TCS Daily

Is the Hockey Stick Broken?

By Willie Soon - October 27, 2004 12:00 AM

It's dubbed the hockey stick. It is a rather simple looking graph -- with a long, stable shaft and a fast rising blade -- that purports to represent averaged Northern Hemisphere temperatures over the last thousand years. More than that, in global climate reports -- particularly the United Nations' Intergovernmental Panel on Climate Change (IPCC) Third Assessment Report (TAR) in 2001 -- it's used as proof that mankind's industrial revolution has over the last hundred years started dangerously pushing up global temperatures, thus justifying restrictions on emissions of human produced greenhouse gasses.

But there's a problem. The hockey stick may well be broken.

A research paper recently published in the journal Science by Professor Hans von Storch and colleagues has found significant problems with the hockey stick. Von Storch, the leader of the research team at the Institute of Coastal Research at Geesthacht, Germany, calls the hockey stick "junk" or "rubbish."1

Figure 1: The cartoon of the 1,000-year temperature history as reported in the 2001 UN IPCC's TAR based on work by Mann, Bradley and Hughes (1998, 1999) [as MBH98, MBH99 here]. The curve resembles a hockey stick with little change from 1000-1900 (the shaft) and then a sharp warming trend from 1900-2000 (the blade). This superposition of the hockey stick image is strictly to enhance the popular discussion here.

How important is that in the debate about what, if anything, the world should do about climate change? Perhaps quite a bit.

A little background is needed here.

The IPCC hockey stick was originally produced by Michael Mann, Raymond Bradley and Malcolm Hughes, first in 1998 for the period 1400-1980 and then, with no major progress in the science or database, was quickly expanded to the full 1000-1980 interval in 1999. We will call the studies MBH98, MBH99 hereafter.

Now, since no thermometer readings were available for almost 850 of the 1,000 years, MBH98 and MBH99 first selected what were the then-available temperature proxies. Those proxies were based on tree-growth, coral and ice core records from about 105 sites across the globe. To verify the accuracy of the temperature data derived from those proxies, the methodology of MBH98 and MBH99 tested recent proxy data to see if it fit the available geographical patterns of temperature observed by available thermometer measurements for the last 80-150 years or so. Those proxies that did not fit the pattern were essentially ignored. For those that did, it was assumed that the same geographical pattern of change seen in the very short thermometer record would hold true for the full 1,000 years into the past.

It was with those two critical, but unsubstantiated steps, that MBH99 then came up with the well-known IPCC hockey stick temperature averaged for the Northern Hemisphere (Figure 1).

Since 1999, several climate researchers have challenged those underlying assumptions for deriving the hockey stick, but with little effect on limiting the hockey stick's use as an illustration purportedly helping prove human induced global warming.

The heavy criticism by Von Storch and colleagues in Science may change that. It exposes a clear methodological problem in the MBH99 hockey stick rendition of the 1000-year Northern-hemisphere temperature history. That rendition improperly smoothed out large temperature variations over the 1000-1900 interval that made up the supposedly stable shaft of the hockey stick, as seen in Figure 1.

How did Von Storch and colleagues show this?

As the researchers explained it in Science, they used computer climate model outputs to generate the test temperature data series at a number of locations on Earth. By using such computer model climate outputs, rather than actual temperature proxy records, the researchers could take advantage of a completely known set of information about temperature variability in every location on Earth (true, of course, only in a computer model) to perform various sensitivity tests. These tests could then show how different combinations of locations for available test temperature data series throughout the 1000-year climate history could systematically influence or bias the statistical reconstruction of the Northern Hemisphere-wide temperature record. Biased results from different statistical methods of combining the computer-produced temperature data for the selected geographical locations could then be compared with the benchmark model results that captured the full range of variability, accurately averaged over the whole Northern Hemisphere.

Von Storch and colleagues added confidence to their adopted methodology by showing that their primary conclusion and results held true using the sophisticated computer climate models from both the Max Planck Institute for Meteorology (results labeled ECHO-G or ECHO-G II in Figures 2 and 3) and the U.K. Meteorological Office's Hadley Centre. (Those results are not shown here but were shown in Von Storch and colleagues' paper).

As expert reviewers commenting on Von Storch et al.'s paper for the Science magazine noted:

"Accepting Von Storch et al.'s results does not mean that we must also accept that their simulated temperature history is close to reality -- merely that it is a reasonable representation of climate behavior for which any valid reconstruction method should perform adequately."

In short, the temperature records from the computer model over the last 1,000 years need not be absolutely correct as long as the model outputs can be shown to simulate observed reality in recent years reasonably well. And as can be seen in the top panel of Figure 2, that was the case.

Von Storch et al. show in the top panel of Figure 2 that their ECHO-G computer simulation yields realistic temperature variability similar to the observed Northern Hemisphere temperature (labeled NCEP Reanalysis) in the test period 1948-1990. By contrast, the hockey stick curve, adopted by the IPCC TAR (curve labeled Mann et al. 1999 or MBH99 in the bottom panel of Figure 2), already underestimates the observed range of temperature change from 1948-1980.

Figure 2: The contrast between the good agreement (top panel) in simulated (curves labeled ECHO-G and ECHO-G II in the top panel) and observed (curve labeled NCEP Reanalysis) Northern Hemisphere temperatures and the poor agreement (bottom panel) in reconstructed (curve labeled Mann et al. 1999 or MBH99 in the bottom panel) and observed (curve labeled NCEP Reanalysis) Northern Hemisphere temperatures. Von Storch and colleagues reported in their new Science paper that "the standard deviation of the MBH99 reconstructed NH [Northern Hemisphere] temperature in the period 1948-1980 is 0.9 K, compared to 1.4 K derived from NCEP [data]." In other words, the hockey stick method adopted by the IPCC TAR, significantly flattens or misses observed trends in available instrumental data (1948-1980).

Going back through past centuries, Von Storch and colleagues identified a peculiar problem with the method used to develop the hockey stick. Using the same statistical averaging method on the computer simulated temperature sampled at the 105 sites from where the original hockey stick proxies were taken, and adding a realistic estimate of error and uncertainty of temperature at each geographical locations, Von Storch and colleagues came up with a remarkable finding. As illustrated by the orange curve in Figure 3, the finding agreed reasonably well with the original hockey stick results represented by the blue curve.

What does such agreement mean? Not confirmation of MBH99's results, but a refutation. As Von Storch and colleagues noted that the hockey stick methodology systematically and significantly underestimated the full range of temperature variability of the last 1,000 years as represented by the black curve in Figure 3. By ignoring the hockey stick's rules for statistical averaging and instead computing the simple arithmetic average of the temperatures from the same 105 sites, Von Storch and colleagues produced a temperature curve that agreed well with the full range of variability, as shown in that benchmark black curve.

FIGURE 3: The approximate agreement between the hockey stick curve (blue curve calculated by Mann, Bradley and Hughes, 1999-MBH99) and the statistical reconstruction (orange curve) from climate model outputs of Von Storch et al. after adopting the methodology of MBH99. Temperature variability over several decades to a century is greatly underestimated by the methodology in MBH99, compared to the large changes (black curve) simulated in their climate model ECHO-G. The September 27 press release by Von Storch and colleagues further notes that "the associated error bars [grey areas encasing the orange curve] from the reconstruction methods [i.e., as derived by the methodology of MBH99] are inaccurate."

Bottom line: the large underestimation of temperature change seen in Figure 3 (contrasting the blue and orange curves with the black curve) is mainly an artifact of the hockey stick's averaging rules.

The authors of the Science paper put it this way:

"widely-used methods [e.g., IPCC TAR 2001] to reconstruct past global climate variations ... probably underestimate the amplitude of the real variations by a factor of up to two, and possibly more."

Von Storch bluntly summed up his results with the following comment reported in Der Spiegel on October 4:

"We were able to show in a publication in Science that this [hockey stick] graph contains assumptions that are not permissible. Methodologically it is wrong: Rubbish [or Junk1]."

Von Storch and colleagues aren't the only one's who've reached that conclusion.

In a recent popular article in MIT's Technology Review, Professor Richard Muller, through the careful re-assessment and checking by the two independent Canadian researchers, Stephen McIntyre and Ross McKitrick, highlighted another very serious methodological problem in the IPCC rendition of the 1,000-year hockey stick temperature history -- adopting the hockey stick methodology, the hockey stick shape of the temperature history curve can be automatically generated by using random data series (i.e., in contrast to data series from actual climate proxies or computer model outputs) from each locations.

Muller remarked that:

"A phony hockey stick is more dangerous than a broken one -- if we know it is broken. It is our responsibility as scientists to look at the data in an unbiased way, and draw whatever conclusions follow. When we discover a mistake, we admit it, learn from it, and perhaps discover once again the value of caution."

In short, the new paper in Science by Von Storch and colleagues confirms what several other climate researchers have long stipulated. The hockey stick curve -- which is a mathematical construct, as opposed to actual temperature information recorded at individual locations -- is problematic because it yields air temperature changes on timescales of a few decades to a century that are simply too muted to fit the phenomena of the Medieval Warm Period (ca. 800-1300) and Little Ice Age (ca. 1300-1900), which are well recorded in historical documents and recognized in indirect climate data from growths of tree-rings and corals or isotopic content in ice cores and stalagmites collected around the world.

This is traditional science, with results from one group tested by others. What makes this case important, though, was explained by Von Storch in Der Spiegel:

"The Mann graph [i.e., the hockey stick of IPCC TAR] indicates that it was never warmer during the last ten thousand years than it is today. ... In recent years it [the hockey stick] has been elevated to the status of truth by the UN appointed science body, the Intergovernmental Panel on Climate Change (IPCC). This handicapped all that research which strives to make a realistic distinction between human influences and climate and natural variability."


[1] From the German phrase "Quatsch"

Willie Soon is physicist at the Solar, Stellar, and Planetary Sciences Division of the Harvard-Smithsonian Center for Astrophysics and an astronomer at the Mount Wilson Observatory. His book (with Steven Yaskell) The Maunder Minimum and the Variable Sun-Earth Connection ( was published by World Scientific Publishing Company earlier this year. David Legates is Associate Professor in Climatology in the Center for Climatic Research at the University of Delaware. The views presented here by Soon and Legates are solely of their own and do not represent the view of the institutions where they work.


TCS Daily Archives