Tales from the Thermometer

Global warming is the “hottest” environmental issue of the day, quite possibly of all time. Yet it’s increasingly clear that most people, even those who are passionate about the issue (on both sides), aren’t very well-informed about what earth’s temperature is doing, what it has done in the past, and what it’s likely to do in the future. There’s quite a gap between what most people know about the subject, and what people need to know.

I’d like to discuss the history of earth’s temperature according to thermometers.

Worldwide, there are many organizations that are “keepers of the thermometer.” That’s not to say there aren’t other excellent record-keepers and researchers, but these are leading the field. Temperature measurements happen every day, every night, every season of every year ad infinitum, from thousands upon thousands of locations over land and sea. They’ve collected past observations and checked and double-checked them so that errors can be corrected when possible and discarded when not. They determine the changes over time, and the differences between different regions of the earth. And they make their data freely available from the web. There are five that are best-known and most often used:

  • HadCRUT: The Hadley Centre/Climate Research Unit in the U.K.
  • GISTEMP: From the Goddard Institute for Space Studies (GISS), part of NASA.
  • NOAA: the National Oceanic and Atmospheric Administration
  • Berkeley Earth: The Berkeley Earth Surface Temperature project
  • Cowtan & Way: from researchers in the U.K. and Canada

    Temperature Anomaly

    What climate researchers are most interested in is temperature change. Also, temperature differences can generally be measured with much greater precision and reliability than absolute temperature. So, what is usually studied is not temperature itself, but temperature anomaly.

    Temperature anomaly is just the difference between the temperature, and what it used to be at the same time of year back in some “reference period,” called the baseline period. As an example, for HadCRU the reference period is 1961 through 1990, while for NASA GISS it’s 1951 through 1980. If it’s hotter now than during the reference period, the temperature anomaly is positive; if it’s colder now, the anomaly is negative.

    Anomaly doesn’t just isolate temperature change from temperature itself. It also eliminates the yearly cycle of the seasons. After all, we’re not really interested in the fact that summer is hotter than winter, we already knew that. We’re most interested in whether or not this summer is hotter or colder than what a typical summer used to be (during the baseline period). By defining anomaly as the difference from the baseline average at the same time of year, we remove the seasonal cycle, which helps us focus on the meaningful changes over time rather than the seasonal changes we already knew about.

    Global estimates are computed by determining the geographic distribution of temperature anomaly worldwide, and averaging that to get the best estimate of global temperature anomaly. This is the data that will tell us whether or not earth’s temperature has changed, or is changing.

    Historical Thermometer Readings

    Until the latter part of the 19th century, we didn’t have enough temperature measurements from enough places to do the job. The HadCRU time series runs from 1850 to the present, as do the Berkeley data and the Cowtan & Way data, while the NASA and NOAA data go from 1880 to today. Let’s look at the NASA data, starting with a graph very similar to the first one of its kind I saw: global average temperature anomaly, averaged throughout the year, for every year from 1880 through 2017. Each red bar represents the difference between the year’s temperature and the 1961-through-1990 average (the baseline period for NASA data). The coldest year recorded is 1904, at -0.495; the hottest is 2016, at +0.986 (measured in degrees Celsius, °C).

    I’ll also show the same exact information in a different form. Here’s the global average yearly temperature anomaly as a line graph; each dot shows the average yearly temperature and I’ve connected them with a line.

    There’s lots of wiggling around, fluctuation from year to year. But superimposed on these year-to-year changes are some more persistent trends. There’s slight warming from about 1915 to about 1940, a levelling off from then until 1970, and a sharp rise 1970 to the present — all superimposed on those year-to-year fluctuations.

    These graphs (the forms most people have seen) are actually slowed-down versions of real temperature change. In real life, temperature changes from moment to moment. But by taking averages over 1-year periods, we have “slowed down” the changes, eliminating those that happen on timescales shorter than a year. We can go into greater time detail by looking at global average temperature, averaged not over every year, but over each month:

    The line represents the difference between the given month’s temperature, and the average for that month from 1961 to 1990. The (relatively) coldest month is January 1893, at -0.8, while the hottest is February 2016 at +1.34.

    The monthly averages show even more up-and-down fluctuations than the yearly averages. Each month is unique, with its own set of influences that are nearly impossible to predict. This leads to short-term, month-to-month variation that is, essentially, unpredictable but has a limited range of variation. Such variations happen on very rapid timescales. I’ll call this very fast natural variations; it’s sometimes also called scatter.

    There are also fluctuations that have some persistence, but still happen rapidly. Such changes last way longer than a single month or two but not more than a few years. This is fast natural variation of global temperature, and it too has a limited range of variation. Sometimes we can figure out exactly why a particular fast natural variation has occurred.

    Finally, there are the slower, even more persistent changes. For example, the temperature is higher, on average, today than it was in the past; we noted the warming from about 1970 to the present, when in addition to scatter and fast natural variations, temperature also showed a steady rise. Such persistent changes are slow variations; we can also call them trends.

    Slow and Fast Changes

    It’s natural to want to separate the long-term (slow) changes from the short-term (fast) changes from the very-short-term (very fast) changes, in fact we want to know the changes on all timescales. We’ve already seen one way to do this; by averaging over every year, we eliminated changes whose duration is much less than a year. Essentially we got rid of the very fast changes, giving us a better look at the fast changes (and the slow changes too).

    We can carry this technique further, by averaging over even longer timescales. If, instead of 1-year averages, we take 10-year averages, then we’ll eliminate not only the very fast variations, but the fast variations as well, leaving only the long-term changes: the trends. Plotting the slow 10-year averages in red and the fast 1-year averages in blue, we get this:

    What Causes Changes?

    Many of Earth’s temperature changes, especially the very fast month-to-month changes, happen for reasons we don’t fully understand. But there are some things we do know, things that cause year-to-year changes which are sometimes quite prominent. Let’s look at NASA monthly average temperature data since about 1970:

    I’ve marked some notable events, with labels for “Mt. Pinatubo volcano” in blue and “el Niño” in red. That’s because we do know the reason behind the downward dip in 1993 and the upward spikes in 1998 and 2016.

    A very large volcanic explosion throws a lot of junk into the atmosphere. If it’s big enough, in can inject material very high, into the stratosphere, and that junk can take years to settle out of the air. One of the things volcanos emit are sulfur compounds, and chemical reactions in the atmosphere turn much of that into sulfates. Those in turn can assemble into aerosols, tiny particles in the air, and sulfate aerosols tend to be bright, scattering light, in particular scattering some of the incoming sunlight right back to space. Hence these aerosols tend to block some of the sunlight from getting to earth’s surface, and since sunlight is the ultimate source of energy for Earth’s climate, reducing the incoming sunlight has a net cooling effect on earth’s climate. The Mt. Pinatubo explosion caused a temporary cooling of the globe for a few years afterward.

    El Niño is a spreading out of the pool of warm water in the eastern-central equatorial Pacific ocean. When it spreads out, it’s exposed to more air, releasing more of its heat into the atmosphere and warming the climate. Both 1998 and 2016 brought some of the strongest el Niño events ever recorded, making those years extra hot (even for the time).

    Those changes are only temporary, and although they may be rare, the do happen again and again. As such, they’re not climate change — it would be more appropriate to refer to them as “weather,” and more scientific to call it “decadal variability.”


    Another way to isolate long-term change from short-term change is to smooth the data, to fit a mathemematical curve, but tune the curve so that it “smooths out” fluctuations faster than the limit we’re interested in.

    The figure below shows two smooth-fit curves to the data, one tuned for 10-year changes (red) and another for 1-year changes (blue). The smallest details differ from the graph of averages, but the results are essentially the same: fast natural variations like el Niño and Mt. Pinatubo, and two episodes of warming, from about 1915 to 1940, and about 1970 to the present.

    Even the slow curve still shows wiggles that aren’t really part of climate change. To isolate climate from weather, we need an even longer smoothing time scale. The usual choice is 30 years, because that has been a useful choice in recent history (but it was intended to study climate, not climate change). In my opinion, a better choice is a 20-year smoothing time scale. Doing that gives us this:


    The smooth curve isn’t the exact climate signal isolated from the weather fluctuations, it’s only an estimate. Therefore there is uncertainty associated with it. One of the benefits of mathematics is that it not only enables us to make this estimate, it also gives us some idea of how uncertain our estimate is. Here’s another version of the previous graph:

    There are two additional red lines, above and below the smooth-fit estimate, showing the probable range in which the actual climate signal value lies. The true value could be outside those limits, but probably not. If “probably” and “uncertainty” make you a bit uncomfortable, accept the fact that there’s no getting around it. That’s the nature of measurement and estimation in science — almost nothing is certain, but with skill and luck we can at least determine roughly how uncertain our estimates are.

    Here’s another view, that isn’t necessarily the most scientifically precise, but I find appealing:

    Even Earlier

    We can’t estimate global temperature from thermometers before about 1850, because there aren’t enough measurements from enough places. But there are some isolated places with longer temperature records than that. The longest of all is the Central England Temperature (CET).

    Central England Temperature (CET) has been reconstructed from thermometer measurements from 1659 to the present. The earliest data, from 1659 to 1670, are so much less precise than later data that monthly average temperature is only recorded to the nearest degree Celsius. From 1671 to late 1722, temperature is estimated to the nearest half a degree, except for a brief episode from 1699 to 1706 when data are more precise, to a tenth of a degree. Since late 1722 CET has been recorded as average monthly temperature to the nearest tenth of a degree.

    In early times weather observations were not yet systematically made by weather services. The Central England Temperature is computed by combining several weather stations in England. In the beginning few stations were available, and from 1707 to 1722 even data from Delft in The Netherlands had to be used.

    And what do the data say? This:

    This is temperature, not temperature anomaly, so the cycle of the seasons is still present. Here it is with the seasonal cycle removed to form temperature anomaly:

    The long-term pattern isn’t immediately obvious. In large part that’s because the level of fluctuation, of random scatter, tends to be a lot bigger for a single location than for the entire Earth.

    Let’s smooth it, with a mathematical curve, in hopes to see the long-term pattern better:

    The smooth fit (the red line) does show some signs of pattern. This is far more obvious is we expand the graph to show more detail, with just the smooth fit:

    Now we can definitely see patterns. The present 30-year period (from 1990 to the present, actually) has the highest average temperature anomaly by far, and the change since the previous 20-year average (1970 to 1989) is the largest change in the entire CET record. If we stick to the more reliable part of the data, from 1723 to the present, we see that the smoothed values (an estimate of the climate part rather than the weather) never got more than 0.23 degrees hotter or colder than zero until 1940. The most recent value is 1.2. The recent rapid rise doesn’t look like what’s happened before.

    But this is only indicative, it’s far from conclusive, because it’s only one small part of the globe. To estimate global temperature before the age of thermometry, we have to estimate temperature from other data. These are called proxy data, and come in many forms: tree rings, microfossils, borehole temperature profiles, coral reefs, ice cores, and many more. The best approach is to take all the available temperature proxy data and combine it with the best statistical tools. Then we can hope to see the temperature history of the earth over the last few thousand years. But that is the subject of a future post.

    You can get global temperature data yourself online:

    Berkeley Earth
    Cowtan & Way

    It’s not always obvious how to access and process the data. We’ll cover some of that in future posts.

  • 7 thoughts on “Tales from the Thermometer

    1. In future, there might want to be a discussion of dependency. That is, it’s different if you have a collection of measurements which, by hook or by crook, are taken pretty much independent of one another, and a collection which is taken in series. So, if an estimate of variability is desired, whether a variance or a standard deviation of some kind, how this gets done really matters. Simply taking the variability of the estimate used for independent observations is misleading and will be too low. There are devices, such as the Politis and Romano stationary bootstrap, which can be enlisted to sample these statistics at varying window sizes which are constrained to have a mean window length. Provably, these estimate the long term variability of the signal much better.

      And, I suppose, the next level up is an appreciation of stationarity. While that technically is important, in many practical cases it is, in my opinion, a red herring. For, because most natural systems have continuity in observables and their derivatives, facts are that if stationarity is violated, it is violated smoothly. So, to some extent, depending upon the parameter, stationarity might as well be true, even if it isn’t technically.

      [Response: Important stuff. Also pretty advanced. I intend to get there eventually — my target audience is the best and brightest of young people (of all ages) — but I don’t want to assume they already have knowledge, so before we get to the deep end of the pool we have to learn our way around shallower waters. Not shallow, just shallower than the stationary bootstrap; I haven’t even introduced the concept of “stationary” yet!]


    2. Roly

      Nice summary of a complicated topic. Its worth noting that there hasnt been a pause in warming recently and you can learn a lot by noting who and which publications claim there has.

      [Response: I’ll certainly be demonstrating the lack of a “pause” in future posts. But I won’t mention who and where that mistaken idea is promoted. This blog is about learning some science, not politics or propaganda.]


      1. Roly

        Ok, sounds good. But i recall that early in my environmental science degree was a module to help you find valid science on the internet and what to avoid. A pretty important skill for youngsters.

        [Response: An important point. Perhaps I’ll modify my stance — but still keep the laser focus on science.]


    3. Nicely done, in my opinion. I think this fills an important niche, in that many people want to understand the basic concepts, but may not need the full development of them as at “Open Mind.” As we’ve come to expect, the exposition is clear.

      (I do have one nit to pick: the second NASA uncertainty graph bit feels incomplete. You haven’t explained why it’s “appealing,” or what the yellow uncertainty bands are. As a reader I’m left wondering “So what?” Personally, I’d either expand it or drop it.)


    4. Alan J

      Hi Tamino,

      This is a wonderfully informative post, and I am enjoying this new blog. Please excuse an ignorant question, but I have long been confused about this part of why temperature anomalies are used. You say, “Also, temperature differences can generally be measured with much greater precision and reliability than absolute temperature. So, what is usually studied is not temperature itself, but temperature anomaly.”

      Why is this so? I understand the rest of the rationale behind anomalies, but this has always eluded me. Why is it more reliable and precise to measure anomalies? Don’t thermometers tell use absolute temperatures anyway? So we’re taking the absolute measurement and converting to an anomaly that is telling use something more accurately than the original reading did?

      [Response: First thing: the question is not ignorant. It’s important.

      There are two things. First, thermometers try to measure absolute temperature, but of course there’s error involved. This is usually in the form of a “bias” (in the statistical sense, not the ideological sense). A thermometer might consistently read 1.17 degrees too hot or too cold, but when it shows warming or cooling, the error in the amount is overwhelmingly likely to be much less. No, the bias isn’t the same at all temperatures like we wish it were, but it’s less than you’d get with absolute temperature. By using only the changes rather than the absolute temperature, we improve results (but they’re still not perfect!).

      Second, we’re trying to get a global average. If you put your thermometer in a valley it’ll read hotter, on a mountain it’ll read colder. When you include that data in a regional or global average it can reflect the local conditions more than the regional/global value. This is particularly troublesome when one station drops out and another becomes active. Using raw temperature means you’ll see a change simply because your instrument went from a hot place to a cold place or vice versa.

      Working with anomalies solves these problems. It’s not perfect! But by using anomaly we can actually get a meaningful result.]


      1. Alan J

        Thank you for your response, it gave me an ‘aha!’ moment! So to check my understanding, even if my thermometer is not giving me a “right” temperature (i.e. if I put another thermometer next to it, one would always be “offset” relative to the other – my thermometer reads 23 degrees when it is actually 24 degrees), if what I’m looking for is a change in the temperature of that spot over time, it doesn’t matter if my thermometer is reading too high or low, because the amount of change should be the same between both thermometers? (i.e., if it has changed by 2 degrees, the “right” thermometer will have gone from 24 to 26, while my “wrong” one will have gone from 23 to 25.)

        [Response: That pretty well sums it up. The constancy of temperature *change* from one thermometer to another isn’t perfect — but it’s better than absolute reading.]

        If that is the case, then when they say an anomaly is measured relative to a baseline, say 1961 through 1990, do they mean that they determine the baseline for each location (each station has its own baseline average? Or is the baseline some kind of average of many stations?

        [Response: Different groups use different methods. Typically, one computes anomalies for a single station without worrying about a baseline. Then, when they’re combined (averaged) they’re offset so that they have the same mean value during the baseline. But it gets tricky when stations don’t cover the entire baseline period. I guess the “take-home” message is that it’s quite a complex process, requiring a lot of care to get it right.]


    Leave a Reply

    Fill in your details below or click an icon to log in:

    WordPress.com Logo

    You are commenting using your WordPress.com account. Log Out /  Change )

    Google photo

    You are commenting using your Google account. Log Out /  Change )

    Twitter picture

    You are commenting using your Twitter account. Log Out /  Change )

    Facebook photo

    You are commenting using your Facebook account. Log Out /  Change )

    Connecting to %s