Global average temperature, at Earth’s surface, has been changing for some time. I’d like to discuss how it has changed over time, since 1880 when we’ve had enough data from enough places to get a reasonable estimate of what the global average temperature is.

But first we need to learn about what trend means and how we estimate it. That’s what part 1 is about. In the next post (part 2) I’ll discuss the actual trend history.

We’re mainly interested in the *climate* change rather than weather changes. To do that, we need to understand that there are two distinct aspects of global temperature changes: *trend* and *fluctuation*.

Temperature fluctuates naturally. It does so all the time; fluctuation is inevitable. *That is not climate change*. Here’s some data for yearly average temperature from 1970 through 2017, but I won’t mention yet where these data come from:

It’s obvious that fluctuations happen. But is there any trend? What is a trend, anyway?

I’ll do another post about what trend *really* means, but for now I’ll give a decent definition that will help understand its essence. **Trend** is change that has a pattern and that persists, change that isn’t just random, not just fluctuation. Be advised: this definition is imperfect — but it’ll do for now.

Is there any lasting pattern of change? Is it getting hotter overall (apart from fluctuations), or getting colder overall, or following some more complex pattern? It doesn’t *look* like there’s any change other than the ubiquitous fluctuations, but one of the great lessons of statistics is that “looks like” is a poor way to tell; it’s just too easy for us to “see” what isn’t really there, or *not* to see what is.

One way to investigate is to test whether or not there’s an overall increase or decrease. One of the best ways to do that, probably the most common way, is something called *linear regression*.

Suppose there was a trend, an overall increase or decrease, which happened at a steady rate so that the trend follows a perfectly straight line. It goes up or down at the same rate every year. There are also, of course, those never-ending fluctuations, so the *model* we apply is that the data is *linear trend* plus *random noise*.

We then find the straight line, of all possible straight lines, that comes closest to the actual data. Linear regression is a mathematical technique to find that “best-fit” straight line. I used it on these data and got this:

The linear regression fit is shown as the thick red line. It certainly doesn’t seem to be going up or down — at least not by much. This suggests that there’s no overall increase or decrease, no trend. We could also say that the trend is “flat.”

If you look closely you might see that the best-fit line is actually going up, but so slightly that it’s hard to see. Does that mean that there *is* a rising trend, albeit an extremely small one? Not necessarily.

In addition to the thick red solid line, there are two dashed red lines, one above and one below the trend line. They show the *range* in which the trend line probably lies. We have to give a range, because our estimate of the trend is only an estimate. It’s influenced by the noise, the fluctuations, but their influence doesn’t inform us about the real trend, it only confuses our estimate.

I say “probably” because the dashed red lines outline a *95% confidence range*. That means there’s about a 95% chance the actual trend line is within that range. It might not be! But it probably is, and 95% confidence is kind of like the “de facto standard” in statistical analysis (not always, but it’s most common).

Linear regression not only tells us what the trend rate is (probably!), it also tells us how uncertain our estimate is. For these data, the estimated rate of change is 0.0000165 ± 0.00108 °C/year. The “±” indicates that it could be the given value *plus or minus* the uncertainty. That uncertainty defines a 95% confidence interval for the trend rate — it’s important to remember that it could be more or less, but *probably* not.

Note that the uncertainty is a lot bigger than the value (32 times as big!). The actual trend rate might be as high as +0.001098 °C/year, or it might be as low as -0.001065 °C/year. We can’t say with confidence whether it’s going up or down. The conclusion is that there’s no real evidence of a non-zero trend. We might say “no trend” or “flat trend,” but we shouldn’t forget that this result is only “probably.” That’s the nature of statistics.

In this paticular case I can be supremely confident that there’s no trend. That’s because I know where these data came from. A *random number generator*. They’re random numbers, and since trend is changes that aren’t random, we know there’s no trend.

Here’s some real temperature data, yearly averages from NASA:

I can fit a straight line by linear regression, just as I did before, and it gives this:

This time the trend line is statistically significant (at much better than the 95% confidence level). We conclude that there is a non-zero trend, at a rate of 0.0072 ± 0.0007 °C/year.

But wait. Is that all the trend? Visually, it seems to be doing more than just following a straight line plus random noise. One clue comes from looking at what are called *residuals*. We take each data value, and subtract the trend-line value at that same time (i.e. what it would have been if it followed the straight line exactly). Here’s a graph of those residuals:

It certainly looks like there’s still pattern there, still changes that aren’t just random. Of course “looks like” isn’t a reliable way to draw conclusions. But in this case we can apply other statistical tests, and we can confirm that yes, there’s trend there still after substracting the best-fit straight line.

The conclusion is that the trend in global temperature hasn’t been that simple. It’s not just a straight line at a constant rate of increase. We’ll need other ways to determine what the trend is.

I won’t go into the details of how that’s done (not in this post, anyway). But I will apply one of my favorites, called a “lowess smooth.” It gives this estimate of the trend:

We see an interesting pattern of changes, but since about 1970 the trend has been increasing at a seemingly constant rate.

There’s so much more to say about the trend over time. But that is the topic of post 2.

Looking forward to part 2!

LikeLike

Why the confidence limits do not look like parallel straight lines?

LikeLike

I think it’s because of an increase in fluctuations which leads to a more uncertain trend line. So when the confidence limits spread apart more it represents a an area of less confidence, while the confidence limits drawing closer to each other represents a more confident trend with less fluctuation.

LikeLike

jaimesal–Tamino can answer in detail, but the very short version is that uncertainties aren’t uniform over time. The reasons for that can vary, but since these particular limits seem highly symmetrical I’m guessing that here it has to do with the effects of ‘data boundaries’–ie., the beginnings and endings of the timeseries. Near those boundaries there is literally less data to constrain the regression.

LikeLike

@jaimesal,

Andthe same kind of ballooning of uncertainty would necessarily apply if, for some reason, there was no data for some contiguous segment in the middle of the graph. Of course, how such a dropout gets dealt with in the long run estimate of uncertainty depends upon certain side assumptions, but, basically, as @Doc Snow indicates, if there were points missing, uncertainty increases. That makes sense, right?LikeLike