About half a century ago I was teaching postgraduate students about the presentation of work, the use and misuse of statistics and the difference between significance and importance. I’m not sure what my charges thought about it all — the course was voluntary, because they were all PhD students wedded to their thesis, the only aspect of their work that would actually be examined. But I had fun. There was a book that I thought they might all read, because it was short and humorous: Darrell Huff’s How to Lie with Statistics, which I had bought as an undergraduate. It had some chapters on graphs, and I must have taken some of his strictures to heart, because I still look at a graph to see if it qualifies as a technically good one.
I hadn’t thought much about graphs recently until, in January, I came across an essay on WUWT that was good-humoured and sensible, and stored it away for a later post — this one. It was by C. R. Dickson, a retired physicist and chemist, and he made two good points I thought worth passing on — actually, one good point that has a good and important consequence. And although I’ll use graphs about climate for this essay, the lessons apply anywhere where you see a pictorial representation of data. Graphs are important because we perceive pictures much more easily than we can interpret numbers, and that means we can be easily persuaded about something if the graph is dramatic. Let’s start with a familiar example, which I’ve taken from John Brignell’s Chartmanship.
Suppose you have a plot of temperature over time, like this one from the Climate Research Unit of the University of East Anglia:
It seems to have two rising phases and two falling or static phases, with a rise overall. Can you make much out of it? Well, here’s how to do it, if, like the people at CRU, you want to suggest that something dramatic is taking place. You convert the temperatures into anomalies against a baseline that is useful for your purpose, and then you use colour — red for warm (and of course danger) and blue for cold. You get something like this, where the zero line is set just where the red starts to rise:
Hey! That second picture tells a story. But so do the next two, which offer the same basic data, presented in two different ways.
In the first we see temperature in degrees Fahrenheit, while in the second we see the anomalies again. What is the difference? Just the way the data are presented. The first says that average global temperature has been very similar for 120 years (though if you look very hard you can see the tiniest of rises towards the present), while the second says that there has been a sharp change in the anomalies. Can there be a sharp change in the anomalies without a sharp change in the temperature? Why, indeed there can be, because using anomalies magnifies the change. In fact, you would get the same shape as the right-hand graph if you simply truncated the left hand one, and made the vertical axis measure temperature, not in degrees as there, but in tenths of a degree or even hundredths of a degree. Truncating the vertical axis is almost the contemporary fashion. Sometimes you’ll see a little jagged line at the bottom of the vertical axis, which tells you that truncation has occurred. But very often the axis starts wherever the presenter wants it to start.
Dr Dickson used a nice analogy to remind us of the effects of magnification: the smoothness of a sheet of glass:
We’re used to seeing glass as smooth and uniform (figure 2). But if you use a powerful microscope (figure 3) you will see that the surface of a glass plate has all sorts of small imperfections on it — it’s really quite rough! Dickson argues that we don’t drive along the freeway using a microscope and that, like the Bureau of Metereology and the weather girls, we talk about the day’s temperature, or the average for the month or the year, in terms of whole degrees F or C. The use of magnification, as in the CRU graph above, distorts our perception of weather. I think there’s a lot in that remark. We’re having an el Nino right now, though it has passed its peak, and Sydney had a hot February. But, as the weather girls keep telling us, it’s the hottest since the last time. One did say, of one weather episode, that it was the hottest since records began. Those records are the official ones, however, and they start only early in the 20th century. There are other records in the 1890s that leave today’s weather extremes well behind.
The general point is that it is mistake to become obsessive about the anomalies. They are used, in my view, to convey a message, and the message is that warming is bad. We don’t actually know if that is so and, on the evidence, the warming we have had over the past century has been accompanied by greater food production and the greening of arid areas. Perhaps warming is a good thing, and should be coloured green. Why do we use anomalies at all?
Dr Dickson defines them like this: A temperature anomaly is the difference obtained by subtracting an average temperature from real temperature data. Climate studies work with anomalies instead of real temperatures because anomalies are assumed to be more accurate over large geographical areas. Paul Homewood sets out some other reasons, and two of them seem significant to me: that using anomalies allows us to pick up regional trends, and that it allows us to compare time periods like months — February is usually hotter than March, where I live, but the anomaly can tell us whether they’re both hotter than ‘usual’ — if we know what ‘usual’ means, and why we have chosen that baseline.
I have no objection to the use of anomalies, but it is always sensible not to be overpowered by the graph. The changes that we observe in temperature anomalies are usually very small, not withstanding their appearance on the graph. And often the error bars, as I’ve said before, are larger than the change.
For those interested in the business of drawing technically good and effective graphs, I recommend a short paper by Dr Steve Figard, which you can read here. I think everything I have written above is covered in it. It too is well written, good-humoured and an excellent reference. Not only that, he has some entertaining Dilbert cartoons as well. The great book is Edward Tufte’s The Visual Display of Quantitative Information, which Figard refers to with great approbation. Tufte is a political scientist of some distinction, who moved to ‘information’ with this book, which he self-published (and made a motza). I read the first edition many years ago, and it is indeed very good. Graphs lend themselves to humour.
Endnote: Earth hour seemed to take place on Saturday last, though I wasn’t aware until Sunday evening. A few years ago we were all being asked to turn our lights off for an hour, but the focus now seems to be turning off the floodlights on structures like the Sydney Opera House and the Harbour Bridge. Oh, and there was a kind of movement to get the PM to turn the lights off. I’m not sure where he was supposed to do it, and I don’t think it succeeded. If I’m wrong someone will tell me.
Supplement: A commenter via email wondered why I hadn’t used an IPCC graph to illustrate my points, so here for his benefit, and perhaps for the interest of others, is an IPCC graph about temperature. It comes from AR5 WG1 Box 2.2 Figure 1.
What is being graphed is global mean surface temperature (GMST) as measured by HADCRUT4, with 1961-1990 being the baseline. The straight black lines are least squares trends for 1901-2012, 1901-1950 and 1950-2012. The lower graph shows the same data with a smoothing spline, with 90 per cent confidence intervals around it (these intervals are omitted from the higher graph for clarity). There were many other ways in which the data could be provided with straight lines, but since the IPCC is looking at human-induced global warming, its message is reinforced by the rising black lines. If a sceptic were doing the same graph, he or she would be likely to show the obvious and separate phases in the data with some horizontal or even slightly lower black lines along with the rising ones. It’s all about presentation.
Even then, you wonder why the models can’t explain the more recent warming without CO2 being introduced, since the increases are virtually the same for both half-century periods. I’ve asked that question in the past, and never got any kind of persuasive answer