Pages

Thursday 15 December 2011

Number games


From Wikipedia


My "career" was chock full of numbers, producing them, reporting them, making sense of them. Modern life seems to be about numbers too, but for me these numbers are very different to those I dealt with all my working life.

You see I knew where my numbers came from. I knew in great detail how, for example, a nickel concentration was generated from a sample of river water. I knew how the sample was collected, who collected it and when, how it was digested in acid and analysed. I knew how the analytical instrument worked, what its strengths and weaknesses were, how the quality control was done, how the analyst was trained, how reagents were checked for purity and so on and so on.

It’s different when I look at economic statistics, global temperatures or statistics on social trends. These numbers can be of uncertain provenance with unknown quality standards and sometimes I can’t even be sure if I’m looking at raw data or the output of some calculation. This problem of uncertain provenance has been one of the major issues as well as a huge scandal in climate science.

So what to do? Well the point I’m trying to make is that we very rarely come across numbers with a provenance as solid as I was familiar with. Secondly, my numbers were still uncertain. All scientific measurements have a degree of uncertainty and in my experience the uncertainty is usually underestimated – often grossly underestimated. In the real world, the level of uncertainty is usually uncertain.

In the lab, we calculated uncertainties for laboratory analyses, error bars which we were reasonably happy with. What about the provenance of the sample though? Sampling adds a considerable degree of uncertainty to the overall measurement and it isn’t easy to estimate for environmental samples. Was it raining at the time of sampling, was the river high or low, was the sampling point appropriate?

It’s this long experience that makes me wary of numbers. Even if you trace the numbers back to an original published paper, there are still many things you don’t know. On the whole, if you want to understand something, then if at all possible it’s best done without numbers. Not always possible of course, but it pays to be very wary indeed when numbers are the basis of an argument. It also pays to be wary if computers have been used for  anything but storing the raw data.

So what to do? For me, the best attitude to numbers is to look at what level of uncertainty the argument will stand. Do round figures and crude approximations still make the case? If not, then I'm always very wary of the argument. Of course this approach does exclude vast swathes of epidemiology, but that's just another advantage as far as I'm concerned.

4 comments:

James Higham said...

It’s this long experience that makes me wary of numbers. Even if you trace the numbers back to an original published paper, there are still many things you don’t know. On the whole, if you want to understand something, then if at all possible it’s best done without numbers. Not always possible of course, but it pays to be very wary indeed when numbers are the basis of an argument. It also pays to be wary if computers have been used for anything but storing the raw data.

Most impressive and equally impressive was your treatment of the word treason in your yet to be published post [:)]

Sam Vega said...

Yes, a good example of this is the recent furore over inflated exam grades and blatant cheating on the part of awarding bodies. I think ordinary people are now beginning to grasp the fact that when we mess about with measurements, then we might as well give up on reality. There are so many lying liars desperate to save or advance their careers that our educational system is pretty much done for. A tragedy. To fight it or not to fight it; that is the question...

rogerh said...

All data users should read "How to Lie with Statistics" by Darrell Huff and carefully consider who is publishing the data and why. As they (don't) say in financial services 'If you don't understand something, don't buy it'.

Unfortunately these days it is only too easy to throw a fistful of numbers into a 'Hokey Cokey Double Trouble Bi-Variate Chi-Ying-Yang Analysis' package and be amazed at the result.

An approximate answer to the right question is worth much more than a precise answer to the wrong question.

A K Haart said...

JH - thanks.

SV - exam grades are a real problem. The manipulation is obvious so decades of integrity goes out the window.

rogerh - there's a lot to be said for approximation. If the argument still stands with approximations, then it may well be a sound argument.