From Wikipedia |
My "career" was chock full of numbers, producing them,
reporting them, making sense of them. Modern life seems to be about numbers
too, but for me these numbers are very different to those I dealt with all my
working life.
You see I knew where my numbers came from. I knew in great
detail how, for example, a nickel concentration was generated from a sample of
river water. I knew how the sample was collected, who collected it and when,
how it was digested in acid and analysed. I knew how the analytical instrument worked, what its strengths and weaknesses were, how the quality
control was done, how the analyst was trained, how reagents were checked for
purity and so on and so on.
It’s different when I look at economic statistics, global
temperatures or statistics on social trends. These numbers can be of uncertain
provenance with unknown quality standards and sometimes I can’t even be sure if
I’m looking at raw data or the output of some calculation. This problem of uncertain provenance has been one of the major issues as well as a huge
scandal in climate science.
So what to do? Well the point I’m trying to make is that we
very rarely come across numbers with a provenance as solid as I was familiar
with. Secondly, my numbers were still uncertain. All
scientific measurements have a degree of uncertainty and in my experience the
uncertainty is usually underestimated – often grossly underestimated. In the real world, the level of uncertainty is usually uncertain.
In the lab, we calculated uncertainties for laboratory analyses, error
bars which we were reasonably happy with. What about the provenance of the
sample though? Sampling adds a considerable degree of uncertainty to the overall measurement and it isn’t easy
to estimate for environmental samples. Was it raining at the time of sampling, was the river high or
low, was the sampling point appropriate?
It’s this long experience that makes me wary of numbers.
Even if you trace the numbers back to an original published paper, there are still many
things you don’t know. On the whole, if you want to understand something, then
if at all possible it’s best done without numbers. Not always possible of
course, but it pays to be very wary indeed when numbers are the basis of an argument. It also pays to be wary if computers have been used for anything but storing the raw data.
So what to do? For me, the best attitude to numbers is to
look at what level of uncertainty the argument will stand. Do round figures and crude approximations still make the case? If not, then I'm always very wary of the argument. Of course this approach does exclude vast
swathes of epidemiology, but that's just another advantage as far as I'm concerned.
4 comments:
It’s this long experience that makes me wary of numbers. Even if you trace the numbers back to an original published paper, there are still many things you don’t know. On the whole, if you want to understand something, then if at all possible it’s best done without numbers. Not always possible of course, but it pays to be very wary indeed when numbers are the basis of an argument. It also pays to be wary if computers have been used for anything but storing the raw data.
Most impressive and equally impressive was your treatment of the word treason in your yet to be published post [:)]
Yes, a good example of this is the recent furore over inflated exam grades and blatant cheating on the part of awarding bodies. I think ordinary people are now beginning to grasp the fact that when we mess about with measurements, then we might as well give up on reality. There are so many lying liars desperate to save or advance their careers that our educational system is pretty much done for. A tragedy. To fight it or not to fight it; that is the question...
All data users should read "How to Lie with Statistics" by Darrell Huff and carefully consider who is publishing the data and why. As they (don't) say in financial services 'If you don't understand something, don't buy it'.
Unfortunately these days it is only too easy to throw a fistful of numbers into a 'Hokey Cokey Double Trouble Bi-Variate Chi-Ying-Yang Analysis' package and be amazed at the result.
An approximate answer to the right question is worth much more than a precise answer to the wrong question.
JH - thanks.
SV - exam grades are a real problem. The manipulation is obvious so decades of integrity goes out the window.
rogerh - there's a lot to be said for approximation. If the argument still stands with approximations, then it may well be a sound argument.
Post a Comment