Saturday, 28 September 2013

Errors in the field

Some years ago I investigated errors generated by people collecting environmental data while out in the field. In those days we had computers and databases back at the lab, but field data was collected manually.

I’d written a range of error-trapping routines to pick up errors during data input at the lab so all I had to do was link errors to people. The survey included several hundred field workers and hundreds of thousands of items of data. These were not major errors by the way, but they had to be corrected.

I suppose I was most surprised at how many errors were being made and how consistent each person’s error rates were.

Firstly, line managers working in the field to keep their hand in. They tended to generate more errors than anyone else. Many should have been locked in their offices and never allowed into the field under any circumstances.

Secondly, there were a few people who were extremely meticulous, making a very small number of errors day in day out, but there were not many like that – maybe five or six at most.

Thirdly, there were people at the other end of the spectrum who routinely made a large number of errors.

So – not particularly surprising really, but what did strike me was the difference between the best and the worst. The worst field workers regularly made at least twenty times as many errors as the best.

Yet that did not mean that the worst couldn’t care less about the work – far from it as far as I could see. People doing environmental field work tend to be interested and conscientious.

I don’t know what became of the survey in the longer term, because I moved on. My reports were greeted with surprise and not a great deal of enthusiasm, but I always remember just how consistent people are when it comes to making mistakes in largely routine work.

Also posted at Broad Oak Magazine


Roger said...

The P45 represents a good QA measure. Then it seems odd to expect folk diligently to collect 100K or so data items without some ongoing process to keep them on track. The Japanese car industry addressed this problem head-on, the UK car industry did not - the rest is history. BTW, was this a public sector jobby?

Mark Wadsworth said...

I watched an episode of Science Club yesterday, and one item was a shopping centre manager who knew all about manipulating databases, data mining and so on.

She's lost a daughter at birth, and turned her hand to applying the same general principles to babies in ICUs.

She basically did the sensible thing and recorded every single bit of data from all the machines the babies are hooked up to on a massive scale and then looked for patterns between these millions of data points and babies who died or survived.

The patterns she recognised were quite surprising, but some ICUs tried working backwards from her approach to help spot them babies at risk of dying a day in advance and it worked fine, they got mortality rates down by half or something.

Slightly off topic, but I'm sure you get the gist.

Demetrius said...

Having spent a lot of time looking at runs of Census Returns as well as transcribing, this incidental experience leads me to agree with you. Once in the past I encountered an eminent psychologist famous for his testing. On taking a good hard look at the actual data base and his methodology I realised that his statistical analysis was very badly flawed. Given that policy decisions were being taken on his material, you may imagine how unwelcome I was.

A K Haart said...

Roger - yes, public sector so no P45 stimulus. Very, frustrating. I did a lot of work on this and the interest was tepid.

Mark - it's an interesting thing to do in all kinds of areas. Often you don't need much in the way of stats to spot the areas of interest either.

Demetrius - it sounds as if your psychologist should have run his stats past a professional.