Sunday, 20 January 2013

God need not bring data

The International J of Epidemiology for December has a fascinating paper by Lynch and Stuckler on the use of data for quality control by Deming and its implications for public health. I thought people might like to see some of what the authors say about data.
"Available data often go unused because they are not well enough documented, lack accessible how-to guides for their use, or knowledge about the resource is passed on informally within research groups or collaborations. Some data may also require analytical skills that are in short supply; or people may simply be unaware of their existence or unable to access them."
This may sound familiar to some readers. The research group around ICLS uses only openly available data, not the kind that is not well enough documented, lack accessible how-to guides for their use, or knowledge about the resource is passed on informally within research groups or collaborations. We do this to protect ourselves from delays in data acquisition, people changing their minds about whether or not we are 'allowed' to use data for a specific purpose, and from having to re-code things that have not been through the quality control of the UK Data Archive and a variety of users. When we find mistakes or derive new variables we give the code to the Archive (access to the data is free to anyone funded by UK Research Councils).

So I warmly welcome the initiative of the IJE to bring together health data sets as a public, openly shared resource. I would hope they will add the UK Data Archive to their list as it now contains some biomedical and genetic data and will soon have more.
I hope that Deming would approve of this method for data curation and exploitation. It means that work done using taxpayers' money (ESRC funded project are obliged to archive data) become a common good for the whole academic and policy community. Also, in the words of an eminent colleague: "if you are not allowed to see the data behind a paper how do you know it is not all made up?". So open data is a vital safeguard against the kind of scientific misconduct that is increasingly being noted.

A number of ruses get used to try and avoid 'sharing' data (another eminent colleague tells me not to use this term as the data is not the property of the research team in the first place). I have heard it said that 'biomedical data (like blood pressure and cholesterol) carry a bigger risk of disclosure' and some social scineitists are fooled by this. Maybe they are thinking about "CSI" on the TV. In fact data on occupation and education are a lot more potentially disclosive that biomedical markers. I also hear it said that biomedical samples are a 'depletable resource'. But what gets archived is just a bunch of codes, not the samples themselves! I once heard it said that the low response rate of studies like the BioBank (10% response rate, i.e. 90% non-response) "does not matter as it is only going to be used for case-control studies". I will leave other people who know more epidemiology than me to react to this statement but the ones I speak to just roll their eyes.

Saturday, 5 January 2013

European innovation example

Because it has been the holiday I have not done too much thinking about the life course. Instead I have been using the internet to watch the Xmas equestrian sport all across mainland Europe. It comes as a huge surprise to many who think they know me that I love horses. It is the only sport where men and women compete on level terms even in the Olympics.

When I was young enough to actually ride myself, choosing a top show jumper or eventer was pretty hit and miss. But for the last 20 years or so, the 'warmblood' or 'sports horse' has emerged from breeding programmes in Netherlands, France and Belgium first, now also in other European nations such as Denmark. These horses are like a dream of my youth. Talk about form and function. They are the result of rigorous testing for conformation (the physical set-up of the horse), ability and temperament. Only horses who pass strict tests are allowed to enter the breeding programmes. Having ignored the sport for 2 decades, I could hardly believe it when I started to take an interest again, to see whole shows dedicated to breeding stock, and large crowds of people going along to see them. The first one I attended was in Maastricht, where you could watch young stallions (a breeding male, most young male horses are castrated and become 'geldings') free-jumping without a rider over large fences. Members of the audience were allowed to book a test-ride. My partner asked why I didn't do this -- a 3-year old stallion for heavens sake (the reason for gelding is that a stallion is rather hot tempered). I would not have dared even in my youth.

It made me think about economics (!). These top sport horses sell for millions of dollars in the States. One Olympic champion is said to be worth £6m.To me it is a good examaple of mainland European genius, a breeding programme carried out by (I guess) farmers, sports persons, and scientists (veterinary, genetic). No speculation or cheating other people out of their money would have produced such fantastic animals. People had to know what they were doing and be exacting and patient.

I don't mean to leave out the Garman warm-blooded breeds who in some ways were way ahead, but in other ways not. If anyone is interested let me know and I can say some more about the German sports horse, descended from cavalry mounts.