Social - biological musings: 2013

Tuesday 10 December 2013

The price of freedom

For some reason a lot of things lately have been reminding me of an old idea of mine. This was that many of the 'vices' of modern life arise from the desire to escape from oppressive individual relationships. It first occurred to me in relation to the success of the supermarket. The death of the small shops was mourned. But I myself rather like supermarkets. Why is that? Well, one reason is that I like the impersonality. The person at the till is a stranger, they do not know my social, marital or parental status. They probably don't think too much about what I have bought, or habitually buy. And if they did it would not be part of the normal repertoire for their job to make any comment on it. I will not be called "luv" or "dearie".

Similarly for the success of television. How much less oppressive it is to relate to people on the TV than to people in real life! Sociology deals too little with the oppressive nature of face to face interaction, despite Goffman's comment that "interaction is dangerous". How much easier to have the trials and tribulations of various people laid out skilfully before us by script writers. We can talk about these with others, without any personal implications, without blame or even envy.

Hemingway was fascinated by the way in which bullfighters in the 1940s and 1950s were first raised to the stratosphere of praise, and then viciously condemned when, inevitably, they fell short. In this was he foresaw what we do to 'celebrities' today. But, unlike in the corrida, characters in the soap operas do not actually get gored by bulls. They do not actually have to show courage and do not risk the taint of cowardice.

And take the motor car. Why are we driving headlong towards destruction of the earth? Because we would rather do this than share our space with others. I once saw what I thought was a very sad lecture on what people do after retirement. A large number spend their newly acquired free time just driving around. heaven forbid they would engage with other people rather than an internal combustion engine. Other people might judge them. And I don't mean to criticise by saying this. I don't blame anyone for designing their life to avoid the judgements and stereotypes imposed by others. It is just too bad that living this way destroys the planet.

It seems to me that our technology (now we can get on to the internet, social media and so forth) increasingly allows us to escape the tyranny of the judgement of other people. This has been a more or less totally ignored (correct me if I am wrong) determinant of the adoption of new technologies. Women do not have to put up with day to day, low level oppression by men. Children can escape constant judgement by their parents. People of more modest social status can avoid the contempt of those who regard themselves as superior.

But of course this comes at a price. As we detach ourselves from our fellow human beings, in some ways our lives are impoverished and made more precarious. Only in a more equal society will the retreat into virtual worlds be reversed.

Thursday 24 October 2013

Problems about "Impact"

A lot is being written now about the upcoming REF and its associated demand for impact. At the same time, independently, our research centre ICLS has started off its second 5 years with a great deal more attention to impact matters than that with which we started out in 2008. I had developed an idea about what used to be called "dissemination" during the ESRC Resilience Network that I co-ordinated between 2003 and 2007. I called this "targeted dissemination". It was perhaps closer to what then became "user engagement". I figured that if we as a research group were going to be any use to our non-academic partners, we needed to get to know each other. The project leaders needed to come to understand the interests and needs of the partners. The kind of thing I wanted to avoid was exemplified when one of my old friends from my Civil Service days said "Oh, Mel, I would love to be able to help you but I am just snowed under with work". This was not the idea at all! I had not meant to ask for her told help me by agreeing to "engage with me as a user". So I said , look, the idea is that I am supposed to help you out, not the other way around.

Nowadays I hear similar things from people in 3rd sector groups that I still relate to. They are wise to the reason they suddenly receive a lot of messages from academics asking if they are interested in some project or other. They know this is because a call for proposals has gone out on a topic that is relevant to them (ageing, child health, etc). And other civil service friends told me they had a kind of standard paragraph they could shell out to importunate academics without taking too much trouble over it.

So over the years I have made myself available to our non-academic partners in whatever way they find useful. It might be advising on a tender they are drawing up to get some research done. It might be reviewing applications they have received. It might be a friendly chat. I always answered my own phone (retired now) which people used to like, though it surprised them. I always wrote personally to people.

One of our partners is a private firm. When I tried to involve them in an ESRC co-funding scheme, however, this did not work. They took one look at the forms we would all have to fill out and were horrified. "We don't pay you people to fill out forms" they said "we pay you to do research on the questions we are interested in. Can't we just agree a task and pay you?" They could not understand that getting a joint project co-funded was a competition where ESRC had to judge who should get the award. They knew what they wanted and they wanted us to do it, end of. And "overheads", forget it. Now we do things their way.

But this is not the main point of this blog. The main point is rather more serious. In the race for impact, I do not think enough attention is paid to the quality of the science. The literature is now filling up with stories about un-replicable research in any case. What quality control is in place to make sure that "impact" is not being attained with poor science? You have a 3 or even 5 year research programme and within that time impact must be demonstrated. what time does that allow for your results to be tested, replicated, critically discussed? Even clinical medicine finds itself under fire for prescribing useless drugs and procedures that have been thrust forward without full enough evaluation. This happens even in a field where clinical trials are supposed to stand as a guarantee; now we know that many negative results are hidden. One can see here where financial incentives play a powerful role. But do we want a situation in social and policy sciences where, in the absence of the profit motive, the "impact motive" threatens to create a similar form of corruption?

Monday 29 July 2013

Almost 20% of working age men are already inactive

More talk today about how we need to extend healthy life expectancy to get people to work longer before they retire. The little graph below updates the one I posted before. I wanted to see how the economic inactivity rate was getting on. And lo and behold, it has not fallen along with unemployment. Now, I do realise that a lot of the inactive men (there are fewer inactive women, that is another story) are now 'students'. But shouldn't this have created more jobs, i.e. the spaces that would have been filled by young men who are now in full time education?

Monday 15 July 2013

RCUK Policy on Data Access: Disappointing

A couple of days ago I was alerted on Twitter by, I think, Simon Hodson, to the publication of the Research Councils UK policy on access to research data. I had not been anticipating this with too much anxiety as so many conversations I have had, in person and via social media, assured me that the default position was going to be open access. I was even more reassured by an editorial by David Stuckler and John Lynch with the great title "In God we trust, all others must bring data" in the International Journal of Epidemiology that I wrote a blog about on 1 January this year. In making a case for their new initiative to bring together a depository of health data they point out that:

"Available data often go unused because they are not well enough documented, lack accessible how-to guides for their use, or knowledge about the resource is passed on informally within research groups or collaborations. Some data may also require analytical skills that are in short supply; or people may simply be unaware of their existence or unable to access them."

The UK Data Archive has, for decades, been working to make sure that this does not happen to publicly funded (and even some privately funded such as the Health And Lifestyle Survey) social science data. Any project funded by the ESRC is obliged to deposit its data in the Archive within a short period (I think it is 6 months). The UKDA practices for data curation are well established. Most of the large and complex data sets can be downloaded in abut 2 minutes by any bona fide academic who has registered the title of the project for which the data will be used. The safeguards for individual confidentiality reside in the anonymisation of the records, and one of the few restrictions to this open access comes when the research requires information on area of residence. Nowadays it is possible using Geographic Information Systems to link things like temperature, rainfall, the location of certain kinds of facilities and property values to individual data. But the Archive judges that for example adding the Postcode Area to the openly available data is too risky, so this has to be done under more restrictive conditions. In all this time (at least 30 years) there has never been one single case of any individual's privacy being threatened.

However, what is clear from the recent policy document from RCUK is that it is no longer the threat to confidentiality of data that forms the major barrier to open access. This won't come as a huge surprise to a lot of people, but the big barrier is what RCUK term 'intellectual property'. I have often been asked "why should we sweat our guts out collecting data when we just have to give it away?". And this is what a lot of people who work in epidemiology feel.

Why is there this difference in attitude between people who work in the social sciences and in epidemiology? People in both disciplinary areas collect data. It is always hard work. In economics individual academics don't so often collect their own data as the 'classical' economic data is collected routinely. But economists also do a lot of 'micro-economics' using data from the British Household Panel Study and the English Longitudinal Study of Ageing as well as birth cohort studies. I have never heard one of my economics colleagues after participating in the design of these studies claim 'intellectual property' over, for example, the data on income, wealth, pensions and so on.

I have fought several battles (mostly unsuccessful) to get measures of physical functioning into various birth cohort and panel studies. But it would never cross my mind that I own 'intellectual property' in the data. The ideas behind my desire for these measures might be regarded as 'mine'. But when you know why you want to collect a certain measure you have the most enormous flying start. As long as you get going on the research question in a timely way no one is gong to steal your 'property'. And if you don't get going in a timely way then other people must be allowed to do so. Anything else is a misuse of public money. The ethical underpinning of health research is that we promise the people who allow us to stick pins in them and make them blow into tubes that the results will be used to improve public health, not to advance our own careers.

So I strongly disagree with the position taken by RCUK that those who collect data should have sole ownership of it "until their major research questions have been answered". That is a charter for slowing down the use of new information for the public good.

Saturday 1 June 2013

Career planning for sociology PhD students

The Social Research Hub and I recently had a Twitter exchange about whether sociology and social research PhD studentships should be accompanied by some kind of career planning. Since that day, I have read yet more despairing tweets about the fate of people who have completed PhDs in sociology and do not find a place to fit into the workforce.

So I said that I would write a bit about the research training that we have been doing in the ESRC International Centre for Life Course Studies in Health http://www.ucl.ac.uk/icls/training.

Our wish to set up a Centre was strongly motivated by the very opposite problem to that of which twitter-writers have been complaining. Over at least 10 years, members of our research group had been struggling to find people to work with us on projects investigating health, wellbeing and resilience over the life course. The possibilities to do this are increasing at an exponential rate due to the maturing of Britain's birth cohort studies and other longitudinal studies such as the British Household Panel Study, and the English Longitudinal Study of Ageing. More recently, large amounts of funding have been awarded for a new, enormous panel study similar to BHPS but 4 times the size, and for a brand new birth cohort study which will also produce a very large amount of data. Anyone who wants to find out how to acquire these data has only to go to http://data-archive.ac.uk/find.

So here are all these data, but where are the people trained in sociology who can use them? For some reason, the tradition of data analysis was lost from sociology between the time when I studied for my BA and the time I returned to academic work in medical sociology. So much so that many people would not regard me as a medical sociologist at all but as a 'social epidemiologist'. But I never believed that to understand the social determinants of health and wellbeing you needed to be an epidemiologist. What you needed was a research group where people knew their stuff in (a) sociology (b) social history (c) human biology (d) developmental psychology and (e) statistics. What you are doing is puting togther the ways in which people develop and change over their lives in terms that can only be understood with a knowledge of (a) to (d), formulated into ideas that will be tested using (e).

ICLS training does not aim to make people bigger experts in the separate disciplines than they already need to be to get a good Masters degree. But learning how to pull together evidence from different spheres and use it to formulate and test the relatively complex hypotheses you need in lifecourse studies is a very transferable skill! In way it is quite the opposite to some of the very arcane, specific topics often followed in PhDs, both in the sciences and the social sciences. Because many of our topics are policy relevant ("Does the divorce of your parents make a difference to your mental health once you are adult?" "Is it bad for children if their mothers have paid employment?" "Does the kind of job you worked in make a difference to how healthy and independent you can remain in your older years?""What is the effect on your mental wellbeing of losing your job?") ICLS PhD candidates also get thrown in the deep end in this respect. One of the many great things about them, which also reflects some other Twitter-exchanges I have been having recently, is that they are natural communicators. By the time I was getting told by my funders that I needed to do 'public engagement' I had become far too much of a nerd. But our students just took to relating to non-academic partners in our enterprise like ducks to water. What THIS in turn means is that they begin their careers being (a) in demand for their knowledge of what to do with a complex data set (b) used to explaining their results in clear language (c) knowing some people outside academe who will be likely to keep them in close touch with what non-academics actually want to know about.

Now, if this sounds like a load of off-putting statistical stuff, all I can say is that the first crop of ICLS PhDs came with a wide range of statistical understanding, many with very little indeed. I myself never even took the Maths "O" level (that is how old I am) and got into data analysis due to the enthusiasm and persistence of colleagues over my working life. Learning what to do with a data set is not a matter of mathematics at all. It is a bit like learning to drive a car (I never learned that either).

But what IS a problem is that there do not seem to be many sociology degrees that might be relevant to the training we give. Many sociology degrees do not seem to deal with, for example, social structure, family change, social institutions. It is hard to find anyone teaching basic Weber, Durkheim, Marx, Bourdieu, etc. Or, more up to date topics relevant to contemporary social change, but from a sociological perspective looking at institutions, identity etc. I am sure that many sociologists looking at what emerges from the large longitudinal data sets at present would be able to think of other questions, other angles. That is badly needed.

Wednesday 17 April 2013

Death of the Phillips curve?

Here is a little graph, now rather out of date, that shows what James Banks and I thought might have happened to the Phillips Curve back in the mid-2000s.

As unemployment fell, economic inactivity rose steadily. You can see that the opposite trends expected between inflation and unemployment are not there. But the opposing trend with inflation is there if you look at economic inactivity.

The paper in which this grph appeared first was considered outrageous by social policy and sociology journals and never actually got published. I am too lazy to update it now!

Friday 12 April 2013

Unemployment statistics: not the whole story

There have been many very interesting posts of economic trend data following the death of Mrs Thatcher. one statistic I have seen less often is the economic inactivity rate. For those who don't think a whole lot about employment figures, being 'economically inactive' is not the same as unemployed. In fact what is officially called 'the economically active' population consists of the employed, the self employed and the unemployed. You are 'active' in the labour market as long as you are looking for work,even if you didn't find any. The corollary of this is that the unemployment rate is not the % of the whole working age population who are unemployed, it is the % of the economically active population. Then there are the economically inactive, who are all those who are not looking for work. The most common reason to be inactive is that you are looking after the home and family, and this (still) mostly applies to women. The second most common reason is what used to be called 'permanent sickness', i.e. long term ill health bad enough to mean a person is not able to work. This 'permanent sickness' is what all the new rules about claiming out-of-work benefits is about. The third most common reason is early retirement. Over time, what has happened is unemployment fell and stayed low, while permanent sickness and early retirement rose. By 2005 or thereabouts, an unemployment rate of around 6% co-existed with an economic inactivity rate of 16% in men of working age.

Finally got the diagram into this document, phew! What a performance. What I hope it shows is how after 1995 or so, unemployment fell dramatically while economic inactivity continued to rise. This diagram is for men only, so this has nothing to do with domestic duties. Nor has it got anything to do with serious ill health, as at this same time life expectancy in men was rising fast.

Thursday 4 April 2013

A new measure of "social class"

There you go! In the first paragraph of Mike Savage et al's paper on a new model of social class they cite Richard Wilkinson and Kate Pickett as authorities on health inequality. As pointed out in my last blog, Wilkinson's work has never been about health inequality.
Anyone who read my last blog will also anticipate my reaction to using the term 'social class' to refer to cultural dimensions of social inequality. I just think it is so much less confusing to stick to calling social class what the official UK Statistics Office has defined and validated, i.e. employment relations and conditions.
But OK, let's call the new measure a measure of social position, the general term I like to use to refer to all dimensions of inequality whether they be cultural, economic or occupational.
It is great that the authors start out from a sound knowledge of Goldthorpe and colleagues clear definition and measurement of social class which has provided the present official UK measure (the NS-SEC). Why after all this do they think we need another measure?
Firstly, the NS-SEC does badly in predicting cultural activities and identities. The simple answer to this is that it is not supposed to. Whether or not a class measure based on occupation is related to cultural identity is an empirical matter. Secondly, the NS-SEC does not tell us who is in the 'elite'. I think anyone who uses it would agree with this. Thirdly, economists don't like the NS-SEC. I can surely attest to the truth of this: no record of occupation was recorded in the whole of the English Longitudinal Study of Aging (ELSA) life-grid while the sociologists took their eye off the design. Economists only think of income and wealth, not social class, which is fine, that is their discipline. Another criticism of the Erikson-Goldthorpe-Portocarrero (EGP) class schema (from which the NS-SEC is derived but in a rather distant form) is that it overdoes the manual-non-manual divide. I think John Goldthorpe would agree with this, I certainly do, and it is one reason why the NS-SEC does not have a manual-non-manual divide at all.
Perhaps the ultimate irony of the study is that the authors actually used the NS-SEC to validate their sample distribution! (page 6 Table 1). And this is the correct schema too, not the old EGP, and so it does not have any 'manual workers', or indeed 'skilled or 'unskilled' workers'. Managers are divided into 3 classes and professionals into 2. On the basis that the BBC audience survey (161,000 people) was biased (compared to what I wonder?) the BBC did a face to face survey with a sample of 1026 chosen to be nationally representative. It was on this representative sample that the authors carried out their multiple classification analysis of cultural capital rather than the much bigger BBC audience sample.
Seven classes emerge from the latent class analysis, the result of allowing income, wealth, house price, number and average social prestige of social contacts, highbrow culture score and 'emerging culture' score to cluster together and seeing how many groups 'naturally' emerge'. As far as I understand it, occupational title did not enter into the group of variables used to establish the classes. However, information on occupation is present in the data set so it is possible to see which occupations fall into which classes. But the classificatory criteria really are at the household level. This is different from any other social classification I know of, as all of these are based on characteristics of occupations one way or another. It would have been interesting to see a bit more information about the occupations whose members fell into several different Great British classes. Presumably it would be possible for an office worker, for example, to be in the precariat if they had no house and no savings mixed mostly with people in low status jobs and never went to the opera, and in the established middle class if they had greater income and wealth and different cultural tastes and friends.
Most interesting to me was to see that not all the other variables had a simple graded relationship to income. Table 6 shows an interesting set of relationships. For example, the 'technical middle class' has 3rd highest income and house price, but 2nd highest savings, and lowest frequency of social contacts. Traditional working class households have a higher score on highbrow culture than do 'new affluent workers' or the 'technical middle class'. Presumably this is one of the reasons why people tell me you can change your social class by, for example, saying you like a different kind of music (Morten Wahrendorf offered to lend me some jazz records).
I hope that, for now, this blog has raised some questions in people's minds.

Monday 11 March 2013

Measuring social inequality

Some time ago I decided I would write something short and sweet about different ways of measuring social inequality. There is a chapter on this in my book on health inequality and although the book is now 10 years old in this respect little has changed. Rather disappointing in fact as we had hoped that work by us and others would soon change the ways in which inequality was measured in health studies. Not to be!

I and my colleagues still get endless papers to review that use the term 'SES' ('socio-economic status') despite this practice having been soundly criticised by Nancy Kreiger in the 1990s, and despite having taken this critique forward in our own empirical research.

What was Nancy's main point? Very simple: it is not helpful in aetiological research to mix up 3 different types of exposure: the social (class) the economic (income) and the cultural (prestige or status). It is a bit like constructing a latent variable made up of smoking, high fat diet and hours of TV watching. These factors will be found to correlate well and to predict health outcomes, but in terms of understanding how the social becomes the biological we will not be any further forward. Rather as if public health had stopped with John Snow and not bothered with microbes, or then had a category 'microbe' covering all different kinds.

What we have tried to do is use the term 'socio-economic position' (SEP) as the most general term to refer to measures of social inequality. And lets be clear here that I mean inequality between individuals not between regions, nations etc. I also get not a negligible number of papers to review that actually think Richard Wilkinson's work is something to do with health inequality (as opposed to health differences between geo-political units with different levels of income inequality). The next level down from SEP is made up of three measures: occupational social class, income, and prestige. The most carefully constructed and well validated measure of social class in use in the UK at present is known as the NS-SEC (National Statistics Socio-Economic Classification). This schema classifies occupations according to clear criteria such as whether a person is self employed or an employee, if self employed how many people they employ, if an employee the presence or absence of career opportunities, discretion over ones own work pattern, responsibility for other people's work content, and job security, summed up as "Employment relations and conditions".

Employment relations and conditions can be classified in the same way in all societies, they do not depend on local cultures. Prestige scales differ between societies and cultures. The Hindu caste system is often taught (or it was when I studied sociology) as the clearest case of a prestige system. Only people of similar caste may worship together, eat together and inter-marry. In the USA occupations have been ranked by prestige by presenting a panel of citizens with a list of occupations and asking them to assign a rank. One obvious difference between caste and the US prestige ratings is that in a caste system your prestige can emanate from your parents, caste is inherited regardless of one's occupation. So someone born into a scheduled caste (the lowest prestige) will have trouble being accepted even if they become a millionaire factory owner. I saw a similar phenomenon here in the UK when it was remarked disparageingly that the mother of Catherine Middleton had been an air stewardess. Of course we do not have caste in the UK! It is just that you can joke about the wife of the future king by calling her mother 'doors to manual', and most people will get it.

One of the sad things about mixing these dimensions on inequlity up (income is obvious so I wont elaborate) is that employment relations and prestige are accompanied by aspects of daily life that have very plausible pathways to health. Castes have different practices in terms of diet for example. Smoking has become a prestige marker in UK but not in Italy or Greece. And so on. I also wonder whether the present debate on 'micro-classes' mixes up these dimensions to some extent and creates problems for research that we don't really need to have if we just keep dimensions separate in the first place.

But this has gone on quite long enough and I will leave it to other people to let me know what they think.

Monday 18 February 2013

Heritability debates

It is time to try and get to grips with what is happening in genetics/epidemiology. I tend to rely on Meena Kumari, the ICLS expert, but really should get a bit more self-sufficient before I start to write my book. For a long time I was happy that a lot of things which "look genetic" could be, as done elegantly by Debbe Lawlor, alternatively seen as evidence of the huge power of the life course. People are not just born into a home environment, they are to a large extent born into a life course. As Debbie and George Davey Smith showed awhile ago, someone with a high blood level of Vitamin A might not be healthy because of the vitamin but because having a high level of the vitamin was a 'biological embedding' of a whole string of favorable circumstances stretching over years of their lives. Similarly Hilary Graham showed that being a smoker was associated in women with a string of unfavorable experiences in a similar way. I always thought these were profound insights and these papers have greatly influenced my thinking.

So I was not too surprised by the revelations from Genome Wide Association Studies around 2009-10 that only a very small amount of the similarity between people in a lot of characteristics seemed to be explicable in terms of "a gene" (or gene variant, all corrections gladly received). I must admit that only 4% of variation in height attributable to genetics did seem rather small. And if someone had done it for red hair or green eyes, that would have surprised me even more. I don't think anyone did this as hair and eye colour do not carry the same moral/ideological weight as height. But the shock was due to the wide difference between what GWAS studies showed and the estimates of heritability obtained from twin studies, which had been the classic method for assessing the influence of 'genes v. ennvironment' before the genome was sequenced and GWAS became possible.

When people do GWAS (Genome Wide Association Studies), it has 2 phases. First you take a 'training sample'. This has to be very large, and one looks for an association of the characteristic of interest (lets say height) with a Single Nucleotide Polymorphism (SNP) given that it is now possible to include vast numbers of these on a 'SNIP-CHIP'. Because you have no hypothesis about which SNP will be associated with (e.g.) height, you need to get an enormously significant association, as I was once told "p with many many zeroes". The second phase is to take this SNP and see what proportion of the variation in (e.g.) height it explains in a new sample independent of the training sample, the "validation sample". When you do this for height you can explain around 4% of the variation in height. This was quite a shock.

But lets face it what everyone cares about is really how heritable are other personal characteristics with ideological meaning such as intelligence or mental health. As intelligence and mental health are generally acknowledged to be complex characteristics, much more than height, what hope was there?

But wait! Along came Professor Visscher and colleagues. Their paper in Twin Research and Human Genetics (vol 13(6) pp. 517-24) aims to set out their method in a way that is more comprehensible than their original paper in Nature Genetics that shows after all about 40% of height is explained by genetics. Never mind, they said, if there is no individual SNP associated significantly with height that explained more than 4% of it. What you need to do is to add up all the SNPs that are associated with height, but so weakly that they do not meet the very stringent requirements for significance used in GWAS studies. This can then be extended to intelligence with results that are summarised as follows:

Data from twin and family studies are consistent with a high heritability of intelligence, but this inference has been controversial. We conducted a genome-wide analysis of 3511 unrelated adults with data on 549 692 single nucleotide polymorphisms (SNPs) and detailed phenotypes on cognitive traits. We estimate that 40% of the variation in crystallized-type intelligence and 51% of the variation in fluid-type intelligence between individuals is accounted for by linkage disequilibrium between genotyped common SNP markers and unknown causal variants. These estimates provide lower bounds for the narrow-sense heritability of the traits. We partitioned genetic variation on individual chromosomes and found that, on average, longer chromosomes explain more variation. Finally, using just SNP data we predicted ~1% of the variance of crystallized and fluid cognitive phenotypes in an independent sample (P=0.009 and 0.028, respectively). Our results unequivocally confirm that a substantial proportion of individual differences in human intelligence is due to genetic variation, and are consistent with many genes of small effects underlying the additive genetic influences on intelligence. [my emphasis]G Davies, A Tenesa, A Payton, J Yang, S E Harris, D Liewald, X Ke, S Le Hellard, A Christoforou, M Luciano, K McGhee, L Lopez, A J Gow, J Corley, P Redmond, H C Fox, P Haggarty, L J Whalley, G McNeill, M E Goddard, T Espeseth, A J Lundervold, I Reinvang, A Pickles, V M Steen, W Ollier, D J Porteous, M Horan, J M Starr, N Pendleton, P M Visscher and I J Deary^"Genome-wide association studies establish that human intelligence is highly heritable and polygenic" Molecular Psychiatry (2011) 16, 996–1005;

However, Makowsky et al. for example threw some doubt on the enterprise in PLOS Genetics thus:

Recently, a large proportion of the “missing heritability” for human height was statistically explained by modeling thousands of single nucleotide polymorphisms concurrently. However, it is currently unclear how gains in explained genetic variance will translate to the prediction of yet-to-be observed phenotypes. Using data from the Framingham Heart Study, we explore the genomic prediction of human height in training and validation samples while varying the statistical approach used, the number of SNPs included in the model, the validation scheme, and the number of subjects used to train the model. In our training datasets, we are able to explain a large proportion of the variation in height (h2 up to 0.83, R2 up to 0.96). However, the proportion of variance accounted for in validation samples is much smaller (ranging from 0.15 to 0.36 depending on the degree of familial information used in the training dataset). While such R2 values vastly exceed what has been previously reported using a reduced number of pre-selected markers (<0.10), given the heritability of the trait (~0.80), substantial room for improvement remains. Robert Makowsky, Nicholas M. Pajewski, Yann C. Klimentidis,Ana I. Vazquez,Christine W. Duarte,David B. Allison, Gustavo de los Campos. Beyond Missing Heritability: Prediction of Complex Traits. PLOS Genetics 2011; 7(4); e1002051

The furthest I can get today is to comment on this phrase "substantial room for improvement". It sounds as if people have already decided what % of the trait is "heritable" and are now trying to find a method that gives the answer they already expect and want to find.

Sunday 20 January 2013

God need not bring data

The International J of Epidemiology for December has a fascinating paper by Lynch and Stuckler on the use of data for quality control by Deming and its implications for public health. I thought people might like to see some of what the authors say about data.

"Available data often go unused because they are not well enough documented, lack accessible how-to guides for their use, or knowledge about the resource is passed on informally within research groups or collaborations. Some data may also require analytical skills that are in short supply; or people may simply be unaware of their existence or unable to access them."

This may sound familiar to some readers. The research group around ICLS uses only openly available data, not the kind that is not well enough documented, lack accessible how-to guides for their use, or knowledge about the resource is passed on informally within research groups or collaborations. We do this to protect ourselves from delays in data acquisition, people changing their minds about whether or not we are 'allowed' to use data for a specific purpose, and from having to re-code things that have not been through the quality control of the UK Data Archive and a variety of users. When we find mistakes or derive new variables we give the code to the Archive (access to the data is free to anyone funded by UK Research Councils).

So I warmly welcome the initiative of the IJE to bring together health data sets as a public, openly shared resource. I would hope they will add the UK Data Archive to their list as it now contains some biomedical and genetic data and will soon have more.
I hope that Deming would approve of this method for data curation and exploitation. It means that work done using taxpayers' money (ESRC funded project are obliged to archive data) become a common good for the whole academic and policy community. Also, in the words of an eminent colleague: "if you are not allowed to see the data behind a paper how do you know it is not all made up?". So open data is a vital safeguard against the kind of scientific misconduct that is increasingly being noted.

A number of ruses get used to try and avoid 'sharing' data (another eminent colleague tells me not to use this term as the data is not the property of the research team in the first place). I have heard it said that 'biomedical data (like blood pressure and cholesterol) carry a bigger risk of disclosure' and some social scineitists are fooled by this. Maybe they are thinking about "CSI" on the TV. In fact data on occupation and education are a lot more potentially disclosive that biomedical markers. I also hear it said that biomedical samples are a 'depletable resource'. But what gets archived is just a bunch of codes, not the samples themselves! I once heard it said that the low response rate of studies like the BioBank (10% response rate, i.e. 90% non-response) "does not matter as it is only going to be used for case-control studies". I will leave other people who know more epidemiology than me to react to this statement but the ones I speak to just roll their eyes.

Saturday 5 January 2013

European innovation example

Because it has been the holiday I have not done too much thinking about the life course. Instead I have been using the internet to watch the Xmas equestrian sport all across mainland Europe. It comes as a huge surprise to many who think they know me that I love horses. It is the only sport where men and women compete on level terms even in the Olympics.

When I was young enough to actually ride myself, choosing a top show jumper or eventer was pretty hit and miss. But for the last 20 years or so, the 'warmblood' or 'sports horse' has emerged from breeding programmes in Netherlands, France and Belgium first, now also in other European nations such as Denmark. These horses are like a dream of my youth. Talk about form and function. They are the result of rigorous testing for conformation (the physical set-up of the horse), ability and temperament. Only horses who pass strict tests are allowed to enter the breeding programmes. Having ignored the sport for 2 decades, I could hardly believe it when I started to take an interest again, to see whole shows dedicated to breeding stock, and large crowds of people going along to see them. The first one I attended was in Maastricht, where you could watch young stallions (a breeding male, most young male horses are castrated and become 'geldings') free-jumping without a rider over large fences. Members of the audience were allowed to book a test-ride. My partner asked why I didn't do this -- a 3-year old stallion for heavens sake (the reason for gelding is that a stallion is rather hot tempered). I would not have dared even in my youth.

It made me think about economics (!). These top sport horses sell for millions of dollars in the States. One Olympic champion is said to be worth £6m.To me it is a good examaple of mainland European genius, a breeding programme carried out by (I guess) farmers, sports persons, and scientists (veterinary, genetic). No speculation or cheating other people out of their money would have produced such fantastic animals. People had to know what they were doing and be exacting and patient.

I don't mean to leave out the Garman warm-blooded breeds who in some ways were way ahead, but in other ways not. If anyone is interested let me know and I can say some more about the German sports horse, descended from cavalry mounts.