Gene Association Studies Typically Wrong

The first published study linking gene to disease is often far from the last word on the subject.

By | December 20, 2004


Strength of association is shown as an estimate of the odds ratio without confidence intervals. At top are eight topics in which the results of the first study differed beyond chance (P < 0.05) when compared with the results of the subsequent studies. The bottom shows eight topics in which the first study did not claim formal statistical significance for the genetic association, but formal statistical significance was reached by the end of the meta-analysis. (Adapted from J.P. Ioannidis et al., Nat Gen 29:306–9, 2001.)

The first published study linking gene to disease is often far from the last word on the subject. Marc-Antoine Crocq, a psychiatrist with the Centre Hospitalier de Rouffach in France, learned this firsthand after leading a 1992 study on a mutation in the dopamine D3 receptor in the brain.1 The study found that people with two copies of the mutation have a schizophrenia risk roughly two to four times higher than others. Partly because these ratios were so high, and because the finding came from two independent teams, it looked strong. It was also, as it turns out, quite likely wrong.

A flood of 50-odd follow up studies, which piled so thickly that they included meta-analyses of meta-analyses, gave inconsistent results. Eventually Crocq and colleagues reviewed all the data and concluded that no statistically significant link existed where they had initially found one.2 Crocq says he's puzzled as to why this happened. "Even today, I don't know exactly what to think," he says. "The bulk of evidence suggests there's no association, but I wouldn't be surprised if in 10 years someone proved there was, after all, a mild association."

Experiences like Crocq's, in which follow-up studies overturn an initial finding of a gene-disease association, are strikingly common, researchers say. Two recent studies found that typically, when a finding is first published linking a given gene with a complex disease, there is only roughly a one-third chance that studies will reliably confirm the finding. When they do, they usually find the link is weaker than initially estimated.34 The first finding is usually either "spurious, or it is true, but it happens to be really exaggerated," says Tom Trikalinos of the University of Ioannina, Greece. Worse, he found that there may be no way to predict which new gene-association studies will be verified with multiple replication.5


Trikalinos and other researchers are working to understand why so many studies can't be replicated, and how to change this. The problem is pressing because current trends could exacerbate it, says Sholom Wacholder, senior investigator at the National Institutes of Health biostatistics branch in Bethesda, Md. New high-throughput analysis techniques, he explains, let researchers study many gene-disease associations quickly and cheaply, but also lead to more studies on associations that don't look especially likely at a study's outset. This tends to increase the likelihood of finding spurious links through chance occurrences. By contrast, he says, "In the old days, it was a big investment to study a hypothesis, and only the best candidates had a shot."

Wacholder suggests researchers revise their statistical methods to account for "prior probability," which is a subjective but reasonable measure of how plausible the gene-disease association in question looked before the study.6 Others propose different solutions. Kirk Lohmueller, a Georgetown University undergraduate student and first author of a letter in Nature Genetics on the subject,3 suggests bigger sample sizes and more family-based studies. These avoid a confounder called population stratification, the tendency of populations to carry high frequencies of both certain genes and certain diseases owing to mere accidents of ancestry.

"We found that studies with family-based controls and larger sample sizes are more likely to be replicated," Lohmueller says. Trikalinos disagrees that there is any clear way to predict which studies will be replicated. He suggests that researchers should treat any finding cautiously until it's replicated, preferably more than once.

No effort to address the problem is complete, researchers say, without a renewed call to publish more negative findings showing no gene-disease association. Such findings often go unpublished, bolstering false impressions of spurious gene-disease associations. "Every study provides a piece of evidence," says Wacholder, "and it needs to be made available somehow to people who are interested."

Popular Now

  1. Publishers’ Legal Action Advances Against Sci-Hub
  2. Metabolomics Data Under Scrutiny
    Daily News Metabolomics Data Under Scrutiny

    Out of 25,000 features originally detected by metabolic profiling of E. coli, fewer than 1,000 represent unique metabolites, a study finds.

  3. How Microbes May Influence Our Behavior
  4. Do Microbes Trigger Alzheimer’s Disease?