Source of Potential Bias Widespread in Large Genetic Studies

A new statistical method finds that many genetic variants used to determine trait-disease relationships may have additional effects that GWAS analyses don’t pick up.

By Diana Kwon | May 15, 2018


Genome-wide association studies, which scan thousands of genetic variants to identify links to a specific trait, have recently provided epidemiologists with a rich source of data. By applying Mendelian randomization, a technique that leverages an individual’s unique genetic variation to recreate randomized experiments, researchers have been able to infer the causal effect of specific risk factors on health outcomes, such as the link between elevated blood pressure and heart disease.

The Mendelian randomization technique has long operated on the key assumption that horizontal pleiotropy, a phenomenon in which a single gene contributes to a disease through more than one pathway, is not happening. However, a new study published last month (April 23) in Nature Genetics finds that when it comes to potentially causal trait-disease relationships identified from genome-wide association studies (GWAS), pleiotropy is widespread—and may bias findings.

The “no pleiotropy” assumption was reasonable when scientists were examining only a few genes and much more was known about their specific biological functions, says Jack Bowden, a biostatistician at the University of Bristol’s MRC Integrative Epidemiology Unit in the U.K., who was not involved in the study. Nowadays, GWAS, which include many more genetic variants, are often conducted with little understanding about the precise mechanisms through which each gene could act on physiological traits, he adds. 

Although researchers have suspected that pleiotropy exists in a large number of Mendelian randomization studies using GWAS datasets, “no one has actually tested how much of a problem this was,” says study coauthor Ron Do, a geneticist at the Icahn School of Medicine at Mount Sinai.

Approximately 10 percent of the causal associations they found were significantly distorted, and by as much as 200 percent.

To address this question, Do and his colleagues developed the so-called MR-PRESSO technique, an algorithm that identifies pleiotropy in Mendelian randomization analyses by searching for outliers in the relationship between the genetic variants’ effects on the trait of interest, say, blood pressure, and the same polymorphisms’ effects on the health outcome, such as heart disease. Outliers suggest that some genetic variants may not only be acting on the outcome through that particular trait—in other words, that pleiotropy exists. 

The team used this method to test all possible trait-disease combinations generated from 82 publicly available GWAS datasets and found that pleiotropy was present in approximately 48 percent of the 191 statistically significant causal relationships they identified.

When the researchers compared the Mendelian randomization results before and after correcting for pleiotropy, they discovered that pleiotropy could lead to drastic over- or underestimations of the magnitude of a trait’s influence on a disease. Approximately 10 percent of the causal associations they found were significantly distorted, and by as much as 200 percent.

For example, the team identified an outlier variant in one of the significant causal relationships they found using Mendelian randomization—a link between body mass index (BMI) and levels of C-reactive protein, a marker for inflammation and heart disease. Further examination revealed that this variant, found in a gene encoding apolipoprotein E—a protein involved in metabolism—was associated with several traits and diseases, including BMI, C-reactive protein, cholesterol levels, and Alzheimer’s disease. After removing this outlier, the effect of BMI on C-reactive protein dropped by 12 percent, still statistically significant, but obviously to a lesser degree.

“There is growing awareness that there’s widespread pleiotropy in the human genome in general, and I think these findings suggest that there needs to be rigorous analysis and careful interpretation of casual relationships when performing Mendelian randomization,” Do says. “I think what’s going to have the biggest impact is not just saying whether causal relationships exist, but actually showing that the magnitude of the causal relationship can be distorted due to pleiotropy.”

Bowden notes that the presence of pleiotropy does not mean that Mendelian randomization is necessarily a flawed technique. “Many research groups around the world are currently developing novel statistical approaches that can detect and adjust for pleiotropy, enabling you to reliability test whether a [gene] has a causal effect on an outcome,” he tells The Scientist. For example, he and his colleagues at the University of Bristol recently reported another method to identify and correct for pleiotropy in large-scale Mendelian randomization analyses.

“I hope that this paper will raise people’s attention to the potential problems in the assumptions behind [these studies],” says Wei Pan, a biostatistician at the University of Minnesota who was not involved in this work. “Large genetic datasets give researchers the opportunity to use a method like this to move the field forward, and as long as they use the method carefully, they can reach meaningful conclusions.”

M. Verbanck et al., “Detection of widespread horizontal pleiotropy in causal relationships inferred from Mendelian randomization between complex traits and diseases,” Nature Genet, doi:10.1038/s41588-018-0099-7, 2018.

Add a Comment

Avatar of: You



Sign In with your LabX Media Group Passport to leave a comment

Not a member? Register Now!

LabX Media Group Passport Logo

Popular Now

  1. How to Separate the Science From the (Jerk) Scientist
  2. Could a Dose of Sunshine Make You Smarter?
  3. Prevalent Form of Childhood Leukemia May Be Preventable
  4. Conservation Biologist Ben Collen Dies of Bone Cancer