In Search of Microarray Standards

An industry/academia/government coalition puts microarray reproducibility to the test

Apr 1, 2006
Jeffrey M. Perkel

There's no denying the popularity of DNA microarrays: 1.8 billion data points in the Stanford Microarray Database don't lie. But a lack of standards and quality control metrics for everything from RNA preparation to probe design to data analysis has led to the perception, at least, that gene expression profile concordance between, and even within labs is, shall we say, spotty.

That technical ambivalence could be ending, however. As you read this, a team of researchers led by Leming Shi of the US Food and Drug Administration's National Center for Toxicological Research is preparing 10 research papers to be submitted jointly to Nature Biotechnology for September publication. The subject of this decology: the Microarray Quality Control (MAQC) project.

The impetus behind MAQC was a series of highly critical papers questioning microarray reliability and reproducibility. In one particularly damning study1 Margaret Cam of the National Institute of Diabetes and Digestive and Kidney Disorders and colleagues compared two gene expression profiles on three commercial systems. Of the 2,009 genes the three arrays shared, just four were identified as differentially expressed by all three systems. The result was illustrated in figure 5, which to Roderick Jensen's recollection was dubbed "the Venn diagram that launched 100 papers."

The figure simply didn't reflect many researchers' experiences, says Jensen, an MAQC participant from the University of Massachusetts, Boston. "In my lab we've been using the GE arrays, the Agilent arrays, and the Affymetrix arrays routinely on the same samples and we see high concordance," he says. "That was the experience in my lab and in others, and so seeing a paper appear saying that they're not reliable demanded a response."

The project had three basic goals: to select and mass-produce a pair of publicly available standard RNA samples; to collect replicate datasets using those samples on eight array and three alternative quantitative gene expression platforms; and finally to analyze that data to produce a benchmark standard. Designed explicitly to address interlab and intralab reproducibility, the project generated just under 27 million datapoints for analysis, says Jensen.

"We started [MAQC] one year ago, and we're basically done, which is a miracle of organization I attribute to our 'fearless leader.' It's stunning, actually, harnessing all this enthusiasm," says Jensen. Stunning, too, that industry heavyweights played along. That competitors like Affymetrix, Agilent, and Illumina could work together for the good of the industry speaks volumes; but the rising tide lifts all ships, and everyone stands to gain from wider acceptance of array data.

Once the MAQC papers are published, any lab can run the standard RNAs (available commercially from Ambion and Stratagene) and check their numbers against the published figures to see how well they do. Core facilities will be able to demonstrate the quality of their work, technology developers will have a metric against which to compare their work, and individual labs will have the ability to directly compare statistical methods, imaging settings, and so on. At the same time, researchers can use "spike-in" controls under development by the External RNA Controls Consortium to spot-check their day-to-day operations.

That doesn't mean array facilities will soon be sporting an official "MAQC Inside" verification mark, says Laura Reid of Expression Analysis, a microarray service provider that participated in MAQC, but "I do imagine that somebody or a few people will come out with proficiency testing that can be used to verify results and give a stamp of approval."

Nor will standards alone magically make data issues disappear, says Marc Salit, who heads the Metrology for Gene Expression project at the National Institute for Standards and Technology; array platforms differ in several significant ways. But standards can provide a framework for understanding observed differences, he says: "Having the standards will let us do the science to understand why things are variable. It will let us control parts of the process."

That could be just what the field needs to finally realize its technical potential.


1. P.K. Tan et al., "Evaluation of gene expression measurements from commercial microarray platforms," Nucleic Acids Res, 31:5676-84, 2003.