In the early 1990s, my colleagues and I at Stanford University began tinkering with an interesting weed, the small flowering mustard plant,
This experiment proved valuable in a number of respects. First, we showed we could hasten or slow the rate of plant development by altering the expression of a single gene.1 But also, it prompted us to pursue an interesting "side project" aimed at developing the DNA microarray, a prospective new means of monitoring plant gene expression with high precision.
The publication of our article in 1995 generated considerable interest (racking up nearly 2,200 citations; see
FORGING A NEW BIOLOGY
Courtesy of Mark Schena
The experimental value of glass substrates, fluorescence, robotics and digital data is considerable. Impermeable flat surfaces enable small feature sizes, low reaction volumes, and rapid kinetics, and allow high-speed robotics to quickly handle reading and writing steps that would take a person countless hours. The switch from radioactive labels to fluorescent ones is also significant. Consider that it used to take perhaps two weeks to detect a single rare transcript by "northern" blotting. A microarray can measure more than 10,000 genes in approximately 10 minutes, a 20 million-fold increase in throughput.
Glass chips flat to less than 0.1 μm facilitated advanced methods of microarray manufacture such as contact printing, ink jetting, photolithography, micromirror, and other approaches, some of which were developed for nonbiological applications. Miniaturization, parallelism, and automation, hallmarks of the computer-chip industry, were introduced to biology for the first time via the microarray. Chemistry and biochemistry, computer science and bioinformatics, physics, mathematics, mechanical engineering, material science, and the full gamut of life science disciplines and subdisciplines were joining forces to forge a new biology. One colleague told me microarrays had him "walking to entirely unexplored parts of the campus," by which he meant that the technology was causing a shift from the traditional paradigm of compartmentalizing research into discrete departments and buildings.
To date, an estimated 10 million microarrays have been used in research and clinical settings including universities, biotechnology and pharmaceutical companies, nonprofit and government institutes, hospitals, and testing clinics, generating a combined output of 1 petabyte (1 × 1015 bytes) of data. That volume of data would have filled the 40-MB hard drives of 25 million Macintosh II computers (Apple sold approximately 500,000).
This rapid explosion of data has driven the development of new hardware and software tools to acquire, quantify, normalize, transform, model, and warehouse microarray information. Bioinformatics is now a formal discipline at many universities and a burgeoning commercial opportunity in the private sector.
But microarrays have spawned two additional noteworthy trends pertaining to bioinformatics. One is the continued realization that high-quality microarray data require sound technical underpinnings upstream of the data analysis steps. Deft experimental design, sample isolation and labeling, surface chemistry, and microarray manufacture and detection are all prerequisites for obtaining high-quality raw data. With superior raw data in hand, it is then possible to bring the entire armada of bioinformatics tools to bear on data analysis and mining in a meaningful manner.
A second trend, enabled by the first, is a move towards obtaining biological and clinical "answers" automatically from the microarray data without user intervention. Extensive sequence and microarray databases, coupled with powerful computational and statistical tools, may soon allow researchers and clinicians to make discoveries and obtain diagnostic and prognostic information from microarray data completely in silico. The increasing use of Markov models, Monte Carlo methods, and supervised machine-learning algorithms such as artificial neural networks and other artificial-intelligence approaches are assisting in this endeavor.345
One appealing aspect of microarrays is their remarkable versatility, and versatility is an operative term because biology demands an enormous breadth of content and application. The estimated 30 million species that comprise the ecosphere are all amenable to microarray analysis by virtue of containing unique nucleic acid genomes (DNA or RNA), and each represents an interesting and instructive subject of molecular inquiry.
More than 500 different organisms have been "microarrayed" to date, and this number is expected to increase as microarrays are increasingly embraced by current users, as well as by archaeologists, geologists, paleontologists, marine biologists, ecologists, anthropologists, and climatologists. Potential applications run the gamut from pure research to clinical tool: measuring messenger RNA levels (transcript profiling), scoring single-nucleotide polymorphisms (genotyping), assessing protein quantities (protein profiling), identifying protein binding partners (protein-protein studies), testing drug safety (toxicology), elucidating structural motifs (structure-function studies), determining the presence of neutralizing antibodies and antigens (serum profiling), and examining cell morphology and gene activity in tissue samples (tissue microarrays). Microarrays yield data simultaneously, and with economy, high precision, and safety.
HGP IN THE DRIVER'S SEAT
The importance of the Human Genome Project as a driver of microarray assays cannot be overstated. The availability of a complete human genetic blueprint facilitates the identification of unique gene target sequences, a prerequisite for accurate transcript profiling in gene-expression experiments. Candidate sequences for each gene are selected using computational tools, then "crunched" against the genomic sequence to confirm uniqueness and assure hybridization specificity. Microarrays of gene-specific target elements representing the entire human genome allow the complete set of more than 25,000 genes and approximately 300,000 transcripts to be quantified in a single step.
Whole-genome expression profiling in patients opens the door to unprecedented information including disease susceptibility, disease onset and progression, autoimmunity, drug responsiveness, nutritional status, mental health, and behavior. Whole genome microarrays, coupled with straightforward sample collection and amplification strategies, are beginning to provide prognostic and diagnostic information of enormous value from blood, needle biopsies, and other readily obtained patient specimens. Gene-expression profiling, coupled with accurate microarray-based genotyping of the population, will help usher in the long-awaited era of personalized medicine.
In terms of format, microarrays are evolving in a number of interesting ways. The trend towards miniaturization of microarray elements from 300 μm in the first microarray publication to the current size of 11 μm will continue as technological advances enable smaller and smaller features. A trend towards the use of multipatient formats, in which patient samples are configured into microarrays or microplates, allows many (e.g., 96 or 384) patients to be tested simultaneously. Miniaturization and multipatient formats are both geared towards providing greater information per microarray and increased patient throughput for clinical applications.
Microarray technology is now firmly in place, and its successful transition from the research laboratory into the clinic has benefited from the support of federal regulatory agencies, disease control centers, hospitals, public-health laboratories, clinics, and the medical community.67 The technology has also benefited from generous funding from the federal government. That microarrays have such a modest beginning once again emphasizes the value and prudence of funding pure basic research, a lesson that we must not forget as we move increasingly toward application.
Mark Schena, visiting scholar at TeleChem International, helped found the microarray revolution with a publication in
He can be contacted at