Beyond Sanger: Toward the $1,000 Genome

Courtesy of Solexa Total Genotyping Without a doubt, the quarter-century-old Sanger sequencing method performed like a champ during the Human Genome Project. But with the capacity to read only a few hundred bases per reaction, it is far too slow and expensive for routine use in clinical settings. Reaping the rewards of the genomics era will clearly require faster and cheaper alternatives. Some companies estimate that within the next five years, technical advances could drop the cost of seque

By | June 30, 2003

Courtesy of Solexa Total Genotyping

Without a doubt, the quarter-century-old Sanger sequencing method performed like a champ during the Human Genome Project. But with the capacity to read only a few hundred bases per reaction, it is far too slow and expensive for routine use in clinical settings. Reaping the rewards of the genomics era will clearly require faster and cheaper alternatives.

Some companies estimate that within the next five years, technical advances could drop the cost of sequencing the human genome low enough to make the "thousand-dollar genome" a reality. Whether or not that happens, new sequencing approaches could in the short term facilitate large-scale decoding of smaller genomes. In the long term, low-cost, rapid human genome sequencing could become a routine, in-office diagnostic test--the first step on the road to truly personalized medicine.

SINGLE-MOLECULE UP-STARTS Most of the players in this market are developing single-molecule-based methods as alternatives to bulk techniques such as capillary gel electrophoresis sequencing. Often, these techniques employ some form of sequencing-by-synthesis, in which the system "reads" each fluorescent building block as it is incorporated into the nascent strand. Single-molecule methods could greatly increase the speed and reduce the cost of sequencing: They require minimal sample preparation, use lower reagent volumes, and offer longer read lengths--thereby reducing the complexity of sequence assembly problems.

But some members of the sequencing community question the short-term prospects. "The question that I ask when I see these [new techniques] is, show me the signal," says Gene Myers, professor of engineering and computer sciences, University of California, Berkeley. Without an adequate signal-to-noise ratio, large numbers of individual molecules must be analyzed and signal-averaged to obtain sequence information. "That ends up destroying the purpose of using a single molecule to begin with," explains David Cox, chief scientific officer of Mountain View, Calif.-based Perlegen Sciences.

Elaine Mardis, assistant professor of genetics, Washington University School of Medicine, St. Louis, concurs. "As a scientist, one of the things that's a little disconcerting at this time is that really all we've heard from companies is mainly the stuff that press releases are made of," she says. "To the best of my knowledge, none of these guys have partnered with truly high-throughput genome centers to start beginning proof-of-principle-type testing of the methodology and the instrumentation."

Reflecting the technical difficulties, at least one proponent of the single-molecule method has shifted course. Woburn, Mass.-based US Genomics, one of the most highly publicized "thousand-dollar-genome" startups, recently redirected its focus from whole-genome sequencing and now markets its GeneEngine™ technology for haplotyping, genotyping, and RNA and protein analysis at the single-molecule level. "We're actually looking at other molecules such as RNA and proteins ..., because we've discovered that our technology has more utility than was at first realized," says chief scientific officer Steve Gullans. "But in terms of genomics we are recasting it for the public's perspective as a mapping technology."

FORGING AHEAD Nevertheless, several companies are forging ahead with their development efforts. Last year, for example, Cambridge, UK-based Solexa claimed that its TotalGenotyping™ platform, a method that combines high-density arrays with sequencing-by-synthesis technology, would be able to deliver a thousand-dollar human genome within the next five years.1

Solexa CEO Nick McCooke concedes that the initial prototype--a "several- thousand-dollar-genome" system scheduled for completion this year--falls short of this stated goal, yet he remains optimistic: "We've made really good progress on the sequencing chemistry and the other components of the system. What we're working on now is pulling these strands together," he says.



Courtesy of US Genomics

Likewise, Susan Hardin, CEO of Houston-based VisiGen Biotechnologies, is optimistic about her company's single-molecule sequencing approach. Based on the interaction between donor fluoro-phores in engineered DNA polymerase and acceptor fluorophores on incoming nucleotides, "It's like getting into the reaction tube and watching the enzyme as it's actually incorporating bases," she explains. As each tagged nucleotide is incorporated into the growing strand, the donor transfers energy to the acceptor, releasing light energy whose color and intensity reveals the base's identity. Hardin anticipates a product release within the next two to four years, once VisiGen's scientists increase fluorophore stability and optimize detection sensitivity.

Not all of the new sequencing technologies eschew amplification steps. The PicoTiterPlate™ system from 454 Life Sciences, a Branford, Conn.-based CuraGen subsidiary, uses microscope slide-sized cartridges etched with hundreds of thousands of picoliter-sized wells. Fragments of a single strand of DNA are deposited into these wells, amplified in parallel, and subjected to simultaneous sequencing-by-synthesis. Assembly of the resulting data produces a contiguous genome sequence.

The company has successfully applied the system to viral genomes and is currently sequencing a three-megabase bacterial genome, says Rich Fisler, manager of business development. "As we make the plate larger, and we make the number of wells that we can sequence simultaneously higher, we'll be able to handle larger genomes," he adds.

OF POLONIES, NANOPORES, AND WAVEGUIDES Academic researchers also are hard at work on new sequencing methods. "The technologies that are going to succeed," says George Church, professor of genetics, Harvard Medical School, "are the ones where the system is completely integrated, the volumes are very small so the reagent costs are minimal, and the instruments are very inexpensive--about the price of a computer."

Church and former student Robi Mitra have produced a technology that meets these requirements: polony, or polymerase colony, sequencing. Single DNA molecules are separated on a polyacrylamide gel affixed to a glass microscope slide, and then subjected to PCR in situ, creating "colonies" of amplified DNA.2 These molecules are then ready for fluorescent-in-situ-sequencing, a sequencing-by-synthesis method employing reversibly labeled nucleotides. Nearly 10 million polonies fit on a single microscope slide that can be read by a standard laboratory scanner in 20 minutes, Church says. "No [current sequencing] machine is comparable to that," he notes.


THE SANGER SEQUENCING METHOD
The National Center for Biotechnology Information's GenBank database contains tens of billions of letters of sequence information, the vast majority of which were deciphered using a method Fred Sanger developed in 1977. Unlike the chemical approach published around the same time by A.M. Maxim and Walter Gilbert, Sanger's method uses DNA polymerase to catalyze a controlled primer extension reaction.

DNA polymerization proceeds 5' to 3'; new nucleotides are incorporated by joining the alpha phosphate group of the new base with the 3'-hydroxyl moiety of the chain. But Sanger's method employs specially modified bases --2',3'-dideoxynucleoside-5'-triphosphates--that replace the 3'-hydroxyl with a nonreactive hydrogen, thereby terminating chain extension. The result is a collection of molecules of varying lengths, each of which terminates at a specified letter--A, C, G, or T. By size-fractionating the reaction products, researchers can read the sequence of the newly polymerized strands.

Originally, one of the nucleotides in the reaction was radiolabeled, and the reaction was split into four tubes for the termination phase. More recently, researchers have turned to fluorescently labeled terminator nucleotides, each tagged a different color. As a result, the four terminators can be mixed in a single reaction that would be separated not on a large, cumbersome polyacrylamide gel, but in arrays of thin capillaries in automated sequencing instruments.
 


Groups at Harvard University and UC, Santa Cruz, are developing an alternative single-molecule approach: nanopore sequencing. The technique analyzes individual strands of DNA or RNA as they pass through a nanoscale membrane channel, or pore, when an electric current is applied.3 As negatively charged nucleic acids pass through the pores in single file, they block the flow of current in a manner characteristic of the polymer's length and sequence.3

The system uses no enzymes, which makes it more robust, notes Harvard coteam leader Daniel Branton, Higgins Research Professor of biology, as proteins can be labile and difficult to work with. Likewise, it requires neither amplification nor fluorescent detection. But signal intensity remains problematic: "We can detect differences in different bases, but we cannot yet do so with a resolution that is sufficiently great to be able to distinguish one base from its neighboring base," says Branton, who adds that, to date, nanopore technologies have a resolution of roughly 10 to 20 nucleotides. Nevertheless, Palo Alto, Calif.-based Agilent Technologies was sufficiently impressed to begin a joint research project with the Harvard group to develop the technology.

Watt W. Webb, professor of applied and engineering physics, and colleagues at Cornell University have developed a third method, which overcomes the signal-to-noise problem that plagues many single-molecule sequencing methods. Webb's strategy employs "zero-mode waveguides," which are nanofabricated chips consisting of a metal film pierced by holes an order of magnitude smaller than the wavelength of visible light.4 DNA polymerase, the DNA of interest, and fluorescently labeled bases are placed in the holes and observed under an optical microscope; the size of the holes limits the observable volume of the reaction, allowing researchers to "watch an individual enzyme working as it's making DNA," explains research associate Michael Levene.

The small observation volumes permit the use of biologically relevant concentrations. "Even though these concentrations are fairly high, you can still look at an incorporation event at the single-molecule level; you're not swamped by the background from all these labeled bases that are diffusing around," says graduate student Jonas Korlach.

Webb's method can read five to 10 bases per second, and Levene says this speed can likely be pushed even further. More importantly, the technology is intrinsically high-throughput (two million waveguides fit on a single chip), and the enzyme used by the Webb group can synthesize 100,000 bases continuously. This read length "would reduce costs dramatically, because all the computer processing that's currently necessary to put the bits and pieces together into the full sequence would be unnecessary," explains Levene.

Levene also notes that the nanofabricated chip at the technology's heart can be mass-produced and that the proof-of-principle sequencing experiments employed a conventional microscope. "Once we are willing to let this out of the lab ... it could turn into a sequencing machine that's widely usable in a very short period of time," Webb says.



Courtesy of Solexa Total Genotyping

THE DIRECTION OF THE FIELD Elaine Mardis notes that scientists in her laboratory and in other sequencing centers are currently working on incremental improvements to existing capillary electrophoresis-based methods, such as increasing hardware sensitivity and reducing reagent volumes. She speculates that intermediate technologies that increase speed but that may not be as forward-looking as single-molecule methods may be the key to the future of sequencing. Indeed, a number of companies currently are developing methods that can be used in conjunction with new sequencing technologies to dramatically reduce costs and increase sequencing speeds.

George Church also notes that three currently available bulk resequencing methods are in many cases more cost-effective than capillary electrophoresis sequencing. These include: the sequencing-by-synthesis method used by Pyrosequencing AB5; the massively parallel signature sequencing method (MPSS) offered by Lynx Therapeutics6; and sequencing by hybridization to Affymetrix chips, the approach successfully used by Perlegen Sciences to resequence 50 human genomes. Efforts to microminiaturize capillary gel electrophoresis sequencing should not be discounted as a means of improving costs, adds Gene Myers. But these methods in their present states are not meant for large-scale de novo sequencing. "The technology we use can't do that because it's too expensive to resequence the genome of every individual. And it takes too long," says David Cox of Perlegen.

Mardis agrees that existing technologies cannot deliver a whole-genome sequence for under $1,000. "I think you're going to need a radically different approach if you're going to enter the realm of what they're talking about cost-wise," she says. Others in the field agree that though it is difficult at this point to predict which of the new technologies will come out ahead in the end, it is likely that at least one will triumph. "It cost us nearly $100 million to do the 50 copies of the genome, but it's all relative because that was done in a year-and-a-half, whereas to do the first copy of the genome, it took nearly $3 billion overall, and 13 years," says Cox. "What you'd expect in another 10 years is for people to do it much faster and cheaper than we do it. If that's not true then we're all in trouble."

Aileen Constans (aconstans@the-scientist.com) is a freelance writer in Pitman, NJ.

References
1. A. Constans, "To dream the not-so-impossible genomics dream," The Scientist, 16[20]:53, Oct. 14, 2002.

2. R.D. Mitra et al., "Digital genotyping and haplotyping with polymerase colonies," Proc Natl Acad Sci, e-pub ahead of print, 10.1073/pnas.0936399100, May 2, 2003.

3. D.W. Deamer, D. Branton, "Characterization of nucleic acids by nanopore analysis," Acc Chem Res, 35:817-25, 2002.

4. M.J. Levene et al., "Zero-mode waveguides for single-molecule analysis at high concentrations," Science, 299:682-6, Jan. 31, 2003.

5. H.E. Sussman, "Pyromania," The Scientist, 15[3]:22, Feb. 5, 2001.

6. A. Constans, "A new approach to gene expression analysis," The Scientist, 16[8]:44, Apr. 15, 2002.



OPTICAL GENOME SCAFFOLDS
Developers of new sequencing methods take note: The secret to the $1,000 human genome may lie in supporting technologies. Madison, Wis.-based OpGen is commercializing a technique called optical mapping that will help scientists complete de novo sequencing projects faster, more accurately, and less expensively, according to chief scientific officer Colin Dykes.

The process, originally developed by David Schwartz of the University of Wisconsin, uses restriction enzymes to digest linear strands of DNA affixed to a glass surface. The resulting fragments are stained with a fluorescent dye, and the pattern of the fragments is observed with a microscope.1 OpGen's software processes images from individual fragments, aligns those that overlap, and generates a consensus restriction map representing the genomic sequence.

Bill Spencer, director of business development, explains that OpGen's approach differs from conventional restriction-fragment analysis in that the fragments remain in the order in which they appeared on the original DNA strand. As a result, the genomes of different organisms have unique fragment patterns. These patterns, or maps, can facilitate sequencing efforts by providing a "scaffold" on which contigs (overlapping sequence fragments) can readily be aligned. As Dykes notes, optical maps of genomes can complement new sequencing technologies: "Some of the new sequencing technologies involve very short sequence reads in massively parallel fashion. ... Their challenge would be to assemble their little sequences into a big sequence. We're coming at it from the opposite angle, starting with an entire genome and providing a framework on which you can anchor these little sequences."

1. C. Aston et al., "Optical mapping and its potential for large-scale sequencing projects," Trends Biotechnol, 17:297-302, 1999.
 



Please indicate on a 1 - 5 scale how strongly you would recommend this article to your colleagues?
Not recommended
1
2
3
4
5
   Highly recommended
Please register your vote

Popular Now

  1. Antarctica Is Turning Green
  2. How to Tell a Person’s “Brain Age”
  3. Male Fish Borrows Egg to Clone Itself
  4. Life Science Funding Cuts Leaked
    The Nutshell Life Science Funding Cuts Leaked

    According to a document posted online less than a day before the release of the official 2018 budget proposal, the National Institutes of Health could face even deeper cuts than previously suggested by the Trump administration.

AAAS