With little fanfare, the much-debated sequencing method known as whole-genome shotgun (WGS) has become a conventional way to sequence genomes. Two studies out this month help to confirm its importance.
Early this month, the publicly funded mouse genome project showed that WGS could yield a high-quality draft sequence — one superior to the first draft of the human genome. And in a paper published December 23 in
"The study seems to answer one of the initial criticisms of WGS, that the finishing stage would be more difficult. Turns out it is not," S. Blair Hedges, who works on vertebrate genome evolution at Pennsylvania State University, told
WGS has been around for two decades, but became controversial when Celera Genomics announced it would use the method to produce a draft human genome sequence faster than the publicly funded Human Genome Project. The latter relied heavily on a different method, usually known as clone-by-clone.
Eric Lander of the Whitehead Institute, one of the NHGRI-funded sequencing centers, blames the controversy in part on journalists. "The WGS versus clone-based sequence issue was so muddled by the press during the Human Genome Project," he told
Draft sequences are useful for many purposes, but finished sequences are essential for identifying the full set of genes and regulatory regions and getting the correct sequence of proteins, Lander pointed out. "Without this, you can't know what you're missing, what apparent genes may be non-functional pseudogenes. You also cannot study repeat sequences accurately. And it is much harder to spot new mutations."
Finished sequence also permits verification and error correction, and completes fragmented and fragmentary genes, according to Mark Blaxter, of the Institute of Cell, Animal and Population Biology in Edinburgh, UK. The completed
Lander argues that it is possible to finish a shotgun sequence in organisms with few repeats, like bacteria or even
WGS smashes a genome into millions of bits, sequences the bits, and localizes each one to a specific spot in the genome by matching genetic markers in the bit to the same markers on chromosomes. The clone-by-clone method breaks a genome into largish chunks, clones the chunks into bacterial artificial chromosomes (BACs), breaks the BAC DNA into smaller chunks, matches their end sequences via computer programs, and then localizes them in the genome with markers. It takes longer than WGS and means sequencing thousands of BACs many times to map a genome, but it has been regarded as more accurate. WGS requires millions of sequence reads, too, but is believed to be less expensive.
After all the high-profile discord, researchers have come quietly to a consensus on sequencing: they want the best of both approaches. Today's genome projects tend to combine the two into hybrid strategies that are shaped by the complexity of the genome under study and the way researchers are likely to use the sequence information.
"We think shotgun sequencing is enough, certainly for organisms whose sequence will be used primarily for comparative genomics studies," Susan Celniker told
"I believe that shotgun sequencing is great, but it doesn't give an accurate account of the 40% or so of the genome of mammals that is repetitive. So, there is definitely a need for the sequencing of BAC clones to finish the job," said Haig Kazazian, who chairs the genetics department at the University of Pennsylvania and studies retrotransposons. "Speciation, aging and other key processes may be affected by repeats," he told
Celniker and colleagues reported that, for some repeats, neither WGS nor clone-by-clone works particularly well. Large tandem repeats such as the histone cluster in
Financial calculations continue to drive individual decisions about which methods to use for which projects. "Everyone in genomics agrees that more sequence and better sequence is better," Blaxter said. But only a limited number of bases can be sequenced in a year. The result, he said, is a continual tug-of-war between those who would like to see exhaustive completion of one genome and those who would rather have draft sequences of several.