οΎ© CHRISTIAN DARKIN

Five years after publication of two drafts
of the human genome, Maynard Olson
of the University of Washington finds
himself longing for another "lurch." To
be sure, genomic scientists across the
world have chalked up many achievements since
2001, but, like many of his colleagues, Olson is feeling
more impatient than celebratory.
Progress has included a blizzard of comparisons
between the human sequence and many others,
including the chicken, the mouse, the rat, the dog, and
the chimp. The flourishing of comparative genomics,
says Olson, has changed the focus of genomics from
a single reference sequence of genes to a rich variety
of "functional elements," largely sequences that serve
as ignition switches, brakes and accelerators for gene expression. And the focus on single-base
changes has widened to an array of evolutionary
rearrangements: insertions, deletions,
reversals, and duplications. There
are new tools: new global databases of
all functional elements in genomes (e.g.,
ENCODE), small molecules for chemical
genomics (e.g., PubChem), and a raft of
protein structures.
And yet the last five years, in Olson's
view, have been "a period of a great grinding
of gears, kind of shifting of gears." In
the terms of the science historian Thomas
Kuhn, it's been "a period of consolidation
and more normal science." Others, such as
Sydney Brenner of the Salk Institute, the Nobel Prize-winning pioneer of the worm,
Caenorhabditis elegans, go further, worrying
that the genome sequence and the
growing lists of sequences and proteins
and protein interactions and functional
elements don't get very deep into such core
problems of biology as the operations of
the cell, of development from egg to adult,
or the problem of consciousness. "We've
become very geno-centric," says Brenner.
"The cell must become the focus."
What vexes many thousands of colleagues
around the world most is that
genomics hasn't yet moved into the "real
world" of medical relevance. Olson led a
team that sequenced the principal microbe involved in lung infections in cystic fibrosis
patients, Pseudonomas aeruginosa. Referring
to changes in cells of both the patient
and the infecting organisms, says Olson,
"it's clear that mutational cascades are a
really critical aspect of disease progression,
just as is the case with cancer." To build a
genomics "bridge" into this area is going to
call for a "very large" amount of sequencing
of both patients and microbes to follow
the progression of the disease. For this, the
faster second-generation sequencing technologies
emerging from several startup
companies will be essential, Olson thinks,
just as it will be for the new National Institutes
of Health Cancer Genome Project, on which pilot work has begun. "They're
over-promising and are trying to move too
quickly, without a strong enough strategic
plan," Olson argues. "Nonetheless the scientific idea is right. These policy things
usually eventually fall into place as reality
exerts itself."
Such issues as cheaper, faster DNA
sequencing to get genomic tools into
the clinic sooner will define the field for
the next five years, and beyond. Echoing
Olson, Cold Spring Harbor Laboratory's
Lincoln Stein says the last five years have
fit Kuhn's definition of 'normal science,"
although "the number of questions never
decreases."
COMPARATIVE SUCCESSES
The annotation work of the last five years,
says David R. Bentley, the former director
of human genetics at the Wellcome Trust
Sanger Institute in Hinxton, UK, has been
"changed completely" by the discovery
of microRNAs, which include the small
interfering RNAs, siRNAs. "All you had
five years ago was strings of bases.... Now
we have this beautiful... emerging colorful
picture of what each of these bases does. I
think we may build a whole new picture of
how the genome works
and how it specifies
the cell," says
Bentley, now chief scientist of Solexa, one of several startup
companies that is speeding DNA sequencing
and driving down its costs beyond those
possible with the classic Sanger method.
To those who shrug at "just another
sequence," says Phillip Sharp of MIT, a Nobel
Prize-winning leader in RNA
research for three decades,
"the widening array of
genomes, means everything."
MicroRNAs were
found in C. elegans, by Andrew Fire and Craig Mello in 1998. Comparing human,
worm, fruit fly, and plant genome sequences allowed microRNA
research to go fast in the past few years. Soon after microRNAs
were found in humans, researchers were calculating the number
of microRNA genes and focusing on their targets; they went on
in 2005 to using them as a signature of cancer in cells and a
potential tool for reducing the expression of genes for synthesizing
cholesterol.
"We have gone from a situation where we didn't even know
about this regulatory network in three years to being able to identify
gene systems that are being regulated," says Sharp, who contends
that the pace of microRNA discoveries is ten times faster
than the work on RNA splicing that he and others began in the
late 1970s. The speedup, in Sharp's opinion, is entirely due to
having the human sequence - and is a major example of the influence
that genomics is having on other fields. "We made progress
on the biochemistry of that process, but we made very little
progress on the big picture of how it's regulated and changes in
normal versus disease states." So now, the comparison of genome
sequences will also be harnessed to get at the specifiity and regulation
of RNA splicing.
The growing menagerie of "model organisms" and what
comparative analysis of them can achieve also impresses Robert
Waterston, chairman of the genomic sciences department at the
University of Washington. With the ability to knock down every
gene in an organism like yeast, he says, a "true molecular description
of yeast... is on the table." Since yeast is a primary organism
for comparisons with human sequences to discover genes and
their controls, he adds, "I can't imagine it wouldn't have a profound
impact on how you view humans."
NEXT FIVE YEARS: SPEED UP, COSTS DOWN
Sensing "some motion" recently, Olson hopes for the success of
the new, faster sequencing techniques that are coming over the
horizon from startups like 454 Life Sciences, Solexa, Agencourt
Bioscience, and Helicos Biosciences. They claim processes 100
times faster than the "classical" machines of the 1980s and 1990s,
which now operate in "reads," or sample lengths of 800 bases
compared to some 100 for the new processes. The workhorses
of the 2001 human drafts have kept doubling their throughput
about every 22 months over 15 years. In September,
454 reported that, in a single run, its system
did a shotgun sequence and assembly of the
microbe, Mycoplasma genitalium, in four
hours. Claire Fraser's team at the Institute
for Genomic Research took three
months to work out Mycoplasma's
sequence in 1995.
Solexa, Bentley said, has a "marketing timeline" that calls
for some of the instruments it's developing to be in the hands of
"early access customers" toward the end of the second quarter of
2006, with commercialization scheduled by the end of the year.
For the rival second-generation sequencing machines, Bentley
sees "overlapping markets, although exactly where the overlaps
are is not clear at the moment." While the new technologies have
worked with small genomes, there are challenges of accuracy,
longer "reads," and cost for larger genomes, he says.
sequence in 1995.
As he has for decades, Leroy Hood, director of Seattle's Institute
for Systems Biology focuses on the "toolbox" for a genomics
that will enable a truly personalized, predictive, and preventive
medicine. "All the big revolutions [in science] are technique-driven,"
he says. Hence, the drive for machines to sequence a
human genome for $100,000 and then $1,000, compared to
today's $3 million price, should succeed over the next 10 to 15
years. The 454 machine, he says, can handle several hundred
thousand samples at once. As to Helicos, to which he is an advisor,
he sees an "enormous advantage" to the single-strand technique
of Stephen Quake of Stanford, which Helicos uses. The result of
faster sequencing will be "an explosion of biology," with demand
for full sequences of hundreds of millions of people in Europe and
America - so, the market will be there, says the ebullient Hood,
contradicting Olson.
Francis Collins (see Delivering on the Dream), director of the National Human
Genome Research Institute, whose overall budget in 2005 was
$500 million, is betting $30 million a year for five years on
second-generation sequencing. Although he says, "Technology
development is a risky experience," he adds, "We are on the cusp
of a real paradigm shift."
And the Broad Institute in Cambridge, Massachusetts, is
testing a 454 machine. However, "we're going to be testing all
the others, too... I'm in favor of all clever ideas," says Broad director
Eric Lander. Nonetheless, "it takes quite a long time before
clarity emerges around a new technology platform. We are going
to have to see how they perform in many, many respects." Compared
to the 1990s decisions to use sequencing technology from
Applied Biosystems and its rivals, "We might have a more textured,
layered solution."
GETTING TO THE CLINIC
Such "second generation" sequencing could make
possible one of Olson's other goals: being able to
sequence, in, say 100 patients and 100 controls, the
HLA region of chromosome 6. The HLA region
contains genes associated with autoimmune diseases
like type 1 diabetes, multiple sclerosis, rheumatoid
arthritis, and lupus. The "precise
molecular explanation" for these diseases
"has remained elusive for decades,"
Olson notes. Presently, he says, such "large-scale targeted resequencing of
genomes," going after small nucleotide changes along "substantial chunks of real estate" is beyond the
capacities of today's principal sequencing robots.
Sharp says that's true for other fields as well. "They can't afford
to do the cancer genome without dropping the cost of sequencing
by over ten-fold," he says. In general, he is optimistic on secondgeneration
sequencing, saying he "would bet on it without a question
that we will be at a $1,000 genome in a five-year window."
At that point, "It will be feasible to sequence everybody at a cost
that will be insignificant compared to the medical costs," opening
up the way to wide clinical application in diagnosing disease and
picking particular therapies.
One of the most dramatic efforts to push genomics into the
realm of complex, multi-genic diseases is the five-year, $138
million haplotype map (HapMap) project, involving samples
donated by Japanese, Han Chinese, Yoruba, and Americans of
European descent. The project takes advantage of the fact that the
millions of single nucleotide polymorphisms (SNPs) found in at
least one percent of humans tend to pass between generations in
blocks of DNA called haplotypes. The project announced its Phase
1 analysis in October 2005, and said that the analysis of Phase
2, already completed, would be published in 2006. Despite successes,
such as using HapMap data to pinpoint a gene for macular
degeneration, there remains controversy over HapMap's reach into
domains such as rearrangements like deletions and reversals, or
the numerous rare mutations that may be involved in diseases.
The minor variations are of central interest to Bentley of
Solexa, who has specialized in rare variations. The HapMap,
he says, has limitations, capturing only common variations in
three target populations, missing the rare mutations. But it may
provide a quick way to find more disease genes. Still, in three to
five years, he says, the new sequencing machines should open
the option of going after virtually all the many genes involved
in a disease like diabetes. To be sure, the multiple sequences of
patients and "controls" will have to square with what HapMap
has found. "Everything that a HapMap captures should also be
captured by a technology that aims to do better." Bentley, an early
proponent, calls the HapMap "a real benchmark."
Collins, who has directed NHGRI since 1993, is cheerful in
the face of the doubts. He is certain that HapMap will become
"the most powerful tool" to date "for unraveling the genetics of
common diseases." He adds, "I think you can expect... in the next,
let's say, two to three years, that the major genetic contributions to
genetic diseases, perhaps as many as a dozen... will be identified.
And that's going to be incredibly exciting."
The long road of such incremental steps to clinical relevance
creates impatience in genomics. Waterston says, "Yeah,
there's frustration. I suspect among some quarters that read
the hype and didn't know enough of the science, they would
be frustrated. Anybody who knew the science knew that it was
going to be a long time coming. These are hard problems. I
would say that the progress that has been made on them is
pretty substantial. But that's because I come in with a deep
understanding of how hard it is."
|