<p/>

David Nance ARS Image Gallery

More than half of the world's population depends on rice as a principal source of calories and nutrition. And from a scientific perspective, the genome of this prolific grain offers clues for others, including corn, wheat, and barley. While its cereal cousins dwarf rice's 400-Mb genome, nearly all of the proteins found in these other staples have homologs in rice. As such, rice serves as a model for all cereal agriculture. Unraveling its code may enable scientists to address world hunger in new ways.

In this issue's Hot Papers, Huanming Yang and others at the Beijing Genomics Institute/Center of Genomics and Bioin-formatics produced a draft sequence for the genome of Oryza sativa, subspecies indica, the most widely cultivated subspecies in China and Asia Pacific regions.1 In the same issue of Science, Stephen Goff of Syngenta (a spinoff company of Novartis)...

GOING WITH

"I still remember that for the first week, we had more than 10,000 hits to our database, let alone GenBank with the same sequence data," recalls Yang, head of the rice genome project at the Beijing Genomics Institute. "We also received numerous messages expressing excitement over the release of the data." Since then, the Beijing database has had more than 300 million hits, says Yang.

The rice genome, to Yang's surprise, comprised large, duplicated fragments, which he and his colleagues say contribute to protein diversity. Using whole-genome shotgun sequencing, they showed the 466-Mb indica genome contained roughly 46,000 to 56,000 genes. While half of the predicted rice genes were near replicas of those found in Ara-bidopsis thaliana, half had no homologs in the model plant's genome. Analysis of the japonica data revealed a 420-Mb genome containing 32,000 to 50,000 genes.

...AND AGAINST THE GRAIN

Before its release, Syngenta made a controversial agreement with Science magazine governing access to data. "Syngenta was willing to make the data wholly available to academic researchers, but we had both public and commercial concerns about placing the data into an international database," writes Syngenta spokesman, Chris Novak, in an E-mail. "As a result, we agreed with Science to make segments of the draft sequence available through our Web site and the entire sequence available via CD-ROM." Furthermore, he adds, Syngenta pledged to allow its data to be inserted into GenBank as part of any finished sequence submitted by the publicly funded International Rice Genome Sequencing Project (IRGSP), the multinational effort consisting of the United States, Japan, and several other countries aiming to produce a high-quality, completed sequence of the japonica genome. Subsequently, Novak writes, Syngenta has more than 1,400 active users who signed agreements waiving commercialization rights in order to access its database.

"The Syngenta sequence was never made completely public, but they did make it available to international sequencers to speed up the... effort," says Susan McCouch, a rice molecular geneticist at Cornell University in Ithaca, NY. By the end of 2002, the IRGSP released its completed draft sequence, which included Syngenta data. "Syngenta has been very present in moving the public project along," says McCouch.

"It would be hard to identify what percentage was ours and what percentage was theirs," says Rod Wing, director of the Arizona Genomics Institute at the University of Arizona in Tucson and leader of the international effort's US consortium.

ADDING DATA AND DILEMMA

<p>RICE RELATIONSHIPS</p>

© 2002 AAAS

Functional classification of indica rice genes assigned by homology to A. thaliana genes, according to the Gene Ontology Consortium. Only 36.3% of the 25,426 predicted A. thaliana genes are classified, along with 20.4% of the 53,398 predicted rice genes.

Nonetheless, the publication deal between Syngenta and Science sparked controversy among scientists,3 reminiscent of Celera's publication of the human genome. "The Syngenta paper gave an overview of the genome data, but didn't provide [freely accessible] data to back it up," says Wing. "A lot of people were pretty upset about that."

"If we have to scan bits at a time, it's not as useful," says Venkatesan Sundaresan, chair of the Plant Biology Section in the Department of Agronomy, University of California, Davis. Shoshi Kikuchi, head of the Laboratory of Gene Expression at the National Institute of Agrobiological Sciences in Tsukuba, Japan, says the data release policy was frustrating, "The limited access to the Syngenta data from our institute blocked the complete mapping of our full-length cDNA clones to the genome sequence at that time." Kikuchi's team eventually published their mapping and annotation of 28,000 cDNAs from japonica rice last year.4

"Partial access is better than no access, but people felt it was a sellout by Science," says Sundaresan. Donald Kennedy, the journal's editor, had argued that the compromise was necessary in order to gain any access to genome data on the world's most important crop.3 But the two draft sequences posed another problem: they threatened the funding of the global rice genome project. "They took away the thunder from the public effort," says Jan Leach, a rice geneticist in the Department of Plant Pathology at Kansas State University in Manhattan, Kan. "The people thought, 'The jobs are done. So why should we put more money into it?"' Wing says that educating funding agencies was a challenge. "We had to convince the community that it was not done." Yang says his team also felt the crunch.

"The public effort, which has been going through the sequence very carefully, has sort of been hit by these publications before them," says McCouch. Since then, some countries have dropped their funding, with the United States and Japan picking up much of the financial slack.

THE NEXT HARVEST

In spite of this setback, two new polished sequences for both indica and japonica genomes are slated for publication in the immediate future. "By merging all of the available data together [for the indica genome], we have put the sequences into no more than 150 super-contigs," says Yang. "We are just now negotiating with the influential journals about the publication of our manuscripts."

According to Wing, the IRGSP's target is to finish the japonica rice genome sequence by year's end, with no more than 20 to 30 physical gaps in the genome. "The IRGSP effort is already 75% completed," Wing says.

Efforts are already underway to make sense of the existing data. Leach and her colleagues are investigating mechanisms of disease resistance and defense responses. Sundaresan's group is looking to define roles for genes of unknown function using a random transposon mutagenesis technique.

The Beijing team has begun identifying genes involved in specific metabolic pathways. They've constructed DNA chips, says Yang, containing all the known genes, predicted genes, and necessary controls on a single slide. He adds that the rice chips are available to collaborators worldwide.

In January, both Yang and Kikuchi's groups published descriptions of their databases. The Beijing Genomics Institute's Rice Information System5 (BGI-RIS) presents sequenced, annotated genomes for both indica and japonica for in-depth comparative studies. Kikuchi's group has established the Rice PIPELINE,6 pooling genome sequences, cDNAs, gene expression profiles, and other publicly available data.

The work is really only just beginning, adds McCouch. "It will be another 10 years before the genomes are completely annotated, and more than that before we understand how the genes are regulated and what they do." Achieving this will involve ongoing concerted international efforts, both private and public, involving multiple institutions. Says McCouch, "Sequencing genomes gives you a huge new perspective of the world, but it's not the end of the game. The fun is just beginning."

Nicole Johnston nicolejohnston@yahoo.com is a freelance writer in Hamilton, Ontario.

Data derived from the Science Watch/Hot Papers database and the Web of Science (Thomson ISI) show that Hot Papers are cited 50 to 100 times more often than the average paper of the same type and age.

"A draft sequence of the rice genome (Oryza sativa L. ssp. indi-ca)," Yu J, Science Vol 296, 79-92 April 5, 2002. (Cited in 297 papers)"A draft sequence of the rice genome (Oryza sativa L. ssp. japonica)," Goff SA, Science Vol 296, 92-100 April 5, 2002. (Cited in 293 papers)

Interested in reading more?

Magaizne Cover

Become a Member of

Receive full access to digital editions of The Scientist, as well as TS Digest, feature stories, more than 35 years of archives, and much more!