One magazine likened the study of carbohydrates, called glycobiology, to Cinderella—neglected stepsister to her two more glamorous siblings, DNA and protein.1 Momentum is building, however, to do for carbohydrates what scientists have done for genomes, and are attempting to do for proteomes: to characterize the entire complement of these sugar chains in a cell, called the "glycome." Researchers are guardedly optimistic. According to Ajit Varki, professor of medicine and cell and molecular biology, and director of the Glycobiology Research and Training Center at the University of California, San Diego, "we don't know what's going to happen to Cinderella at midnight."
Scientists acknowledge that sequencing the genome was nothing compared to solving the proteome. Likewise, the glycome will make the proteome seem like child's play. "The problem is that [the glycome is] probably thousands of times as complicated as the genome, in magnitude of complexity and level of diversity," says Varki. First of all, unlike the genetic code, there is no rigid template that accurately specifies glycosylation patterns, but rather a complex assembly-line system involving competition by hundreds of gene products. In addition, each cell, tissue, organ, and organism exhibits different glycosylation patterns, which can change based on the cell's state or activity. Further, proteins often have numerous glycosylation sites, each of which may have a different carbohydrate group attached, and these sugar chains can themselves be modified.
The sugar chains' structures are also problematic; some are linear, but many are branched. And unlike with DNA and protein, the inter-monomeric linkages are not constant. Thus, multiple iterative methods are usually needed to fully define a given glycan structure. These various factors offer "totally new challenges that most biopolymer analytical people have yet to fret with," says Vernon Reinhold, a chemistry professor at the University of New Hampshire who coined the word glycome.
Richard Cummings, professor of biochemistry and molecular biology and director, Oklahoma Center for Medical Glycobiology at the University of Oklahoma Health Sciences Center, points out several additional complications. Unlike DNA, there is no way to amplify carbohydrates. If a researcher wants to obtain more of an interesting sugar, the options are either to synthesize that structure, or isolate more of the material from tissue. But most of the interesting carbohydrates identified thus far are minor species. "It's like looking at a mountain range like the Himalayas. It's easy to see the peaks, but the problems are down in the valleys," he says. And it won't be easy to determine when the entire glycome is solved, in contrast to the human genome project, in which "you would know at some point [that] you've covered it from A to Z," observes Cummings.
CD22, which is part of the B-cell receptor (BCR) complex, is another critical CBP. CD22 doesn't affect leukocyte trafficking, but rather, receptor signaling pathways. This protein negatively regulates the BCR, preventing accidental activation of the cell in response to "self" antigens and inhibiting autoimmune disease. Mice that lack the sialyltransferase required to produce CD22's ligand are profoundly immunosuppressed, suggesting that in normal cells, the CD22 ligand removes the receptor from the BCR complex, enabling the BCR to activate the cell.
Several human diseases result from faulty carbohydrate metabolism. Tay-Sachs disease, Sandhoff disease, and juvenile GM2 gangliosidosis are related carbohydrate degradation disorders that result from hexosaminidase deficiency. Congenital Disorders of Glycosylation (CDG) and Leukocyte Adhesion Deficiency (LAD) are related to faulty carbohydrate synthesis. They appear to be extremely rare: the most common form of CDG, which results from a defect in the phosphomannomutase (PMM2) gene, affects perhaps 200 individuals worldwide, according to Donna Krasnewich, acting clinical director of the National Institute of Health's National Human Genome Research Institute, and member of the medical advisory board for the CDG Family Network Foundation. LAD, an immunodeficiency that results from a defect in the way leukocytes interact with endothelial walls, is even less common, with only 20 to 30 observed cases around the world.
Krasnewich suggests two reasons glycobiology-related genetic diseases appear to be so rare. First, scientists and clinicians simply don't know how to look for them. But because glycosylation plays such a critical role in cell-cell interactions and in the immune system, it's likely that at least some of the many genetic diseases with unknown etiologies are caused by glycosylation-machinery deficiencies. Scientists have probably "only uncovered the tip of the iceberg," she says.
Another explanation for the paucity of known glycosylation-related disorders could be that many result in embryonic lethality, but that's not likely to be a major factor. Jamey Marth, professor of cellular and molecular medicine, and Howard Hughes Medical Institute investigator at UCSD, has engineered and analyzed almost two dozen mice deficient in carbohydrate formation; most are viable. Less than 25% of these mice, and those made by other researchers, are affected by premature lethality or failure to reproduce, he says. The rest exhibit some type of physiologic abnormality, with a portion exhibiting symptoms of various human diseases.
What makes glycobiology so challenging, says Jim Paulson, professor of molecular biology and experimental medicine at The Scripps Research Institute, is that carbohydrate biosynthesis is so incredibly complex. There are nearly 80 steps in the biosynthetic pathway affected in CDG patients, according to Krasnewich, and disruption of any one of them can lead to disease. "The first question I always get," says Paulson, "is, what's different between a protein recognizing a carbohydrate and a protein recognizing another protein?" The answer, he says, is simple: To regulate the expression of a protein, the cell need only regulate a single gene. Carbohydrate homeostasis, on the other hand, requires considerably more cellular control.
Can We Do It?
On a technical level, no single technology today can do for sugars what has been done for the genome and proteome. There are no automated sequencers and synthesizers, and few core facilities. Currently, the best and most sensitive approach to determining glycan structure is mass spectrometry.2,3 Nuclear magnetic resonance is another option, albeit a more reagent-intensive one. Scientists have also made strides in carbohydrate synthesis. Although chemical synthesis of carbohydrates is difficult, researchers can now use purified enzymes in a step-wise protocol to produce sufficient quantities of product for their work.
New technologies will certainly help advance glycomics, but to really make the field reach its potential, glycobiologists must also settle on a model system to study, explains Cummings. That's what jump-started the genomics era, he says—making the decision to sequence the human genome. "You have to define targets, and you just put all your energy into it." Glycomics, in contrast, suffers from a comparable lack of focus. "It's just kind of random work going on in different laboratories."
Cummings concedes that it may not be possible to analyze every sugar in an organ or tissue in the near future. Yet he tempers that, saying "Twenty years ago, if you had told me that it would be possible to sequence the human genome, ... I would have thought that was also impossible in my lifetime, and now it's clearly doable, because the technology has caught up."
He expresses hope that by setting the challenge of glycomics, technologists will start thinking about new strategies to tackle the inherent difficulties of carbohydrate research. Then the field can progress as molecular biology did years ago, with slow, steady progress, followed by a paradigm shift induced by a new technological approach. Ram Sasisekharan, associate professor of biological engineering at the Massachusetts Institute of Technology, who teaches a course on glycomics, agrees, likening glycobiology today with molecular biology at the dawn of the biotech revolution.
Enter the Consortium
"The key word in the consortium is 'functional,'" says Paulson, the consortium's principal investigator. "We are focused on the subset of carbohydrates whose functions are involved in cell communication." As a result, the consortium consciously excludes research on two major glycoconjugate classes, intracellular glycoproteins and proteoglycans. "We had to be careful," Paulson explains. "Our scope is already huge."
The consortium's seven scientific core facilities will develop and produce reagents and information to advance glycobiological research. The mouse genetics core, for instance, will create up to 10 knockout mice per year. A gene microarray core will develop biochips containing genes for glycosyltransferases, CBPs, nucleotide sugar synthesis, and some associated families like adhesion proteins and cytokines. Two of the core facilities will collaborate to produce a complementary "glycoarray," in which various carbohydrate structures will be patterned as a way to screen for binding proteins. One of these two facilities is a carbohydrate-synthesis core, which already produces from 10 to 100 gram quantities of product, says Paulson.
An information dissemination core, located at MIT, will produce and maintain a series of databases to make all of the consortium's data searchable and publicly available. One of the databases will contain about 4,000 annotated carbohydrate structures, provided by Israeli company GlycoMinds Ltd., one of the consortium's sponsors.
Hold the Saccharine
For all the excitement, whether glycomics can live up to its promise is unclear. Significant technical challenges lie ahead. For his part, Paulson is confident that the consortium can make real progress in the next five years. But, "Will we finish everything there is to be known about glycobiology? No, I don't think so," he says. Cummings concurs: "We may not define the glycome, but we're going to know a lot about the functions of carbohydrates."
And that will surely yield clinical benefits. According to Krasnewich, researchers now understand that type 1b CDG is caused by a phosphomannose isomerase deficiency. Doctors can treat these patients simply by bypassing the affected step in carbohydrate biosynthesis. And how do they do that? With a pinch of sugar—in this case, mannose.
1. S. Hurtley et al., "Cinderella's coach is ready," Science, 291:2337, March 23, 2001.
2. A. Dell, H.R. Morris, "Glycoprotein structure determination by mass spectrometry," Science, 291:2351-6, March 23, 2001.
3. G. Venkataraman et al., "Sequencing complex polysaccharides," Science, 286:537-42, 1999.
Selected Glycomics Resources