Two years ago the Human Genome Project published its final draft – a protein parts list, if you will, for human cells. Noticeably missing, though, were the instructions needed to put those pieces together.
But not for long: Researchers around the world, in both academia and pharmaceutical companies, are working to compile a first draft of those plans. The result, called an "interactome" – a complete set of cellular protein-protein interactions – could arrive by year's end. Or at least, a small fraction of it.
Marc Vidal, the Dana-Farber Cancer Institute researcher credited with assembling the
The human interactome "is the next big challenge," says Edward Marcotte of the University of Texas at Austin. "Now that we have the human genome, it's natural to try to understand how the parts work together."
But will the human interactome provide that information? Optimists argue that a complete map of protein-protein interactions gives scientists a picture of what's actually happening in a cell, exposes patterns they otherwise wouldn't see, and generates hypotheses that can be tested in more focused experiments. Pessimists counter that, due to the mind-bogglingly complex nature of the human cell, any interactome will be riddled with errors, and practically pointless, since most scientists don't have the tools to make use of the entire network anyway. And what good is a list of protein interactions without knowing what those interactions mean?
Still, journals have already published protein maps for the budding yeast, fruit fly, and nematode worm generated with high-throughput tools, and many say a human version is the next logical step. The number of interactions added to the Human Protein Reference Database (HPRD) – which has already catalogued nearly 30,000 protein-protein interactions published in peer-reviewed papers – has "really accelerated" in the last year and a half, according to Akhilesh Pandey of Johns Hopkins University, who directs the database. But when the dust from the big unveiling of the human interactome settles, will scientists be any smarter?
"The day [the human interactome] is going to be declared done by any one group, it's going to be a big story," says Pandey, who also founded and directs the Institute of Bioinformatics, in Bangalore, India, where HPRD is annotated. "But the next day, we won't be much better off."
THE BIG PICTURE
Either way, that day is fast approaching. A handful of scientists are "itching" to do the human interactome, simply because it's "doable," says Pandey. Vidal says he expects to publish a proteome-scale human interactome in the near future. Joel Bader, who leads another effort at Johns Hopkins University in Baltimore, Md., says his team has already finished gathering data for its map and is now highlighting regions that suggest new information on human biology and disease before submitting it for publication. "We're trying to get it out as soon as possible," he says. And Erich Wanker of the Max Delbruck Center for Molecular Medicine in Berlin, is also in the race.
The resulting maps, say Vidal, could reveal important patterns or "organizational principles" in protein connectivity that scientists otherwise wouldn't see. They also can be used to scaffold other genome-scale information like gene expression, localization, and phenotype data, which can give the maps added richness and detail.
Because proteins tend to act in complexes rather than singularly, interactome maps can yield important functional clues for novel proteins, and reveal surprising details about known proteins, says Peter Uetz, an assistant professor at the Research Center Karlsruhe, Germany, who in 2000 helped map the yeast interactome.
Interactome maps pull data from the literature, yeast two-hybrid experiments, protein microarrays, and coimmunoprecipitation-mass spectrometry assays, to produce a list of interacting protein pairs that can then be converted into a map. This map can be enhanced using statistics, gene expression, colocalization, and phenotype data, and other metrics, to rank interactions into those that are more or less likely.
Uetz and his coworkers have used whole interactome maps from model organisms to predict human interactions. Now, he says researchers could use human interactome data to learn more about other species, and vice versa. "As soon as you have a whole interactome, you immediately see connections you didn't see before," Uetz says. One recent, small-scale
The interactome could also teach biologists much about how networks have evolved over time, says Bader. Scientists are just beginning to look at model organism interactomes from this perspective, he says, studying how connections between existing network components are rewired and how new components are recruited to the network. "We're starting to learn the design patterns of biological networks, how complex systems are built from simpler, modular components," Bader says.
Yet, some doubt the utility of a genome-wide human interactome. The human body contains an ever-changing protein kaleidoscope that varies over time and from cell to cell. Humans have some 25,000 to 30,000 protein-coding genes, many of which can be alternatively spliced. Each tissue and cell type expresses its own unique constellation of those genes. There may therefore be an entirely separate interactome for each cell type at each moment in time, making it practically impossible to catalog all of those interactions, experts say.
Marcotte estimates humans are home to more than 200,000 protein-protein interactions, not counting the new proteins, and new interactions, created from alternate splicing and post-translational modification. To compound the problem, the quality of gene calls is worse in humans, he says. Scientists still haven't agreed on the number of genes, and in many cases, gene boundaries are "messy," he says.
For all these reasons scientists who say they are working on an entire human interactome are knowingly being "naïve," says Giulio Superti-Furga of the Center for Molecular Medicine of the Austrian Academy of Sciences and a founder of Cellzome, who himself is investigating human protein interactions involved in individual pathways.
Still, some argue that the prospect of errors shouldn't discredit the entire endeavor. Vidal takes issue with those who raise doubts about the benefits of the human interactome because it will likely contain mistakes. Errors are inevitable in such large projects, he says, noting that the human genome is riddled with them, yet it's the interactome that gets bad publicity. A given cell in the human body may be home to 20,000 interactions at one time, he says; it's hard to catalog that much activity without making a mistake. "If you draw a tree, you don't include every leaf," he says. "And if you have a problem with individual leaves, don't say it isn't a tree."
But the concerns don't just center on the magnitude of the human genome and the quality of its annotation. Many scientists have raised concerns about one of the high-throughput tools researchers are using to assemble the interactome, the yeast two-hybrid (Y2H) system. Y2H tests for interactions by introducing artificial fusion constructs representing the two putative binding partners – often, mere fragments of full-length proteins – into cells, forcing them into the nucleus, overexpressing them, and then monitoring the readout of a reporter gene.
The Y2H system is valuable "starting point," Superti-Furga says, because it is easily automated, can move quickly through the map, and captures transient interactions, which can get lost, he says, when scientists extract and purify interacting proteins. But, the fact that Y2H exposes transient interactions also means it's prone to false positives, he says.
It is also prone to false negatives, he says, since the technique cannot show interactions that happen at the cell membrane, the scene of much interesting biology. For that reason, Pandey says less than 5% of interactions in the HPRD come from Y2H, and over 90% from in vivo experiments – the "gold standard."
Fans of Y2H are hard at work to find ways around these hurdles. One approach is simply to repeat your experiments, says Rémi Delansorne, vice president for research and development at Hybrigenics in Paris, a company that uses Y2H technologies to catalog human protein interactions. Hybrigenics staffers often screen several presentations of the same protein, to make sure they've found the most specific part of the interactions. But Delansorne says the company typically runs these additional – and costly – experiments only for those proteins that are particularly "interesting."
Bader says he weeds out false interactions by looking at the local topology around an interacting pair of proteins – counting how many interaction partners each protein has and how many partners two proteins share, and then generating related measures of connectivity. Another trick, he says, is to review how often a pair appears to interact – the more frequent the pairing, the more confident scientists can be in the relationship.
Still other scientists compare interactions with orthologous interactions and networks in other species, which can lend credence to human data. Ashwini Patil and Haruki Nakamura of Osaka University recently estimated that when researchers combine high-throughput interaction data from yeast with additional characteristics of interacting proteins, such as sequence homology, they are more likely to pinpoint true interactions.2 The paper predicts that nearly 70% percent of human interactions from yeast are correct.
But to Dana Farber's Vidal, false positives can be a "philosophical" problem, not just a technical one. If an interaction repeatedly appears during in vitro experiments, but never in vivo, is that a true mistake? The proteins still like each other, but perhaps never meet in the cell, he suggests. For that reason, Vidal says he defines the human interactome as all protein-protein interactions that are possible, not necessarily the reactions that happen.
PATHWAY TO SUCCESS
However flawed interactome maps prove to be, they will undoubtedly provide starting points for further study. But, some researchers are narrowing their focus and concentrating their research efforts on small sections of the human interactome.
Pandey favors focusing research efforts on particular pathways within the human interactome, investigating what drives those pathways and the consequences of disturbing them – information that could have pharmaceutical implications. "Many of the breakthroughs are still going to come from the process level," he predicts. He argues that most researchers likely won't take advantage of the large datasets that would emerge from a genome-scale human interactome, Pandey notes, nor will they have the tools to manipulate and visualize the data in many dimensions.
A. Donny Strosberg of Scripps-Florida in Jupiter, former CEO of Hybrigenics, agrees that focusing on particular pathways is the most practical way to study the human interactome. But he adds researchers should keep a very "open mind." Interactions occurring in one cell at one time may not occur in others, or under other circumstances. Thus, Strosberg says it's important to probe interactions at different times in the cell's life, or in different types of cells. This, he says, will help scientists learn whether a protein affects several cell types simultaneously, which may help explain why some drugs cause particular side effects, for instance.
The potential pharmaceutical benefits of this research have not escaped the attention of drug developers. Superti-Furga says he's seen an "almost exponential" growth in pharmaceutical interest in protein complexes in recent years, focusing on what happens at the process (pathway) level. Germany's Cellzome is collaborating with Novartis to research different pathway interactomes, according to Tewis Bouwmeester, Cellzome's vice president of biology and research alliances. And Bader, who works with New Haven, Ct.-based CuraGen, says his lab's interactome efforts focus not on producing a genome-scale map but rather on disease-related proteins. Cellzome and Hybrigenics each published pathway-level interactomes this past year, for tumor necrosis factor-alpha and tumor growth factor-beta signaling, respectively.34
Of course, studying pathways in isolation isn't ideal: "At the end of the day, the network is all connected," says Vidal, making it difficult to isolate one pathway from another. Nevertheless, as scientists continue to debate how best to approach the human interactome, some are already thinking about the next step: how best to use the information it provides. Superti-Furga predicts the interactome will inspire a wave of activity to design well-crafted models that help researchers predict how protein pathways behave. If a model can forecast how a system will react to three times more of a particular signaling molecule, for example, scientists will know whether it's wise to tinker with that particular region of the human interactome, before they even try. "Right now," he says, "we are like children playing with different parts."