For 50 years, biologists have focused on reducing life to its constituent parts, first focusing on the cell, then working their way down to the genome itself. However, such achievements created a new challenge--making sense of the huge amounts of data produced. As professor Denis Noble, Oxford University, puts it: "It took Humpty Dumpty apart but left the challenge of putting him back together again."
Systems biology attempts to reconstruct Humpty Dumpty as a series of overlapping mathematical models. It exploits all the theoretical and experimental advances of the various genome projects, allying them to computational, mathematical and engineering disciplines in an attempt to create predictive models of cells, organs, biochemical processes, and complete organisms. It has the potential to unravel how complex biochemical systems, from cells to organisms, really work, and to take a leap forward in preventing and treating disease. "Diseases ... are often a fault of a whole complex network rather than a single element," notes researcher Thomas Sauter, Stuttgart University, Germany. "Such diseases ... need the help of mathematical models and theoretical analysis."
Systems biology involves interaction between experiment and simulation, attempting to create ever more accurate models of processes, such as the functioning of an organ over a period of time. Initially, a rough working model is created and used to design experiments that will verify or refute the predictions of that model. The model is modified to incorporate the results, and new simulations are run that in turn require further experiments. In this way, both the model and experiments evolve together until a satisfactory simulation can be achieved. Systems biology has inspired interactive cooperation among various types of researchers to produce a comprehensive database and a project to predict the behavior of pheromone response in brewer's yeast.
FROM DARK TO LIGHT The field is older than is sometimes thought; the term "systems biology" was coined 40 years ago.1 It was then, says systems biologist Olaf Wolkenhauer, Institute of Science and Technology at the University of Manchester, that the first attempts were made to create mathematical models of complete organs or processes, and to run them on computers to test hypotheses and make behavioral predictions on the basis of varying input data. "But the modeling effort then died as mathematical biology became detached from biologists doing experiments," says Wolkenhauer.
A dark age prevailed until the 1990s, when simultaneous advances in numerous core fields (see sidebar) made a renaissance possible, bringing practical and theoretical biologists back together, says Leroy Hood, president of the Seattle-based Institute of Systems Biology and a DNA sequence pioneer. Of these areas, Hood cites the emergence of discovery science from the Human Genome Project as perhaps the era's defining moment. The HGP "introduced the idea of discovery science where you can take a complex object and define all its elements, put this in a database and enrich the structure of biology," says Hood. "The genome sequence itself was a prototype example of discovery science."
There is more to systems biology, however, than discovery science, which is just a constituent part of the field. "The key difference is that discovery science is a description of a type of information, and systems biology is hypothesis-driven, integrated perturbation, and global capture of information," says Hood.
PERSUASIVE POWERS The iterative experimental approach of systems biology is much less random than before, and more quantitative rather than qualitative, which is reminiscent of the physical sciences. It requires those biologists who grew up before the postgenome era to change their approach, says Wolkenhauer, himself a product of the new era. "It is a new way of thinking. I have found from my work with biologists that once we have done a simulation, we can persuade them to change their experiments and think of the system dependencies, and of the inputs and outputs."
Systems biology also requires practitioners in maths, computing, and engineering in a cross-disciplinary effort, however. And in these fields, too, some shift of thinking or focus is needed. One such field is bioinformatics, which emerged to deal with the huge volumes of data generated by genomic, proteomic, and physiomic projects. This field borrowed from work done on commercial data warehouses and gave biologists tools for storing and manipulating large data sets. It led to so-called data mining techniques for finding subtle correlations and relationships that might not have been obvious, and it involved too many data values to be easily detectable.
A change of thinking is needed in bioinformatics, because systems biology aims to go further than data mining to study complete functional relationships. To do this requires regression analysis of models with input and output parameters, so that the effect of varying conditions on a system such as a cell or organ can be determined.
HIERARCHY OF MODELS However, a model of a whole organ or a metabolic signaling pathway that regulates processes such as cell division would be extremely complex and beyond the scope of current computation. Consequently, a hierarchy of models is needed, each capable of interacting efficiently with the others. But this creates problems with spatial and temporal scaling. In the Human Physiome Project, designed to model the whole human body, time scales ranged from the 70 years of a human lifetime to the one microsecond typical of Brownian motion, which is a difference of 15 orders of magnitude, whereas distances scaled across nine orders of magnitude from the 1-meter dimension of the human body to the 1 nm pore size of an ion channel. "It requires a hierarchy of models to represent such a range, such that the parameters of one model can be understood in terms of the physics or chemistry of the level above or below," said Oxford University's Noble.
This raises the question of where in the hierarchy to begin simulating biological systems. Should it be from the bottom up, beginning with single genes and protein molecules, when unmanageable data volumes might be created, or from the top down, starting with large-scale physiological behavior, where the risk exists of serious error by failing to account accurately for gene and protein interactions? Noble advocated a compromise, working in both directions from the middle, and adopted this in building a mathematical model of the human heart. "Modeling the heart beautifully shows the middle-out approach, because there are at least two levels of data-rich simulation there," he says.2
Using hierarchical models, and the requisite global collaboration, needs standard ways of exchanging data at a high level of detail specific to the systems biology field. As Hood points out, the Internet, just as much as molecular biological developments, has made systems biology possible by creating a universal medium with sufficient bandwidth as a forum for remote collaboration. Specific standards have emerged, such as the Systems Biology Markup Language. Even more specific lower level "MLs," such as CellML for describing cell models, also exist.
DANGER AND IRONY In turn, SBML is regarded as a component of the wider-ranging Systems Biology Workbench (SBW), which is designed to provide a standard application-level outline and meant to be easier than similar ones aimed at the IT community. There is a real danger, however, that the rapid expansion of systems biology will leave the standards bodies behind and lead to incompatibility between models. This must be resisted at all costs, says Mike Hucka, a codesigner of SBML at the California Institute of Technology, to avoid not only wasting time and effort, but also total project failure through the inability to communicate and exchange models between participants worldwide.
One ironic problem in systems biology is that although huge amounts of data points exist within models, there is often a sparsity of measurement data for populating the models in the first place, due to the current difficulty of making sufficient numbers of accurate experimental measurements. Consequently, researchers often seek experimental results performed by other teams--such as the same protein in different organisms, in order to gain additional data.
The long-term remedy to the sparse data problem, says Hood, will come from new techniques based on nanotechnology and microfluidics to obtain large numbers of accurate measurements at the cellular and molecular levels. "It is very clear we need to move to technologies that are [parallel] and miniaturized where multiple steps are integrated and fully automated," he says.
Even without these technologies though, Hood insists that the power and potential of systems biology had already been amply demonstrated. "We've beautifully shown how the galactose system in yeast is connected to many other systems and regulates them."
Philip Hunter (email@example.com) is a freelance writer in London.
1. O. Wolkenhauer, "Systems biology: The reincarnation of systems theory applied in biology?" Brief Bioinform, 2:258-70, 2001.
2. D. Noble, "Modelling the heart: Insights, failures and progress," Bioessays, 24:1155-63, December 2000.