The first statement that usually accompanies a talk or journal article on cellular biology is usually something like, "Life is complicated." The apparent complexity of the networks we find in living cells is sometimes quite overwhelming. Historically, many of us have shied away from tackling system-level questions, being content instead to productively study individual proteins or genes. The new mantra, systems biology, is proclaiming a change in attitude, to convince us that this overwhelming complexity is, in fact, tractable to human understanding. As a card-carrying, bonafide systems biologist, I would have to agree.
THE CELL COMPUTED
Currently, there is a community quite at home in dealing with huge complexity: modern day microchip designers. Given the statistics on modern chip design, one wonders if, in fact, cellular complexity has been surpassed. For example, with the recent move to 90-nm fabrication technology, the average transistor is now less that 50 nm in diameter – only five times bigger than the average intracellular protein. Not only are the parts getting smaller, the number of parts fabricated onto a single die is quite astounding. For example, the AMD Athlon 64 has about 106 million transistors.
Given that a single kinase/phosphatase cycle has a dynamic response similar to a transistor, with approximately 518 kinases known to be expressed in humans, we are left with the embarrassing notion that a human cell's computational capacity is significantly less than even the very first microprocessor – the Intel 4004, which had just over 2,000 transistors. This comparison is perhaps unfair, since it assumes that cellular signaling pathways "compute" digitally like human-made microprocessors. Signaling pathways more likely operate like an analog computer. Most external signals are themselves analog, and protein kinetics are eminently suitable for analog computation.1 Assuming that a single kinase/phosphatase unit behaves as a modest analog element such as an operational amplifier, it puts human protein networks somewhere around an Intel 8086 microprocessor in terms of complexity. That's still not particularly high.
Even if we take into account the added complexity of gene networks, gene splicing, and the great variety of covalent states, we might still only be able to increase the complexity a little more than ten-fold – comparable to, say, a 486 processor. Ok, perhaps these numbers are meaningless, but it makes one think for a moment that cells may not be as functionally complicated as it seems, given the relatively small number of components.
MOVING BEYOND A PARTS LIST
The real problem facing systems biology is that we don't know the computational roles that most cellular networks play. In many cases we know the parts, and we often even know the connections, but we have very little idea as to what the networks actually do. To take a simple example, glycolysis, presumably the most well understood and historically the first pathway discovered more than 70 years ago, is still poorly understood in terms of its functional role. Other than some vague notions of maintaining homeostasis and supplying ATP, we still don't really understand why glycolysis has so many feedback and feed-forward loops.
What's surprising is that some researchers don't recognize the gap in our understanding. When I ask my colleagues why metabolism is not of great interest anymore, they say, "Because we understand metabolism. There is no more to discover." What they really mean is that all the parts and connections, sometimes in extreme detail, are known. But Humpty-Dumpty is still in pieces and no one has yet bothered to put him back together again.
As biologists, we have a long and illustrious tradition of collecting, be it butterflies or genes. Systems biology is trying to take us down a new route, one that requires a different mode of thinking. But many researchers do not fully understand the nature of the change. Some ignore it completely and consider it a passing fad. Others cope with the change by treating systems biology as an extension of traditional biology, equating it with the collection of vast amounts of high-throughput data. Some researchers focus on network inference as a defining attribute, while others equate systems biology to predictive biology that is closely tied with modeling. This last interpretation is closer to the spirit of systems biology, but modeling, though important, is still only a means to an end.
Herbert M. Sauro
While systems biology may make use of high-throughput data, network inference, and modeling at some stage, these activities are not the defining attributes. Instead, systems biology is about understanding and rationalizing the operation of a biological system. Systems biologists are not content with just a list of parts, terabytes of high-throughput data, or even a computational model. Systems biology is about looking at the MAPK pathway or the p53/Mdm2 couple and trying to rationalize the structure and kinetics of the network in terms of function. These studies may be on small systems or large; size is not important. However, given the confusion on the interpretation of the term systems biology, a more appropriate phrase for rationalizing cellular networks might be "molecular physiology."
THE PATH AHEAD
Biology is the product of evolution and not of an intelligent mind. This makes questions of function difficult for many biologists to address. Unlike engineers who have a design to work from, as biologists we often have very little idea of what evolution had "in mind." An engineer can open up an AM radio and locate the different modules – the amplifier, the frequency detector, the demodulator, etc. – and thereby rationalize what seems, to the untrained eye, a jumble of components. To be able to do the same trick on a complex signaling network involved in cancer would be a tremendous achievement both intellectually and for facilitating our search for a cure.
So how does one proceed? My own group at the Keck Graduate Institute focuses on the understanding of large networks by functional deconstruction using both bottom-up and top-down approaches.2 We have assembled a large library of network motifs through a combination of manual design and
The end result of a successful project is a functional understanding of the problem and a reliable predictive model. Most important is the coexistence of experimentation, theory, and modeling in one package. Modeling and experimentation, in particular, must go hand in hand. Do not fall into the trap of putting the modeling effort at the end of a project; this is the worst possible scenario. If anything, put the modeling effort up front, because when you do the modeling you will know what needs to be measured, what hypotheses need to be tested, etc. Modeling should guide you and uncover gaps in knowledge.
At the end of the project you will hopefully have a reasonable quantitative model of the biology (validating that model is another story, of course). You will likely find, however, that the model is so complicated that it's not easy to understand. This is where theory helps. Two essentially equivalent theories, metabolic control analysis (MCA) and the biochemical systems theory (BST) are both excellent starting points.3
These theories describe how perturbations propagate through reaction networks. Most important, they force us to think about what happens to concentration levels and reaction rates during a perturbation, something that many of us never consider (which is quite odd considering that one of the defining attributes of life is change). In addition to these biological theories, an appreciation of engineering, particularly electrical engineering is also indispensable. If there is any group in our community that understands signal processing more than most, it is, in my experience, the electrical engineers. Recent work by Brian Ingalls4 and more recently Chris Rao has shown that a deep connection exists between classical control theory and MCA/BST. This is good news because we can leverage that mountain of engineering theory to our cause.
If possible, theory and reverse engineering should be run alongside modeling and experimentation. Rationalization of the pathway will often generate hypotheses that one would have never considered. Finally, do not be seduced by the promise of high-throughput data. Unfortunately, much high-throughput data are inappropriate for quantitative systems biology, though it is hoped that will change in the future. Hypothesis-driven, targeted, experimental measurements are better, time-honored approaches.
Currently, the people involved in the new and exciting area of synthetic biology seem to be taking the intellectual lead.5 They take an engineering approach, building quantitative models and testing them experimentally under controlled conditions. If I could, I would bring synthetic and systems biology together, using the approach from the former to solve problems in the latter.
Herbert M. Sauro is a group leader in computational and systems biology at the Keck Graduate Institute. He thanks Anastasia Deckard and Vijay Chickarmane for reading the manuscript and providing helpful discussions.
He will be contacted at