Metabolic and regulatory networks may be expanded by coupling high-throughput phenotyping and gene expression data with the predictions of a computational model. (Reprinted with permission, Nature, 429:92–6, 2004).

Most people like their predictions to pan out, but Markus Covert is glad when his fail. That's because he has developed a genome-scale mathematical model of the transcriptional interactions that regulate bacterial metabolism. The model's mistakes lead to new ideas about how the network is put together.

Covert, now a postdoctoral fellow in David Baltimore's lab at the California Institute of Technology, constructed the Escherichia coli model, described in Nature,1 when he was a doctoral student with Bernhard Palsson at UC-San Diego. "We have the bacterium's full genotype and can observe its phenotypes, but there is a gap in-between," says Palsson, professor of bio-engineering. "That gap is filled by the interactions of thousands of compounds...


To validate the model, Covert tapped into the ASAP database of the E. coli Genome Project at the University of Wisconsin-Madison, which aims to knock out every gene in E. coli and determine which substrates can support the growth of the resulting mutants. One evening, Covert asked iMC1010v1 to predict the results for 110 different mutants placed in 125 different growth media. By the next morning, it had correctly predicted 10,828 (79%) of the 13,750 actual outcomes.

The wrongly predicted outcomes identified five growth conditions in which unknown regulatory interactions must have produced the observed results. They also revealed that eight of the mutants require unknown enzymes or pathways to grow under various conditions.

To test the utility of his model for discovering transcriptional regulatory networks, Covert made six new strains of E. coli, each lacking a gene whose expression is affected by oxygen availability. After growing these strains and wild-type E. coli in the presence and absence of oxygen, Covert evaluated gene expression levels by extracting mRNA and analyzing it on Affymetrix chips. He also asked the model to predict the results: Its accuracy rate was 49%.

Using two-way analysis of variance to identify the differentially expressed transcription factors, Covert updated his model. The second iteration, iMC1010v2, correctly predicted 100 of 151 (66%) gene-expression changes without any false-positive predictions. " [Covert] discovered three times more links in the network than were previously known," says Palsson. "We should be able to describe the regulatory network of E. coli reasonably comprehensively after 10 to 15 iterations."


The researchers plan eventually to replace the model's Boolean logic with quantitative data on the interactions of transcription factors with genes. They also want to write algorithms that automatically devise the most appropriate experiment after the model makes a mistake.

Meanwhile, the model is being put to good use. "We have already started incorporating these regulatory constraints into the OptKnock computational framework, developed in our lab, for identifying gene deletions that lead to targeted overproductions," says Costas Maranas, associate professor of chemical engineering at Pennsylvania State University.

Palsson also predicts commercial applications. For example, the model could be used to uncover and modify regulatory networks of bacteria used for bioremediation or bioprocessing. He adds, "A futuristic idea is to think of these models in the context of synthetic biology, for designing new organisms."

- Linda Sage

Interested in reading more?

Magaizne Cover

Become a Member of

Receive full access to digital editions of The Scientist, as well as TS Digest, feature stories, more than 35 years of archives, and much more!
Already a member?