© 2004 National Academy of Sciences (From M.L. DeMarco, et al, PNAS, 101:2293–2298, 2004)
Figuring out how denatured proteins morph into their folded, active forms isn't just a challenge; it's one of the most elusive problems in biology. Protein chemists now have more computational power to devote to the problem, thanks to a recent award of two million processor hours on the Department of Energy's 10-teraflops IBM supercomputer at the National Energy Research Scientific Computing Center in Berkeley, Calif. The award, part of the Innovative and Novel Computational Impact on Theory and Experiment (INCITE) program, supports a project entitled "Molecular Dynameomics," whose ultimate goal is to create a repository for molecular-dynamics data to be used in protein structure predictions.
"We want to simulate every protein fold," says project leader Valerie Daggett, professor of medicinal chemistry at the University of Washington in Seattle. Daggett's research bolsters experimental work on protein folding with computer simulations that can reveal the position and motion of every single atom in a protein.1
Her group will model the molecular dynamics of as many different protein motifs as possible in the processing time they have. So far they have simulated folding for about 150 of 1,130 known nonredundant protein-fold classes, such as the globin-like, TIM-barrel, and acid protease folds. By modeling one protein from each class, Daggett expects that her lab will be able to decipher folding dynamics for 80% to 90% of fold types by the project's end in November. She says pinning down folding strategies for so many proteins could uncover general "rules" of protein folding that have eluded researchers in the past.
"We still can't predict protein structure adequately, because we don't know all the rules," Daggett says. By "looking at the folding of all these different proteins, we want to pull out the general features that will help us to improve prediction algorithms."
Daggett and her coworkers start with the active biological structure of each protein from previous NMR spectroscopy and X-ray crystallography experiments. They choose a representative for each fold type based on the quality of experimental data on the protein's native structure and the biological relevance of the protein. The proteins are simulated as they unfold, rather than fold, for a very practical reason: Unfolding is fast and easy to induce by simply raising the temperature as high as 225°C.
The computer deduces unfolding dynamics by applying a standard set of force-field parameters. Daggett's group spent years calculating atomic interactions in proteins – including bond lengths, angle sizes, force constants, and van der Waals and ionic interactions – in order to come up with a generic force-field description that can be applied to all moving proteins.
In this approach, "every single atom [is] represented," Daggett says. "It's the most realistic simulation method available." Daggett's group is creating a free-access online database
Analyzing protein folding by simulating unfolding at very high temperatures is a fairly novel technique that not all simulations employ, says structural biologist Vijay Pande of Stanford University. Whether unfolding at temperatures close to the boiling point of water is really the reverse of folding at physiological temperatures is "still an open question," Pande says.
The proteins that Daggett has examined, however, move through the same transition states, intermediate structures, and denatured forms, regardless of whether they are folding or unfolding, Daggett says. By examining the same proteins in high-temperature simulations and low-temperature experiments, Daggett and her colleagues have shown that simulation at extremely high temperatures "just affects the timescale and not the process."
Besides providing atomic-scale resolution, molecular dynamics simulations also allow scientists to study the fluid nature of proteins in solvent, Daggett says; solid-state techniques such as X-ray crystallography do not. Modeling a dynamic protein can reveal features of its interactions that can't be seen in its static crystal structure.
A drawback of these simulations, according to Daggett, is the size of protein that can be modeled. Owing to computer power limitations, Daggett tries to pick model proteins that are no longer than 300 amino acids, which usually excludes proteins made of multiple polypeptide subunits.
On the other hand, complex, multido-main proteins are sometimes just a bunch of small domains linked together, says biochemist Andreas Matouschek of Northwestern University. "If you learn how these individual domains behave, you know how parts of these big proteins behave, too."