Sciences Date: December 7, 1992
The energetic young field of computational biology is looking to the coming generation of massively parallel computers for its future, researchers say. These machines, enhanced by the contributions of computer scientists who are developing innovative programming tactics, will be crucial in allowing biological researchers to reach ambitious goals.
"Parallel computers are going to let this field take off," says Andrew McCammon, a theoretical chemist at the University of Houston and one of three scientific directors at the Keck Center for Computational Biology, a joint effort with Rice University and Baylor College of Medicine, all based in Houston.
The effective use of ever more powerful computers to simulate the actions of proteins, enzymes, and other complex molecules in computational biology relies on a remarkable marriage of disciplines, the scientists say. Biologists, chemists, physicists, computer scientists, and mathematicians are all contributing to develop new methodologies and move investigations forward.
At the same time, foundations and government agencies--as well as individuals such as William Gates III, chairman of Microsoft Corp., Redmond, Wash.--have provided key funding to help establish research and training centers for computational biology. Most of these are affiliated with university supercomputer facilities, but the substantial computational resources of the national laboratories--perhaps more available for such inquiries with the end of the Cold War--are also being called into play.
"My personal belief is that it's going to move extremely fast," says William Goddard III, a professor of chemistry at the California Institute of Technology in Pasadena. "The next two or three years are going to look like a revolution."
The pharmaceutical and biotechnology industries are also paying close attention to developments in computational biology, in the hope that the new techniques will help them step up the pace of drug design and development, according to these investigators. And, as the capabilities of the field become more defined, these companies expect to be hiring more individuals with specific computational skills along with their scientific training.
"Key companies and key people are very aware of this area, very interested in it," says Kathleen S. Matthews, chairwoman of the depart- ment of biochemistry and cell biology at Rice University. Matthews is also chairwoman of the management committee of the Keck center. "They are concerned about being positioned for the future with the appropriate personnel and resources to make the next step," she says.
Computational biology is similar in many ways to the more established field of computational chemistry. Both seek to predict the actions of atoms and molecules based on what researchers call first principles--the basic laws of physics. This approach stands in contrast to the perhaps more experimental, or descriptive, approach of traditional biological inquiry. The computational approach is also differentiated from theoretical approaches. In fact, it is sometimes referred to by practitioners as a "third methodology" for science--a new avenue for discovery lying somewhere between experiment and theory.
The difference between computational chemistry and computational biology is that computational biology attempts to manipulate, on the basis of fundamental physical laws, molecules that are usually much larger than those of interest to chemists. This leap of complexity requires significantly greater computational capacity than is generally available today.
"The fundamental problem in computational biology is to predict protein folding, or general structure, from first principles," says Caltech's Goddard.
"It's been dubbed `the second genetic code,' " says Robert Lang- ridge, a professor of pharmaceutical chemistry at the University of California, San Francisco. "If we could translate the sequence of amino acids [in a protein], which can be determined directly from the nucleic acids sequences [in DNA], into a three- dimensional structure, then we'd be in a much better position to understand what the genes are doing."
The reason this is so--and the key to its importance--is that a protein's structure almost always determines its function, according to researchers. But determining that structure is not an easy task.
Langridge and coauthors described the protein folding problem recently (E. Lander, R. Langridge, and D. Saccocio, Computer, 24:6-13, 1991): "Once synthesized, the protein chain folds according to the laws of physics into a specialized form, based on the particular properties and order of the amino acids (some of which are hydrophobic, some hydrophilic, some positively charged, and some negatively charged)."
Although the basic coding scheme by which protein chains are derived from DNA is well understood, biologists cannot accurately predict the folded protein shape.
Addressing this problem--the translation of amino acids sequences into structures through an understanding of physical laws--is a dauntingly complex computational task.
"The smallest interesting biological systems are enzymes or proteins of, say, 45 amino acids--400 to 500 atoms," says Goddard. "To predict structure from first principles for a system of that size, you would need to define 120 angles or so. If you wanted to try 10 values for each one of those, that would be 10 120.
"That's an impossible number--it's bigger than the number of grains of sand in the world."
The need to be able to do these kinds of calculations is why computer scientists and their tools--better hardware and better pro- gramming strategies--will have a greater role in answering the biological questions of the future, whether in drug design or in tracking genetic disease, say investigators.
"We need big computers, the bigger the better, and more clever methods to make everything more efficient, so we can work on bigger systems and get more accurate simulations," says Paul Bash, a computational structural biologist at Argonne National Laboratory in Illinois.
Klaus Schulten, a theoretical biophysicist at the University of Illinois, Urbana-Champaign, agrees.
"Supercomputers and massively parallel computers will play an increasing role in biology," he says. "At this point, we can really only deal with very small biological molecules over very brief periods of time. We cannot include the native environment of these molecules, like water or membranes, which are very important for the function. We are at the very beginning, and whatever speedup in technology we have will be very quickly absorbed for quite some time by the field."
"These problems are as hard as it gets," says William D. Wilson, a theoretical physicist at the branch of Sandia National Laboratories in Liv-ermore, Calif. "Biology is very, very difficult. These are challenging, first-class problems that scientists who are working on the edge recognize will be the [problems of the] next century."
Wilson notes that more physicists and chemists are learning biology, perhaps because they see greater possibilities for dramatic scientific breakthroughs in fields such as biology or ecology rather than in, for instance, atomic physics.
"With protein folding," he says, "you'd win the Nobel Prize if you could pull that off on a computer--a protein is such a big beast."
To address difficult biological questions like protein folding, interdisciplinary research teams are emerging at a handful of locations. Usually, the availability of impressive computational resources is of crucial importance to these researchers. This is one reason the national laboratories have been major sites for the early development of computational biology.
"It's going on here because we have big computers next to us," says Sandia's Wilson.
Joseph Lannutti, a professor of physics at Florida State University, Tallahassee, and director of the Supercomputer Computations Research Institute at FSU, sees the same trend, but adds that the national labs are currently looking for new missions.
"If there is a national laboratory emphasis [in computational biology], it's because of the need for supercomputers," Lannutti says. "But there's also what's called the `greening of the military' and `civilianizing the military.' One way to civilianize a national laboratory that normally makes weapons is to take on some major new project in research that's hot these days--and biotechnology is hot these days."
The reliance of computational biology on powerful computers might also explain the interest Microsoft Corp.'s Gates has shown in biotechnology. Earlier this year, Gates gave $12 million to the University of Washington, Seattle, to establish a department of molecular biotechnology, launched this fall under former Caltech biologist Leroy Hood (Susan L-J Dickinson, The Scientist, March 30, 1992, page 1).
Hood recently told Business Week (November 16, 1992), "Computation is the future of biology."
The Los-Angeles-based W.M. Keck Foundation, perhaps best known for supporting construction of the world's largest optical telescope on Mauna Kea in Hawaii, has provided primary funding for the Keck Center for Computational Biology. The center is a joint project run by Rice University, Baylor College of Medicine, and the University of Houston.
"Now that they've developed the premier instrument for looking at the large-scale structure of the universe," says the University of Houston's McCammon, "the Keck Foundation has decided to focus on the microscopic side of nature, to see if we can understand the molecular origins and mechanisms of life. The instrument to do that is advanced parallel supercomputing. In a sense, it is the ultimate microscope."
The Keck Foundation is also funding computational biology proj- ects at the University of Pittsburgh and at the Mayo Foundation in Rochester, Minn., according to McCammon. University supercomputer centers gearing up in the area of computational biology include the University of California, San Diego, the University of Illinois, and Cornell University in Ithaca, N.Y., he says.
In early October, University of Houston researcher Ridgway Scott, a mathematician affiliated with the Keck center, won National Science Foundation funding for a project to "coordinate a group of chemists, biophysicists, computer scientists, and mathematicians from UH who will use emerging scalable parallel computers and software to develop and implement new methods for solving critical problems in biomolecular design," according to an NSF statement.
Competition for the NSF Grand Challenge Applications Group grant obtained by Houston, intended to promote high-performance computing techniques and resources, was strong, says McCammon.
"The reason NSF singled this project out from the hundreds it had to look at," he says, "was that it was unique in that the mathematicians really had a coequal role with the basic physical and biological scientists. We were the only ones really going in with a balanced team of mathematical computer science people and the more biophysical and chemical types."
Researchers describe the interdisciplinary character of computational biology as more than merely helpful.
"You can't do this without an interdisciplinary approach," says Wilson.
"It certainly is the case that the overall issues of computational biology have attracted quite a motley group to be interested in it," says Goddard, "people with different kinds of backgrounds--in chemistry, materials, physics, computer science."
Most researchers involved in computational biology expect that the biotechnology and pharmaceutical companies will be quick to take advantage of the techniques that are being developed. By doing preliminary work with computational modeling techniques, companies might be able to substantially reduce the time needed to design a drug. Scientists emphasize, however, that the laboratory will always be necessary to confirm the computer findings.
"It can be useful for suggesting avenues of research that may be better than others," says C. Nicholas Hodge, research manager for the computer- assisted drug design team in the biotechnol- ogy department of Du Pont Merck Pharmaceutical Co., Wilmington, Del. Hodge is currently on a two-month visit to the Keck center to better understand the new field. "It helps us narrow the focus of what we try to do in the laboratory."
He adds: "It's expensive to maintain the software, the equipment, and the people. But if you can truncate the process of getting a drug into development, you save yourself money."
McCammon believes that groups like his at the Keck center will be the source for an entirely new breed of research worker of great value to biotechnology and pharmaceutical companies.
"To really take advantage of what's coming out in the parallel computing world," he says, "they are going to have to look to these national centers for people who are getting their degrees in computer science or applied mathe- matics or electrical engineering, but who know enough chemistry and biology to be useful in a biotechnology setting."
The Keck center's Matthews also sees the need to prepare the next generation of researchers as primary.
"Training is definitely a central feature of what we're doing and what our goals are," she says.
Schulten at the University of Illinois, who agrees on the importance of training at his center, prefers to prioritize the future skill requirements of industry and research slightly differently.
"It is better to emphasize the need for modelers who are biochemists or biophysicists to learn the new [computational] methods and include them in their work," he says, "rather than saying that now companies or research groups should hire computer scientists."
But most agree that the field of computational biology will still be growing well into the future.
"When I started doing this 10 years ago as a graduate student, there were very few people working at it," says Argonne's Paul Bash. "And people who were not working at it thought we were nuts. Now, it's incredible. People are starting to jump into it in droves."
"It's a field which is expanding very rapidly," agrees Robert Langridge at the University of California, San Francisco.
"It has enormous intellectual value and, obviously, great economic interest. And, I think, we're not going to run out of problems in the near future."