Protein folding raises some of biology's greatest theoretical challenges. It also lies at the root of many diseases. For example, the fundamental question of whether a protein's final tertiary conformation, sometimes called the native state, can be predicted from its primary amino acid sequence is also of vital importance in understanding the protein's potential capacity to form disease-inducing aggregates.
Protein folding is a hierarchical process, sometimes pictured as an inverted funnel, in three fundamental stages. All proteins begin with a primary amino acid sequence, which folds into intermediate secondary shapes comprising the well-known a helices and b sheets, and then into the final tertiary, or native form, in which they fulfill their function. Some proteins undergo a further phase, combining with other folded proteins to form quarternary structures.
The etiology of protein folding-related diseases can be investigated by forming predictive relationships between these structural layers, and understanding how subtle changes in sequence can have dramatic consequences, not just for the final native protein but also for the organism as a whole. The difference between health and death can hinge on the existence of alleles often with just a single nucleotide substitution, as in prion diseases.
Predictions aside, what's clear through current research is that a protein's actual fate during folding depends on its immediate biochemical environment, a fact that bedevils biotechnology; moreover, Jonathan King of the Massachusetts Institute of Technology says that proteins fold differently in test tubes as compared to in vivo eukaryotic proteins don't necessarily fold correctly in prokaryotes.
The process of folding and reaching a stable native conformation, allowing a protein to fulfill its function without disrupting other cellular parts or the pathway in which it operates, can be understood and analyzed at several levels. One obvious level is the protein's function, as this depends on the detailed tertiary conformation. The influence of selective pressures on the folding process, such as those in sickle cell anemia, also needs to be considered in unraveling the detailed mechanisms, in particular the intermediate secondary stages. Selective pressures also may explain the existence of partially formed protein fragments that may be the detritus of evolution, says Andrei Alexandrescu, assistant professor at the University of Connecticut. "Recombination of existing architectures appears to be an important mechanism in the evolution of new protein domains. Such partially folded states could remain as vestiges of autonomous folding units that became incorporated into more complex domains."
Folding also needs to be recognized as a four-dimensional process, for timely assembly of a protein involved in a signaling pathway is just as vital as achieving the correct conformation, says genetics professor Arthur Horwich of Yale University. "In the cellular context, being able to fold in a reasonable timescale is important if the protein has a biological [as opposed to a structural] function. If an organism needs to do something in a couple of minutes, it won't do to wait two hours for a protein to fold." A protein that fails to fold in time may be dismantled before it has assumed its native state, or it may fail to be manufactured in sufficient quantities. According to Horwich, protein degradation occurs if correct folding fails after several attempts within a given time scale.
LAW AND ORDER Substantial progress has been made during the last few years in determining the physical principles that underlie the rate at which a protein folds. One remarkable finding, discovered by the team of David Baker, assistant investigator at University of Washington School of Medicine, was of a strong correlation between a protein's contact order and folding rate.1 Proteins fold fastest if the residues that are in contact within the finished conformation are close to each other in the primary amino acid sequence. There was a good theoretical basis for assuming a relationship between contact order and folding rate; it is clearly more difficult for components lying far apart on a primary chain to come together during the folding process, but it was possible that interatomic interactions would be more important in determining folding rate.
It turns out they are not, according to Baker, leading to the assertion in his article that nonnative interactions and conformations (kinetic traps) have a relatively minor impact on folding rate, at least in small proteins, defined by Baker as being less than 150 amino acids in length.
According to Baker, this and other results suggest that it is the structure of the native state that determines a small protein's folding rate and mechanism, not the atomic interactions that take place during folding. This, in turn, implies that the sequence's local details are less relevant than had been expected, since it is the sequence that determines the order of the atomic components and the timing of their convergence during the folding process. So, in a sense, the cart comes before the horse.
This does not contradict the fundamental tenet of protein chemistry--one sequence, one structure--but it does mean that the relationship between sequence and structure is perhaps deeper and subtler than originally thought. The University of Pennsylvania's Heinrich Roeder and colleagues discerned the complex interplay between local and nonlocal events in protein folding and how these relate to the sequence.2 The team found that with b-lactoglobulin, the local sequence might favor helical structure, but this tendency subsequently can be overridden by the tertiary interactions established during the latter stages of folding. At first sight, this might suggest that the formation of temporary helical structures at the secondary stage slows down the folding process. But in the Roeder paper, this factor is dismissed as insignificant from the timing perspective, because the helical structures are very unstable, reinforcing Baker's argument that interatomic interactions are less relevant than native structure in determining folding rate.
Because of their short lifetimes, which can be as little as 100 microseconds, the existence of many such transient folding intermediates just recently was discovered. Roeder and colleagues obtained a detailed view of the secondary structure at such resolutions using a rapid flow protocol that can detect the formation of marginally stable hydrogen-bonded structures with folding times down to 200 microseconds. This allows two-dimensional nuclear magnetic resonance spectra to be recorded after folding is completed, showing the location and degree of solvent protection for individual peptide nitrogen-hydrogen groups formed during the early folding stages, thereby revealing the intermediates.
FOLDING MISADVENTURES Such techniques are essential for investigating the role of folding intermediates and their potential role in disease. "In general, the critical issue (for folding-related diseases) is properties of partially folded intermediates," says MIT's King. His team chose to study the tailspike protein of phage P22 to reveal intracellular folding intermediates.
"We were able to identify a set of temperature-sensitive folding mutations that further destabilized the intermediates, shifting them to the inclusion body pathway. These experiments revealed clearly that problems in protein folding in the biotechnology industry were due to polymerization of partially folded or partially misfolded intermediates," says King. Inclusion bodies are formed in cells or organelles by aggregated or misfolded proteins, and as such are insoluble and a potential source of problems for protein synthesis.
The MIT team also found that a single amino acid substitution could suppress misfolding. This proved, King says, that the aggregation resulting from misfolding was a specific process driven by the amino acid sequence, even if, as described earlier, the rate/mechanism of folding is not sequence-dependent.
Folding intermediates, says King, can have entirely different properties from the native state. Intermediates may be unstable and delicately poised between behaving correctly and going astray. Further, they risk getting stuck in a kinetic trap, from which energy is required to extract the intermediate so that normal folding can resume.
Such extraction is an important role of chaperone proteins that are needed to assist the folding process, particularly for larger proteins where there is more scope for misbehaving. Chaperones also are needed further down the line to prevent proteins acquiring the incorrect tertiary native structure, or subsequently misfolding into aggregates. This gives rise to the possibility of manufacturing chaperones to treat folding-related diseases caused by protein aggregation, such as Alzheimer and Creutzfeldt-Jakob disease.3 (See 5-Prime | Protein Folding)
The potential for such therapies was demonstrated when Simon Hawke at London's Imperial College and Professor John Collinge at the Medical Research Council Prion Unit established that prion disease, in principal, could be prevented in mice using monoclonal antibodies (mAbs).4 It appeared that anti-PrPC (cellular prion proteins) monoclonal antibodies with little or no affinity for scrapie PrPSc prions prevented PrPC incorporation into propagating prions. The anti-PrPC antibodies were attaching to PrPC prions and functioning as chaperones by stabilizing their configuration and preventing their conversion into diseased scrapie PrPSc prions, according to Hawke and colleagues.
THE EDGE OF THE STRAND Until recently, many scientists presumed that disease-causing proteins, created by the misfolded formation of aggregates, were rare. While it's true that the diseases themselves were quite uncommon, recent research suggests that, on the contrary, most polypeptides would form amyloids in the absence of stabilizing factors such as chaperones. According to Cambridge University's Chris Dobson, growing evidence shows that the ability to form highly organized and stable amyloid aggregates is a generic property of all polypeptides, and not just those few proteins associated with recognized pathological conditions.5
This suggests, as MIT's King believes, that aggregates were predominant in primitive organisms until efficient folding mechanisms evolved with inbuilt protection against amyloid plaque formation. Connecticut's Alexandrescu says that quality control mechanisms are directly encoded into the amino acid sequence, specifying structural surface features that protect against misfolding. Alexandrescu cites work from two groups on the design of edge strands.6,7 "There appear to be distinct sequence fingerprints that specify edge strands," says Alexandrescu. "The specification of an edge may be important in preventing indefinite continuation of sheets through improper association between molecules."
Even with these quality control mechanisms in place, many proteins are still unstable. Indeed, it appears that proteins have become less, rather than more, stable, if there is truth in the theory that protein aggregates were more prevalent in early life forms. "The aggregated, or more accurately, polymerized, states associated with pathology are in general irreversibly associated (incapable of reaching a healthy native state), and probably more stable than the native state," says King.
But stability has its problems as well. The organism needs to be able to dismantle proteins that have become damaged or that have served their purpose. Moreover, many globular proteins must readily unfold so that they can pass through membranes, whose pores (except nuclear pores) are usually too small to accommodate the fully folded native form. Chaperones facilitate such folding and unfolding, but clearly the need for it to happen regularly imposes constraints on the stability level that the native form can attain.
According to Alexandrescu, the marginal stability of most proteins is actually reflected in the built-in, quality-control mechanisms, with small sequence changes that either increase or decrease stability, proving to be equally undesirable for the organism. "Lowering the stability of p53, for example, can lead to loss of the protein's tumor-suppressor function. Raising the stability of p53 could presumably interfere with the clearance of this naturally short-lived protein and result in apoptosis."
Many native proteins require locally unstable surfaces to fulfill their functions as, for example: active centers of enzymes, contact surfaces between proteins in signal transduction, or antigen binding surfaces for antibodies. Peter Csermely of the Institute of Biochemistry, Semmelweis University in Hungary,8 says such locally unstable structures are stabilized by the favorable conformation of the rest of the protein, whose inner hydrophobic core becomes packed with high-energy bonds, such as disulfide bridges or ion pairs. In any case, the unstable surface-protein segments are stabilized when they form complexes with some other molecule, which happens as the whole protein fulfills its function, such as signal transduction.
It is not yet clear as to what extent detailed features of large proteins, such as local instabilities, can be predicted from their primary sequences. But as Horwich points out, important progress has been made on the principle protein-folding fronts, of predicting native states from the sequence, and understanding the folding process in conjunction with its intermediate states. "I think it's increasingly clear that there are intermediates involved even in very fast folding proteins. There is also pretty good agreement now on the role of chaperones, firstly to rescue from kinetic traps and put back on the folding pathway, and secondly to bind on intermediates."
Philip Hunter (email@example.com) is a freelance writer in London.
1. D. Baker, "Prediction and design of protein-folding mechanisms, protein structures, and protein-protein interactions," Howard Hughes Medical Institute, available online at www.hhmi.org/research/investigators/bakerd.html
2. K. Kuwata et al., "Structural and kinetic characterization of early folding events in b-lactoglobulin," Nat Struct Biol, 8:151, 2001.
3. B. Maher, "Researchers reveal a new twist in torsion dystonia," The Scientist, 17:32-3, April 21, 2003.
4. A.R. White et al., "Monoclonal antibodies inhibit prion replication and delay the development of prion disease," Nature, 422:80-3, March 2003.
5. C. Dobson, "The structural basis of protein folding and its links with human disease," Philos Trans R Soc Lond B Biol Sci, 356:133-45, 2001.
6. J.S. Richardson et al., "Natural b-sheet proteins use negative design to avoid edge-to-edge aggregation, "Proc Natl Acad Sci," 99:2754-9, 2002.
7. W. Wang, M.H. Hecht, "Rationally designed mutations convert de novo amyloid-like fibrils into monomeric b-sheet proteins," Proc Natl Acad Sci, 99:2760-5, 2002
8. P. Csermely, "A nonconventional role of molecular chaperones: Involvement in the cytoarchitecture," News Physiol Sci, 16:123-6, 2001.