Protein Folding: Theory Meets Disease

Protein folding raises some of biology's greatest theoretical challenges. It also lies at the root of many diseases. For example, the fundamental question of whether a protein's final tertiary conformation, sometimes called the native state, can be predicted from its primary amino acid sequence is also of vital importance in understanding the protein's potential capacity to form disease-inducing aggregates. MISS A FOLD, PROMPT A DISEASE Here's a list of protein folding-related disease catego

Sep 8, 2003
Philip Hunter

Protein folding raises some of biology's greatest theoretical challenges. It also lies at the root of many diseases. For example, the fundamental question of whether a protein's final tertiary conformation, sometimes called the native state, can be predicted from its primary amino acid sequence is also of vital importance in understanding the protein's potential capacity to form disease-inducing aggregates.


Here's a list of protein folding-related disease categories:

1. Amyloidoses, such as Alzheimer and Creutzfeldt-Jakob disease, involve deposits of aggregated proteins in a variety of tissues, and typically lead to degradation of cognitive or motor functions.

2. Lung diseases, such as cystic fibrosis or hereditary emphysema, entail mutations that lead to degradation of proteins that have vital respiratory functions.

3. Blood coagulation diseases also involve mutations that lead to retention and degradation of vital proteins (for example, protein C), blocking secretion and causing deficiencies in coagulation, such as blood clotting or bleeding disorders.

4. Liver diseases, in which proteins needed in signaling or enzyme regulation are retained in the endoplasmic reticulum (ER, from which hormones, antibodies, and enzymes are secreted) and degraded so that they fail to function.

5. Diabetes, in which a number of mutations occur, causing misfolding of various proteins that are exported from the ER in mutated states. The misfolded proteins disrupt carbohydrate metabolism, or even accumulate in the ER, with toxic effects to the cell.

6. Cancer, in which misfolding of key-proteins can cause them to lose tumor-suppressor functions. The main victim is the p53 protein, whose tumor-suppression function is so vital that it has been described as "the guardian of the genome." Even a single DNA strand break, which could trigger uncontrolled cell division, normally activates p53, which then induces the production of other proteins that either block cell division or trigger programmed cell death.

The mutation of even a single nucleotide of the p53 gene can cause the protein to misfold and fail to recognize when action is needed, or fail to respond correctly. It's believed that such mutation-induced misfolding causes about half of all cancers, and a substantially higher proportion of some cancers, including lung.

7. Infectious diseases, in which pathogens exploit the host ER-associated degradation (ERAD) mechanism, whose normal function is to cause the destruction of terminally misfolded proteins. Viruses such as Epstein-Barr escape immunosurveillance by corrupting the ERAD machinery and degrading the antigens that the immune system needs to identify infected cells. Bacteria can invade a cell by coercing ERAD to destroy proteins that otherwise bar its entry.

Protein folding is a hierarchical process, sometimes pictured as an inverted funnel, in three fundamental stages. All proteins begin with a primary amino acid sequence, which folds into intermediate secondary shapes comprising the well-known a helices and b sheets, and then into the final tertiary, or native form, in which they fulfill their function. Some proteins undergo a further phase, combining with other folded proteins to form quarternary structures.

The etiology of protein folding-related diseases can be investigated by forming predictive relationships between these structural layers, and understanding how subtle changes in sequence can have dramatic consequences, not just for the final native protein but also for the organism as a whole. The difference between health and death can hinge on the existence of alleles often with just a single nucleotide substitution, as in prion diseases.

Predictions aside, what's clear through current research is that a protein's actual fate during folding depends on its immediate biochemical environment, a fact that bedevils biotechnology; moreover, Jonathan King of the Massachusetts Institute of Technology says that proteins fold differently in test tubes as compared to in vivo eukaryotic proteins don't necessarily fold correctly in prokaryotes.

The process of folding and reaching a stable native conformation, allowing a protein to fulfill its function without disrupting other cellular parts or the pathway in which it operates, can be understood and analyzed at several levels. One obvious level is the protein's function, as this depends on the detailed tertiary conformation. The influence of selective pressures on the folding process, such as those in sickle cell anemia, also needs to be considered in unraveling the detailed mechanisms, in particular the intermediate secondary stages. Selective pressures also may explain the existence of partially formed protein fragments that may be the detritus of evolution, says Andrei Alexandrescu, assistant professor at the University of Connecticut. "Recombination of existing architectures appears to be an important mechanism in the evolution of new protein domains. Such partially folded states could remain as vestiges of autonomous folding units that became incorporated into more complex domains."

Folding also needs to be recognized as a four-dimensional process, for timely assembly of a protein involved in a signaling pathway is just as vital as achieving the correct conformation, says genetics professor Arthur Horwich of Yale University. "In the cellular context, being able to fold in a reasonable timescale is important if the protein has a biological [as opposed to a structural] function. If an organism needs to do something in a couple of minutes, it won't do to wait two hours for a protein to fold." A protein that fails to fold in time may be dismantled before it has assumed its native state, or it may fail to be manufactured in sufficient quantities. According to Horwich, protein degradation occurs if correct folding fails after several attempts within a given time scale.

LAW AND ORDER Substantial progress has been made during the last few years in determining the physical principles that underlie the rate at which a protein folds. One remarkable finding, discovered by the team of David Baker, assistant investigator at University of Washington School of Medicine, was of a strong correlation between a protein's contact order and folding rate.1 Proteins fold fastest if the residues that are in contact within the finished conformation are close to each other in the primary amino acid sequence. There was a good theoretical basis for assuming a relationship between contact order and folding rate; it is clearly more difficult for components lying far apart on a primary chain to come together during the folding process, but it was possible that interatomic interactions would be more important in determining folding rate.

It turns out they are not, according to Baker, leading to the assertion in his article that nonnative interactions and conformations (kinetic traps) have a relatively minor impact on folding rate, at least in small proteins, defined by Baker as being less than 150 amino acids in length.

According to Baker, this and other results suggest that it is the structure of the native state that determines a small protein's folding rate and mechanism, not the atomic interactions that take place during folding. This, in turn, implies that the sequence's local details are less relevant than had been expected, since it is the sequence that determines the order of the atomic components and the timing of their convergence during the folding process. So, in a sense, the cart comes before the horse.

This does not contradict the fundamental tenet of protein chemistry--one sequence, one structure--but it does mean that the relationship between sequence and structure is perhaps deeper and subtler than originally thought. The University of Pennsylvania's Heinrich Roeder and colleagues discerned the complex interplay between local and nonlocal events in protein folding and how these relate to the sequence.2 The team found that with b-lactoglobulin, the local sequence might favor helical structure, but this tendency subsequently can be overridden by the tertiary interactions established during the latter stages of folding. At first sight, this might suggest that the formation of temporary helical structures at the secondary stage slows down the folding process. But in the Roeder paper, this factor is dismissed as insignificant from the timing perspective, because the helical structures are very unstable, reinforcing Baker's argument that interatomic interactions are less relevant than native structure in determining folding rate.

Because of their short lifetimes, which can be as little as 100 microseconds, the existence of many such transient folding intermediates just recently was discovered. Roeder and colleagues obtained a detailed view of the secondary structure at such resolutions using a rapid flow protocol that can detect the formation of marginally stable hydrogen-bonded structures with folding times down to 200 microseconds. This allows two-dimensional nuclear magnetic resonance spectra to be recorded after folding is completed, showing the location and degree of solvent protection for individual peptide nitrogen-hydrogen groups formed during the early folding stages, thereby revealing the intermediates.

FOLDING MISADVENTURES Such techniques are essential for investigating the role of folding intermediates and their potential role in disease. "In general, the critical issue (for folding-related diseases) is properties of partially folded intermediates," says MIT's King. His team chose to study the tailspike protein of phage P22 to reveal intracellular folding intermediates.

"We were able to identify a set of temperature-sensitive folding mutations that further destabilized the intermediates, shifting them to the inclusion body pathway. These experiments revealed clearly that problems in protein folding in the biotechnology industry were due to polymerization of partially folded or partially misfolded intermediates," says King. Inclusion bodies are formed in cells or organelles by aggregated or misfolded proteins, and as such are insoluble and a potential source of problems for protein synthesis.

The MIT team also found that a single amino acid substitution could suppress misfolding. This proved, King says, that the aggregation resulting from misfolding was a specific process driven by the amino acid sequence, even if, as described earlier, the rate/mechanism of folding is not sequence-dependent.

Folding intermediates, says King, can have entirely different properties from the native state. Intermediates may be unstable and delicately poised between behaving correctly and going astray. Further, they risk getting stuck in a kinetic trap, from which energy is required to extract the intermediate so that normal folding can resume.

Such extraction is an important role of chaperone proteins that are needed to assist the folding process, particularly for larger proteins where there is more scope for misbehaving. Chaperones also are needed further down the line to prevent proteins acquiring the incorrect tertiary native structure, or subsequently misfolding into aggregates. This gives rise to the possibility of manufacturing chaperones to treat folding-related diseases caused by protein aggregation, such as Alzheimer and Creutzfeldt-Jakob disease.3 (See 5-Prime | Protein Folding)

The potential for such therapies was demonstrated when Simon Hawke at London's Imperial College and Professor John Collinge at the Medical Research Council Prion Unit established that prion disease, in principal, could be prevented in mice using monoclonal antibodies (mAbs).4 It appeared that anti-PrPC (cellular prion proteins) monoclonal antibodies with little or no affinity for scrapie PrPSc prions prevented PrPC incorporation into propagating prions. The anti-PrPC antibodies were attaching to PrPC prions and functioning as chaperones by stabilizing their configuration and preventing their conversion into diseased scrapie PrPSc prions, according to Hawke and colleagues.

THE EDGE OF THE STRAND Until recently, many scientists presumed that disease-causing proteins, created by the misfolded formation of aggregates, were rare. While it's true that the diseases themselves were quite uncommon, recent research suggests that, on the contrary, most polypeptides would form amyloids in the absence of stabilizing factors such as chaperones. According to Cambridge University's Chris Dobson, growing evidence shows that the ability to form highly organized and stable amyloid aggregates is a generic property of all polypeptides, and not just those few proteins associated with recognized pathological conditions.5

This suggests, as MIT's King believes, that aggregates were predominant in primitive organisms until efficient folding mechanisms evolved with inbuilt protection against amyloid plaque formation. Connecticut's Alexandrescu says that quality control mechanisms are directly encoded into the amino acid sequence, specifying structural surface features that protect against misfolding. Alexandrescu cites work from two groups on the design of edge strands.6,7 "There appear to be distinct sequence fingerprints that specify edge strands," says Alexandrescu. "The specification of an edge may be important in preventing indefinite continuation of sheets through improper association between molecules."

Even with these quality control mechanisms in place, many proteins are still unstable. Indeed, it appears that proteins have become less, rather than more, stable, if there is truth in the theory that protein aggregates were more prevalent in early life forms. "The aggregated, or more accurately, polymerized, states associated with pathology are in general irreversibly associated (incapable of reaching a healthy native state), and probably more stable than the native state," says King.

But stability has its problems as well. The organism needs to be able to dismantle proteins that have become damaged or that have served their purpose. Moreover, many globular proteins must readily unfold so that they can pass through membranes, whose pores (except nuclear pores) are usually too small to accommodate the fully folded native form. Chaperones facilitate such folding and unfolding, but clearly the need for it to happen regularly imposes constraints on the stability level that the native form can attain.

According to Alexandrescu, the marginal stability of most proteins is actually reflected in the built-in, quality-control mechanisms, with small sequence changes that either increase or decrease stability, proving to be equally undesirable for the organism. "Lowering the stability of p53, for example, can lead to loss of the protein's tumor-suppressor function. Raising the stability of p53 could presumably interfere with the clearance of this naturally short-lived protein and result in apoptosis."

Many native proteins require locally unstable surfaces to fulfill their functions as, for example: active centers of enzymes, contact surfaces between proteins in signal transduction, or antigen binding surfaces for antibodies. Peter Csermely of the Institute of Biochemistry, Semmelweis University in Hungary,8 says such locally unstable structures are stabilized by the favorable conformation of the rest of the protein, whose inner hydrophobic core becomes packed with high-energy bonds, such as disulfide bridges or ion pairs. In any case, the unstable surface-protein segments are stabilized when they form complexes with some other molecule, which happens as the whole protein fulfills its function, such as signal transduction.

It is not yet clear as to what extent detailed features of large proteins, such as local instabilities, can be predicted from their primary sequences. But as Horwich points out, important progress has been made on the principle protein-folding fronts, of predicting native states from the sequence, and understanding the folding process in conjunction with its intermediate states. "I think it's increasingly clear that there are intermediates involved even in very fast folding proteins. There is also pretty good agreement now on the role of chaperones, firstly to rescue from kinetic traps and put back on the folding pathway, and secondly to bind on intermediates."

Philip Hunter ( is a freelance writer in London.

1. D. Baker, "Prediction and design of protein-folding mechanisms, protein structures, and protein-protein interactions," Howard Hughes Medical Institute, available online at

2. K. Kuwata et al., "Structural and kinetic characterization of early folding events in b-lactoglobulin," Nat Struct Biol, 8:151, 2001.

3. B. Maher, "Researchers reveal a new twist in torsion dystonia," The Scientist, 17[8]:32-3, April 21, 2003.

4. A.R. White et al., "Monoclonal antibodies inhibit prion replication and delay the development of prion disease," Nature, 422:80-3, March 2003.

5. C. Dobson, "The structural basis of protein folding and its links with human disease," Philos Trans R Soc Lond B Biol Sci, 356:133-45, 2001.

6. J.S. Richardson et al., "Natural b-sheet proteins use negative design to avoid edge-to-edge aggregation, "Proc Natl Acad Sci," 99:2754-9, 2002.

7. W. Wang, M.H. Hecht, "Rationally designed mutations convert de novo amyloid-like fibrils into monomeric b-sheet proteins," Proc Natl Acad Sci, 99:2760-5, 2002

8. P. Csermely, "A nonconventional role of molecular chaperones: Involvement in the cytoarchitecture," News Physiol Sci, 16:123-6, 2001.


Living cells are packed with proteins and other molecules that occupy 20% to 30% of the total volume.1 Such crowding has a significant impact on a wide variety of processes, including protein folding, and can affect stability. This is one reason why folding experiments conducted under ideal solvent conditions in the laboratory often turn out differently from the in vivo process they aim to simulate; recent attempts to mimic cellular conditions by adding crowding agents have been tried.1

Crowding can cause problems, particularly when mutant or misfolded proteins fail to be disposed of correctly by the endoplasmic reticulum-associated protein degradation (ERAD) pathway, where aggregates can accumulate, denying space for native proteins and creating toxic effects.

Crowding affects protein folding in two fundamental ways. The exclusion volume effect, caused by the physical crowding itself, denies space for folding and related processes such as the chaperoning of partially folded intermediates. Molecular interaction effects, such as electrostatic attraction and repulsion, and Brownian motion, also occur. (In Brownian motion, a small suspended particle that is constantly bombarded by liquid molecules will jump when the jolts from one side are stronger than those from the other side.)

Despite these effects, or perhaps because of them, protein folding has evolved to cope with crowding, and even thrive in it. Recent research has identified a number of proteins whose folding appears to be more efficient in the cell's crowded environment than in vitro. In Escherichia coli, the chaperones GroEL and GroES prevent aggregation and promote efficient folding of ATP-dependent polypeptides. Research at Brown University and elsewhere2 concluded that GroEL has evolved to function most efficiently under crowding.

The key point is that during folding, a fraction of partially folded protein leaks away from the chaperone and has to be recaptured. Crowding appears to increase the ability of the chaperone's hydrophobic regions to recapture these folding intermediates. The Brown University research suggested that crowding works in this case by restricting the partitioning of intermediates between chaperone molecules, making substrate protein more likely to be retained within the protective chaperonin cavity during successive cycles of the folding process.

1. A.R. Kinjo, S. Takada, "Effects of macromolecular crowding on protein folding and aggregation studied by density functional theory: Statics," Phys Rev E, 66:031911-1-9, 2002.

2. J. Martin, F.U. Hartl, "The effect of macromolecular crowding on chaperonin-mediated protein folding, Proc Natl Acad Sci," 18:1107-12, 1997.

Please indicate on a 1 - 5 scale how strongly you would recommend this article to your colleagues?
Not recommended
   Highly recommended
Please register your vote