Paul H. Silverman
Courtesy of Paul H. Silverman
For more than 50 years scientists have operated under a set of seemingly incontrovertible assumptions about genes, gene expression, and the consequences thereof. Their mantra: One gene yields one protein; genes beget messenger RNA, which in turn begets protein; and most critically, the gene is deterministic in gene expression and can therefore predict disease propensities.
Yet during the last five years, data have revealed inadequacies in this theory. Unsettling results from the Human Genome Project (HGP) in particular have thrown the deficiencies into sharp relief. Some genes encode more than one protein; others don't encode proteins at all. These findings help refine evolutionary theory by explaining an explosion of diversity from relatively little starting material. We therefore need to rethink our long-held beliefs: A reevaluation of the genetic determinism doctrine, coupled with a new systems biology mentality, could help consolidate and clarify genome-scale data, enabling us finally to reap the rewards of the genome sequencing projects.
In the mid- and late 1980s, our testimony before the congressional committees controlling HGP purse strings relied upon our old assumptions.1 In describing the genome's potential medical value, we elevated the status of the gene in human development and by extension, human health. At the same time, the deterministic nature of the gene entered the social consciousness with talk of "designer" babies and DNA police that could detect future criminals.
Armed with DNA determinism, scientific entrepreneurs convinced venture capitalists and the lay public to invest in multi-billion-dollar enterprises whose aim was to identify the anticipated 100,000-plus genes in the human genome, patent the nucleotide sequences, and then lease or sell that information to pharmaceutical companies for use in drug discovery. Prominent among these were two Rockville, Md.-based companies, Celera, under the leadership of J. Craig Venter, and Human Genome Sciences, led by William Haseltine.
But when the first draft of the human genome sequence was published in the spring of 2001, the unexpectedly low gene count (less than 30,000) elicited a hasty reevaluation of this business model. On a genetic level, humans, it seems, are not all that different from flies and worms.
Or maybe they are, if we can assume that genes are not strictly deterministic. As Venter et al. reported in their genome manuscript: "A single gene may give rise to multiple transcripts, and thus multiple distinct proteins with multiple functions by means of alternative splicing and alternative transcription initiation and termination sites."2
The industry shakeup was predictable. Celera, Human Genome Sciences, and most of the other genomic sequencing firms refocused their business plans and downsized. Venter resigned as president of Celera, and Haseltine has indicated his intention to do the same.
RETHINKING THE GENE
Maybe the gene itself needs reevaluation. Venter et al., recognizing the inadequacy of the term, proposed the phrase "transcription unit" instead.2 Consider alternative splicing, estimated to occur in at least 74% of human multi-exon genes.3 With the ability to create potentially tens of thousands of protein products per gene,4 alternative splicing could reconcile 30,000 genes with what is, in the Human Proteome Organization's estimation, a million or more proteins.5
Further confounding the predictive value of genotype are micro-RNAs. Noncoding polynucleotides approximately 22 nucleotides in length,6 micro-RNAs regulate gene expression at the mRNA level both before and after transcription, by splicing exons, silencing genes, and editing proteins before and after translation. How they regulate and coordinate these activities is unclear, yet clarity is called for. A recent study on the genomic differences between humans, chimpanzees, and mice highlights the point, concluding, "Most of the evolutionary changes [between these species] must have occurred at the level of gene regulation."7
Indeed, the gene may not be central to phenotype at all, or at least it shares the spotlight with other influences. Environmental, tissue, and cytoplasmic factors clearly dominate the phenotypic expression processes, which may, in turn, be affected by a variety of unpredictable protein-interaction events. The cell-signaling process heavily depends on extracellular stimuli to trigger nuclear DNA transduction. Even chromatin can be regulated. Transmembrane pumps, porosity, and receptor molecules all affect the signals that induce uncoiling of DNA superknots, and the status of histones and their associated enzymes contribute to the transcription process.
On the medical front, epidemiologists have long known that diet, exercise, antioxidants, and environmental factors may affect gene expression. Oncogene and tumor-suppressor mutations account for less than 5% of cancer cases, while preventive measures can abate other hereditary disease propensities.
Mina Bissell and others have demonstrated that by appropriate signaling or silencing of membrane receptors, cellular character and behavior can be altered.8 Bissell has induced malignant acinar breast cancer cells to revert to normal form and function of milk production, leading her to assert, in a recent seminar at UC-Irvine, "Phenotype overrides genotype." Yet despite these many observations, the DNA deterministic model continues to dominate molecular biology.
SHUFFLING THE GENETIC DECK
The evolutionary and developmental implications of multiple expression variants are profound and suggest a tectonic shift from sole reliance on single mutations or nucleotide polymorphisms as a source of potential variation. Through combinatorial interactions, these expression variants increase, by a million fold or more, the raw material for evolutionary development.
A combination of just three genes, each with a thousand possible variants, offers a billion possibilities for natural selection; 10 genes each with 100 variants offers the evolutionary process infinitely more plastic and responsive combinatorial possibilities. This tremendous increase compresses the evolutionary timeline significantly by creating more opportunities for the evolution of organismal complexity and multiple phylogenetic experiments such as those in the Burgess shale deposits described by Stephen J. Gould.9
The Central Dogma of molecular biology as formulated in the 1950s proposed unidirectional gene expression. Reverse transcriptase, shattered this model. Post-translational protein modifications added another wrinkle. More recently, multiple factors affecting gene expression at all stages of the process became apparent at the transcriptional level, the translational level, and the post-translational level.
Mary Jane West-Eberhard explored the potential of combinatorial evolution in a recent, stunningly comprehensive book,
WANTED: A NEW MODEL
For more than 50 years the simplistic DNA deterministic model of hereditary transmission has provided a useful and satisfying context for the development of molecular biology. Yet though it cannot account for the increased complexity and plasticity being discovered for gene expression, the model continues to receive strong support among molecular biologists who have been reluctant to alter or abandon it without a viable alternative.
That may not happen any time soon, unfortunately, as not all the factors that lead to specific gene expression are known, nor are they likely to be determined in the immediate future. The study of phenotypic expression must take a wide-angled view of events from the origin of cellular stimuli, to the initiation of mRNA transcription, to the production of a protein product. Such a systems-biology approach requires multidisciplinary participation and over the last few years, research centers dedicated to that study have been established.
Combining the disciplines of biology, mathematics, engineering, and computation, these centers seek to understand cellular behavior in totality. Giot et al. recently published a map illustrating more than 4,000
The current phenotypic expression model suggests a wasteful, inefficient overproduction of multiple protein products that results from unorganized, stochastic events. This might be a gross misinterpretation, but it illustrates the need for new data and for the attention of theoretical and experimental biologists to contribute to unraveling the conundrum.
Paul H. Silverman is Associate Chancellor, Emeritus at the University of California, Irvine. An early advocate of the Human Genome Project, he established the first Human Genome Center as a joint effort between UC-Berkeley and The Lawrence Berkeley National Laboratory. In 1994 he was elected to the 500-person World Academy of Art and Science, a UNESCO-supported group of experts, to advise on global concerns of technology developments.
Paul H. Silverman can be contacted at