In the early 2000s, geneticist linkurl:Len Pennacchio;http://www.jgi.doe.gov/research/pennacchio.html was at the Lawrence Berkeley National Laboratory in California studying coronary artery disease (CAD) and was faced with a conundrum: Despite the fact that CAD was a known heritable disorder, he and his colleagues could not identify any gene that significantly contributed to CAD risk. "It's the number one killer in Western society, yet the genetic explanations have largely remained elusive," he said.
But the completion of the Human Genome Project and the release of the first draft genome of Homo sapiens over a decade ago revealed a vast expanse of DNA that scientists hadn't yet begun searching for disease-related genes -- non-coding DNA, which is not transcribed into RNA or translated into a protein product. "The main surprise from the project is that only one percent of the genome is coding," said Pennacchio, now the head of the Joint Genome Institute's Genomics Technologies Departments -- a miniscule amount compared to the 88 percent and 24 percent of the E. coli
and C. elegans
genomes, respectively (1). The other 99 percent, often dubbed "junk DNA" or the "dark matter" of the genome, hadn't been well characterized, though many scientists suspected it may play a role in gene regulation and disease. "The project basically opened the whole field to ask the question: What is the function that lies in non-coding DNA?" Pennacchio said.
|How does non-coding DNA affect the risk of coronary heart disease?|
Image: Wikimedia commons
Scanning the genomes of over 23,000 people, some of whom had severe, early-onset CAD, Pennacchio and his colleagues identified a genetic pattern on the 9p21 region of chromosome 9 where few coding genes were found -- appropriately dubbed a gene desert -- that increased CAD risk by 30-40 percent in homozygous individuals (2). "It was just fascinating," Pennacchio reflected. "It's a pretty strong link and independent of anything that we knew caused heart disease before." But it was just a correlation, he added. Unraveling the function of this sequence would take a new bag of tricks.
In February, linkurl:Kelly Frazer;http://frazer.ucsd.edu/ and linkurl:Geoff Rosenfeld,;http://rosenfeldlab.ucsd.edu/cms/ genomic scientists at the University of California, San Diego, and cardiologist and geneticist linkurl:Eric Topol;http://www.scripps.org/physicians/5497-eric-topol of Scripps Genomic Medicine connected this 9p21 region to inflammatory signaling in heart cells (3). The region of non-coding DNA that conferred the increased risk of disease was unable to properly bind the transcription factor STAT1, an event which normally regulates the expression of several genes implicated in many types of cancer. The binding appeared to be influenced by cytokines, such as those produced during inflammation of the artery walls in CAD patients, suggesting that the 9p21 region may play a role in the progression of the disease.
In addition to furthering scientists' understanding of CAD, the finding adds to a growing body of literature that links non-coding DNA with changes in expression patterns of known disease genes. Over the past few years, a number of studies have identified genes whose expression correlates with specific non-coding sequences, suggesting that the so-called junk DNA acts to regulate the rate and amount of transcription of other sections of the genome.
"Disease often has to do with producing the right amount of protein at the right place at the right time," said linkurl:Aravinda Chakravarti,;http://chakravarti.igm.jhmi.edu/AravindaChakravartiLab/Home.html a molecular geneticist at John Hopkins Medicine, whose 2005 research identified a non-coding sequence associated with risk of Hirschsprung disease, a colon disorder with nearly 100 percent heritability (4). In addition to mutations that alter the function of disease-related proteins, "changing the amount of a protein will also create disease."
A variant of a non-coding DNA region on chromosome 8, for example, increases the risk of prostate and colorectal cancers. In 2009, scientists found that the high-risk allele disrupts the binding site for a transcription factor, affecting Wnt signaling, a major pathway in colorectal cancer pathogenesis, and upregulating the expression of the proto-oncogene MYC (5, 6, 7). Similarly, a non-coding region on chromosome 7 was found to regulate the expression of the protein sonic hedgehog (Shh) in the developing limb bud. When the non-coding sequence is mutated, Shh is expressed in abnormal parts of the mouse embryo, resulting in polydactyly, or the growth of extra fingers and toes (8). When this sequence is completely knocked out, "you lose sonic expression and all limb development," said developmental geneticist linkurl:Laura Lettice;http://www.hgu.mrc.ac.uk/people/b.hill_researchb.html of the UK's Medical Research Council Human Genetics Unit.
But the functions of many non-coding DNA regions associated with disease risk remain unknown. Because non-coding DNA has no RNA or protein product, "it's very hard to think about what are the biological reasons for these associations [with disease]," said Frazer. "It takes more energy, guesswork and intuition" to figure out its function.
"The real question is what
does it affect? Which gene?" said Chakravarti. Fortunately, the advent of new technologies, such as chromatin conformation capture (3C), which can help examine how distant parts of the genome interact (9), and high-throughput sequencing coupled with chromatin immunoprecipitation to map histone marks, is making it easier for researchers to answer that question. "Without those types of technological breakthroughs to figure out how to string these things together, it couldn't have been done," said Frazer.
There is one lingering mystery, however -- the physical mechanism by which non-coding DNA effects such the changes it does. Current thought holds that transcription factors bind sites embedded in these non-coding regions, initiating a conformational change in chromatin structure. The result is the formation of loops in the DNA, which bring distant points on a chromosome -- separated by up to a million base pairs or more -- into close proximity, where the non-coding region then engages a gene promoter to activate or inhibit transcription.
"You have to assume that these elements interact with the promoter somehow, and therefore the assumption is that they form loops," said Lettice. "But it's kind of a wee bit hand-wave-y."
Still, scientists in the field are optimistic. "We have not only the genome sequences and the tools, but importantly, we have the perspective of how we go about looking at these problems," said Chakravarti. "We will eventually understand what [these non-coding sequences] are, what they do, what their mutations are, and how they are associated with disease."
(1) S.A. Shabalina and N.A. Spiridonov, "The mammalian transcriptome and the function of non-coding DNA sequences," Genome Biology
5:105, 2004. linkurl:Link;http://www.ncbi.nlm.nih.gov/pmc/articles/PMC395773/
(2) R. McPherson et al., "A Common Allele on Chromosome 9 Associated with Coronary Heart Disease," Science
316: 1488-91, 2007. linkurl:Link;http://www.sciencemag.org/content/316/5830/1488.abstract
(3) O. Harismendy et al., "9p21 DNA variants associated with coronary artery disease impair interferon-γ signalling response," Nature
470: 264-8, 2011. linkurl:Link;http://www.nature.com/nature/journal/v470/n7333/full/nature09753.html
(4) E.S. Emison et al., "A common sex-dependent mutation in a RET enhancer underlies Hirschsprung disease risk," Nature
434: 857-63, 2005. linkurl:Link;http://www.nature.com/nature/journal/v434/n7035/abs/nature03467.html
(5) O. Harismendy and K. Frazer, "Elucidating the role of 8q24 in colorectal cancer," Nature Genetics
41: 868-9, 2009. linkurl:Link;http://www.nature.com/ng/journal/v41/n8/full/ng0809-868.html
(6) M.M. Pomerantz et al., "The 8q24 cancer risk variant rs6983267 shows long-range interaction with MYC
in colorectal cancer," Nature Genetics
41: 882-4, 2009. linkurl:Link;http://www.nature.com/ng/journal/v41/n8/abs/ng.403.html
(7) S. Tuupanen et al., "The common colorectal cancer predisposition SNP rs6983267 at chromosome 8q24 confers potential to enhanced Wnt signaling," Nature Genetics
41: 885-90, 2009. linkurl:Link;http://www.nature.com/ng/journal/v41/n8/abs/ng.406.html
(8) L.A. Lettice et al., "A long-range Shh enhancer regulates expression in the developing limb and fin and is associated with preaxial polydactyly," Human Molecular Genetics
12: 1725-35, 2003. linkurl:Link;http://hmg.oxfordjournals.org/content/12/14/1725.full
(9) P.J. Shaw, "Mapping chromatin conformation," F1000 Biology Reports
2:18, 2010. linkurl:Link;http://f1000.com/reports/biology/content/2/18
**__Related stories:__*** linkurl:First pages of regulation;http://www.the-scientist.com/news/display/53280/
[13th June 2007]*linkurl:Epigenetics mark regulatory elements;http://www.the-scientist.com/news/display/48529/
[5th February 2007]*linkurl:Regulatory DNAs may be missed;http://www.the-scientist.com/news/display/23246/
[24th March 2006]