Decoding DNA: New Twists and Turns

Highlights from a series of three webinars on the future of genome research, held by The Scientist to celebrate 60 years of the DNA double helix

Kerry Grens
Jun 1, 2013

Sixty years ago, on April 25, 1953, Watson and Crick’s paper, “A Structure for Deoxyribose Nucleic Acid,” appeared in Nature. In little over one page they describe the now iconic double-helical structure of DNA, concluding with the colossal understatement: “It has not escaped our notice that the specific pairing we have postulated immediately suggests a possible copying mechanism for the genetic material.” Five weeks later, in the same journal, Watson and Crick published a more detailed description of the structure ending in the words: “We feel that our proposed structure for deoxyribonucleic acid may help to solve one of the fundamental biological problems—the molecular basis of the template needed for genetic replication. The hypothesis we are suggesting is that the template is the pattern of bases formed by one chain of the deoxyribonucleic acid and that the gene contains a complementary pair of such templates.”

For 60 years, this exciting...


What’s Next in Next-Generation Sequencing?

The transition to next-generation sequencing has opened up an exciting new world of research possibilities. Sequencing that cost $1,000 5 years ago now costs a dime and is lightning fast compared with work that used earlier iterations of the technology. But scientists still aren’t satisfied. Bioinformaticist Joel Dudley says, “As the cost of sequencing drops, our ambitions and our aspirations about how we can apply the sequencing technology, of course, grow.”

“Now we can generate tremendous multiscale information using next-generation sequencing on single cells,” says Dudley. “And since we’re talking about the future going forward, it’s not hard to extrapolate that what we’ll be interested in is generating single-cell, real-time, perhaps continuous multiscale genomic information.”

In this first of three webinars, Dudley joins two other leaders in the development, advancement, and application of next-generation sequencing to offer a sneak preview into the future of this field—sharing their unpublished data, ongoing projects, and predictions about where next-gen sequencing will take biology and medicine.

George Church
Harvard Medical School
One promising method that has yet to be fully deployed is a technique called long fragment reads, or LFR (Nature, 487:190-95, 2012). “This is the ability to sequence large chunks of DNA—sort of the equivalent of bacterial artificial chromosomes, on the order of 200 kilobases—not by cloning, but just by dilution
into a 384-well plate,” George Church explains. It sounds simple, and better yet, “this is, I think, our best candidate for clinical accuracy today.” LFR allows analysts to determine the haplotype phase—whether two mutations exist on the same chromosome or on two different chromosomes—which “determines whether you have one working copy of a gene or zero. So this is a pretty big deal,” he adds.

Next to come down the pike, Church anticipates—likely to be published in the coming months—will be fluorescent in situ sequencing of transcripts performed within cells. The approach starts by producing and fixing the cDNA in cells, then amplifying each messenger RNA (mRNA), and sequencing the mRNAs through ligation, polymerase extension, or hybridization. The sequencing happens inside the cell. “These can be visualized in three dimensions with, say, a confocal setup, so you can image the full three-dimensional distribution [of mRNAs] in many cells. You can see them in multicellular contexts; you can see subcellular differentiation of polarized cells; and I think this goes beyond the single-cell assays you talk about,” Church says.

Another technology to watch in the next few years is nanopore sequencing. In this method, each of the four DNA nucleotides passing through a nanopore alters the current across the pore in a distinctive way. Church describes three “flavors” of the application: detecting natural, single-stranded polymers of DNA, natural DNA monomers, and unnatural nanotags. Church says nanotags are an interesting development. “Since these are unnatural . . . you can dial up the distinction between these in terms of their ionic conductance impacts on the protein nanopore . . . where you get an analytic accuracy of 1 in 5 x 108.”

Joel Dudley
Assistant professor,
Mount Sinai School
of Medicine
“I think it’s clear to most people now that we’re living in a big-data world,” says Joel Dudley. Of course, the challenge of big data is not just storing it, but integrating it and using it in its entirety, he adds. “We believe that one of the most powerful modalities for organizing this big data that’s being generated by next-generation sequencing technologies is the network biology paradigm.”

Genes don’t operate in isolation, but in interactive networks. “By scoring these high-dimensional traits using next-generation sequencing data, we can use network biology algorithms to begin to understand in a data-driven way how the elements are related—we begin to organize them into networks; these networks begin to represent systems,” Dudley says. From there, researchers can examine networks of networks and construct a model of how the greater system functions.

For instance, Dudley and his colleagues are using predictive network modeling to develop personalized cancer therapies. Data collection begins with patients’ clinical information. Then, using next-generation sequencing, researchers collect tumor-specific RNA and DNA and germline DNA. They analyze the data to look at somatic variation, copy number variation, and single nucleotide mutations. The patient’s individual network is projected onto a baseline model of the tumor’s genetic network, allowing the researchers to identify patient-specific subnetworks that are unique in, say, expression levels of RNA or in the accumulation of somatic mutations.

We then look at chemogenomic information and begin to integrate the entire world of chemotherapy options with the patient-specific tumor features.—­ Joel Dudley

“We then look at chemogenomic information and begin to integrate the entire world of chemotherapy options with the patient-specific tumor features,” Dudley says. Dudley’s group is building the capacity to rapidly create patient-specific animal models and human cell-screening methods to test drugs. The idea is to then integrate all this information into a clinical report with tailor-made treatment options. “Going forward, when the cost is in our favor and the technology is reliable enough, what we’re going to want to do is score as many next-generation sequencing traits on all patients that come in the door at Mount Sinai, and hopefully other places, to build personalized, multiscale network models of each patient.”

George Weinstock
Washington University
in St. Louis
“We have microorganisms basically owning the entire planet as well as our bodies,” says George Weinstock. “And next-generation sequencing . . . you might think it would not have that big an effect on organisms with little genomes, but in fact it’s been completely transforming for microbiology.” One application of the technology has been in the neonatal intensive-care unit at hospitals, to determine whether a preemie’s blood infection came from the newborn’s own body or from the environment. “Is there a potential pathogen loose in the neonatal intensive-care unit that needs to be addressed?”

The researchers collect bacteria from the baby’s stool and compare them with the bacterial species causing the blood infection. They then perform shotgun sequencing of the bacterial genomes and count the number of single nucleotide polymorphisms (SNPs) when two samples are aligned. For instance, if the number of SNPs in a blood-to-blood comparison (these SNPs reflect expected errors in the analysis, not actual differences between the blood samples) is similar to those from a blood-to-stool comparison, then the organisms are considered the same and the infection must have originated from the baby’s own body. If there are a greater number of SNPs in the blood-to-stool comparison—indicating that the genomes are quite different—then the bug must have come from the environment.

“Here we’re just doing shotgun sequencing, generating a lot of reads, doing assembly, doing very basic operations, but we’re getting a resolution of the environmental organisms in the neonatal intensive-care unit, the blood and gut organisms, and how they correlate with each other, that’s really unprecedented,” says Weinstock. In a 2-year study in the NICU using this approach, Weinstock has investigated a number of infection episodes and found that most of the time they do originate from the gut of the baby—something that conventional diagnostics might have gotten wrong. “For infection control as well as for understanding and possibly diagnosing before a bad event happens, this is going to be a very great application of next-generation sequencing that is getting very close to reality.”

George Church is a professor of genetics at Harvard Medical School, and Director of the Personal Genome Project, providing the world's only open-access information on human genomic, environmental and trait data (GET).His 1984 Harvard PhD included the first methods for direct genome sequencing, molecular multiplexing, and barcoding. These lead to the first commercial genome sequence (pathogen, Helicobacter pylori) in 1994. His innovations in "next generation" genome sequencing and synthesis and cell/tissue engineering resulted in 12 companies spanning fields including medical genomics (Knome, Alacris, AbVitro, GoodStart, Pathogenica) and synthetic biology (LS9, Joule, Gen9, WarpDrive) as well as new privacy, biosafet, and biosecurity policies. He is director of the NIH Centers of Excellence in Genomic Science. His honors include election to NAS & NAE and Franklin Bower Laureate for Achievement in Science.

Joel Dudley is an assistant professor of genetics and genomic sciences and Director of Biomedical Informatics at Mount Sinai School of Medicine in New York City. His current research is focused on solving key problems in genomic and systems medicine through the development and application of translational and biomedical informatics methodologies. Dudley's published research covers topics in bioinformatics, genomic medicine, personal and clinical genomics, as well as drug and biomarker discovery. His recent work with coauthors describing a novel systems-based approach for computational drug repositioning, was featured in the Wall Street Journal, and earned designation as the NHGRI Director's Genome Advance of the Month. He is also coauthor (with Konrad Karczewski) of the forthcoming book, Exploring Personal Genomics. Dudley received a BS in microbiology from Arizona State University and an MS and PhD in biomedical informatics from Stanford University School of Medicine.

George Weinstock is currently a professor of genetics and of molecular microbiology at Washington University in Saint Louis. He was previously codirector of the Human Genome Sequencing Center at Baylor College of Medicine in Houston, Texas where he was also a professor of molecular and human genetics. Dr. Weinstock received his BS degree from the University of Michigan (Biophysics, 1970) and his PhD from the Massachusetts Institute of Technology (Microbiology, 1977).

Unraveling the Secrets of the Epigenome

Stephen Baylin describes epigenetics as “the software that gives the genetic program, the hard drive, the capacity to work.” DNA “needs the software packaging of epigenetics to play out its long-term memory for how genes are expressed,” he says. Compared to genes themselves, epigenetic modifications have only recently come under intense scrutiny by scientists. “From my perspective,” says Victoria Richon, “I think the reason why we’re really seeing the explosion of information about this field is that we now understand the enzymes and really the machinery that’s catalyzing the . . . different modifications, which previously we didn’t. So this is a relatively new enzyme family compared to the kinases, for instance.”

Despite being a relative latecomer to the genetics field, epigenetics has already yielded considerable insight into basic cellular functioning and disease, as discussed by the panelists in this second webinar of the series.


Stephen Baylin
Johns Hopkins University
Posttranslational modifications to the tails of histones include methylation and demethylation. Those modifications are “written” by a family of enzymes called methyltransferases and “erased” by histone demethylases. Certain patterns of epigenetic modification can signal the presence of cells that are malignant.  Methylation at particular sites might turn off the expression of a tumor-suppressor gene, for instance, which is ordinarily unmethylated. Knowledge of such patterns opens up the opportunity to reverse the modification through drug therapy, to return DNA to its normal configuration and presumably stop the cancer, says Stephen Baylin.

Therapies using this approach are beginning to gain traction, Baylin says. His research has focused on giving cancer patients a very low dose of a DNA demethylating agent called 5-azacytidine (Vidaza). The drug blocks and degrades DNA methyltransferases, and it has been approved by the FDA to treat a preleukemia condition called myelodysplasia. In preliminary studies, Baylin and his colleagues have been using it to treat a solid tumor, non–small cell lung cancer in 60 patients.

The response to the drug seems to shake out in one of three ways: 3 percent of patients have a long-term response to the therapy; one-third of patients seemed to become primed for a better response to subsequent chemotherapy; and other patients show evidence of becoming sensitized to an immunotherapy in which lymphocytes regain their function after the tumor had rendered them incompetent. “All of this is going to be followed with clinical trials,” says Baylin.
Victoria Richon
Vice President, Discovery
and Preclinical Research, Sanofi
In another case of a methyltransferase gone bad, Victoria Richon has been overseeing the development of therapies for two types of leukemia. A proportion of patients with acute myeloid leukemia (AML) and acute lymphoblastic leukemia (ALL) have a translocation in the MLL gene, which encodes a histone methyltransferase. The consequence of this genetic rearrangement is the recruitment of DOT1L, a methyltransferase that targets a modification on histone 3 at lysine-79 (H3K79). Methylation of this site is tied to active transcription. “This aberrant recruitment of DOT1L causes an increase in H3K79 methylation, increase in gene expression, and leads to leukemogenesis,” says Richon.

To stop this unwanted methylation, Richon and her colleagues have developed therapies based on DOT1L inhibition. They have found that in MLL-rearranged cell lines, a DOT1L inhibitor is antiproliferative. In normal cell lines, proliferation is unaffected by the drug.

Epigenetics is the software that gives the genetic program, the hard drive, the capacity to work.—­ Stephen Baylin

The researchers have also found that tumors regress and don’t return when animals are treated with the inhibitor. Such success prompted Epizyme, the company that has been developing this drug, to begin a clinical trial last year. “This is a very important landmark for being able to develop these histone methyltransferase inhibitors . . . in defined patient populations where we know that there are alterations that specifically cause a requirement for DOT1L,” says Richon.
Paolo Sassone-Corsi
University of California, Irvine
According to Paolo Sassone-Corsi, about 15 percent of all transcripts in cells oscillate in a circadian manner. “In order to have so many genes oscillating in this harmonic, beautiful way we need to have chromatin remodeling, possibly an epigenetic program, that is controlling this highly physiological system,” says Sassone-Corsi.

One enzyme he’s focused on is acetyl coenzyme A synthase1 (AceCS1), which synthesizes acetyl-CoA. Acetyl-CoA is a key player in many important biochemical reactions, providing acetyl groups, including as modifiers of histone proteins. AceCS1 itself is dependent upon acetylation—the enzyme is inactive when acetylated, and a deacetylase, SIRT1, renders it active by removing the acetyl group. Sassone-Corsi and his colleagues developed an antibody that recognizes acetylated AceCS1, and found that the enzyme’s activity is cyclic and controlled by the circadian clock system.

“That oscillation is gone in cells when the clock is removed or destroyed,” says Sassone-Corsi. “This tells us that the circadian clock is actually regulating the enzymatic activity of AceCS1. . . . If that’s true, that’s possibly also telling us the acetyl-CoA metabolite is cyclic.” Indeed, this is just what Sassone-Corsi has found. When the circadian clock system is disrupted, oscillations in acetyl-CoA are abolished. “So the clock system seems to be able to drive not only the cyclic oscillation of the enzyme [AceCS1], but also the cyclic synthesis of acetyl-CoA,” he says.

Now, putting this finding into an epigenetic context, Sassone-Corsi and his colleagues have shown that global acetylation of histones is affected by the clock-controlled AceCS1. In cells with either a mutated clock system or a mutated AceCS1, the normal oscillations of histone acetylation are flattened and the cycle disappears. The consequences of a disrupted clock, Sassone-Corsi points out, are myriad, ranging from diabetes and cancer to depression.

Stephen Baylin is a professor of medicine and of oncology at the Johns Hopkins University School of Medicine, where he is also Chief of the Cancer Biology Division of the Oncology Center and Associate Director for Research of The Sidney Kimmel Comprehensive Cancer Center. Together with Peter Jones of the University of Southern California, Baylin also leads the Epigenetic Therapy Stand up to Cancer Team (SU2C). He and his colleagues have fostered the concept that DNA hypermethylation of gene promoters, with its associated transcriptional silencing, can serve as alternatives to mutations for producing loss of tumor-suppressor gene function. Baylin earned both his BS and MD degrees from Duke University, where he completed his internship and first-year residency in internal medicine. He then spent 2 years at the National Heart and Lung Institute of the National Institutes of Health. In 1971, he joined the departments of oncology and medicine at the Johns Hopkins University School of Medicine, an affiliation that still continues.

Victoria Richon heads the Drug Discovery and Preclinical Development Global Oncology Division at Sanofi. Richon joined Sanofi in November 2012 from Epizyme, where she served as vice president of biological sciences beginning in 2008. At Epizyme she was responsible for the strategy and execution of drug discovery and development efforts that ranged from target identification through candidate selection and clinical development, including biomarker strategy and execution. Richon received her BA in chemistry from the University of Vermont and her PhD in biochemistry from the University of Nebraska. She completed her postdoctoral research at Memorial Sloan-Kettering Cancer Center.

Paolo Sassone-Corsi is Donald Bren Professor of Biological Chemistry and Director of the Center for Epigenetics and Metabolism at the University of California, Irvine, School of Medicine. Sassone-Corsi is a molecular and cell biologist who has pioneered the links between cell-signaling pathways and the control of gene expression. His research on transcriptional regulation has elucidated a remarkable variety of molecular mechanisms relevant to the fields of endocrinology, neuroscience, metabolism, and cancer. He received his PhD from the University of Naples and completed his postdoctoral research at CNRS, in Strasbourg, France.


The Impact of Personalized Medicine

Fifteen years ago, Marty Tenenbaum was diagnosed with metastatic melanoma. He says he bet his life on a clinical trial that ultimately failed. “But it helped some patients, and, fortunately, I was one of them,” he says. “But why? Why me?” The answer lies in personalized medicine—the approach to treating disease that relies on each patient’s particular biology. In the final webinar of the series, Tenenbaum and three researchers describe what’s necessary to make the practice of medicine truly individualized.

Jay M. ("Marty") Tenebaum
Founder, Cancer Commons
“Thanks to genomics we’ve recognized that cancer is thousands of diseases,” says Marty Tenenbaum. To develop targeted therapies for each disease is an overwhelming task. “There simply aren’t enough patients, or dollars, or specimens,” he adds. But there is a wealth of experimental data that gets tossed away and should be taken advantage of.

Doctors treating patients with late-stage disease often look to off-label uses of drugs or novel combinations of medicines to see what might work when all conventional treatments have been exhausted. “Unfortunately, none of the learnings from these experiments ever get reported,” says Tenenbaum.

Tenenbaum proposes a method of capturing these “N of 1” trials. The vision for rapid learning in oncology is to start with a reference model of how to treat various subtypes of cancer. Doctors use that model to guide treatment and analyze how it affected the patient, and also to learn as much as possible from that particular patient. “We can get almost a terabyte of information per patient nowadays,” he says. Then, as the successes and failures and multiple treatment attempts in each case are recorded, the treatment model gets tweaked. These results need to be systematically organized so that leads can be explored. An individual’s experience should not be disregarded by science, Tenenbaum says. “We can’t afford to do this anymore.”

Amy P. Abernethy
Director, Cancer Care
  Research Program,
Duke University
Pursuing the treatment of disease has become increasingly complicated by personalized medicine. “We used to bucket out our choices of treatments by specific disease,” says Amy Abernethy. For instance, there were breast cancer therapies, lung cancer therapies, etc. Then treatments for each disease became refined based on patients’ biomarkers, and this further subdivided the categories of treatment approaches within each cancer.

But patterns have begun to emerge linking various cancers. “Many times . . . we start seeing the same biomarker present across multiple cancers,” she says. Then, the treatment buckets for each cancer spill over into one another. “What we see now is an overlay of more and more characteristics and more and more biomarkers, and the usual buckets that we’re used to assigning for tumors and treatments becoming quite blurry as we try and figure out what to do.”

Biomarkers are not the only factor that determines a patient’s treatment—other health conditions, age, family history of disease, and the time course of the cancer care also make a difference. What’s needed next are streamlined ways to analyze data and communicate treatment options. “We’re going to need a whole suite of decision-support tools to figure out how to take care of individual patients and pull these data together,” says Abernethy.

Abhijit ("Ron") Mazumder
Johnson & Johnson
Essential to using biomarkers for tailoring therapies is a companion diagnostic that accompanies a particular treatment, says Abhijit “Ron” Mazumder. These tools categorize patients based on a disease biomarker and shuttle them into the appropriate therapy. Developing these biomarker assays takes about 2 to 4 years, he says, and clinical trials testing the effectiveness of biomarkers can follow one of two routes.

One pathway involves patients with and without the particular biomarker, and patients within each group are randomly assigned to receive a treatment in development or the standard care. The downside to this route is that it demands larger trial sizes, but it does test for a biomarker’s usefulness in sending patients toward a targeted therapy.

The alternative pathway to approval involves trials that enroll only patients with a particular biomarker, “in other words, those who you have a better confidence level will respond,” says Mazumder. This trial design requires compelling data from early trials indicating that the biomarker is indeed a useful indicator  on which to base enrollment. Patients are then randomly assigned to receive a standard therapy or a treatment in development. Researchers can’t determine whether the treatment might work in other populations, but they can get away with smaller numbers of participants.

We’re going to need a whole suite of decision-support tools to figure out how to take care of individual patients and pull these data together.—­ Amy P. Abernethy

The challenge in conducting these trials is that, to screen patients for entry into a clinical trial, all biopsies must be sent to a central lab for testing. For one study on the BRAF biomarker in melanoma, for instance, 675 patients were seen at 104 clinical trial sites, but all diagnostic testing was conducted at just five central locations. This ensures there is less bias in the clinical validation of the diagnostic, but it adds a level of complexity to the study. Ultimately, though, “one of the requirements for personalized medicine is a companion diagnostic,” Mazumder says.


Geoffrey S. Ginsburg
Duke University
Geoffrey Ginsburg describes new approaches to capturing the wealth of information to be gleaned from any individual. One is through the National Human Genome Research Institute’s eMERGE network, in which genomic information from specimens is linked to electronic medical records and shared among a number of research sites. This could lead to “potentially genome-based outcome measures,” says Ginsburg.

In addition to genetic data, phenotypic information can also be harnessed to help make clinical decisions. Ginsburg mentions a movement called “the quantified self,” which he says is “often emblematic of people who are obsessed with recording lots of details about their daily lives,” including what foods they eat and how often they exercise. People will then post these logs online for public view and possible data mining.

Sensors to capture phenotypic data are becoming more common. “I think it’s important to recognize that sensor technologies are actually going to give us some new insights into phenotypes that we’ve never had a chance to measure on a longitudinal basis,” Ginsburg says. These sensors can keep track of movement, calorie expenditure, and diet. Couple this with expanding access to genome-wide sequencing and transcriptome and metabolome profiling, and “that may be one of the manifestations of truly personalized medicine.”

Jay M. ("Marty") Tenenbaum is founder and chairman of Cancer Commons. Tenenbaum’s background brings a unique perspective of a world-renowned Internet commerce pioneer and visionary. He was founder and CEO of Enterprise Integration Technologies, the first company to conduct a commercial Internet transaction. Tenenbaum joined Commerce One in January 1999, when it acquired Veo Systems. As chief scientist, he was instrumental in shaping the company's business and technology strategies for the Global Trading Web. Tenenbaum holds BS and MS degrees in electrical engineering from MIT, and a PhD from Stanford University.

Amy P. Abernethy, a palliative care physician and hematologist/oncologist, directs both the Center for Learning Health Care (CLHC) in the Duke Clinical Research Institute, and the Duke Cancer Care Research Program (DCCRP) in the Duke Cancer Institute. An internationally recognized expert in health-services research, cancer informatics, and delivery of patient-centered cancer care, she directs a prolific research program (CLHC/DCCRP) which conducts patient-centered clinical trials, analyses, and policy studies. Abernethy received her MD from Duke University School of Medicine.

Abhijit “Ron” Mazumder obtained his BA from Johns Hopkins University, his PhD from the University of Maryland, and his MBA from Lehigh University. He worked for Gen-Probe, Axys Pharmaceuticals, and Motorola, developing genomics technologies. Mazumder joined Johnson & Johnson in 2003, where he led feasibility research for molecular diagnostics programs and managed technology and biomarker partnerships. In 2008, he joined Merck as a senior director and Biomarker Leader. Mazumder rejoined Johnson & Johnson in 2010 and is accountable for all aspects of the development of companion diagnostics needed to support the therapeutic pipeline, including selection of platforms and partners, oversight of diagnostic development, support of regulatory submissions, and design of clinical trials for validation of predictive biomarkers.

Geoffrey S. Ginsburg, is the Director of Genomic Medicine at the Duke Institute for Genome Sciences & Policy. He is also the Executive Director of the Center for Personalized Medicine at Duke Medicine and a professor of medicine and pathology at Duke University Medical Center. His work spans oncology, infectious diseases, cardiovascular disease, and metabolic disorders. His research is addressing the challenges for translating genomic information into medical practice using new and innovative paradigms, and the integration of personalized medicine into health care. Ginsburg received his MD and PhD in biophysics from Boston University and completed an internal medicine residency at Beth Israel Hospital in Boston, Massachusetts.

Correction (June 10): The original version of this article incorrectly stated the institutional affiliation of Abhijit ("Ron") Mazumder. Mazumder is with Johnson & Johnson, not with Merck & Co. The Scientist regrets the error.


Thank you to the companies who sponsored The Scientist’s celebration
of the 60th anniversary of the discovery of the double-helical structure of DNA.

THE BIG DATA DECADE 2003–2012 poster is available to view/download here.

Video links to all three of the webinars in the series can be found here.

Full bios of all the distinguished webinar panelists can be found here.