The chemist examined the role of activated oxygen molecules in biological processes.
A growing toolbox for surveying the activity of entire genomes
January 1, 2016|
© HENNING DALHOFF/SCIENCE SOURCEA cell packs its genome as if our lives depended on it, and they do. If you could unwind the DNA within the nucleus of a single cell, it would stretch two meters. The 2–3 percent of the genome revealed at any one time performs an essential function: transcription. “Assaying the parts that are being used is a very powerful way to try to understand gene-expression regulation at the level of DNA,” says William Greenleaf of Stanford University. And probing that regulation process is key for understanding health and disease.
Large consortia-led projects such as ENCODE (Encyclopedia of DNA Elements) have made great strides in identifying various functional elements of the genome. These include enhancers, activators, and promoters—regions of DNA that bind proteins that control transcription. Studies have also have tapped into the nature of DNA’s primary packing material: protein spools called histones around which genomes wind to form nucleosomes. Nucleosomes, which are often compared to beads on a string of DNA, further stack as chromatin folds and winds, forming some 10,000 loops within the cell’s nucleus (Cell, 159:1665-80, 2014). This brings distant regions of the genome into close contact and ensures that genes aren’t unintentionally transcribed.
Which parts of the genome are available for transcription at a given moment? ENCODE helped answer this question by using DNase-seq, a technique that digests and sequences nucleosome-free regions of the genome. Similar methods have come along in recent years, including ATAC-seq and MNase-seq, expanding researchers’ options for taking snapshots of available (or unavailable) DNA.
Surveying the whole genome using these methods can be a helpful first step toward cataloging potential functional elements of transcription. ChIP-seq (or its myriad variations) may then provide more mechanistic insights, by using antibodies to pinpoint specific transcription factors, notes senior investigator Keji Zhao of the National Heart, Lung, and Blood Institute.
The Scientist talked to developers and users about the pros and cons of each of these commonly used techniques. Here’s what they said.
Background: Deoxyribonuclease (DNase) has long been paired with Southern blotting to reveal exposed regions of DNA, known as DNase hypersensitive sites, finding that such regions are indeed active. Next-generation sequencing has allowed researchers to probe exposed regions across entire genomes, and the ENCODE project alone has generated more than 400 data sets using DNase-seq.
How it works: DNase-seq takes advantage of the fact that exposed regions of the genome are naturally more prone to degradation by DNases. The method employs the enzyme DNase I to cleave DNA at sites along the genome that are not wrapped around nucleosomes, which become displaced by the binding of transcription factors. These small fragments, which are thought to infer the presence of transcription factors, are then sequenced and mapped to the genome.
Getting started: Check out the two main protocols: Cold Spring Harb Protoc, doi:10.1101/pdb.prot5384, 2010; Curr Protoc Mol Biol, Supplement 103:Unit 21.27, 2013.
Considerations: Recent research has revealed how DNase?I’s cutting bias may limit the method’s usefulness for the identification of DNA footprints. Analyzing supposed binding of 36 different transcription factors, the researchers showed that DNase-seq data were not useful for illuminating footprints for many of them (Nat Methods, 11:73-78, 2014).
Because where the enzyme cuts is sequence-dependent, researchers should use naked DNA (i.e., DNA with no associated proteins) as a control in DNase-seq (also, in ATAC-seq) footprint analysis, says Clifford Meyer, research scientist in X. Shirley Liu’s lab at Harvard University and a coauthor on the Nature Methods study. “If you see a pattern in the naked DNA, then you know it’s got nothing to do with transcription-factor binding,” he adds.
BASED ON EPIGENETICS CHROMATIN, 7:33, 2014, REDRAWN WITH PERMISSION.
Single cells?: Just a month ago, Keji Zhao’s group described single-cell DNase-seq (scDNase-seq), using the technique to identify exposed regions of DNA in tumor cells that they had manually scraped from fixed-tissue slides of thyroid cancer biopsies. The team also analyzed exposed genome regions of single living cells isolated using fluorescence-activated cell sorting (Nature, 528:142-46, 2015).
Background: In collaboration with Howard Chang at Stanford University, Greenleaf’s group introduced the Assay for Transposase-Accessible Chromatin (ATAC)-seq in 2013 (Nat Methods, 10:1213-18, 2013).
How it works: ATAC-seq inserts sequencing adapters directly into accessible DNA using the enzyme Tn5 transposase. The bits captured between the adapters are then amplified with qPCR and sequenced.
Getting started: Greenleaf has started a forum to field questions from an expanding user group. You can request access to it at sites.google.com/site/atacseqpublic/home?pli=1. Those experienced with molecular biology techniques can generate a sequencing library in a day, Greenleaf says.
Tips: Every cell is different, so you will need to adjust the cell number and the lysis conditions for your particular situation. “Ideally you want to gently lyse cells to get the transposase in but not disrupt the chromatin state,” says Greenleaf.
Using too many cells leads to fewer sequencing adapters being inserted, and thus larger DNA fragments; too few cells will lead to shorter bits. The optimal number of cells can vary, depending on the tissue or organism from which the cells originate.
It’s always good to do some preliminary analysis before you run your samples on a sequencer, or do light sequencing to start, says Greenleaf. You could run a preliminary gel to check out fragment distributions, or run the sample through a machine that quantifies DNA and measures its quality (e.g., Agilent 2100 Bioanalyzer). For production-level sequencing, Greenleaf recommends using paired-end sequencing for the best results.
Single cells?: Two groups recently published different methods for single-cell ATAC-seq. Jay Shendure’s group at the University of Washington and his collaborators tagged cell nuclei with barcodes and separated them using fluorescence-activated cell sorting (Science, 348:910-14, 2015). In contrast, Greenleaf’s lab uses microfluidic approaches for cell isolation (Nature, 523:486-90, 2015). Much of the challenge for both methods comes down to data analysis, Greenleaf says, because the data are sparse. “In a single cell, there’s either zero, one, or two loci that are open at any specific region of the genomic sequence,” he says.
Background: Researchers have used micrococcal nuclease (MNase), from Staphylococcus aureus, for digesting and studying chromatin for at least 40 years. In 2010, they started pairing it with high-throughput sequencing.
How it works: MNase works by chewing up exposed stretches of the genome; the DNA associated with nucleosomes is recovered and sequenced. That makes MNase-seq the inverse of ATAC-seq and DNase-seq, at least conceptually.
Getting started: Researchers have developed a protocol that takes into account the shorter reads produced by MNase digestion and generates base-pair resolution mapping (PNAS, 108:18318-23, 2011). Buck’s group has described methodology aimed at standardizing digestion and data analysis steps (BMC Mol Biol, 13:15, 2012).
Tips: DNase-seq and MNase-seq are not perfect opposites: studies could, for example, suggest that a given site on the genome could be both DNase I hypersensitive and nucleosomal, says Lieb. “Just imagine that a site is open half the time and nucleosomal half the time,” he says. “It’s theoretically possible to get [DNase] hypersensitive signaling and a nucleosome. Kinetics are still a challenge which none of these methods has addressed completely.” Averaging over many populations of cells also muddles the data, he adds.
An alternative to MNase-seq, called NOMe-seq, generates genome-wide information about both nucleosome positioning and the state of DNA methylation (Genome Res, 22:2497-506, 2012).
Nothing published yet.
A word about data analysis
In probing DNase-seq, ATAC-seq, and MNase-seq data, most researchers use programs that were originally developed for ChIP-seq, says Michael Buck. It’s simple enough to recognize places in the genome that are open, “but if you want to do more analysis, that’s where people get bogged down,” he adds.
Sophisticated analysis is required to get more meaningful results, and for that you will need some programming abilities, Zhao says. “It doesn’t matter what programming language you use—R, Perl, C++—but programming ability is important.”
That doesn’t mean you have to be a bioinformatician. It’s relatively easy for molecular biologists to pick up enough Perl, for example, to do data analysis themselves, or at least be able to communicate with a bioinformatician about the analysis. Core facilities and collaborators who specialize in data analysis can be key resources, Zhao says. In addition, Buck says, new assay-specific analysis tools are in the works and the tools for these methods should improve in the near future.
January 19, 2016
“If you see a pattern in the naked DNA, then you know it’s got nothing to do with transcription-factor binding,” he adds.
Is he claiming that hydrogen-atom transfer in DNA base pairs cannot be linked to the patterns neo-Darwinists assumed linked mutations to evolution?
If so, what does that tell serious scientists about the assumptions the neo-Darwinists made?
Without those assumptions, is there a way for a new species to evolve via an accumulation of mutations? If not, who will be the first neo-Darwinian teleophobe to tell the others that they were taught to believe in pseudoscientific nonsense?
May 11, 2016
"Recent research has revealed how DNase?I’s cutting bias may limit the method’s usefulness for the identification of DNA footprints."
We have recently revisited computational methods for digital footprinting in DNAse-seq data on a large compedia with > 80 TFs (Nat Methods, 13, 303-308). Among other aspects, we investigated the influence of the cleveage bias in the footprint predictions. Our analysis indicates that a few advanced digital footprinting methods were not affected by cleveage bias. Moreover, we shown that correcting DNAse-seq signals by cleveage bias virtually removes the effects of this artifacts.
Our message is