The DNA base cytosine has a tendency to play dress-up, gaining and shedding chemical modifications. For more than 40 years, scientists have known that methyl groups attached to cytosine’s fifth carbon atom can alter gene expression. These epigenetically marked bases, called 5-methylcytosines (5mCs), help to determine how hundreds of cell types in the human body differentiate and maintain their identities, despite having the same genetic backgrounds.

Recently, researchers have rediscovered a mostly ignored epigenetic variant that results when a methyl group on a cytosine takes on a hydroxyl group to form 5-hydroxymethylcytosine (5hmC). The favored method for detecting methylation is bisulfite sequencing, which converts unmodified cytosine to uracil, which then reads as thymine following PCR amplification. Modified cytosines continue to read as cytosines. This technique fails to distinguish between 5mC and 5hmC, however, and researchers are beginning to understand that the two gene marks sometimes play different roles.


Scientists first described 5hmC in mammals several decades ago (Biochem J, 126:781-90, 1972), but the community paid little attention to the modification until 2009, when researchers noticed its enrichment in the brain (Science, 324:929-30, 2009) and demonstrated how the modification is formed (Science, 324:930-35, 2009). These days, researchers suspect 5hmC helps determine developmental fates and, when present in aberrant patterns, contributes to disease. “I’m convinced that many of the developmental changes we see either connected with normal development or abnormal development will have one or another connection to hydroxymethylation,” says Jörn Walter, an epigeneticist at Saarland University in Germany.

As interest in 5hmC has surged, researchers have developed approaches for mapping its areas of enrichment across the genome and, most recently, determining its location at single-base resolution. The Scientist profiles some of those methods here.

Powerful Affinities

AFFINITY-BASED PROFILING: Antibodies for 5hmC or for its post-bisulfite conversion form CMS can capture 5hmC-containing DNA fragments. Alternatively, researchers can first modify 5hmC with an azide-glucose moiety, then tag the complex with biotin and enrich for tagged DNA fragments using bacterial proteins. DNA fragments are sequenced following enrichment.
See full infographic: JPG | PDF
Some relatively easy and cheap techniques for determining genome-wide distribution of 5hmCs involve using antibodies or enzymes to tag fragments of DNA that contain epigenetic marks. Since this approach involves cutting DNA into pieces spanning 50 to a few hundred base pairs, it is inexact; researchers can only determine which neighborhoods of the genome contain 5hmCs, not which bases have been modified. Affinity-based methods also do poorly at quantifying what proportion of DNA in a sample has 5hmCs in a given region.

On the plus side, however, affinity techniques provide a great starting point, and they are relatively cheap, says Peng Jin, who studies epigenetic marks in cancer and neurological disease at Emory University. Jin uses an affinity-based method to get an overview of the 5hmC distribution in the diseased cells he studies, followed by a higher-resolution method to take a closer look.

Skirmantas Kriaucionis, a principal investigator at the University of Oxford branch of Ludwig Cancer Research and one of the researchers who recently discovered the importance of 5hmC, favors a method called hMe-Seal. Developed by Chuan He, a chemical biologist at the Univer­sity of Chicago, along with Jin and colleagues, the technique takes advantage of the activity of the bacteriophage enzyme β-glucosyltransferase. The enzyme adds an azide-modified glucose moiety to 5hmC on genomic DNA that’s been cleaved into small fragments. Researchers can then tag the 5hmC locations with biotin, which binds to the azide group. Bacterial proteins of the avidin family bind biotin, allowing biotin-labeled DNA fragments to be pulled down for sequencing (Nat Biotech, 29:68-72, 2011). Kriaucionis says that biotin-pulldown methods not only are relatively affordable, but also can enrich for modifications that have a low frequency at a given locus.

Another tactic is to create antibodies to 5hmC itself. Gerd Pfeifer, an epigeneticist at City of Hope in Duarte, California, recently used antibodies against 5hmC to immunoprecipitate hydroxymethylated fragments of DNA to generate whole-genome 5hmC profiles of mouse cells undergoing neurogenesis (Cell Reports, 3:291-300, 2013). However, areas with multiple 5hmCs are more likely to be immunoprecipitated than sparsely modified regions, making the method somewhat biased.

An alternative antibody-based method, developed by Anjana Rao of the La Jolla Institute for Allergy and Immunology and colleagues, takes advantage of the fact that bisulfite conversion changes 5hmC to cytosine 5-methylenesulphonate (CMS). Researchers can then use anti-CMS antibodies to immunoprecipitate fragments of hydroxymethylated DNA (Nat Protoc, 7:1897-1908, 2012). Anti-CMS antibodies may be less biased towards areas high in 5hmC, according to He.

Epigeneticists don’t know yet how important it is to describe methylation at singe-base resolution, notes Gary Hon, a postdoc who studies DNA modification in stem cells at the San Diego branch of Ludwig Cancer Research. For many types of experiments, determining that an epigenetic modification sits in a particular regulatory region might be enough. “Maybe you find that pulldown or affinity methods are good enough in the vast majority of your cases,” he says. “I don’t think a lot of people really need to know which particular cytosine is hydroxymethylated.”

Cost: Active Motif sells Hydroxymethyl Collector kits, based on the enzymatic hMe-Seal approach, for $395 for 25 reactions. A kit from Diagenode for performing 5hmC DNA immunoprecipitation (hMeDIP) costs $495 for 16 reactions. A similar kit from Active Motif costs $375 for 10 reactions. Antibodies for CMS are not yet on the market, though Rao and colleagues have provided them to interested collaborators. Labs will have to factor in the additional cost of DNA sequencing—generally a few hundred dollars per genome profiled.

• Economical

• Affinity-based methods don’t allow absolute quantification.
• Some antibody-based methods are biased towards areas with high levels of modification.
• Low-resolution

Keeping TABs on Modifications

TAB-Seq METHOD: Bisulfite sequencing alone (near right) cannot determine whether a base is 5mC or 5hmC. With TAB-Seq (far right), 5hmCs are protected from oxidation by glucosylation, while 5mCs are oxidized using a Tet enzyme, which converts 5mC to 5-carboxylcytosine (5caC). Bisulfite treatment then converts the 5caCs and unmodified cytosines to read as thymines following PCR. The 5hmCs continue to read as cytosines, revealing their locations.
See full infographic: JPG | PDF
SOURCE: YU ET AL., CELL, 149:1368-80, 2012
For researchers interested in 5hmC’s dynamics and effects at a single-base level, a rough map isn’t enough. Tet-assisted bisulfite sequencing (TAB-Seq) is one of two currently available methods that can give a close-up tour of 5hmC marks across an entire genome (Nat Protoc, 7:2159-70, 2012).

Hydroxymethylation arises naturally in cells when enzymes from the Tet family oxidize cytosine methyl groups. TAB-Seq, developed by a team that included He, Jin, and Hon, uses a Tet enzyme to differentially oxidize 5mCs and 5hmCs, thereby distinguishing them from each other.

The first step of the technique involves protecting all of the 5hmCs in the sample from oxidation by glucosylating them using β-glucosyltransferase, as in the biotin-assisted pulldown method described in the previous section. Next, you’ll use a Tet enzyme to repeatedly oxidize the 5mCs until they convert to 5-carboxylcytosines (5caCs). Finally, bisulfite treatment followed by PCR converts the 5caCs and unmodified cytosines to read as thymines, while the 5hmCs continue to read as cytosines. All cytosines left in the sequence represent locations of 5hmC.

Using TAB-Seq, researchers can also figure out what percentage of cells in a sample are hydroxymethylated at specific cytosines. For example, cancerous and noncancerous cells lumped into a single sample might not all have 5hmCs in the same positions. An affinity-based method would simply show the vicinities in the genome where 5hmCs tended to be, without fully explaining whether they were in each location in just a small number of the cells in the sample or in a greater number of cells. TAB-Seq can determine what proportion of cells in a sample have 5hmCs at specific locations.

Moreover, TAB-Seq can be used to assess the 5hmC status of single pieces of DNA. Examining 5hmC at the single-cell level “can reveal the precise mechanisms of how 5hmC can be maintained or dynamically changed from cell to cell,” says Wolf Reik of the Babraham Institute in Cambridge, U.K. He is one of the creators of a competing single-base method (described below). However, TAB-Seq is unable to determine locations of 5mCs on single molecules. Looking at 5hmCs, 5mCs, and unmodified cytosines in relation to one another is key to understanding the modifications' functions, but no method can currently determine the distribution of both modifications on a single molecule of DNA.

It’s possible to produce your own Tet enzyme using High Five insect cell lines. If a lab is already set up to do insect cell expression, says He, this might be an economical route. But in most cases, purchasing a kit is the way to go. “It’s maybe a little more expensive to have the company do it, but then you have confidence in the enzyme,” says Lucy Godley, an associate professor of medicine at the University of Chicago who is using TAB-Seq to study the dynamics of hydroxymethylation during hematopoietic stem cell differentiation.

Getting single-base resolution across an entire genome can be prohibitively expensive because it requires high sequencing coverage, so many researchers use single-base methods like TAB-Seq to analyze partial genomes. Hon additionally suggests that researchers who only need to analyze 5hmC with resolution at the regulatory-element level could get away with much lower sequencing coverage.

Cost: TAB-Seq kits, sold by WiseGene, cost $969 apiece and contain sufficient reagent amounts for three genome-wide reactions or six loci-specific reactions. Depending on labor costs, producing the enzymes in the lab could cost about one-third to one-half the price of the kit, He estimates. (He shares a joint Small Business Innovation Research grant with WiseGene and continues to collaborate with the company.) Sequencing costs can vary depending on coverage needed but can run up to several tens of thousands of dollars for a full genome sequenced at single-base resolution.

• Gives absolute quantities of the modification
• Single-base resolution
• Can determine location of 5hmC on individual molecules of DNA

• Expensive
• Enzymes must be produced and handled carefully and have limited activity.

It’s Chemical

CHEMICAL CONVERSION: The oxBS-Seq method converts 5hmCs to 5-formylcytosines (5fCs) using chemical oxidation, while leaving 5mCs alone. Bisulfite treatment converts 5fCs and unmodified cytosines to read as thymines following PCR. The 5mCs continue to read as cytosines. Comparing an oxBS-Seq run with a run of traditional bisulfite sequencing pinpoints the precise locations of 5hmCs.
See full infographic: JPG | PDF
TAB-Seq’s primary single-base resolution competitor is oxidative bisulfite sequencing, or oxBS-Seq, another modified form of bisulfite sequencing that achieves absolute quantification (Science, 336:934-37, 2012). Developed at the Babraham Institute and the University of Cambridge by Reik, Shankar Balasubramanian, and colleagues, oxBS-Seq depends on chemical oxidation rather than enzymatic oxidation. “We like the beauty of the chemistry,” says Reik. “It’s so conceptually simple.”

The method works as follows: First, oxidize all the 5hmC in one DNA sample to 5-formylcytosine (5fC) using chemical reagents. Next, with the help of bisulfite sequencing, convert the cytosines and 5fCs to read as thymines while leaving the 5mCs to continue to read as cytosines. Simultaneously, run a round of ordinary bisulfite sequencing on a second DNA sample. Finally, compare the sequences of the two samples after treatment. The differences reveal the locations of the 5hmCs.

Initially, many outside researchers struggled to replicate the oxBS-Seq method, says Jörn Walter. It wasn’t until he used a kit from the UK-based company Cambridge Epigenetix as part of an alpha trial that he got the method to work. Walter, Reik, and their colleagues recently coauthored the first publication using oxBS-Seq, which describes demethylation dynamics in developing stem cells (Cell Stem Cell, 13:351-59, 2013).

Cambridge Epigenetix, for whom Reik is an advisor, began selling the oxBS-Seq kits, which it calls TrueMethyl kits, in August 2013, and representatives say purchasers are now successfully using the method. Kriaucionis, who participated in the kits’ beta trials, says he now plans to compare oxBS-Seq and TAB-Seq head-to-head. “When you bring the kit into day-to-day life, and have a variety of samples, then you really experience a range of biological conditions,” he says.

Unlike TAB-Seq, oxBS-Seq does not directly yield a sequence showing the locations of 5hmCs. TAB-Seq converts all 5mC locations and all unmodified cytosines to read as thymines, so that the cytosines remaining at the end of the treatment must represent locations of 5hmC. But following oxBS treatment and sequencing, 5hmCs are indistinguishable from unmodified cytosines—they both read as thymines. To determine the locations of 5hmCs versus unmodified cytosines, users must compare a sample that has undergone oxBS conversion and sequencing to a sample that has undergone regular bisulfite conversion and sequencing, which causes cytosines to read as thymines while allowing 5hmCs to continue to read as cytosines. The comparison can reveal the proportion of cells in a sample that have 5hmCs at a given locus, but not which of those cells have the modifications.

Kriaucionis recommends that users of both TAB-Seq and oxBS-Seq check the accuracy of their experiments by using controls. “The most important thing is to have the right controls to make sure the method performs as expected.”

Like TAB-Seq, oxBS-Seq is expensive to perform on whole genomes. Toby Ost, R&D manager at Cambridge Epigenetix, suggests that currently it’s smartest to analyze only a fraction of the human genome. “In the future when [sequencing] is dirt cheap, whole hydroxymethylomes will be commonplace,” he says.

Cost: Cambridge Epigenetix sells kits with reagents for six oxBS reactions and six ordinary bisulfite reactions for $1,000, and kits with reagents for 24 of each type of reaction for $3,000. One reaction is sufficient for quantifying hydroxymethylation across an entire mammalian genome. Sequencing costs are around the same price as for TAB-Seq.

• Gives absolute quantities of the modification
• Single-base resolution
• Slightly cheaper than TAB-Seq

• Expensive compared to affinity-based methods
• Tricky to get good results without a kit

Correction (February 10): The text of the article has been changed to reflect the fact that no single method currently in use can determine the locations of all three states of cytosine on a single molecule, and to point out that both TAB-Seq and oxBS-Seq are indirect methods in different regards: TAB-seq indirectly measures 5mC, while oxBS-Seq indirectly measures 5hmC.

Interested in reading more?

Magaizne Cover

Become a Member of

Receive full access to digital editions of The Scientist, as well as TS Digest, feature stories, more than 35 years of archives, and much more!
Already a member?