Human Epigenome Project Maps MHC Locus

Methylation at regulatory regions, especially promoters, correlates with transcriptional activity: Sequences near silent genes generally are methylated, whereas those near active regions are not.

Mar 28, 2005
Melissa Phillips

Courtesy of Human Epigenome Consortium

The Human Epigenome Project's pilot study included the development of novel approaches to mapping methylation across the genome. Researchers hope the so-called GOOD assay for epigenotyping, diagrammed here, can be used to identify particular CpG positions whose methylation status is indicative of the larger region's genomic state.

Methylation at regulatory regions, especially promoters, correlates with transcriptional activity: Sequences near silent genes generally are methylated, whereas those near active regions are not. Scientists traditionally have measured these modifications on a gene-by-gene basis, but a team at the Wellcome Trust Sanger Center in Cambridge, UK, has been attacking the question on a genomic scale. This project, dubbed the Human Epigenome Project, released its first results in December 2004.1

Vardhman Rakyan led the pilot study. He and his colleagues examined DNA methylation patterns in the human major histocompatibility complex (MHC) in seven human tissues: adipose, brain, breast, liver, lung, muscle, and prostate. They chose the MHC because it is the most gene-dense region of its size in the human genome, Rakyan says. The MHC is also highly polymorphic, which means the researchers could expect detectable differences in methylation between individuals. At about four megabases, the MHC is the largest region to have its DNA methylation pattern mapped to date. "This was a technical tour de force," says Arthur Riggs of City of Hope National Medical Center in Duarte, Calif.

In the mammalian genome, methyl groups attach to DNA at CpG dinucleotides, where a cytosine base is followed by a guanine. By examining methylation patterns at CpGs, researchers can infer which regions of the genome are active in a particular cell.

The authors examined methylation patterns in likely regulatory regions of MHC genes, as well as in CpG-dense regions within each gene. They isolated 253 DNA fragments, representing 90 genes, which is more than 70% of all expressed genes in the MHC.

To sequence these fragments the researchers used a method called bisulfite sequencing. When DNA is treated with sodium bisulfite, unmethylated cytosines are converted to uracil, but methylated cytosines remain untouched. The DNA is then subjected to PCR and sequenced. "We can compare the sequence with the original sequence and see where the changes have occurred," Rakyan explains. "We can get information on every single CpG site."

Researchers traditionally have sequenced multiple subclones of bisulfite PCR products. Instead, Rakyan and his colleagues sequenced PCR products directly, using a program they developed called epigenetic sequencing methylation (ESME) analysis software. The program calculates methylation levels by comparing the C to T signal at CpG sites. Using ESME to sequence DNA methylation directly "is better for high-throughput purposes" than is sequencing subclones, Riggs says.

Rakyan found that more than 90% of the fragments were either hypomethylated or hypermethylated. This makes sense, says Rakyan, because "the epigenetic state of a genome has to be tightly regulated. So that means either you keep it methylated or you keep it unmethylated."

But fourteen amplicons in the MHC produced heterogeneous data: In the same tissue type, some of the amplicons were methylated while others were not. Many researchers think this pattern could arise from aberrant methylation in some cells, Rakyan says, and might underlie the etiology of certain diseases, especially cancers. Alternatively, this heterogeneity might be found if different cell types with different methylation profiles are present in the same tissue.

Rakyan and his colleagues also found that methylation profiles varied somewhat between tissues and individuals. These differences mean that the epigenome really exists in hundreds of different forms, says Adrian Bird of the Wellcome Trust Center for Cell Biology at the University of Edinburgh. Thus, he says, "Even with the best high throughput one could imagine, I think doing every epigenome in a person is going to be a massive amount of work."

Recently, however, the authors developed an alternative method, using matrix-assisted laser desorption/ionization (MALDI) mass spectrometry, to identify shortcuts that could reveal regional methylation patterns and make it easier to determine a person's epigenomic status.2

"Some sites essentially give you the same information as an entire region," Rakyan explains. "So if a particular site ... is methylated, then all the others will be methylated as well." Identifying these "methylation-variable positions" will allow fast, automated epigenotyping of biological samples, Rakyan says.

Rakyan's group is currently using these sequencing and epigenotyping methods to work their way through the next phase of the project: determining methylation profiles for human chromosomes 6, 13, 20, and 22.