© ISTOCK.COM/OLIANA/ANNA_LENIThroughout the genomes of mammals and plants, certain genes carry marks that indicate whether they came from mom or dad. Typically, these marks are methyl groups that regulate gene expression so that one parent’s allele is selectively expressed. Together, these imprinted genes make up the imprintome.
Scientists used to search for imprinted genes one by one, but thanks to modern sequencing techniques, they can now scan entire genomes. The precise size of the imprintome is uncertain. Estimates suggest there are approximately 100 to 150 or so imprinted genes in humans and in mice, and 90 or more in the plant model Arabidopsis thaliana. Many imprinted regions of the genome can contain sequence variants linked to human diseases, such as diabetes. Because only one copy of an imprinted gene is expressed, loss-of-function mutations are more likely to cause problems in an imprinted situation.
Identifying a full list of imprinted genes for humans and model organisms will give scientists a springboard to characterize the mechanisms and functions of imprinting, says Ian Morison of the University of Otago in Dunedin, New Zealand. That’s an ongoing effort, and there have been plenty of hurdles along the way. Piroska Szabó of the Van Andel Research Institute in Grand Rapids, Michigan, was once excited to think she’d discovered a new gene that expressed only the maternal allele—until she realized that the RNA sequences she was looking at were from a gene that had been misannotated as a nuclear gene, explaining the maternal-only inheritance.
There are other ways, too, of being misled. Some papers have identified upwards of 1,000 potentially imprinted genes, only to have most shot down later as false positives. These false hits can arise when cells select one allele for other reasons—it could be random or specific to the allele’s sequence, rather than its parent of origin.
The need to accurately evaluate sequences makes a bioinformatician a key member of any imprintome team, advises David Monk of the Bellvitge Institute for Biomedical Research in Barcelona, Spain.
The Scientist asked imprinting experts to share their techniques for teasing out the imprintomes of mouse, Arabidopsis, and human, and to propose some ideas for new technologies and investigations that would move the field forward.
RESEARCHER Piroska Szabó, Associate Professor, Center for Epigenetics, Van Andel Research Institute
In 2014, Szabó and colleagues reported how they’d used mouse embryonic fibroblasts to test whether the relatively new method of RNA sequencing could reveal known and novel imprinted genes. The team started with two different strains of mice, which had known genomic differences. They crossed these strains and sequenced the RNA of the resulting offspring.
For any gene in which the maternal and paternal genomes differed in sequence, the researchers could look at the RNAs for that gene, and ask which alleles were transcribed. For most genes, they’d see a 50:50 split between maternal and paternal codes. But for imprinted genes, they’d expect to see mostly maternal, or mostly paternal, codes represented in the RNA.
FINDINGS The researchers identified 32 known imprinted genes, but no new ones, implying that the list of imprinted genes in the mouse—at least in embryonic fibroblasts—is nearly complete, says Morison, who was not involved in the study (Nucleic Acids Res, 42:1772-83, 2014).
- Performing both versions of the cross, with each strain standing as the mother or father, helps confirm the imprinting.
- The team used paired-end sequencing, which starts reads from both ends of a cDNA, and can help identify genes in which only certain splice variants are imprinted.
- Genes expressed at low levels are susceptible to being falsely identified as imprinted, because there are only a few transcripts that might randomly lean toward one or the other parental allele. Szabó and colleagues required a minimum of 10 sequence reads to call a gene as imprinted.
- Even dubbing highly expressed genes as imprinted is fraught with uncertainty, as some imprinted genes aren’t expressed in an all-or-nothing manner. The researchers set a cutoff, 80 percent expression of one allele or the other, for calling a gene as imprinted. Different groups choose different cutoffs, and a too-lenient cutoff could yield false positives.
WISH Szabó would like to see techniques for single-cell imprintome analysis, as well as more studies of different tissues at different times during development, which might still yield more imprinted genes.
ONE-PARENT SAMPLE SET
GENOME RES, 24:554-69, 2014 RESEARCHERS Kazuhiko Nakabayashi, Division Chief, Department of Maternal-Fetal Biology, National Research Institute for Child Health and Development, Tokyo, Japan; David Monk, Principal Investigator, Epigenetics and Cancer Biology Program, Bellvitge Institute for Biomedical Research
METHODS Bisulfite-seq; bisulfite-chip
Methylation is typically associated with the silencing of the nonexpressed allele, making it a convenient marker for imprinted genes, though it’s possible for patterns of differential methylation to exist in tissues where both alleles are expressed. Nakabayashi, Monk, and collaborators studied methylation patterns in adult and umbilical blood and placenta cells from healthy volunteers; brain tissue from a brain bank; and a cultured liver cell line. The team treated genomic DNA with bisulfite, which converts only unmethylated cytosines to thymines, leaving methylated cytosines unaltered. By sequencing, they could determine gene methylation patterns. The researchers figured that genes that were consistently half-methylated across a variety of tissues were possibly imprinted.
To confirm imprinting and identify the parent of origin, the authors then compared those methylation patterns to methylation in tissues affected by a phenomenon called uniparental disomy, in which both copies of a genome (or a chromosome or partial chromosome) come from one parent. One such sample was from growths called hydatidiform moles that develop in unviable pregnancies, when an egg lacking a nucleus is fertilized by two sperm, or by one sperm that has its genome duplicated. The other samples were from people who carried blood cells with duplicated chromosomes from either their mother or their father.
The researchers used Illumina microarrays to identify methylated spots in the disomy tissue samples, and compared them to methylation patterns from blood cells with typical chromosome sets. In most cases, these should match up, but the methylation patterns would differ in imprinted genes, where one copy would be methylated in the normal blood cells, but neither or both copies would be methylated in the disomy tissues. For example, in a hydatidiform mole, all genes come from the father, so genes that are normally methylated only on dad’s copy would be methylated on both alleles within these tissues.
FINDINGS The authors picked up 21 novel sites of differential methylation, 15 of which occur only in the placenta—and none of which were imprinted in mouse crosses they conducted (Genome Res, 24:554-69, 2014). “Imprinted loci are newly gained, and probably lost, during evolution,” says Nakabayashi.
- Monk prefers the bisulfite-sequencing approach because most imprinted genes are differentially methylated, even if those genes aren’t expressed in the tissue under analysis. “We use methylation as a sort of flag for where to look in the genome, rather than going directly to gene expression, which can be very complicated.”
- Uniparental disomy tissues help to confirm the identity of imprinted genes.
- Sequencing is “hugely expensive,” says Monk, estimating the cost at $6,000 per sample.
- The Illumina Infinium HumanMethylation450 BeadChip arrays only contain probes for 450,000 possible methylated regions, so some imprinted genes could be missed. The newer MethylationEPIC kit contains probes for 850,000 sites.
- Given that alleles may be differentially methylated but not differentially expressed in some tissues, methylation sequencing doesn’t confirm imprinting at the RNA level.
WISH With human tissues hard to come by and the mouse imprintome not matching the human one, Nakabayashi would like to see more research on primate imprinting.
“Scalability is the next question,” adds Monk. He’d like a way to perform single-cell analysis using microarrays, instead of full sequencing, but chip-based bisulfite methods require more nucleic acid—about a microgram—than one cell can provide.
RESEARCHER Mary Gehring, Member, Whitehead Institute and Associate Professor of Biology, MIT, Cambridge, Massachusetts
ORGANISM Arabidopsis thaliana and A. lyrata
Methods RNA-seq and bisulfite-seq
In plants, imprinting only occurs in the endosperm, the triploid seed component that nourishes an embryonic plant. Many scientists suspect that imprinting, in both animals and plants, happens because the paternal genome promotes growth of the biggest possible offspring, while the maternal genome promotes conservation of limited resources. Gehring and colleagues were curious whether less imprinting would occur in A. thaliana, a self-fertilizing plant in which the parental interests ought to be aligned, than in the outcrossing A. lyrata. They crossed two A. lyrata strains, dissected the seeds by hand, and performed paired-end RNA-seq and bisulfite-seq on the resulting endosperms to identify the species’ imprintome. They compared this to the imprintome they had previously determined for A. thaliana (Nat Plants, 2:16145, 2016).
FINDINGS In fact, the list of imprinted genes was mostly conserved between the two species. But the authors did observe a difference in the placement of the silencing methyl groups between the two species, suggesting their imprinting mechanisms differ.
- As with Szabó’s study, the researchers know the parental genotypes, so they can identify those alleles in the seeds.
- If there’s no genetic difference between the two parental strains at a given gene, any imprinting will be invisible.
- “A main challenge is understanding what’s significantly different,” says Gehring. Is 80 percent or 90 percent maternal expression evidence for imprinting? Different research groups set different cutoffs.
WISH Gehring would like to isolate endosperm without dissecting tiny seeds by hand; she hopes that the triploid nature of endosperm tissue (with two maternal and one paternal genomes) will help her to isolate it from other, diploid cells by flow cytometry.
GENOME RES, 25:927–36, 2015 RESEARCHER Tuuli Lappalainen, Junior Investigator and Core Member, New York Genome Center and Assistant Professor, Department of Systems Biology, Columbia University
METHOD Mining sequencing data
Thanks to widely available transcriptome databases, some scientists don’t even have to collect new tissue to search for imprinted genes. Lappalainen is a collaborator on the Genotype-Tissue Expression (GTEx) database (see table on opposite page), which includes genotyping and RNA sequencing from postmortem samples of multiple tissue types. In a recent study, she and colleagues used data from that collection to hunt for imprinted genes. In any case where an individual was heterozygous at a given gene, the researchers could look for allele-specific expression. They did their best to filter out genes known to have random monoallelic expression, single-allele expression due to sequence variants, or RNA patterns that might look like monoallelic expression due to technical issues with the sequencing.
FINDINGS The researchers identified imprinting in 42 genes, 12 of which were novel (Genome Res, 25:927-36, 2015).
- GTEx includes samples from a variety of tissues, covering systems including circulatory, nervous, and gastrointestinal.
- Having a large sample—1,582 tissues from 178 people—made it easier to confirm that imprinting occurs across the population.
- From just the GTEx data, the scientists don’t know which parent a given allele came from. They had to use additional data, such as family samples, to determine which genes were paternally or maternally imprinted.
- This large-scale approach works when imprinting patterns are the same across many individuals, but it’s possible the strength of imprinting varies across the population, Lappalainen cautions.
WISH The ideal way to do this kind of analysis would be with large family data sets with diverse tissue types, so that the researchers know the parental genotypes, but Lappalainen says only data sets from blood samples are currently available.
|RESOURCES FOR IMPRINTING|
Genomic Imprinting website
Geneimprint includes a list of imprinted
genes, by species, as well as articles,
reviews, and lectures on the topic.
Catalogue of Parent of Origin Effects
|Users can search for genes impacted by parent of origin, including imprinting and other effects such as differing mutation rates in each parent, in a variety of species. (Nucleic Acids Res, 29:275-76, 2001)|
|This list of genes imprinted in the mouse is based on literature search or microarray expression data. (Epigenetics, 3:89-96, 2008)|
MouseBook Imprinting Resource
MouseBook, which lays out the stock strains at MRC Harwell, includes lists and maps of imprinted genes.
(Nucleic Acids Res, 38:D593-99, 2010)
|A consortium provides genome and RNA sequences from human postmortem tissues. (Nat Genet, 45:580-85, 2013)|
Correction: The original version of this article stated that patterns of genetic imprinting vary between tissues and developmental stages. In fact, imprinting patterns are consistent across tissues and development; what varies is how cell types read those imprinting makrs and whether they express the imprinted alleles differently. This incorrect statement has been removed from the article. The Scientist regrets the error.