Adefining shift in molecular biology over the past decade has been the application of whole genome and whole transcriptome sequencing methods to single cells. With advances in cell isolation and next generation sequencing, researchers no longer need to average out the signal from multiple cells in a population, but can instead study the DNA, RNA, proteins, and chromatin cell by cell.
Single-cell genomics, epigenomics, transcriptomics, and proteomics studies have revealed just how much variation there is in gene and protein expression even between genetically identical cells in the same tissue. But most such studies examine only a single layer of information from each cell, which may give a skewed picture, says Pier Federico Gherardini, a biologist at the Parker Institute for Cancer Immunotherapy in San Francisco. “You cannot just measure RNA and assume that things will look the same with proteins.”
Researchers have started to combine multiple layers of information at single-cell resolution. These “multi-omics” techniques can provide a closer look at the variability between cells and more clearly identify specific cells and their functions. Analyzing genomic DNA reveals the single-cell genome, methylome, or chromatin, while analyzing RNA and proteins yields transcriptome and proteome data, respectively.
“Multi-omics is much more powerful than a single layer alone,” says Lia Chappell, a molecular biologist at the Wellcome Sanger Institute in the UK. “You begin to untangle what all that heterogeneity really means and are able to dig deeper into biological mechanism.”
Single-cell multi-omics is particularly useful for examining cells undergoing rapid changes, such as activated immune cells, or cells in very heterogeneous tissues, including tumors, says Christoph Bock, a genome researcher at the CeMM Research Center for Molecular Medicine in Vienna, Austria.
The approach can also identify rare but biologically important cells that are masked in a large population. “A classic example would be those few drug-resistant cells that were already there, but [that] you cannot see with bulk techniques because they’re drowned out a thousandfold to one by more abundant cells,” says Chappell.
Single-cell multi-omics is no cakewalk. There are no commercial kits available yet for any single-cell multi-omics techniques, and many are technically challenging. Researchers must modify existing single-cell protocols so that they’re compatible with multiple types of molecules and take great care to minimize the loss or contamination of samples. Yet the extra effort is worth it, experts say.
The Scientist asked researchers developing single-cell multi-omics techniques to guide us through the available options.
Genome and transcriptome
Simultaneously sequencing both DNA and RNA from the same cell can reveal how genomic variation between single cells might explain variations in their transcript levels. Doing so can also detect DNA mutations with greater accuracy.
In DR-seq (DNA-mRNA sequencing), single cells are lysed and the DNA and RNA in the lysate are simultaneously amplified. The lysate is then split into two halves, one for RNA sequencing (RNA-seq) and the other for genome sequencing. Keeping DNA and RNA together during amplification minimizes the loss of nucleic acids, but could lead to potential cross-contamination.
G&T-seq (genome and transcriptome sequencing) physically separates mRNA and DNA from a fully lysed cell using magnetic beads coated with a short oligonucleotide sequence that binds mRNA. DNA and mRNA are then amplified and sequenced separately. Keeping mRNA and DNA separate allows researchers to use their protocol of choice for analyzing each, but could potentially lead to the loss of nucleic acids. G&T-seq has been automated and is relatively high throughput.
Epigenome and transcriptome
Techniques that assay a cell’s epigenome and transcriptome can reveal how methylation and chromatin accessibility regulate gene expression. “During complex biological processes such as tumorigenesis, heterogeneity will exist in genome, epigenome, and transcriptome simultaneously, and profiling them separately may not work,” says Fuchou Tang, a molecular biologist at Peking University. Genetically identical tumor cells may have different DNA methylation or gene expression patterns, and multi-omics techniques may be needed to unambiguously classify them into subpopulations, he adds.
scM&T-seq (simultaneous single-cell methylome and transcriptome sequencing) is based on G&T-seq, and uses the same procedures to isolate DNA and RNA from a single cell and to amplify and sequence RNA. Genomic DNA is subjected to bisulfite treatment to convert unmethylated cytosines to uracils, and is then amplified and sequenced to assay the methylome.
scNMT-seq (single-cell nucleosome, methylation, and transcription sequencing) builds on scM&T-seq, but single cells are isolated and treated to also probe genome-wide chromatin accessibility. How accessible or protected different genomic locations are can affect gene expression, and researchers using scNMT-seq have uncovered new associations between the epigenome and transcriptome in mouse embryonic stem cells.
During complex biological processes such as tumorigenesis, heterogeneity will exist in genome, epigenome, and transcriptome simultaneously and profiling them separately may not work.—Fuchou Tang, Peking University
In scMT-seq (another method of simultaneously sequencing single cells’ methylomes and transcriptomes) and scTrio-seq (single-cell triple omics sequencing), the cell membrane is selectively lysed to separate mRNA in the cytosol from genomic DNA in the intact nucleus. In scMT-seq, the cell nucleus is collected using a micropipette, whereas in scTrio-seq it is separated by centrifugation. In both cases, genomic DNA is subjected to a modified bi-sulfite treatment and sequencing method to reveal the methylome, while mRNA from the cell lysate is amplified and sequenced in parallel. “Essentially there is no cross-contamination between the genome and transcriptome data,” says Tang, who helped develop scTrio-seq. scTrio-seq uses the methylome sequence data to computationally assess genomic copy number variation, and the technique has already been used to analyze heterogeneity in human colorectal cancer samples.
Proteome and transcriptome
Several techniques can simultaneously assay transcripts and proteins from a single cell. These approaches offer scientists a look at post-transcriptional processes that can result in differences between protein and transcript levels.
In PLAYR (proximity ligation assay for RNA), proteins are labeled with antibodies conjugated to distinct metal isotopes. At the same time, RNA transcripts are bound by isotope-labeled probes. A mass-spectrometry-based method known as mass cytometry is used to measure the isotopes and can simultaneously quantify more than 40 different mRNAs and proteins in thousands of individual cells per second.
CITE-seq (cellular indexing of transcriptomes and epitopes by sequencing) uses oligonucleotide-tagged antibodies to target cell-surface proteins. Single cells are isolated and lysed, and their mRNA and oligo-tagged antibodies are bound to magnetic beads coated with short oligonucleotide sequences. The RNA and antibody tags are amplified and separated by size, and proteins and transcripts are quantified using sequencing. CITE-seq can simultaneously quantify about 100 proteins along with tens of thousands of RNA transcripts, says
Marlon Stoeckius, a molecular biologist at the New York Genome Center who helped develop the technique. Stoeckius is working on extending the method to intracellular proteins, although that will require fixing and permeabilizing the cells, which may degrade RNA quality or cause it to leak out of cells.
REAP-seq (RNA expression and protein sequencing assay) is similar to CITE-seq, and also uses oligonucleotide-conjugated antibodies to measure both cellular protein and transcript levels using a sequencing-based readout. Both REAP-seq and CITE-seq can assay a larger number of transcripts, but on fewer cells at a time, compared to PLAYR. “I think they’re complementary approaches,” says Gherardini, a lead developer of PLAYR. “If you have a big cohort or a clinical study, something like PLAYR is much, much more cost effective,” he says.
One advantage of CITE-seq is that the protein quantification is performed on a fraction that is normally discarded in single-cell RNA sequencing prep, so “there is no detriment to the quality of the RNA sequencing library,” says Peter Smibert, manager of the Technology Innovation Lab at the New York Genome Center. “We think that any sort of situation where people are using RNA-seq as a readout, we see no reason not to use CITE-seq instead.”
Multiple choices of multi-omics
With so many assays to choose from, researchers will have to decide which ones to use based on the biological question they’re asking, as well as on how expensive, labor-intensive, and technically demanding a given technique is. “It typically takes a team of technologists, a computational person, and a biologist who knows the experimental system to do these types of projects well,” says Bock.
|Potential loss of nucleic acids||Cell Throughput||Automation||Website/Paper|
|DR-seq||Genome, Transcriptome||Mouth pipette||Low risk of|
|G&T-seq||Genome, Transcriptome||Flow cytometry||Some risk of mRNA and|
|Flow cytometry||Some risk of mRNA and|
|Microcapillary pipette||Loss of some cytoplasmic|
and all nuclear mRNA molecules
methylation, Copy number
|Mouth pipette||Loss of nearly half of|
and all nuclear mRNA molecules
|Flow cytometry||Some risk of mRNA and|
|CITE-seq||Transcriptome and Proteome||Drop-seq and 10x Genomics Chromium||Low||High||No||citeseq.com; www.nature.com/articles/nmeth.4380;satijalab.org/seurat/|
|REAP-seq||Transcriptome and Proteome||Flow cytometry and 10x Genomics Chromium||Low||High||No||https://www.nature.com/articles/nbt.3973|
|PLAYR||Transcriptome and Proteome||Mass cytometry||Low||Very High||No||www.nature.com/articles/nmeth.3742|
Protocols vary in how they dissociate tissues into single cells. Mouth pipetting and serial dilution are relatively quick, easy, and low-cost methods that can minimize the risk of RNA or protein degradation, but they’re also relatively low throughput. FACS, robotic manipulation, and microfluidics are high throughput, but they’re expensive and can be rougher on cells.
Cost is another factor. The techniques outlined above can range anywhere from a few dollars to a few hundred dollars per sample based on the protocol and the volume of reagents—such as enzymes and antibodies—needed. That doesn’t include the costs of sequencing, which can be limiting as well. “Not many scientists will be able to afford to sequence 100,000 single genomes from single cells,” says Chappell.
The ultimate goal is to capture information from all molecules in a single cell—a sort of “omni-omics”—but that’s still several years away.
Single-cell multi-omics assays can take anywhere from a little more than 24 hours to nearly a week for one batch of samples. Protocols that involve manually separating the nucleus from the cytoplasm tend to be slower and more labor-intensive, whereas FACS-based methods are more convenient and higher throughput.
Bioinformatics expertise is a plus. “For all these types of new assays, you need a little bit of a computational background and little bit of experience with R,” a programming language and software environment for statistical computing and data analysis, says Stoeckius. That may soon change, as more people start to use such assays and companies develop easy-to-use software for analyzing single-cell multi-omics data. For now, computational packages such as SEURAT and MOFA have been built to integrate data from two or more omics layers for a single cell.
As single-cell single-omics techniques become more sensitive, accurate, and high throughput, the corresponding multi-omics techniques are likely to improve as well. Researchers are also working on integrating single-cell techniques with additional layers of data, such as spatial information (such as CODEX) and functional assays. Linking a cell’s spatial information to other omics layers could help researchers map different cell types and functions within a tissue.
The ultimate goal is to capture information from all molecules in a single cell—a sort of “omni-omics”—but that’s still several years away. For now, single-cell multi-omics may be well on its way to changing how scientists approach molecular biology. “The take-up [of these techniques] has been impressively quick, so I think in five years’ time this will be seen as obvious and the standard thing to do,” says Chappell.