© ISTOCK.COM/KMLMTZ66; © ISTOCK.COM/MACROVECTOR
Winston Yan’s graduate school project involved using a variant of the CRISPR-Cas9 genome editing system to knock down a gene that regulates cholesterol in mice. “The real goal was to eventually pave the way for therapeutic uses,” says Yan, who recently completed his graduate work in Feng Zhang’s laboratory at MIT. That’s when he encountered, firsthand, the problem of CRISPR’s off-target effects.
CRISPR allows researchers to quickly and efficiently make targeted cuts to genomes. Its specificity and ease of use gives the gene-editing tool great potential for removing defective genes and treating genetic diseases or cancer, or for editing the genome of crop plants to increase their yield or disease resistance.
To wield this power, however, researchers will first have to overcome one of CRISPR’s main limitations—its propensity to cut not just at its target site, but also at unintended sites with similar...
Developing methods to detect CRISPR off-target mutations has been a challenge, but over the past few years, researchers have come up with a variety of new approaches. Here, The Scientist leads a guided tour of when and how to use them.
In silico prediction vs. unbiased detection
The specificity of CRISPR editing can vary widely. A key driver of CRISPR’s precision is guide RNA, an RNA sequence that guides the Cas9 nuclease to cut at a specific location on the genome. “Even in the same organism, choosing one guide RNA might let you get away with no off-target mutations at all, while a different guide RNA might give you up to 150 off-targets,” says Julia Jansing, a PhD student in Luisa Bortesi’s lab at RWTH Aachen University in Germany.
Researchers have also found or engineered other nucleases that are more specific than the commonly-used version of Cas9 and cut at fewer off-target sites. These include the Cpf1 nuclease and a high-fidelity version of Cas9.
But CRISPR, even with optimized guide RNA or nucleases, can still cause inadvertent changes to the genome. One way to predict CRISPR’s off-target activity is with computational algorithms that identify likely off-target sites based on the sequence of the guide RNA. Researchers can then use targeted sequencing after they attempt a genomic cut to check for mutations at those predicted off-target sites. Each algorithm has its own secret sauce, and their results don’t always align. “Depending on the quality of your prediction tool, you get a varying degree of reliability,” says Jansing. In silico prediction tools have not been systematically compared, so researchers could choose one based on their preferred user interface or whether it supports the genome of their species of interest—researchers working on mice or humans have more tools to pick from than those working on tomatoes, for example.
For basic research applications, such as making a mutant cell line, in silico prediction may be the most practical option. It is relatively quick, easy, and cheap, while still generally being accurate enough to ensure that off-target mutations aren’t too numerous and don’t confound the interpretation of experimental results.
But such prognostication is far from foolproof. “We’re putting this intrinsic bias on where we’re looking because we assume we know how Cas9 cuts,” says Yan. “But from many different studies, we know that this isn’t the case.” In addition, computational predictions can be overly broad. “You’re never going to do a PCR on thousands of sites to look for a minor amount of off-targets,” he adds.
To overcome the current limitations of in silico prediction, researchers have developed multiple in vitro and cell-based techniques to detect CRISPR off-target mutations in an unbiased, genome-wide fashion. Such approaches are crucial when developing therapeutics, or even in preclinical studies, because these methods can detect rare and unforeseen off-target edits that could have potentially harmful effects on a patient, for instance by activating an oncogene. “The regulators are probably going to ask for some kind of genome-wide, off-target analysis,” says Keith Joung, a pathologist at Massachusetts General Hospital and professor of pathology at Harvard Medical School.
In vitro genome-wide assays
When Cas9 and similar nucleases cut the genome, they create double-stranded breaks. Most in vitro assays that pinpoint off-target effects use Cas9 or other nucleases to cleave cell-free genomic DNA, then use software to detect double-stranded breaks in sequencing data.
These assays are generally extremely sensitive and can detect off-target sites that are mutated at frequencies lower than 0.1 percent. They can be used to run large-scale screens for off-target effects, or be adapted to a clinical setting by extracting genomic DNA from patients. But because they use cell-free genomic DNA, they can’t predict mutations that occur inside cells.
Digested genome sequencing, or Digenome-seq, is an in vitro assay that has become increasingly popular since its introduction in 2015. It has a simple two-step protocol: in vitro Cas9 cleavage followed by next-gen sequencing. Two newer methods, CIRCLE-Seq and SITE-Seq, are slightly more complex but are also more sensitive, as they enrich for nuclease-cleaved genomic DNA before sequencing. Researchers are continuing to improve the accuracy and throughput of these methods. “I think in the long term, in vitro is going to be the way to go, but that’s an evolving space right now,” says Joung.
Cell-based genome-wide assays
Cell-based assays for off-target detection use different techniques to identify where Cas9 or other nucleases cleave genomic DNA in cells and create double-stranded breaks. One advantage of this approach over in vitro methods is that it can identify off-target sites in a specific cell type and under particular experimental conditions.
Cell-based assays detect double-stranded breaks that occur endogenously in addition to those created by the CRISPR nuclease. Their sensitivity can vary depending on the characteristics of the cells involved, including how easy it is to culture and transfect them and how efficiently they repair double-stranded breaks.
Genome-wide Unbiased Identification of Double-stranded breaks Enabled by Sequencing (GUIDE-Seq) is a widely-used and highly sensitive cell-based assay that can detect off-target sites that occur at a frequency of 0.1 percent in a cell population. Small, double-stranded oligonucleotides are used to tag double-stranded breaks created by Cas9 or other nucleases. The tagged genomic sites are then PCR amplified and sequenced to map the double-stranded breaks. One drawback of GUIDE-Seq is that some primary cells can be difficult to transfect with the oligonucleotides.
Linear amplification–mediated high-throughput genome-wide translocation sequencing (LAM-HTGTS) detects more than just breaks; it identifies genomic rearrangements resulting from these breaks. Most double-stranded breaks cleaved by the Cas9 nuclease actively repair themselves, but some fuse to ends created by other double-stranded breaks, resulting in a chromosomal translocation of off-target and on-target breaks. By detecting these translocations, LAM-HTGTS can provide a sense of the collateral damage from accumulating off-target edits, including their effect on genome instability.
“With this assay, you’re measuring the end events from a biological process,” says Richard Frock, a molecular geneticist at Stanford University who helped develop this method as a postdoc in the laboratory of Frederick Alt at Harvard Medical School.
Breaks Labeling In Situ and Sequencing (BLISS), developed by Magda Bienko and Nicola Crosetto of the Karolinska Institute in Sweden in collaboration with Zhang and Yan at MIT, profiles off-target edits by biochemically labeling double-stranded breaks in fixed cells, thus directly capturing the number of breaks in primary cells. The sensitivity of this assay can depend on when the cells are harvested, and different tissues might have different optimal harvest points.
“I think our assay has a lot of potential for use in the clinic because it interrogates breaks in the natural environment,” says Crosetto. “I’m very optimistic that we will see useful translational applications.”
Deciding on an assay
There’s still no gold standard among these techniques, and for now researchers just have to pick the one that makes the most sense for their research. “It’s a question of which method you have the equipment and the know-how for,” says Jansing. Those who want to be extra thorough could use a combination of methods.
© ISTOCK.COM/KMLMTZ66; © ISTOCK.COM/MACROVECTOR
None of these methods require special reagents for their pre-sequencing steps and all generally take only a few days to a week to finish. Assays can cost from a few hundred to a few thousand dollars depending on the number of samples, and the cost of the different assays is broadly comparable, says Crosetto.
But researchers will need access to a next-gen sequencing service or machine. Sequencing costs can vary widely depending on the number of samples and the sequencing depth, and costs are going down. But sequencing is still the most expensive part of these assays. “It’s definitely in the thousands of dollars” to perform the sequencing involved in one of these assays, says Jansing. “It’s not something you do for fun.”
Researchers employing these assays will also need bioinformatics expertise to analyze the next-gen sequencing data. “The biggest challenge is the bioinformatics analysis, because there is no off-the-shelf, commercial software package to do the analysis,” says Joung. He and his collaborators wrote their own code to analyze GUIDE-Seq and CIRCLE-Seq results. Similarly, Frock’s lab wrote custom Perl scripts for their LAM-HTGTS pipeline. “Each assay is a little bit different in its approach, so coming up with a universal pipeline to analyze these things is going to be a little bit challenging,” he says.
Meanwhile, Joung and his colleagues are working on a commercial solution. They set up a company, Beacon Genomics—now called Monitor Biotechnologies—that plans to offer GUIDE-Seq and CIRCLE-Seq on a fee-for-service basis. Joung says he hopes that the company can make these assays easier to use by allowing researchers to outsource the next-gen sequencing and bioinformatics analyses steps.
Such commercialized assays could promote the widespread adoption of unbiased, off-target detection—paving the way for improving CRISPR and making it safer for medical and biotech applications. “Having a tool for researchers just out of the box, used within a couple of hours just like a miniprep kit or something,” says Yan, “that would be fantastic.”
|HOW THE ASSAYS STACK UP|
|Selected in silico prediction assays||Resources|
|CRISPR Design Tool||http://crispr.mit.edu/|
|Selected in vitro genome-wide assays||Description||Resources|
|Digenome-seq||Purified genomic DNA is digested with a nuclease and subjected to whole genome sequencing. Off-targets are computationally identified.||Nat Methods, 12:237-43, 2015|
Web tool: http://www.rgenome.net/digenome-js/#!
|CIRCLE-Seq||Purified genomic DNA is sheared and circularized, and residual linear DNA is degraded. The Cas9 nuclease is used to linearize circular DNA containing a Cas9 cleavage site, and the cleaved ends are PCR-amplified and sequenced to identify off-targets.||Nat Methods, 14:607-14, 2017|
|SITE-Seq||Purified genomic DNA is cleaved using Cas9, and Cas9 cleavage sites are biochemically tagged and enriched. Next-gen sequencing and bioinformatics analysis is then used to identify off-target cleavage sites.||Nat Methods, 14:600-606, 2017|
Protocol: Protocol Exchange, doi:10.1038/protex.2017.043
|Selected cell-based genome-wide assays||Description||Resources|
|GUIDE-Seq||Double-stranded breaks created by the Cas9 nuclease are tagged using small double-stranded oligonucleotides, PCR amplified, and sequenced to map the double-stranded breaks.||Nat Biotechnol, 33:187-97, 2015|
|LAM-HTGTS||Chromosomal translocations of off-target and on-target breaks are PCR amplified and analyzed by next-gen sequencing.||Nat Biotechnol, 33:179-86, 2015|
Protocol: Nat Protoc, 11:853-71, 2016
|BLISS||Double-stranded breaks are biochemically labeled, and their downstream sequences are amplified using in vitro transcription and analyzed using next-gen sequencing.||Nat Commun, 8:15058, 2017|