Ancestry Bias Could Cause CRISPR Screens to Veer Off-target

Current CRISPR guide RNAs are designed based on European reference genomes, leading to false-negative results in cells from people with African ancestry.

Written byRohini Subrahmanyam, PhD
Published Updated 4 min read
Illustration depicting DNA genetic engineering, purple background with white drawings of internal organs like liver, lung, kidney, brain, intestine, stomach and a red heart featuring two orange DNA helices being cut by labcoat-covered arms and hands, which are holding different instruments to cut the helices.
Register for free to listen to this article
Listen with Speechify
0:00
4:00
Share

CRISPR-Cas9 enables researchers to make precise and targeted edits in the genome to determine gene function. For this, scientists use guide RNAs, which are short stretches of RNA sequences that lead the Cas9 enzyme to its target region of the genome. The enzyme causes double-stranded DNA breaks, which consequently turns off gene function when a cell cannot correctly repair the break.

In 2017, Jesse Boehm, a genomics scientist at the Broad Institute of MIT and Harvard, and his team had used multiple guide RNAs to perform a genome-wide screen in cancer cells lines from 1,000s of patients.1 They used the CRISPR-Cas9 system to target each gene and identified many “cancer dependencies”—genes essential for the cancer cells’ survival, which could serve as potential drug targets. Using their findings, they built a cancer dependency map (DepMap). But upon deeper analysis of the DepMap, they found that they had missed many cancer dependencies: CRISPR-Cas9’s targeted DNA-modifying machinery didn’t seem to work equally well across different cell lines.

In a new study, Boehm teamed up with Rameen Beroukhim, a medical oncologist at Dana-Farber Cancer Institute to investigate guide efficiency. They found that about 1.8 percent of the guides don’t reach their target genes in individual cell lines, with 2.17 percent off-target occurrences seen in cell lines of African ancestry, compared to 1.78 percent in cell lines of other ancestry groups.2 Their findings, reported in Nature Communications, demonstrate how ancestry biases can lead to scientists misidentifying potentially life-saving cancer drug targets.

“We assumed that all organizations, when designing CRISPR guides, paid a reasonable amount of attention to the variation present across patients,” said Boehm. “Because if you don't design your guides well, those guides can't cut, and they can't produce a signal.”

“Even though there had been some attention to this problem, we were surprised at the magnitude of the bias that remained,” he adds.

Boehm, Beroukhim, and others have worked on understanding cancer dependencies— the “Achilles Heel” of different types of cancers—for many years, using cell lines from cancer patients. Initially, they were looking into how somatic variations in the genome, or the genetic mutations that can come up in a person’s DNA over the course of their life, can make some patients more vulnerable to cancer. More recently, they investigated germline variations, which are inherited genetic variations found in every cell of the body, to see if those influence cancer dependencies. When they systematically analyzed the 1000s of patient-derived cell lines, they found that the patients’ ancestries—whether they were of African, European, or Asian descent—seemed to influence cancer dependencies found in their cell lines. In particular, European or East Asian ancestries were associated with many more cancer dependencies than African ones.

But upon delving deeper, they realized that most of these associations between ancestry and cancer dependencies were not real. They were due to experimental artifacts.

“The methods that are being used to determine dependencies of cancers, [they] can often miss dependencies in people of non-European ancestries because they're designed, mostly, for people of European ancestry,” explained Beroukhim. “And so, you get negative results in people of non-European ancestries, and that is particularly marked for people of recent African ancestry. The methods go wrong the most in that population because people of African ancestry have the largest amount of diversity in their genomes.”

Continue reading below...

Like this story? Sign up for FREE Genomics updates:

Latest science news storiesTopic-tailored resources and eventsCustomized newsletter content
Subscribe

One such method is the guide RNA design, which is mostly performed using European genomes as a reference. As the guide RNAs don’t account for the large genetic variation across different ancestries, they do not work as well in cell lines from patients of non-European ancestries as compared to patients of European ancestries. A false-negative result can obscure more realistic findings, misleading scientists as they design drugs or choose patients for clinical trials.

Another issue is that more than 90 percent of the cell lines used to test cancer dependencies also come from European and East Asian patients. According to Boehm, this is a call for the preclinical research community to partner with patients from African and Middle Eastern ancestries to ensure that the cell lines are reflective of patients everywhere.

“When we use CRISPR or other genome engineering tools, we need to make sure that our reagents are ancestry agnostic and don't produce a source of bias,” Boehm added. “We have to look at—not only the old Eurocentric reference genome—but the newer pan-genome that's comprised of multiple ethnicities and multiple populations.”3

Boehm, Beroukhim, and their team have also designed a website called Ancestry Garden, based on data from the Genome Aggregation database (gnomAD), to help users design guide RNAs that have the least ancestry bias. “It allows you to look at guide RNAs [from standard libraries] and [check] how often they are mismatched on people's DNA across different ancestries,” said Beroukhim. They also developed their own guide RNA library, which they strive to keep as bias-free as possible.

“[The study] is a good example of how these kinds of biases in reference databases, that scientists use routinely while designing experiments or thinking about treatments, affects research,” said Jian Carrot-Zhang, a computational geneticist at Memorial Sloan Kettering Cancer Center. “How it creates biases in not just CRISPR screening, but both basic and transitional research.”

  1. Tsherniak A, et al. Defining a cancer dependency map. Cell. 2017;170(3):564-576.e16.
  2. Misek SA, et al. Germline variation contributes to false negatives in CRISPR-based experiments with varying burden across ancestries. Nat Commun. 2024;15(1):4892.
  3. Liao WW, et al. A draft human pangenome reference. Nature. 2023;617(7960):312‑324

Related Topics

Meet the Author

  • Rohini Subrahmanyam, PhD

    Rohini Subrahmanyam completed her PhD in Biology from the National Center for Biological Sciences in Bangalore, India. During PhD, she studied neuronal defects in a rat model of autism. For postdoctoral research at Harvard University, she used human embryonic stem cells to study cortex development using brain organoids. Now back in Bangalore, she likes writing about biology, from interesting, absurd creatures to important medical discoveries.

    View Full Profile
Share
You might also be interested in...
Loading Next Article...
You might also be interested in...
Loading Next Article...
Illustration of a developing fetus surrounded by a clear fluid with a subtle yellow tinge, representing amniotic fluid.
January 2026, Issue 1

What Is the Amniotic Fluid Composed of?

The liquid world of fetal development provides a rich source of nutrition and protection tailored to meet the needs of the growing fetus.

View this Issue
Redefining Immunology Through Advanced Technologies

Redefining Immunology Through Advanced Technologies

Ensuring Regulatory Compliance in AAV Manufacturing with Analytical Ultracentrifugation

Ensuring Regulatory Compliance in AAV Manufacturing with Analytical Ultracentrifugation

Beckman Coulter Logo
Skip the Wait for Protein Stability Data with Aunty

Skip the Wait for Protein Stability Data with Aunty

Unchained Labs
Graphic of three DNA helices in various colors

An Automated DNA-to-Data Framework for Production-Scale Sequencing

illumina

Products

nuclera logo

Nuclera eProtein Discovery System installed at leading Universities in Taiwan

Brandtech Logo

BRANDTECH Scientific Introduces the Transferpette® pro Micropipette: A New Twist on Comfort and Control

Biotium Logo

Biotium Launches GlycoLiner™ Cell Surface Glycoprotein Labeling Kits for Rapid and Selective Cell Surface Imaging

Colorful abstract spiral dot pattern on a black background

Thermo Scientific X and S Series General Purpose Centrifuges

Thermo Fisher Logo