Opinion: Greater Diversity Is Needed in Human Genomic Data

Researchers must ensure that the inequality seen in most of today’s genomic studies and databases is corrected.

Written byCharles Lee
| 5 min read

Register for free to listen to this article
Listen with Speechify
0:00
5:00
Share

Since the Human Genome Project was completed, scientists around the world have worked tirelessly to populate the sequence and variant databases that have become the crown jewels of genomics research. These databases are now brimming with genomic information, but unfortunately, they are greatly biased towards individuals of European descent. For example, 70 percent of the data stored in the Genome-wide Association Study (GWAS) Catalog, a publicly available resource that contains manually curated array-based data from more than 2,800 published studies, is from individuals of European descent. The other 30 percent comes from individuals with Asian ancestry. Similarly, the database of Genotypes and Phenotypes (dbGaP) and the Genome Aggregation Database (gnomAD) are lacking data from individuals hailing from the Middle East, Central Asia, Oceania, and Africa.

European countries such as Iceland, Estonia, and the UK are among the first to launch countrywide whole genome sequencing efforts. Hence, it’s no surprise that ...

Interested in reading more?

Become a Member of

The Scientist Logo
Receive full access to digital editions of The Scientist, as well as TS Digest, feature stories, more than 35 years of archives, and much more!
Already a member? Login Here
Illustration of a developing fetus surrounded by a clear fluid with a subtle yellow tinge, representing amniotic fluid.
January 2026, Issue 1

What Is the Amniotic Fluid Composed of?

The liquid world of fetal development provides a rich source of nutrition and protection tailored to meet the needs of the growing fetus.

View this Issue
Skip the Wait for Protein Stability Data with Aunty

Skip the Wait for Protein Stability Data with Aunty

Unchained Labs
Graphic of three DNA helices in various colors

An Automated DNA-to-Data Framework for Production-Scale Sequencing

illumina
Exploring Cellular Organization with Spatial Proteomics

Exploring Cellular Organization with Spatial Proteomics

Abstract illustration of spheres with multiple layers, representing endoderm, ectoderm, and mesoderm derived organoids

Organoid Origins and How to Grow Them

Thermo Fisher Logo

Products

Brandtech Logo

BRANDTECH Scientific Introduces the Transferpette® pro Micropipette: A New Twist on Comfort and Control

Biotium Logo

Biotium Launches GlycoLiner™ Cell Surface Glycoprotein Labeling Kits for Rapid and Selective Cell Surface Imaging

Colorful abstract spiral dot pattern on a black background

Thermo Scientific X and S Series General Purpose Centrifuges

Thermo Fisher Logo
Abstract background with red and blue laser lights

VANTAstar Flexible microplate reader with simplified workflows

BMG LABTECH