A New Human Reference Genome Represents the Most Common Sequences

Researchers create a “consensus genome” that halves the number of errors when mapping transcripts, although they say the current standard is still a good tool.

Written byAshleen Knutsen

| 4 min read

Save for Later

consensus genome human reference genome pangenome dna genetics diversity

Listen with Speechify

0:00

4:00

ABOVE: © ISTOCK.COM,
NOBI_PRIZUE

The human reference genome is a DNA blueprint used as a standard for comparison in basic research and clinical settings. Despite improvements in accuracy and completeness that have been made over the years, it still harbors limitations that can result in erroneous findings.

In the current version of the reference, called GRCh38 or Build 38, 93 percent of the sequence comes from just 11 individuals and 70 percent from just one man, resulting in a lack of diversity and at least 300 million missing letters of DNA. In addition, a small percentage of the genes in the reference genome are represented by alleles that are not the most common forms of the genes.

To address these issues, some scientists are developing a new reference, called the pangenome or graph genome, that contains a vast collection of genomes representing all possible DNA sequences for any given locus. ...

Interested in reading more?

The Scientist ARCHIVES

Become a Member of

Receive full access to more than 35 years of archives, as well as TS Digest, digital editions of The Scientist, feature stories, and much more!

Join for free today

Already a member? Login Here