Minding the human genome gap

Many of the unsequenced gaps in the human genome arise because their DNA sequences cannot be read by the bacteria used in traditional sequencing methods, according to a linkurl:paper;http://genomebiology.com/2009/10/6/R60 published last week in __Genome Biology__. In the paper, a team of Broad Institute researchers report a simple, new way to fill in those gaps using next-generation sequencing technology. Image: Wikimedia Commons"There are some regions of the genome which bacteria don't like,"

Written byElie Dolgin
| 3 min read

Register for free to listen to this article
Listen with Speechify
0:00
3:00
Share
Many of the unsequenced gaps in the human genome arise because their DNA sequences cannot be read by the bacteria used in traditional sequencing methods, according to a linkurl:paper;http://genomebiology.com/2009/10/6/R60 published last week in __Genome Biology__. In the paper, a team of Broad Institute researchers report a simple, new way to fill in those gaps using next-generation sequencing technology.
Image: Wikimedia Commons
"There are some regions of the genome which bacteria don't like," linkurl:Lincoln Stein,;http://www.oicr.on.ca/Research/stein.htm a bioinformatician at the Ontario Institute for Cancer Research in Toronto who was not involved in the research, told __The Scientist__. "Those have been unsequenceable until recently when the next generation of sequencing machines, which do not require a cloning step, became available." The human genome was declared "finished" nearly a decade ago, but the 3 billion nucleotide sequence is still riddled with holes. Broad computational biologist linkurl:Manuel Garber;http://broad.harvard.edu/compbio/team.html and his colleagues set out to fix this. They focused on special types of gaps in the human genome called non-structural gaps -- recalcitrant stretches of DNA that represent around half of the incomplete sections of the genome's gene-rich euchromatic regions. Unlike so-called structural gaps such as duplications with near-identical sequences, these unfinished portions are not amenable to the bacterial clone-based sequencing approaches taken by the Human Genome Project. Garber's team restricted themselves to three such gaps on chromosome 15, designed six primer pairs enveloping the genomic gulfs, and applied large-scale, parallel, shotgun sequencing approaches developed by Roche's linkurl:454 Life Sciences;http://www.454.com/ to correctly piece together the missing bits. The same "very straightforward" method can now be used to plug many of the remaining holes dotted throughout the rest of the human genome, said Stein. However, linkurl:Ian Dunham,;http://www.ebi.ac.uk/Information/Staff/person_maintx.php?s_person_id=914 a genome researcher at the European Bioinformatics Institute in Cambridge, UK, who linkurl:last year;http://genomebiology.com/2008/9/5/R78 used a combination of shotgun sequencing and bacterial cloning techniques to close many of the outstanding gaps on chromosome 22, noted that "this method will only touch [around] 50% of the gaps. The remainder will require alternate, quite detailed mapping approaches," he wrote in an email. As of the human genome's most recent iteration -- the linkurl:37th "build,";http://www.ncbi.nlm.nih.gov/mapview/stats/BuildStats.cgi?taxid=9606&build=37&ver=1 which was released in February and includes Garber's then-unpublished results -- the sequence now contains fewer than 200 euchromatic gaps. A mix of technologies should bring this number to zero, as well as allow researchers to polish off other near-finished genomes, such as mouse and dog, Garber said. Garber found that all three uncloneable gaps had DNA sequences that alternated between pyrimidine and purine nucleotides. This arrangement is known to make DNA twist in a reverse orientation and form a left-handed helix, which forces the strand to be unwound and rewound before it can be read. The probability of finding three such regions by chance alone was vanishingly small, Garber's team found. Thus, he suspects that the hitherto unsequenced regions were impervious to clone-based approaches because the sequence composition made the bacterial vector "allergic" to the foreign human DNA. "These [sequences] are definitely toxic to bacteria." Stein called this "a plausible explanation of why these regions aren't cloneable." But is dotting the i's and crossing the t's of the Human Genome Project worth the effort? It's an "intellectually satisfying goal or milestone with pretty symbolic significance," said Stein. But he questioned the ever-dwindling utility of completing a composite genome built from many different people's DNA. "Finishing this genome isn't telling us anything about all the other human genomes," such as the diploid sequences from individuals including Craig Venter and James Watson as well as the many anonymous members of the linkurl:1000 Genomes Project.;http://www.1000genomes.org/page.php "Finishing the genome is a symbolic step that I think we should continue to strive for, but we need to do it quickly before it becomes irrelevant," he said.
**__Related stories:__***linkurl:Refining the genome;http://www.the-scientist.com/article/display/22461/
[21st October 2004]*linkurl:Finishing fourteen;http://www.the-scientist.com/article/display/20977/
[2nd January 2003]*linkurl:Shotgun sequencing comes of age;http://www.the-scientist.com/article/display/20975/
[31st December 2002]
Interested in reading more?

Become a Member of

The Scientist Logo
Receive full access to more than 35 years of archives, as well as TS Digest, digital editions of The Scientist, feature stories, and much more!
Already a member? Login Here

Meet the Author

Share
Illustration of a developing fetus surrounded by a clear fluid with a subtle yellow tinge, representing amniotic fluid.
January 2026, Issue 1

What Is the Amniotic Fluid Composed of?

The liquid world of fetal development provides a rich source of nutrition and protection tailored to meet the needs of the growing fetus.

View this Issue
Skip the Wait for Protein Stability Data with Aunty

Skip the Wait for Protein Stability Data with Aunty

Unchained Labs
Graphic of three DNA helices in various colors

An Automated DNA-to-Data Framework for Production-Scale Sequencing

illumina
Exploring Cellular Organization with Spatial Proteomics

Exploring Cellular Organization with Spatial Proteomics

Abstract illustration of spheres with multiple layers, representing endoderm, ectoderm, and mesoderm derived organoids

Organoid Origins and How to Grow Them

Thermo Fisher Logo

Products

Brandtech Logo

BRANDTECH Scientific Introduces the Transferpette® pro Micropipette: A New Twist on Comfort and Control

Biotium Logo

Biotium Launches GlycoLiner™ Cell Surface Glycoprotein Labeling Kits for Rapid and Selective Cell Surface Imaging

Colorful abstract spiral dot pattern on a black background

Thermo Scientific X and S Series General Purpose Centrifuges

Thermo Fisher Logo
Abstract background with red and blue laser lights

VANTAstar Flexible microplate reader with simplified workflows

BMG LABTECH