Structural variations common in human genome

At least 12 percent of the genome is made of regions that vary in number across individuals, according to four new papers

| 4 min read

Register for free to listen to this article
Listen with Speechify
0:00
4:00
Share
Regions where large segments of DNA are gained or lost cover at least 12 percent of the human genome, far more than previously thought, an international consortium of scientists report in four papers in three journals this week. The findings could help scientists identify new traits with medical or other phenotypic relevance and understand human evolution, Wan Lam at the British Columbia Cancer Research Center in Vancouver, who did not participate in the studies, told The Scientist. "The work is beautiful," added Evan Eichler at the University of Washington in Seattle, who was not a coauthor. The consortium focused on copy number variable regions (CNVRs), which are DNA segments 500 base pairs or larger found in varying numbers in different people. In their Nature paper, coauthor Matthew Hurles at the Wellcome Trust Sanger Institute in Cambridge, England, and his colleagues report the first comprehensive map of CNVRs. The researchers analyzed DNA samples from 270 individuals from four populations with ancestry in Africa, Europe or Asia who were part of the International HapMap Project. Specifically, they screened for copy number variants (CNVs) using two complementary technologies detailed in the consortium's new papers in Genome Research: single-nucleotide polymorphism genotyping arrays and clone-based comparative genomic hybridization (CGH). SNP genotyping arrays investigated assayed samples for more than 500,000 known SNPs, looking for stretches of adjacent SNPs that occurred in levels different from expected ratios. Clone-based CGH compared samples from 269 of the HapMap individuals against DNA from one HapMap individual chosen as a reference standard, looking for differences in copy number among more than 26,000 large-insert cloned segments that span nearly all of the currently sequenced portion of the genome. The reference individual was male, to allow detection of CNVs on the Y chromosome, and was the son of two HapMap participants, to maximize prior information on his CNVs.SNP genotyping arrays and clone-based CGH found CNVs with an average size of 206 and 341 kilobases, respectively. While SNP genotyping was better at detecting smaller CNVs, clone-based CGH had large specific targets, leading to less background noise in results and making it better at detecting CNVs in more complex regions of the genome, such as those where two or more segments are duplicated.The two technologies combined found 1,447 CNVRs, covering an eighth of the human genome. "We could have culled more CNVs from our data," Hurles told The Scientist via email. However, he said they aimed conservatively "so that investigators are not overwhelmed by the false positives that are inevitable in any study of this nature."The CNVRs contained 2,908 genes, 285 of which are linked to disease. They also contained 67 non-coding RNAs, 50 ultraconserved elements and 130,353 conserved non-coding sequences. Notably, CNVRs encompass about two to three times as much nucleotide content per genome as SNPs do, Hurles said."I suspect our paper will be a wakeup call to a lot of scientists that they need to start incorporating a 'CNV-analysis step' in their study designs if they want to fully understand their data," Stephen Scherer at the Hospital for Sick Children in Toronto, coauthor on all four papers, told The Scientist via email.The consortium found that only about 10 to 15 percent of copy number variation occurred between populations. "This is as a result of our recent common ancestry in Africa," Hurles said. The researchers suggest these differences could explain the increased prevalence of some diseases in certain populations. For instance, prior studies have shown that one CNV the consortium confirmed, UGT2B17, is a gene linked to an increased risk of prostate cancer in populations of African and European descent. The consortium is now expanding its studies to thousands of healthy individuals from populations outside the HapMap collection."We think that we are at the stage where we can confidently detect CNVs of 50 kilobases or more, but we think that the overall aim must be to increase resolution by two orders of magnitude such that we can detect CNVs of 500 base pairs. Our consortium is currently pursuing this objective," Hurles said. A large number of CNVs remain to be found, Lam agreed in an email to The Scientist. Lam's team has found many CNVs that are not seen in the consortium's new papers, and vice versa."The important next steps will be to identify the specific variants within these regions, to learn what the alleles are -- zero copies? three copies? -- and to develop ways to type these variants in large patient cohorts so that researchers can see whether these variants are associated with disease risk," Steven McCarroll at Massachusetts General Hospital in Boston, who was not a coauthor, told The Scientist via email.Scherer added that the researchers would "also like to have a much better idea of the precise start and end points of the CNVs so we can better understand the underlying mechanism -- that is, are they random events or DNA sequence-driven?" The scientists aim to better understand "the new mutation rate of CNVs and how this might be dependent on the region of the genome involved," he said. In the future, the most sensitive way to identify all kinds of DNA variation, including SNPs, CNVs and inversions of sequences, may be to directly compare whole genomes, according to the consortium. In their Nature Genetics paper, the consortium computationally compared whole genome sequences assembled by the two human genome projects and confirmed more than 1.5 million SNPs and 240 variable regions, including CNVs and inversions.Charles Q. Choi cchoi@the-scientist.comLinks within this article:Wan Lam http://www.bccrc.ca/cg/people_wanlam.htmlEvan Eichler http://eichlerlab.gs.washington.eduJ.P. Roberts. "Looking at Variation in Numbers," The Scientist, March 14, 2005 http://www.the-scientist.com/article/display/15302/R. Redon et al. "Global variation in copy number in the human genome," Nature 444: 444-54, Nov. 23, 2006. http://www.nature.com/nature/journal/v444/n7118/full/nature05329.htmlMatthew Hurles http://www.sanger.ac.uk/Teams/Team29A. Constans. "A Practical Guide to the HapMap," The Scientist, Feb. 1, 2006. http://www.the-scientist.com/article/display/23052D. Komura et al. "Genome-wide detection of human copy number variations using high density DNA oligonucleotide arrays," Genome Research, published online ahead of print Nov. 22, 2006. http://www.genome.org/cgi/content/abstract/gr.5629106v1H. Fiegler et al. "Accurate and reliable high-throughput detection of copy number variation in the human genome."Genome Research, published online ahead of print Nov. 22, 2006. http://www.genome.org/papbyrecent.shtmlJ.L. Peirce. "Following Phylogenetic Footprints," The Scientist, September 27, 2004 http://www.the-scientist.com/article/display/14954Stephen Scherer http://www.the-scientist.com/article/display/21257R. Khaja et al. "Genome assembly comparison identifies structural variants in the human genome," Nature Genetics, published online ahead of print Nov. 22, 2006. http://www.nature.com/ng/journal/vaop/ncurrent/abs/ng1921.htmlV.K. McElheny. "The Human Genome Project +5," The Scientist, Feb. 1, 2006. http://www.the-scientist.com/article/display/23065
Interested in reading more?

Become a Member of

The Scientist Logo
Receive full access to more than 35 years of archives, as well as TS Digest, digital editions of The Scientist, feature stories, and much more!
Already a member? Login Here

Meet the Author

  • Charles Q. Choi

    This person does not yet have a bio.
Share
Image of a woman in a microbiology lab whose hair is caught on fire from a Bunsen burner.
April 1, 2025, Issue 1

Bunsen Burners and Bad Hair Days

Lab safety rules dictate that one must tie back long hair. Rosemarie Hansen learned the hard way when an open flame turned her locks into a lesson.

View this Issue
Faster Fluid Measurements for Formulation Development

Meet Honeybun and Breeze Through Viscometry in Formulation Development

Unchained Labs
Conceptual image of biochemical laboratory sample preparation showing glassware and chemical formulas in the foreground and a scientist holding a pipette in the background.

Taking the Guesswork Out of Quality Control Standards

sartorius logo
An illustration of PFAS bubbles in front of a blue sky with clouds.

PFAS: The Forever Chemicals

sartorius logo
Unlocking the Unattainable in Gene Construction

Unlocking the Unattainable in Gene Construction

dna-script-primarylogo-digital

Products

Metrion Biosciences Logo

Metrion Biosciences launches NaV1.9 high-throughput screening assay to strengthen screening portfolio and advance research on new medicines for pain

Biotium Logo

Biotium Unveils New Assay Kit with Exceptional RNase Detection Sensitivity

Atelerix

Atelerix signs exclusive agreement with MineBio to establish distribution channel for non-cryogenic cell preservation solutions in China

Green Cooling

Thermo Scientific™ Centrifuges with GreenCool Technology

Thermo Fisher Logo