Structural variations common in human genome

Regions where large segments of DNA are gained or lost cover at least 12 percent of the human genome, far more than previously thought, an international consortium of scientists report in four papers in three journals this week. The findings could help scientists identify new traits with medical or other phenotypic relevance and understand human evolution, Wan Lam at the British Columbia Cancer Research Center in Vancouver, who did not participate in the studies, told The Scientist. "The work is beautiful," added Evan Eichler at the University of Washington in Seattle, who was not a coauthor. The consortium focused on copy number variable regions (CNVRs), which are DNA segments 500 base pairs or larger found in varying numbers in different people. In their Nature paper, coauthor Matthew Hurles at the Wellcome Trust Sanger Institute in Cambridge, England, and his colleagues report the first comprehensive map of CNVRs. The researchers analyzed DNA samples from 270 individuals from four populations with ancestry in Africa, Europe or Asia who were part of the International HapMap Project. Specifically, they screened for copy number variants (CNVs) using two complementary technologies detailed in the consortium's new papers in Genome Research: single-nucleotide polymorphism genotyping arrays and clone-based comparative genomic hybridization (CGH). SNP genotyping arrays investigated assayed samples for more than 500,000 known SNPs, looking for stretches of adjacent SNPs that occurred in levels different from expected ratios. Clone-based CGH compared samples from 269 of the HapMap individuals against DNA from one HapMap individual chosen as a reference standard, looking for differences in copy number among more than 26,000 large-insert cloned segments that span nearly all of the currently sequenced portion of the genome. The reference individual was male, to allow detection of CNVs on the Y chromosome, and was the son of two HapMap participants, to maximize prior information on his CNVs.SNP genotyping arrays and clone-based CGH found CNVs with an average size of 206 and 341 kilobases, respectively. While SNP genotyping was better at detecting smaller CNVs, clone-based CGH had large specific targets, leading to less background noise in results and making it better at detecting CNVs in more complex regions of the genome, such as those where two or more segments are duplicated.The two technologies combined found 1,447 CNVRs, covering an eighth of the human genome. "We could have culled more CNVs from our data," Hurles told The Scientist via email. However, he said they aimed conservatively "so that investigators are not overwhelmed by the false positives that are inevitable in any study of this nature."The CNVRs contained 2,908 genes, 285 of which are linked to disease. They also contained 67 non-coding RNAs, 50 ultraconserved elements and 130,353 conserved non-coding sequences. Notably, CNVRs encompass about two to three times as much nucleotide content per genome as SNPs do, Hurles said."I suspect our paper will be a wakeup call to a lot of scientists that they need to start incorporating a 'CNV-analysis step' in their study designs if they want to fully understand their data," Stephen Scherer at the Hospital for Sick Children in Toronto, coauthor on all four papers, told The Scientist via email.The consortium found that only about 10 to 15 percent of copy number variation occurred between populations. "This is as a result of our recent common ancestry in Africa," Hurles said. The researchers suggest these differences could explain the increased prevalence of some diseases in certain populations. For instance, prior studies have shown that one CNV the consortium confirmed, UGT2B17, is a gene linked to an increased risk of prostate cancer in populations of African and European descent. The consortium is now expanding its studies to thousands of healthy individuals from populations outside the HapMap collection."We think that we are at the stage where we can confidently detect CNVs of 50 kilobases or more, but we think that the overall aim must be to increase resolution by two orders of magnitude such that we can detect CNVs of 500 base pairs. Our consortium is currently pursuing this objective," Hurles said. A large number of CNVs remain to be found, Lam agreed in an email to The Scientist. Lam's team has found many CNVs that are not seen in the consortium's new papers, and vice versa."The important next steps will be to identify the specific variants within these regions, to learn what the alleles are -- zero copies? three copies? -- and to develop ways to type these variants in large patient cohorts so that researchers can see whether these variants are associated with disease risk," Steven McCarroll at Massachusetts General Hospital in Boston, who was not a coauthor, told The Scientist via email.Scherer added that the researchers would "also like to have a much better idea of the precise start and end points of the CNVs so we can better understand the underlying mechanism -- that is, are they random events or DNA sequence-driven?" The scientists aim to better understand "the new mutation rate of CNVs and how this might be dependent on the region of the genome involved," he said. In the future, the most sensitive way to identify all kinds of DNA variation, including SNPs, CNVs and inversions of sequences, may be to directly compare whole genomes, according to the consortium. In their Nature Genetics paper, the consortium computationally compared whole genome sequences assembled by the two human genome projects and confirmed more than 1.5 million SNPs and 240 variable regions, including CNVs and inversions.Charles Q. Choi cchoi@the-scientist.comLinks within this article:Wan Lam http://www.bccrc.ca/cg/people_wanlam.htmlEvan Eichler http://eichlerlab.gs.washington.eduJ.P. Roberts. "Looking at Variation in Numbers," The Scientist, March 14, 2005 http://www.the-scientist.com/article/display/15302/R. Redon et al. "Global variation in copy number in the human genome," Nature 444: 444-54, Nov. 23, 2006. http://www.nature.com/nature/journal/v444/n7118/full/nature05329.htmlMatthew Hurles http://www.sanger.ac.uk/Teams/Team29A. Constans. "A Practical Guide to the HapMap," The Scientist, Feb. 1, 2006. http://www.the-scientist.com/article/display/23052D. Komura et al. "Genome-wide detection of human copy number variations using high density DNA oligonucleotide arrays," Genome Research, published online ahead of print Nov. 22, 2006. http://www.genome.org/cgi/content/abstract/gr.5629106v1H. Fiegler et al. "Accurate and reliable high-throughput detection of copy number variation in the human genome."Genome Research, published online ahead of print Nov. 22, 2006. http://www.genome.org/papbyrecent.shtmlJ.L. Peirce. "Following Phylogenetic Footprints," The Scientist, September 27, 2004 http://www.the-scientist.com/article/display/14954Stephen Scherer http://www.the-scientist.com/article/display/21257R. Khaja et al. "Genome assembly comparison identifies structural variants in the human genome," Nature Genetics, published online ahead of print Nov. 22, 2006. http://www.nature.com/ng/journal/vaop/ncurrent/abs/ng1921.htmlV.K. McElheny. "The Human Genome Project +5," The Scientist, Feb. 1, 2006. http://www.the-scientist.com/article/display/23065

Structural variations common in human genome

The Scientist ARCHIVES

Become a Member of

Meet the Author

Charles Q. Choi

Related Research Resources

What Is the Amniotic Fluid Composed of?

Research Resources

Podcasts

Webinars

Videos

Infographics

eBooks

Skip the Wait for Protein Stability Data with Aunty

An Automated DNA-to-Data Framework for Production-Scale Sequencing

Exploring Cellular Organization with Spatial Proteomics

Organoid Origins and How to Grow Them

Products

Product News

BRANDTECH Scientific Introduces the Transferpette® pro Micropipette: A New Twist on Comfort and Control

Biotium Launches GlycoLiner™ Cell Surface Glycoprotein Labeling Kits for Rapid and Selective Cell Surface Imaging

Thermo Scientific X and S Series General Purpose Centrifuges

VANTAstar Flexible microplate reader with simplified workflows