Human genome sequences from 50 ethnolinguistic groups across more than a dozen African countries reveal insights that studies focused mainly on Europeans are unable to capture.
Often scientists will include some African genomes to supplement their large European sample sizes for studies of human genetics, particularly in the context of disease, but the new research, presented this week (October 15) at the annual meeting of the American Society of Human Genetics (ASHG), shows how “woefully inadequate” this approach is, Baylor College of Medicine genomicist Neil Hanchard, who led one of the studies, tells STAT. “There is so much genetic diversity across the African continent, if you sample from just one or two ethnolinguistic groups you know something about one or two groups.”
Lack of ethnic diversity in global genome databases has long been a source of discussion in the scientific community. Africans in particular are underrepresented in these datasets, limiting the conclusions that can be drawn about human health and disease on the continent. Now, many groups are looking to buck the trend by diversifying genomics research. Working as part of Human Heredity and Health in Africa (H3Africa), a consortium devoted to increasing African representation in genetics research, Hanchard and his colleagues sequenced the genomes of 426 African people, providing “an unprecedented, in-depth cataloging of the genetic diversity of people across the African continent,” University of Pennsylvania medical geneticist Kiran Musunuru, who chaired the ASHG program committee but was not involved in Hanchard’s study, tells STAT.
Among the findings, the researchers found that each of the 50 ethnolinguistic groups examined had unique genetic variants, a total of more than 3 million of which are not seen in European genomes. Moreover, the group linked genetic variants that differed the most from non-African sequences to infectious disease, and viral infection, in particular.
These findings validate concerns that a dearth of genetics data on African individuals could inhibit efforts to treat them. For example, if a sick patient were sequenced, leading to the identification of one of these African group–specific variants, doctors might conclude that it causes disease, Hanchard tells STAT, because “one of the things often used to infer pathogenicity is that a variant is very rare.” But, he adds, “with better information on African genomes, we can say, this variant is common among Africans and so is probably not a big deal.”
The sequence data also revealed clues about the history of African people. For instance, the findings hinted at early migrations from East Africa to central Nigeria, differentiating the Nigerian Yoruba people from other West African groups.
In another study presented this week at ASHG, a group of researchers from the University of California, San Francisco, tackled the lack of diversity in human genome databases from another angle. Sequencing the genomes of 220 individuals from around the globe, the team found 7 million base pairs worth of DNA sequences missing from the human reference genome. “It turns out that we’re still missing important pieces” of the human genome, Musunuru tells STAT.
Jef Akst is the managing editor of The Scientist. Email her at firstname.lastname@example.org.