Genetic Cartography

has hardly put an end to mapping studies.

Josh Roberts(jroberts@the-scientist.com)
Jan 30, 2005
<p>A TREE GROWS IN REYKJAVIK:</p>

Above is a family tree of more than 100 current-day Icelandic asthma patients, going back eleven generations to their common ancestors born in the 17th century.

While the DNA sequence is the ultimate fine-scale physical map of the human genome, working out that sequence – as was the goal of the Human Genome Project – has hardly put an end to mapping studies. The sequence itself is still continuously being updated and annotated; the NIH's National Center for Biotechnology Information released Build 35.1 in November. Moreover, other types of maps that are no less important for the positional cloning and candidate mapping methods geneticists use to search for disease-causing genes are still being crafted.

Linkage maps, which describe the frequency with which genetic markers are coinherited, are one example, as are single-nucleotide polymorphism (SNP) maps, which locate variable genetic sequences on the physical map. Linkage-disequilibrium...

FINDING ASSOCIATED GENES

Data derived from the Science Watch/Hot Papers database and the Web of Science (Thomson Scientific) show that Hot Papers are cited 50 to 100 times more often than the average paper of the same type and age.

"A high-resolution recombination map of the human genome," Kong A, Nat Genet , 2002 Vol 31, 241-7 (Cited in 262 papers)"A first-generation linkage disequilibrium map of human chromosome 22," Dawson E, Nature Vol 418, 544-8 Aug. 1, 2004. (Cited in 102 papers)

A genetic map relies on crossover events to separate markers that would otherwise be inherited together. The frequency with which two markers segregate during meiosis determines their genetic distance (measured in centiMorgans). If this number is less than what would be seen by chance, the markers are considered linked, like the genes for blue eyes and blond hair, for example.

Reykjavik, Iceland's deCODE Genetics was founded to take advantage of the small island nation's unique resources: a fairly homogenous population with genealogical data dating back 1,100 years, and a modern medical system with good record-keeping. To construct its map, the biotechnology firm examined the transmission of 5,136 markers in 146 Icelandic families, collecting information on 1,257 meioses.1 This, and comparison of the data to the draft sequence of the human genome (to aid in ordering the markers), resulted in a map with about five times the resolution of the previously-standard Marshfield map (which had examined 188 meioses).3

The completed sequence provided reliable markers throughout the genome, so that scientists could track associations between markers and populations. "The likelihood that a marker will be inherited with a gene that you're trying to find depends on how close it is genetically, not on how close it is physically," explains Michael Nachman, an evolutionary geneticist at the University of Arizona. "What [the deCODE map] did was lay the foundation [for association studies] by giving us more precise estimates of variation in recombination rate in different regions of the genome. This is helping us find disease genes."

"Basically everyone who is working on mapping human genes uses this," says Kári Stefánsson, CEO and cofounder of deCODE. "It's wonderful to know that we have been able to contribute to the work of so many." He also says he's pleased that deCODE's work has allowed the HGP to correct many errors. "Just the publication of a high-resolution genetic map allowed people to increase the quality of the sequence dramatically."

But helping the work of others was not deCODE's sole motivation for creating the state-of-the-art genetic map. Having genotype data, and genealogical and medical histories of some 60% of the Icelandic adult population at their disposal, the company has been able to use its linkage map to isolate genes involved with longevity, myocardial infarction, asthma, atherosclerosis, schizophrenia, obesity, and diabetes, many of which have led to pharmaceuticals currently being tested in clinical trials.

LINKAGE DISEQUILIBRIUM

Genetic variation is probably responsible for much heritable disease susceptibility and variable drug response. University of Southampton genetic epidemiologist Newton Morton points out, however, that the human genome sequence, as valuable as it is, doesn't represent genetic variation so much as the average genome. "Whereas LD, in a sense, is a representation of variability," he says.

Linkage disequilibrium is a measure of how often markers are inherited together. It has long been known that certain genetic variations (such as the human leukocyte antigens, which give rise to HLA haplotypes) tend to be passed along with little or no recombination between them, yet the extent of LD in other parts of the genome were generally not well documented.

A group led by Ian Dunham at the Wellcome Trust Sanger Institute and statistical geneticist Lon Cardon at University of Oxford genotyped three unrelated sets of individuals for 1,504 SNPs and small insertions/deletions on chromosome 22, and calculated the LD among them, to create the first chromosome-wide LD map.2 Great variability was seen, with areas of high assortment interspersed among stretches of high LD (called haplotype blocks or LD blocks). In one 800-kilobase (KB) stretch, for example, they found only five different sets of 25 markers among the 32 million sets possible, suggesting that this segment of the chromosome had been inherited as a block since an ancient ancestral crossover event. If so, LD blocks could be used to create pedigrees, essentially representing thousands of generations, for association studies.

The paper is highly cited, Cardon speculates, because the work that went into it (including some previously featured Hot Papers4) served as proof of principle for what has "since turned into this very, very large international project called the International HapMap." That project has created a database boasting a density of about 1 SNP per 5 KB (compared to Cardon's 1 SNP per 15 KB) on every chromosome for their Caucasian sample set, with plans to go as dense as 1 SNP per KB. They expect to post similar data for African and Asian sample sets as well. HapMap "has taken it to a whole other level [from the preliminary study of chromosome 22], in terms of the number of markers examined and the different populations looked at," Cardon remarks.

TOWARD A TRUE LD MAP

The term "LD map" is generally used to mean a physical map annotated to reflect some aspect of LD. But to purists such as Morton: "You don't have an LD map unless the distances are additive, comparable in logical terms to the linkage map." That is, an LD unit map (as it's sometimes called) will indicate the number of times two markers will segregate together in a given number of meioses, rather than merely indicating how many base pairs separate them.

Yet while a traditional linkage map is based upon recombination events that occur during a single generation, explains Morton, "in LD, you're talking about something like 1,500 generations." In any given meiosis, nearby events can interfere with each other, which is a reason why linkage maps often cannot discern the order of very close markers. LD, because it averages over so many generations, does not have this problem, and so can yield a much more finely discriminating map.

Morton and his colleagues have created an LD unit map based on data from deCODE and the HapMap Project, which they hope to publish soon. Meanwhile, a group led by Tara Matise at Rutgers University has just published a "combined linkage-physical map" that makes use of both the deCODE genotype data and the data used to create the Marshfield map, all correlated with the genome sequence.5 "We basically used the physical map to allow us to know the order of many more markers than we would be able to position just using linkage data alone," explains Matise.

ALL MAPPED OUT?

The human genome sequence was a great leap forward, both in its own right and as a way of informing genetic maps. The latter are critical for most methods that are used to locate the genes responsible for both rare and common diseases, notes Matisse, yet "none is guaranteed to be correct, and they all have their little caveats." Use a map that has the wrong order of markers, or gross misestimates of distances, she points out, and a researcher might miss the linkage or chase after a false lead.

Only time will tell whether the HapMap, or a hybrid map like Morton's or Matise's, will replace deCODE's genetic map as the primary tool for association studies. For now, the Icelandic company's data are still the gold standard for researchers and mapmakers alike. But it is not as good as it can be, and deCODE plans to increase both the density of the markers and the number of pedigrees, Stefánsson says. Yet still, the "ultimate genetic map" would require thousands and thousands of pedigrees, Cardon notes. So it is likely that mapmaking, in one form or another, is likely to continue for some time to come.