Only a small fraction of the thousands of described genetically inherited diseases have been linked to a specific gene. In an Advanced Online Publication in Nature Genetics Carolina Perez-Iratxeta and colleagues at the EMBL in Heidelberg describe using a bioinformatics approach to link genes to diseases (Nat Genet 2002, DOI:10.1038/ng895).

Their data-mining system is based on 'fuzzy set theory', which can make inferences from the complex scientific literature. They integrated information from multiple databases to establish relationships between Medical Subject Headings (MeSH terms) related to diseases, or drugs, and Gene Ontology terms. After a series of computational steps they defined a 'core' for known genes in the RefSeq database. They then used the score to rank candidate genes in a given disease-associated region. When this approach was tested against known disease-linked genes, the score could predict promising candidate genes.

This type of strategy may be useful for prioritizing...

Interested in reading more?

Become a Member of

Receive full access to more than 35 years of archives, as well as TS Digest, digital editions of The Scientist, feature stories, and much more!