Predicting promoters

Finding the beginning of genes within genomic sequence presents a formidable challenge to projects to annotate the human genome sequence. In the Advanced Online Publication of Nature Genetics Ramana Davuluri and colleagues at Cold Spring Harbor Laboratory, in New York describe a bioinformatic strategy to predict gene promoters and first exons (Nat Genet 2001, DOI: 10.1038/ng780).They developed a new program, called FirstEF, that attempts to predict the starts of genes. They collected over two th

Nov 28, 2001
Jonathan Weitzman(jonathanweitzman@hotmail.com)

Finding the beginning of genes within genomic sequence presents a formidable challenge to projects to annotate the human genome sequence. In the Advanced Online Publication of Nature Genetics Ramana Davuluri and colleagues at Cold Spring Harbor Laboratory, in New York describe a bioinformatic strategy to predict gene promoters and first exons (Nat Genet 2001, DOI: 10.1038/ng780).

They developed a new program, called FirstEF, that attempts to predict the starts of genes. They collected over two thousand first-exons to use as a training dataset, and characterized those that were associated with a CpG island. FirstEF is designed to recognize CpG islands, promoter regions and first splice-donor sites.

The program could predict 86% of all first exons with about 17% false positives (92% of CpG-related first-exons and 74% of non-CpG exons). FirstEF gave a similar performance when tested against the finished sequences for human chromosomes 21 and 22.