A dictionary for genomes

With sequence information in hand, the search for regulatory sites in promoters can be done by computers rather than cloning. But the primary tools for analysis, multiple-alignment algorithms, can only handle a small amount of sequence data. In the August 29 Proceedings of the National Academy of Sciences, Bussemaker et al. introduce an alternative algorithm that they dub 'MobyDick' (Proc Nat Acad Sci USA 2000, 97: 10096-10100). MobyDick treats DNA sequence as text in which allthewordshavebeenru

By | August 31, 2000

With sequence information in hand, the search for regulatory sites in promoters can be done by computers rather than cloning. But the primary tools for analysis, multiple-alignment algorithms, can only handle a small amount of sequence data. In the August 29 Proceedings of the National Academy of Sciences, Bussemaker et al. introduce an alternative algorithm that they dub 'MobyDick' (Proc Nat Acad Sci USA 2000, 97: 10096-10100). MobyDick treats DNA sequence as text in which allthewordshavebeenruntogether. It attempts to build a dictionary of 'words' by first finding over-represented pairs of letters. Letter frequency is used to determine the probability that the pairs exist thanks to chance, and this helps determine how larger fragments continue to be built. Bussemaker et al. test their algorithm on a space-less version of the first ten chapters of the novel Moby Dick, then attack a list of all of the upstream regions in the yeast genome. For yeast, approximately 500 dictionary entries fall above a plausible significance level, including 114 of the 443 experimentally confirmed sites, and good matches to approximately half of the motifs found in previous analyses of co-regulated genes, the cell cycle, and sporulation.

Popular Now

  1. UC Berkeley Receives CRISPR Patent in Europe
    Daily News UC Berkeley Receives CRISPR Patent in Europe

    The European Patent Office will grant patent rights over the use of CRISPR in all cell types to a University of California team, contrasting with a recent decision in the U.S.

  2. DNA Replication Errors Contribute to Cancer Risk
  3. Should Healthy People Have Their Exomes Sequenced?
    Daily News Should Healthy People Have Their Exomes Sequenced?

    With its announced launch of a whole-exome sequencing service for apparently healthy individuals, Ambry Genetics is the latest company to enter this growing market. But whether these services are useful for most people remains up for debate.  

  4. Rethinking a Cancer Drug Target
    Daily News Rethinking a Cancer Drug Target

    The results of a CRISPR-Cas9 study suggest that MELK—a protein thought to play a critical role in cancer—is not necessary for cancer cell survival.

Business Birmingham