Protein binding sites often are represented by "consensus sequences," such as TATA(A/T)A(A/T), "which report the most common nucleotide at any given position but eliminate much of the possible variability. In 1991, National Institutes of Health research biologist Tom Schneider developed an alternative, graphical approach, called sequence logos.

In a sequence logo, the height of each position measures how well conserved it is, while the height of each character within that position reflects its relative frequency. Thus, where a consensus sequence might mark a position as C/T, a sequence logo could indicate that C actually is observed five times more often.

Steven Brenner, an associate professor at the University of California, Berkeley, says Schneider's logo-generation software "was very, very hard for typical biologists to make use of." So in 1994, while a graduate student at the University of Cambridge, UK, Brenner developed a Web version called WebLogo. It would take another...

1. G.E. Crooks et al., "WebLogo: a sequence logo generator," Genome Res, 14:1188-90, 2004. (Cited in 131 papers)

Interested in reading more?

Magaizne Cover

Become a Member of

Receive full access to digital editions of The Scientist, as well as TS Digest, feature stories, more than 35 years of archives, and much more!