Courtesy of Roland Brosch
The regions absent from the attenuated vaccine strain Mycobacterium bovis BCG Pasteur relative to the M. tuberculosis H37Rv genome are shown as gray boxes. Open reading frames (ORFs) are represented as pointed boxes showing the direction of transcription, with colors reflecting the functional classification of the ORFs similar to the ones on the TubercuList server
Not long after the genome sequence of
Surprising many, Institut Pasteur scientist Roland Brosch and colleagues overturned a widely accepted theory about how
STRAINING FOR COMPARISONS
Looking to clarify genomic differences evident in a much smaller evolutionary window, Claire Fraser's group at the Institute for Genomic Research (TIGR) in Rockville, Md., examined differences between laboratory and clinical
Fraser and her group hoped to clarify this issue; they sequenced the clinical strain CDC1551 using the whole-genome shotgun approach. They then used genome-aligning software to find regions in which the genomes differed. Fraser's group also used a number of loci that differed between the lab strain and the CDC1551 as markers to screen a much larger group of clinical isolates. The PP/PPE gene family, for example, appeared more polymorphic than the bulk of the genome. Fraser suggests that these genes may be antigenic membrane proteins. The higher degree of polymorphism, she speculates, may represent a mechanism whereby
"This paper was the first look at whole sequence, to see how are these differences born out," says Issar Smith, a TB researcher at PHRI. According to Smith, most TB researchers suspected differences among clinical and lab strains, but weren't certain prior to this paper. "I think it's been very valuable," says Smith. But, he adds, what the specific differences are and how they affect virulence isn't yet evident. Behr notes that the data in the paper weren't particularly surprising; much was already available via online databases.
Kreiswirth disagrees with the Fraser group's conclusions based on his own work3; he says he doesn't believe that lab and clinical strains of
© 2002 National Academy of Sciences
Proposed evolution of the tubercle bacilli illustrating successive loss of DNA in certain lineages (gray boxes). The scheme is based on the presence or absence of conserved deleted regions and on sequence polymorphisms in five selected genes. Note that the distances between certain branches may not correspond to actual phylogenetic differences calculated by other methods. (Reprinted with permission from R. Brosch, Proc Nat Acad Sci, 99:3684–9, 2002.)
Rob Fleischman, first author from the TIGR group, points out that comparisons made with independent data show that 91% of the SNPs are genuine. He says that the belief that SNPs are more common than previously thought remains accurate. Fleischman notes, however, that he doesn't have an objective measure to determine whether the differences observed between strains are significant.
Many TB researchers are using the insights from comparative genomics to search for chinks in the armor of the TB bacterium. Brosch and colleagues are focused, in part, on the RD1 region, which is missing in the BCG strain and in
Knocking out RD1 decreases virulence. When it's knocked in to BCG, virulence increases. RD1 encodes a secreted protein called ESAT-6 that appears to have a role in virulence and immunogenicity, based on comparisons among several members of the
The paper from Brosch's lab also provided genomic tags for individual bacterial species, which could facilitate diagnostics in the clinic. "Once you put the correct name on things you can determine their distribution," says Behr. "And once you can determine their distribution, you can have a rational understanding of whether these are restricted pathogens or are they spreading among species."
Looking back at her paper, Fraser is reminded that just two years ago, having the sequence of two strains was considered a big deal. Now, researchers might have ten or more sequences from strains of their human pathogen of interest, each often providing additional information about biochemical and metabolic potential. Researchers need to consider how many genome sequences are sufficient for understanding diversity, cautions Fraser. "Obviously it will differ depending on the species you're looking at and the range of variability that it can display," she says. "... we have a ways to go."