DNA-RNA mismatch

There may be widespread, nonrandom differences between DNA sequences and their corresponding RNA transcripts in human cells.

Megan Scudellari
May 18, 2011



A new paper, published today in Science, identifies widespread differences between DNA sequences and their corresponding RNA transcripts in human cells, and demonstrates that these differences result in proteins that do not precisely match the genes that encode them.

The finding challenges the assumption that RNA is a perfect one-to-one match to its corresponding DNA sequence and may open the door to an unexplored area of variation in the human genome.

"Most people assume the information in DNA is faithfully transferred to RNA and then the RNA is translated into proteins," said Jin Billy Li, a geneticist at Stanford University who was not involved in the research. If additional research confirms the results, "the central dogma will have to be revised. You can't assume the DNA [code] is transferred to RNA without any changes."

Vivian Cheung and colleagues at the University of Pennsylvania School of Medicine used...

Cheung was surprised to find more than 10,000 sites where RNA bases did not match the corresponding DNA sequence. "We didn't really expect anything, and then we saw a lot of differences," says Cheung. Her initial reaction was to blame the differences on technical errors or artifacts, so the team performed numerous experiments to rule out a technical fluke.

In the process, they noticed that many of the differences were not random. Time and again, a single RNA base was always changed in the exact same way from cell to cell. A site that should be AA might be edited to AC, for example, and every individual would either have the original AA or the edited AC, but no other possible modifications, such as AG or AT.

Then the team looked at the resulting proteins and found that they reflected the edited RNA sequences, meaning the DNA did not directly encode its protein products. In some cases, the changes were minor, but not all. In one, a RNA variant led to the loss of a stop codon, and the protein was 55 amino acids longer than would have been encoded by the DNA.

Past research has identified several post-transcriptional mechanisms that result in RNA editing, but these mechanisms account for less than half of the differences uncovered in the new study, the authors write. Cheung and the team do not yet know the mechanisms that might be causing the systematic RNA modifications, nor how the resulting protein changes might affect protein function.

Still, "this is one source of genomic variation we didn't know about," says Cheung. "We always think of DNA sequences as the causes or reasons why some of us are more or less prone to certain diseases, but we certainly didn?t think RNA sequences as being the possible cause. And now we have all these forms of proteins we didn?t know existed."

But additional research needs to be done to confirm the findings, said Li, who is performing similar studies in his lab. "Mapping these differences is not a trivial thing," he cautioned. Many genes exist two or more times in the human genome, and researchers could mistakenly map an RNA sequence to the wrong region. "If that happens, you may miscall that as RNA editing," he said.

Li, M., et al., "Widespread RNA and DNA sequence differences in the human transcriptome," Science, doi: 10.1126/science.1207018, 2011.

Interested in reading more?

Become a Member of

Receive full access to more than 35 years of archives, as well as TS Digest, digital editions of The Scientist, feature stories, and much more!
Already a member?