Your DNA sequence may not be as secure as you think.

DNA databases contain hordes of information about people’s genetic makeup, including mutations that may put them at higher risks of certain diseases and, more generally, a genetic barcode that is unique to each individual. For this reason, access to such data is restricted to protect identity and health information. RNA data, on the other hand, reside in publicly available databases, which house results of thousands of genomic studies from the last several years.

Now, Eric E. Schadt from Mount Sinai School of Medicine and his colleagues have figured out a way to infer DNA sequence from RNA data, which reflects gene-expression levels in a variety of tissues.  The technique, published this week (April 8) in Nature Genetics, is the first time RNA levels have been translated to DNA sequence, and may threaten the privacy of DNA...

“We need to accept the reality that it is difficult—if not impossible—to shield personal information from others,” Schadt said in a press release. The technique even has implications for forensic science, he adds. “For example, barcodes derived from individuals who participated in a research study, where RNA levels were monitored and deposited into publicly available data bases, could be tested against DNA samples left at a crime scene as a way of identifying persons of interest.”

The National Institutes of Health (NIH), which hosts one of the largest DNA databases, is not concerned, however, ScienceInsider reported. Because the technique requires complex calculations that are not easy to master, the agency is less worried. While it "will be reviewing the finding" and its implications, according to a statement released from the agency, "NIH sees no need to modify its data sharing practices at this time."

Interested in reading more?

Become a Member of

Receive full access to more than 35 years of archives, as well as TS Digest, digital editions of The Scientist, feature stories, and much more!
Already a member?