Menu

Infographic: Writing with DNA

Researchers devise numerous strategies to encode information into nucleic acids.

Sep 30, 2017
Catherine Offord

If just encoding text, one way is to convert each letter of the alphabet into a three-letter code. Using three bases, such as A, C, and T, gives 27 combinations—enough for the English alphabet plus a space—with a code such as AAA = A, AAC = B, and so on (1 in graphic below). However, researchers often want to encode more than just text, so most current methods instead first translate data into binary code—the language of 1s and 0s used in electronic media. Using binary, the four bases of DNA could theoretically store up to two bits of information per nucleotide, with a code such as A = 00, C = 01, and so on (2).

In reality, though, biochemical features of nucleic acids make some combinations of bases more desirable than others. Particularly problematic are homopolymers—long strands of the same nucleotide—which are difficult to write and read using current methods. One way to avoid homopolymers is by allocating two bases to each binary digit; long runs of the same digit can then be encoded by alternating base pairs (3). A more efficient method is to convert text or other data into a code that employs three digits rather than two, and then write bases so that no base is used twice in a row—for example by encoding 0, 1, and 2 as C, G, and T after an A, but as G, T, and A after a C (4). Newer methods include more complex codes, as well as error-correcting techniques, to pack as much information as possible into DNA while maximizing the accuracy of information retrieval.

Sources for methods depicted: 1. Bancroft et al., 2001; 3. Church et al., 2012; 4. Goldman et al., 2013.

Storage Cycle

After an encoding method is chosen, researchers write the DNA message into a series of long oligonucleotides. In earlier methods, these fragments were each tagged with a unique address sequence to aid reassembly, as well as common flanking sequences that allow amplification by PCR (1). Newer methods incorporate selective retrieval of specific sections of stored data, known as random access, by combining the address and PCR sequences into unique codes on either side of every oligonucleotide. Appropriate primers allow researchers to select and amplify only a sequence of interest (2).

These oligonucleotides are synthesized into tiny test tubes or printed onto DNA microchips, which are stored in a cold, dry, dark place. When the message needs to be read, researchers rehydrate the sample and add primers corresponding to the addresses of the sequences of interest. The amplified product is then sequenced and decoded in order to retrieve the original message.

THE SCIENTIST STAFF

Read the full story.

July 2019

On Target

Researchers strive to make individualized medicine a reality

Marketplace

Sponsored Product Updates

DeNovoMAX - NRGene's new genomics tool to meet a major agbio need:
DeNovoMAX - NRGene's new genomics tool to meet a major agbio need:
NRGene has launched a new product that aims to empower breeding and maximize agricultural yield as part of the Denovo assembly product suite offered by the company.
Overcoming the Efficiency Challenge in Clinical NGS
Overcoming the Efficiency Challenge in Clinical NGS
Download this white paper to see how an ECS lab serving a network of more than 10,000 healthcare providers integrated QIAGEN Clinical Insight (QCI) Interpret to significantly reduce manual variant curation efforts and increase workflow efficiency by 80%!
Veravas Launches Product Portfolio to Mitigate Biotin Interference and Improve Diagnostic Assay Accuracy
Veravas Launches Product Portfolio to Mitigate Biotin Interference and Improve Diagnostic Assay Accuracy
Veravas, Inc., an emerging diagnostic company, launched a portfolio of products that can improve the accuracy of current diagnostic test results by helping laboratory professionals detect and manage biotin interference in patient samples with VeraTest Biotin and VeraPrep Biotin.
New Data on Circulating Tumor DNA as a Biomarker for Detecting Cancer Progression Presented at 2019 ASCO Annual Meeting
New Data on Circulating Tumor DNA as a Biomarker for Detecting Cancer Progression Presented at 2019 ASCO Annual Meeting
Scientists presented more than 30 abstracts featuring Bio-Rad’s Droplet Digital PCR (ddPCR) technology at the American Society of Clinical Oncology (ASCO) Annual Meeting in Chicago, May 31–June 4.