Menu

Infographic: Writing with DNA

Researchers devise numerous strategies to encode information into nucleic acids.

Sep 30, 2017
Catherine Offord

If just encoding text, one way is to convert each letter of the alphabet into a three-letter code. Using three bases, such as A, C, and T, gives 27 combinations—enough for the English alphabet plus a space—with a code such as AAA = A, AAC = B, and so on (1 in graphic below). However, researchers often want to encode more than just text, so most current methods instead first translate data into binary code—the language of 1s and 0s used in electronic media. Using binary, the four bases of DNA could theoretically store up to two bits of information per nucleotide, with a code such as A = 00, C = 01, and so on (2).

In reality, though, biochemical features of nucleic acids make some combinations of bases more desirable than others. Particularly problematic are homopolymers—long strands of the same nucleotide—which are difficult to write and read using current methods. One way to avoid homopolymers is by allocating two bases to each binary digit; long runs of the same digit can then be encoded by alternating base pairs (3). A more efficient method is to convert text or other data into a code that employs three digits rather than two, and then write bases so that no base is used twice in a row—for example by encoding 0, 1, and 2 as C, G, and T after an A, but as G, T, and A after a C (4). Newer methods include more complex codes, as well as error-correcting techniques, to pack as much information as possible into DNA while maximizing the accuracy of information retrieval.

Sources for methods depicted: 1. Bancroft et al., 2001; 3. Church et al., 2012; 4. Goldman et al., 2013.

Storage Cycle

After an encoding method is chosen, researchers write the DNA message into a series of long oligonucleotides. In earlier methods, these fragments were each tagged with a unique address sequence to aid reassembly, as well as common flanking sequences that allow amplification by PCR (1). Newer methods incorporate selective retrieval of specific sections of stored data, known as random access, by combining the address and PCR sequences into unique codes on either side of every oligonucleotide. Appropriate primers allow researchers to select and amplify only a sequence of interest (2).

These oligonucleotides are synthesized into tiny test tubes or printed onto DNA microchips, which are stored in a cold, dry, dark place. When the message needs to be read, researchers rehydrate the sample and add primers corresponding to the addresses of the sequences of interest. The amplified product is then sequenced and decoded in order to retrieve the original message.

THE SCIENTIST STAFF

Read the full story.

September 2018

The Muscle Issue

The dynamic tissue reveals its secrets

Marketplace

Sponsored Product Updates

Enabling Genomics-Guided Precision Medicine

Enabling Genomics-Guided Precision Medicine

Download this eBook from Qiagen to learn more about the promise of precision medicine and how QCITM Interpret can help deliver better care with better knowledge.

Best Practices for Sample Preparation and Lipid Extraction from Various Samples

Best Practices for Sample Preparation and Lipid Extraction from Various Samples

Download this white paper from Bertin Technologies to learn how to extract and analyze lipid samples from various models!

Bio-Rad Launches CHT Ceramic Hydroxyapatite XT Media and Nuvia HP-Q Resin for Process Protein Purification

Bio-Rad Launches CHT Ceramic Hydroxyapatite XT Media and Nuvia HP-Q Resin for Process Protein Purification

Bio-Rad Laboratories, Inc. (NYSE: BIO and BIOb), a global leader of life science research and clinical diagnostic products, today announced the launch of two new chromatography media for process protein purification: CHT Ceramic Hydroxyapatite XT Media and Nuvia HP-Q Resin.

Immunophenotypic Analysis of Human Blood Leukocyte Subsets

Immunophenotypic Analysis of Human Blood Leukocyte Subsets

Download this application note from ACEA Biosciences, Inc., to find out how to perform an immunophenotypic analysis of a human blood sample utilizing 13 fluorescent markers using a compact benchtop flow cytometer equipped with 3 lasers!