Menu

Infographic: Writing with DNA

Researchers devise numerous strategies to encode information into nucleic acids.

Sep 30, 2017
Catherine Offord

If just encoding text, one way is to convert each letter of the alphabet into a three-letter code. Using three bases, such as A, C, and T, gives 27 combinations—enough for the English alphabet plus a space—with a code such as AAA = A, AAC = B, and so on (1 in graphic below). However, researchers often want to encode more than just text, so most current methods instead first translate data into binary code—the language of 1s and 0s used in electronic media. Using binary, the four bases of DNA could theoretically store up to two bits of information per nucleotide, with a code such as A = 00, C = 01, and so on (2).

In reality, though, biochemical features of nucleic acids make some combinations of bases more desirable than others. Particularly problematic are homopolymers—long strands of the same nucleotide—which are difficult to write and read using current methods. One way to avoid homopolymers is by allocating two bases to each binary digit; long runs of the same digit can then be encoded by alternating base pairs (3). A more efficient method is to convert text or other data into a code that employs three digits rather than two, and then write bases so that no base is used twice in a row—for example by encoding 0, 1, and 2 as C, G, and T after an A, but as G, T, and A after a C (4). Newer methods include more complex codes, as well as error-correcting techniques, to pack as much information as possible into DNA while maximizing the accuracy of information retrieval.

Sources for methods depicted: 1. Bancroft et al., 2001; 3. Church et al., 2012; 4. Goldman et al., 2013.

Storage Cycle

After an encoding method is chosen, researchers write the DNA message into a series of long oligonucleotides. In earlier methods, these fragments were each tagged with a unique address sequence to aid reassembly, as well as common flanking sequences that allow amplification by PCR (1). Newer methods incorporate selective retrieval of specific sections of stored data, known as random access, by combining the address and PCR sequences into unique codes on either side of every oligonucleotide. Appropriate primers allow researchers to select and amplify only a sequence of interest (2).

These oligonucleotides are synthesized into tiny test tubes or printed onto DNA microchips, which are stored in a cold, dry, dark place. When the message needs to be read, researchers rehydrate the sample and add primers corresponding to the addresses of the sequences of interest. The amplified product is then sequenced and decoded in order to retrieve the original message.

THE SCIENTIST STAFF

Read the full story.

February 2019

Big Storms Brewing

Can forests weather more major hurricanes?

Marketplace

Sponsored Product Updates

Bio-Rad Releases First FDA-Cleared Digital PCR System and Test for Monitoring Chronic Myeloid Leukemia Treatment Response
Bio-Rad Releases First FDA-Cleared Digital PCR System and Test for Monitoring Chronic Myeloid Leukemia Treatment Response
Bio-Rad Laboratories, Inc. (NYSE: BIO and BIOb), a global leader of life science research and clinical diagnostic products, today announced that its QXDx AutoDG ddPCR System, which uses Bio-Rad’s Droplet Digital PCR technology, and the QXDx BCR-ABL %IS Kit are the industry’s first digital PCR products to receive U.S. Food and Drug Administration (FDA) clearance. Used together, Bio-Rad’s system and kit can precisely and reproducibly monitor molecular response to treatment in patients with chronic myeloid leukemia (CML).
Bio-Rad Showcases New Automation Features of its ZE5 Cell Analyzer at SLAS 2019
Bio-Rad Showcases New Automation Features of its ZE5 Cell Analyzer at SLAS 2019
Bio-Rad Laboratories, Inc. (NYSE: BIO and BIOb) today showcases new automation features of its ZE5 Cell Analyzer during the Society for Laboratory Automation and Screening 2019 International Conference and Exhibition (SLAS) in Washington, D.C., February 2–6. These capabilities enable the ZE5 to be used for high-throughput flow cytometry in biomarker discovery and phenotypic screening.
Andrew Alliance and Sartorius Collaborate to Provide Software-Connected Pipettes for Life Science Research
Andrew Alliance and Sartorius Collaborate to Provide Software-Connected Pipettes for Life Science Research
Researchers to benefit from an innovative software-connected pipetting system, bringing improved reproducibility and traceability of experiments to life-science laboratories.
Corning Life Sciences to Feature 3D Cell Culture Technologies at SLAS 2019
Corning Life Sciences to Feature 3D Cell Culture Technologies at SLAS 2019
Corning Incorporated (NYSE: GLW) will showcase advanced 3D cell culture technologies and workflow solutions for spheroids, organoids, tissue models, and applications including ADME/toxicology at the Society for Laboratory Automation and Screening (SLAS) conference, Feb. 2-6 in Washington, D.C.