Benching Bases

By Kelly Rae Chi Benching Bases How to do heavy computational lifting in genomes and transcriptomes You've unpacked your next-generation sequencing system and popped in some DNA or RNA. Five days later, you've sequenced 50 million tiny strings of nucleotides. Then what? Based on their sequences, you have to align all the fragments, called "reads," with the help of a reference genome—a fully assembled sequence from the same species. In the abse

| 7 min read

Register for free to listen to this article
Listen with Speechify
0:00
7:00
Share

You've unpacked your next-generation sequencing system and popped in some DNA or RNA. Five days later, you've sequenced 50 million tiny strings of nucleotides. Then what?

Based on their sequences, you have to align all the fragments, called "reads," with the help of a reference genome—a fully assembled sequence from the same species. In the absence of a reference, you're left with assembling the genome based solely on the portions of the reads that overlap with each other. For both alignment and assembly, "computation becomes a big issue," says Steven Salzberg, director of University of Maryland's Center for Bioinformatics and Computational Biology. "That's a huge amount of data, and in fact even streaming the data off the machine onto other computers causes network bandwidth problems."

That's because most newer technologies generate shorter reads—roughly 25 to 50 nucleotides in length—than those generated using traditional Sanger sequencing. The newer methods create smaller ...

Interested in reading more?

Become a Member of

The Scientist Logo
Receive full access to digital editions of The Scientist, as well as TS Digest, feature stories, more than 35 years of archives, and much more!
Already a member? Login Here

Meet the Author

  • Kelly Rae Chi

    This person does not yet have a bio.

Published In

Share
May digest 2025 cover
May 2025, Issue 1

Study Confirms Safety of Genetically Modified T Cells

A long-term study of nearly 800 patients demonstrated a strong safety profile for T cells engineered with viral vectors.

View this Issue
Detecting Residual Cell Line-Derived DNA with Droplet Digital PCR

Detecting Residual Cell Line-Derived DNA with Droplet Digital PCR

Bio-Rad
How technology makes PCR instruments easier to use.

Making Real-Time PCR More Straightforward

Thermo Fisher Logo
Characterizing Immune Memory to COVID-19 Vaccination

Characterizing Immune Memory to COVID-19 Vaccination

10X Genomics
Optimize PCR assays with true linear temperature gradients

Applied Biosystems™ VeriFlex™ System: True Temperature Control for PCR Protocols

Thermo Fisher Logo

Products

Leica Microsystems Logo

Latest AI software simplifies image analysis and speeds up insights for scientists

BioSkryb Genomics Logo

BioSkryb Genomics and Tecan introduce a single-cell multiomics workflow for sequencing-ready libraries in under ten hours

iStock

Agilent BioTek Cytation C10 Confocal Imaging Reader

agilent technologies logo
Sapio Sciences logo

Sapio Sciences Introduces Biorepository Management Solution