Opinion: Text Mining in the Clinic

Despite increasing use of electronic medical records, much patient data remains in text form, requiring text-mining techniques to make full use of patient information.

Written byMin Song
| 3 min read

Register for free to listen to this article
Listen with Speechify
0:00
3:00
Share

Patient records in the patient administration department at Naval Medical Center San DiegoWIKIMEDIA, AMANDA L. KILPATRICKThe increased use of electronic medical records (EMRs) is supporting widespread data-mining efforts to uncover trends in health, disease, and treatment response data. But a significant chunk of information in EMRs remains stored as text, unusable by conventional data-mining methods. These semi-structured or unstructured data include clinical notes, certain categories of test results such as echocardiograms and radiology reports, and other important documentation. To take full advantage of EMRs, we need to utilize both data- and text-mining techniques to explore patient outcomes.

Text and data mining have much in common; underlying each is the assumption that knowledge lies buried in a scattered mass of information. But whereas data mining predominately relies on statistical methods to uncover trends in structured data, text-mining techniques seek to make sense of information that is unstructured, such as a doctor’s scribbles on a patient’s chart. For example, much of the available clinical data are in narrative form as a result of transcription of dictations, direct entry by providers, or use of speech-recognition applications. This “free-text” form is convenient to express concepts and events, but is difficult to search, summarize, and analyze. Fortunately, text-mining techniques can help code these data for analysis.

Text mining normally requires a pre-processing phase such as spell checking, ...

Interested in reading more?

Become a Member of

The Scientist Logo
Receive full access to more than 35 years of archives, as well as TS Digest, digital editions of The Scientist, feature stories, and much more!
Already a member? Login Here

Related Topics

Meet the Author

Share
Illustration of a developing fetus surrounded by a clear fluid with a subtle yellow tinge, representing amniotic fluid.
January 2026, Issue 1

What Is the Amniotic Fluid Composed of?

The liquid world of fetal development provides a rich source of nutrition and protection tailored to meet the needs of the growing fetus.

View this Issue
Skip the Wait for Protein Stability Data with Aunty

Skip the Wait for Protein Stability Data with Aunty

Unchained Labs
Graphic of three DNA helices in various colors

An Automated DNA-to-Data Framework for Production-Scale Sequencing

illumina
Exploring Cellular Organization with Spatial Proteomics

Exploring Cellular Organization with Spatial Proteomics

Abstract illustration of spheres with multiple layers, representing endoderm, ectoderm, and mesoderm derived organoids

Organoid Origins and How to Grow Them

Thermo Fisher Logo

Products

Brandtech Logo

BRANDTECH Scientific Introduces the Transferpette® pro Micropipette: A New Twist on Comfort and Control

Biotium Logo

Biotium Launches GlycoLiner™ Cell Surface Glycoprotein Labeling Kits for Rapid and Selective Cell Surface Imaging

Colorful abstract spiral dot pattern on a black background

Thermo Scientific X and S Series General Purpose Centrifuges

Thermo Fisher Logo
Abstract background with red and blue laser lights

VANTAstar Flexible microplate reader with simplified workflows

BMG LABTECH