© RYCCIO/ISTOCKPHOTO.COMWith today’s high-throughput technologies and state-of-the-art tools, laboratories around the world are generating mountains of data at unprecedented rates. The traditional approaches to data management—notes jotted in lab notebooks, multiple spreadsheet files tucked away in computer folders, and images of gels and computer printouts stashed in 3-ring binders—no longer suffice. As a result, many researchers are turning to computerized laboratory information management systems (LIMS)—database applications that can help collect, organize, and track information about the samples being analyzed and the data being generated in the lab.
The first LIMS came on the scene more than 30 years ago as custom-made applications designed to increase productivity and to reduce the errors associated with routine laboratory functions. Today, a plethora of commercially licensed and open-source options are available, ranging from application-specific tools to multipurpose solutions. “A LIMS today is defined by what it can achieve,” says Tom Dolan, Director...
Many LIMS can be configured to communicate with laboratory equipment, including analytical instruments and liquid-handling robots. This not only allows data to flow directly into the LIMS as it is generated, but also enables the system to direct the workflow with specifically tailored instructions. Features like these can improve efficiency by saving researchers the task of manually recording and entering data, and can reduce data transcription errors. Once the analysis is done, the LIMS can compile and generate custom data reports using information generated by multiple instruments and personnel. Many LIMS are also equipped with data-mining and trending tools that can provide unique insights into the data.
Since there is no “standard” LIMS, choosing a system can be difficult. In addition to evaluating LIMS based on the features that your lab requires, Dolan recommends that prospective buyers try to learn as much as possible about a prospective vendor. Of particular importance are the frequency and quality of services such as system updates, bug fixes, and new-feature rollouts, he says. LIMS shoppers should also look out for how much customization will be required, which influences the cost and time needed to get your LIMS up and running. Buyers should also consider the system’s capacity to adapt as the lab’s requirements change over the years.
The Scientist spoke with three researchers about how they are using LIMS to manage the particular type of data generated in their labs. This is what we learned.
SEQUENCING AN ENTIRE ECOSYSTEM
Researcher: Christopher Meyer,
Research Zoologist, Smithsonian Institution National Museum of Natural History;
Director, Moorea Biocode Project
Project: Since 2007, the Moorea Biocode Project has been obtaining DNA barcode sequences—short sequences from standard and widely agreed-upon regions of the genome—for every animal, plant, and fungal species living on and in the waters surrounding the tropical island of Moorea, French Polynesia. “It’s an attempt to build a digital signature, or phone book, for every species so we can monitor this ecosystem and understand it better,” Meyer says.
Problem: Multiple labs are processing the genetic data using different equipment, protocols, and workflows, making it difficult to track all the project data. Furthermore, collaborators around the world needed to be able to view, edit, and share the data. The team also needed a tool to help automate some of the routine tasks associated with DNA barcoding, such as sequence analyses and uploading the data to public repositories.
Solution: The Moorea Biocode Project contracted Biomatters Ltd., a New Zealand–based bioinformatics company, to develop the open-source Biocode LIMS plug-in for use with Biomatters’ Geneious sequence analysis software (Methods Mol Biol, 858:269-310, 2012). The LIMS keeps a record of all of the samples, reagents, primers, procedures, and data generated during the barcoding process, from DNA extraction to PCR to DNA sequencing. After the DNA barcode sequence data have been obtained and deposited into the LIMS, Geneious performs automated sequence alignments—trimming low-quality data from sequence ends in the process—as well as base calling, which accelerates the workflow, reduces personnel demands, and makes it easier to trace problems identified during sequence analysis back to the relevant reactions and reagents.
The plug-in also supports connections to other databases, Excel spreadsheets, and Google Fusion Tables. For example, researchers can integrate the molecular data stored in the LIMS with a database that stores information related to each specimen’s field metadata, such as the name of collector, date, geospatial coordinates, and taxonomic information.
It’s an attempt to build a digital signature, or phone book, for every species so we can monitor this ecosystem
and understand it better.—Christopher Meyer, Moorea Biocode Project
Biomatters also developed a second plug-in that enables the Moorea project researchers to quickly and easily transfer their sequence data from the LIMS to GenBank, a large public repository for DNA sequence information. “Our attitude has been that if we’re going to do this, let’s try to do it right and make it as broadly useful for this community as possible,” says Meyer.
• Any lab using Sanger-based sequencing methods should be able to use the Biocode LIMS to track their workflow.
• The Biomatters customer service team is very responsive and routinely incorporates customer requests into new versions of their software.
• The Biocode LIMS and GenBank plug-ins are available for free download, enabling anyone to view the data collected by the Moorea Biocode Project.
• The LIMS can only integrate with MySQL databases, Excel spreadsheets, Google Fusion Tables, and TAPIR (a database system commonly used by museums). Steven Stones-Havas, a developer at Biomatters, says the company is working on integration with additional types of databases.
• To add and analyze data, users will need to purchase a license for Geneious, which ranges from $395 for students to $795 for academic nonprofit researchers and $1,995 for commercial operations.
Cost: Meyer estimates that it cost approximately $500,000 to develop the Biocode LIMS and GenBank plug-ins.
© MARK EVANS/ISTOCKPHOTO.COM
GENOMIC ANALYSES OF RESPIRATORY DISEASES
Researcher: Scott Weiss,
Director of Systems Genetics and Genomics, Channing Division of Network Medicine, Brigham and Women’s Hospital; Professor of Medicine, Harvard Medical School
Project: Examine the environmental and genetic risk factors for the development of asthma and chronic obstructive pulmonary disorder (COPD) using a genomics-driven approach
Problem: The Systems Genetics and Genomics division is a 37-investigator research group that maintains more than 500,000 clinical samples from 150,000 subjects involved in 107 studies. For a decade, the group used an in-house LIMS to track all of their patient-derived samples, but as the number of samples continued to grow, the group recognized that their homemade system wasn’t efficient enough to meet their needs. Furthermore, the group frequently sends samples to other labs, and needed a system that could track the locations of samples in their freezers to enable more efficient sample retrieval.
Solution: The lab selected LabVantage Solutions, Inc.’s Sapphire BioBanking LIMS for its ability to migrate the data from their existing LIMS, and for its compatibility with their existing sample-processing workflow. Jody Sylvia, head of Bioinformatics at the Channing Division of Network Medicine, says that the new system has greatly cut down the time needed to log new samples, which have grown from about 20 samples per day to upwards of 150 samples a day. Sapphire can enter these samples in batches, accelerating the process.
In addition, the homemade LIMS was designed in a very linear fashion, and couldn’t be expanded or reconfigured easily. It would often prompt researchers to complete processes that were no longer relevant to the lab’s workflow, but because of the linearity of the system, such prompts could not be bypassed. Sapphire comes with a configuration tool that enables the team to easily and inexpensively customize the LIMS to their needs.
Sapphire also keeps track of the freezer location for each sample and aliquots removed from samples, enabling the lab to control and manage their freezer inventory more efficiently. “You always know what’s in your freezers and where it is,” Weiss says. With the homemade LIMS, tracking samples as they moved to different labs required researchers to pull each sample out of a box and scan its barcode, Sylvia says. But Sapphire can automatically collect information about the location of individual samples within a box without scanning each sample.
• The LIMS preserves patient anonymity and conforms to other regulatory guidelines for research involving human specimens.
• Sapphire is loaded on a central application server, rather than individual computers, and can be accessed through a Web browser without requiring the installation of additional software.
• The ability to track samples in 96- or 384-well plates was somewhat limited for version 5.0, before the Channing Division’s bioinformatics team customized the LIMS. The system was unable to keep track of how samples were arranged on multiwell plates, and couldn’t keep track of the hierarchy among plates—such as when samples are removed from one plate and used to create daughter plates.
• Although the LIMS is very customizable, unless the lab has dedicated IT staff to do the technical work, LabVantage will have to do the customization, which can incur additional fees.
Cost: Weiss estimates that his lab spent between $100,000–$200,000 to implement the LabVantage system, but the cost depends on a lab’s particular needs. Weiss recommends that researchers also budget for the personnel and time needed to maintain and modify the system.
PROBING THE PROTEOME
Researcher: Martin Eisenacher,
Head of Bioinformatics and Biostatistics, Medical Proteome Center,
Ruhr University, Bochum, Germany
Project: Laboratories at the Medical Proteome Center and collaborating institutions perform quantitative proteomics and gene expression profiling on patient tissue and fluid samples. The goal is to identify molecular targets and biomarkers for the diagnosis and clinical management of diseases such as bladder and liver cancers, and chronic liver, Alzheimer’s, and Parkinson’s diseases.
Problem: Mapping the proteome involves many procedures, such as liquid chromatography, gel electrophoresis, and mass spectrometry, which often produce data in incompatible formats. Moreover, several laboratories worldwide are performing the experiments, which can also lead to heterogeneous data due to differences in experimental techniques and equipment.
The system tracks all aspects of the proteomics workflow, including information about species, tissue type, and cell type analyzed.
Solution: To manage these problems, the Medical Proteome Center and its partnering laboratories adopted Massachusetts-based Bruker Corporation’s proteomics-specific bioinformatics tool, Proteinscape, which stores all of the proteomics-related data generated by each laboratory, and merges these heterogeneous data with all the necessary experimental annotations into a meaningful format that facilitates the evaluation of results (Proteomics, 10:1230-49, 2010).
According to Eisenacher, the system tracks all aspects of the proteomics workflow, including information about species, tissue type, and cell type analyzed; the separation method applied to the sample (i.e., 1-D gel separation, 2-D separation, or liquid chromatography); and the measured mass spectra. Once mass-spectra data are uploaded into the system, Proteinscape processes the spectral data, using various search engines that automatically identify peptides, and generates protein lists.
• Proteinscape supports all typical proteomics processes.
• The program allows users to import and annotate gel images with the current workflow step, as well as mark-up the identified protein.
• As a Web-based client/server application that is designed for concurrent user access, it may slow down in very active high-throughput labs if IT infrastructure is not optimized.
• Proteinscape is streamlined for use with Bruker mass spectrometry instrumentation; establishing workflows for instruments from other vendors may require some effort.
Cost: The list price for Proteinscape starts at €18,000 (or $23,000, as of December 2012), but can be higher depending on the license type, service agreements, and a lab’s specific needs.