One Million Genomic Datasets

Publicly accessible databases now store nearly 1 million gene-expression datasets, giving researchers a robust resource for discovery.

Bob Grant
Bob Grant

Bob Grant is Editor in Chief of The Scientist, where he started in 2007 as a Staff Writer.

View full profile.


Learn about our editorial policies.

Jul 23, 2012

With more and more researchers conducting experiments that tease apart the functions of the thousands of genes that make up the genomes of mice, rats, and humans, the number of gene-expression datasets deposited in publicly accessible databases will soon reach 1,000,000, according to an analysis done by Nature. Adding together the number of datasets in the two major public data repositories, the National Center for Biotechnology Information's Gene Expression Omnibus and the gene-expression database at the European Bioinformatics Institute, the milestone should be reached within the next month.

"Some time in the next few weeks, the number of deposited data sets will top one million," Monya Baker wrote in Nature last week.

Gene-expression data can help scientists test preliminary hypotheses about which genes may contribute to the development of certain diseases, leading to potential drug targets. For example, a researcher could comb public data from several studies of people...

Interested in reading more?

Become a Member of

Receive full access to more than 35 years of archives, as well as TS Digest, digital editions of The Scientist, feature stories, and much more!
Already a member?