With more and more researchers conducting experiments that tease apart the functions of the thousands of genes that make up the genomes of mice, rats, and humans, the number of gene-expression datasets deposited in publicly accessible databases will soon reach 1,000,000, according to an analysis done by Nature. Adding together the number of datasets in the two major public data repositories, the National Center for Biotechnology Information's Gene Expression Omnibus and the gene-expression database at the European Bioinformatics Institute, the milestone should be reached within the next month.
"Some time in the next few weeks, the number of deposited data sets will top one million," Monya Baker wrote in Nature last week.
Gene-expression data can help scientists test preliminary hypotheses about which genes may contribute to the development of certain diseases, leading to potential drug targets. For example, a researcher could comb public data from several studies of people...