This article is part of the supplement: Proceedings of the First International Conference on Toxicogenomics Integrated with Environmental Sciences (TIES-2007)
Investigation of reproducibility of differentially expressed genes in DNA microarrays through statistical simulation
1 National Center for Toxicological Research (NCTR), US Food and Drug Administration, 3900 NCTR Rd., Jefferson, AR 72079, USA
2 Pharmaceutical Informatics Institute, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310027, PR China
3 Z-tech Corporation, an ICF International, National Center for Toxicological Research (NCTR), US Food and Drug Administration, 3900 NCTR Rd., Jefferson, AR 72079, USA
BMC Proceedings 2009, 3(Suppl 2):S4 doi:10.1186/1753-6561-3-S2-S4Published: 10 March 2009
Recent publications have raised concerns about the reliability of microarray technology because of the lack of reproducibility of differentially expressed genes (DEGs) from highly similar studies across laboratories and platforms. The rat toxicogenomics study of the MicroArray Quality Control (MAQC) project empirically revealed that the DEGs selected using a fold change (FC)-based criterion were more reproducible than those derived solely by statistical significance such as P-value from a simple t-tests. In this study, we generate a set of simulated microarray datasets to compare gene selection/ranking rules, including P-value, FC and their combinations, using the percentage of overlapping genes between DEGs from two similar simulated datasets as the measure of reproducibility. The results are supportive of the MAQC's conclusion on that DEG lists are more reproducible across laboratories and platforms when FC-based ranking coupled with a nonstringent P-value cutoff is used for gene selection compared with selection based on P-value based ranking method. We conclude that the MAQC recommendation should be considered when reproducibility is an important study objective.