Ranking analysis of F-statistics for microarray data
1 College of Life Sciences, Hunan Normal University, Changsha, 410081, China
2 Institute of Molecular Medicine, the University of Texas-Houston, Houston, Texas 77030, USA
3 Department of Biostatistics, Medical College of Georgia, Augusta, Georgia, 30912, USA
BMC Bioinformatics 2008, 9:142 doi:10.1186/1471-2105-9-142Published: 6 March 2008
Microarray technology provides an efficient means for globally exploring physiological processes governed by the coordinated expression of multiple genes. However, identification of genes differentially expressed in microarray experiments is challenging because of their potentially high type I error rate. Methods for large-scale statistical analyses have been developed but most of them are applicable to two-sample or two-condition data.
We developed a large-scale multiple-group F-test based method, named ranking analysis of F-statistics (RAF), which is an extension of ranking analysis of microarray data (RAM) for two-sample t-test. In this method, we proposed a novel random splitting approach to generate the null distribution instead of using permutation, which may not be appropriate for microarray data. We also implemented a two-simulation strategy to estimate the false discovery rate. Simulation results suggested that it has higher efficiency in finding differentially expressed genes among multiple classes at a lower false discovery rate than some commonly used methods. By applying our method to the experimental data, we found 107 genes having significantly differential expressions among 4 treatments at <0.7% FDR, of which 31 belong to the expressed sequence tags (ESTs), 76 are unique genes who have known functions in the brain or central nervous system and belong to six major functional groups.
Our method is suitable to identify differentially expressed genes among multiple groups, in particular, when sample size is small.