Log on / register
Feedback | Support | My details
Open AccessHighly AccessMethodology article

Ranking analysis of F-statistics for microarray data

Yuan-De Tan1,2 email, Myriam Fornage2 email and Hongyan Xu3 email

1College of Life Sciences, Hunan Normal University, Changsha, 410081, China

2Institute of Molecular Medicine, the University of Texas-Houston, Houston, Texas 77030, USA

3Department of Biostatistics, Medical College of Georgia, Augusta, Georgia, 30912, USA

author email corresponding author email

BMC Bioinformatics 2008, 9:142doi:10.1186/1471-2105-9-142

Published: 6 March 2008

Abstract

Background

Microarray technology provides an efficient means for globally exploring physiological processes governed by the coordinated expression of multiple genes. However, identification of genes differentially expressed in microarray experiments is challenging because of their potentially high type I error rate. Methods for large-scale statistical analyses have been developed but most of them are applicable to two-sample or two-condition data.

Results

We developed a large-scale multiple-group F-test based method, named ranking analysis of F-statistics (RAF), which is an extension of ranking analysis of microarray data (RAM) for two-sample t-test. In this method, we proposed a novel random splitting approach to generate the null distribution instead of using permutation, which may not be appropriate for microarray data. We also implemented a two-simulation strategy to estimate the false discovery rate. Simulation results suggested that it has higher efficiency in finding differentially expressed genes among multiple classes at a lower false discovery rate than some commonly used methods. By applying our method to the experimental data, we found 107 genes having significantly differential expressions among 4 treatments at <0.7% FDR, of which 31 belong to the expressed sequence tags (ESTs), 76 are unique genes who have known functions in the brain or central nervous system and belong to six major functional groups.

Conclusion

Our method is suitable to identify differentially expressed genes among multiple groups, in particular, when sample size is small.


© 1999-2008 BioMed Central Ltd unless otherwise stated