Generalized shrinkage F-like statistics for testing an interaction term in gene expression analysis in the presence of heteroscedasticity
1 Department of Preventive Medicine, Stony Brook University, Stony Brook, NY 11794, USA
2 Department of Statistics, University of Florida, Gainesville, FL 32611, USA
3 Department of Molecular Genetics and Microbiology, University of Florida, Gainesville, FL 32611, USA
4 The Genetics Institute, University of Florida, Gainesville, FL 32611, USA
BMC Bioinformatics 2011, 12:427 doi:10.1186/1471-2105-12-427Published: 1 November 2011
Many analyses of gene expression data involve hypothesis tests of an interaction term between two fixed effects, typically tested using a residual variance. In expression studies, the issue of variance heteroscedasticity has received much attention, and previous work has focused on either between-gene or within-gene heteroscedasticity. However, in a single experiment, heteroscedasticity may exist both within and between genes. Here we develop flexible shrinkage error estimators considering both between-gene and within-gene heteroscedasticity and use them to construct F-like test statistics for testing interactions, with cutoff values obtained by permutation. These permutation tests are complicated, and several permutation tests are investigated here.
Our proposed test statistics are compared with other existing shrinkage-type test statistics through extensive simulation studies and a real data example. The results show that the choice of permutation procedures has dramatically more influence on detection power than the choice of F or F-like test statistics. When both types of gene heteroscedasticity exist, our proposed test statistics can control preselected type-I errors and are more powerful. Raw data permutation is not valid in this setting. Whether unrestricted or restricted residual permutation should be used depends on the specific type of test statistic.
The F-like test statistic that uses the proposed flexible shrinkage error estimator considering both types of gene heteroscedasticity and unrestricted residual permutation can provide a statistically valid and powerful test. Therefore, we recommended that it should always applied in the analysis of real gene expression data analysis to test an interaction term.