Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Research article

Generalized shrinkage F-like statistics for testing an interaction term in gene expression analysis in the presence of heteroscedasticity

Jie Yang1*, George Casella24 and Lauren M McIntyre234

Author Affiliations

1 Department of Preventive Medicine, Stony Brook University, Stony Brook, NY 11794, USA

2 Department of Statistics, University of Florida, Gainesville, FL 32611, USA

3 Department of Molecular Genetics and Microbiology, University of Florida, Gainesville, FL 32611, USA

4 The Genetics Institute, University of Florida, Gainesville, FL 32611, USA

For all author emails, please log on.

BMC Bioinformatics 2011, 12:427  doi:10.1186/1471-2105-12-427

Published: 1 November 2011

Abstract

Background

Many analyses of gene expression data involve hypothesis tests of an interaction term between two fixed effects, typically tested using a residual variance. In expression studies, the issue of variance heteroscedasticity has received much attention, and previous work has focused on either between-gene or within-gene heteroscedasticity. However, in a single experiment, heteroscedasticity may exist both within and between genes. Here we develop flexible shrinkage error estimators considering both between-gene and within-gene heteroscedasticity and use them to construct F-like test statistics for testing interactions, with cutoff values obtained by permutation. These permutation tests are complicated, and several permutation tests are investigated here.

Results

Our proposed test statistics are compared with other existing shrinkage-type test statistics through extensive simulation studies and a real data example. The results show that the choice of permutation procedures has dramatically more influence on detection power than the choice of F or F-like test statistics. When both types of gene heteroscedasticity exist, our proposed test statistics can control preselected type-I errors and are more powerful. Raw data permutation is not valid in this setting. Whether unrestricted or restricted residual permutation should be used depends on the specific type of test statistic.

Conclusions

The F-like test statistic that uses the proposed flexible shrinkage error estimator considering both types of gene heteroscedasticity and unrestricted residual permutation can provide a statistically valid and powerful test. Therefore, we recommended that it should always applied in the analysis of real gene expression data analysis to test an interaction term.