A two-sample Bayesian t-test for microarray data
1 Codexis, Inc., Redwood City, CA 94063, USA
2 Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, NY 14853, USA
3 Divergence, Inc., St. Louis, MO 63141, USA
BMC Bioinformatics 2006, 7:126 doi:10.1186/1471-2105-7-126Published: 10 March 2006
Determining whether a gene is differentially expressed in two different samples remains an important statistical problem. Prior work in this area has featured the use of t-tests with pooled estimates of the sample variance based on similarly expressed genes. These methods do not display consistent behavior across the entire range of pooling and can be biased when the prior hyperparameters are specified heuristically.
A two-sample Bayesian t-test is proposed for use in determining whether a gene is differentially expressed in two different samples. The test method is an extension of earlier work that made use of point estimates for the variance. The method proposed here explicitly calculates in analytic form the marginal distribution for the difference in the mean expression of two samples, obviating the need for point estimates of the variance without recourse to posterior simulation. The prior distribution involves a single hyperparameter that can be calculated in a statistically rigorous manner, making clear the connection between the prior degrees of freedom and prior variance.
The test is easy to understand and implement and application to both real and simulated data shows that the method has equal or greater power compared to the previous method and demonstrates consistent Type I error rates. The test is generally applicable outside the microarray field to any situation where prior information about the variance is available and is not limited to cases where estimates of the variance are based on many similar observations.