Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Highly Accessed Research article

A two-sample Bayesian t-test for microarray data

Richard J Fox1* and Matthew W Dimmic23

Author Affiliations

1 Codexis, Inc., Redwood City, CA 94063, USA

2 Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, NY 14853, USA

3 Divergence, Inc., St. Louis, MO 63141, USA

For all author emails, please log on.

BMC Bioinformatics 2006, 7:126  doi:10.1186/1471-2105-7-126

Published: 10 March 2006

Abstract

Background

Determining whether a gene is differentially expressed in two different samples remains an important statistical problem. Prior work in this area has featured the use of t-tests with pooled estimates of the sample variance based on similarly expressed genes. These methods do not display consistent behavior across the entire range of pooling and can be biased when the prior hyperparameters are specified heuristically.

Results

A two-sample Bayesian t-test is proposed for use in determining whether a gene is differentially expressed in two different samples. The test method is an extension of earlier work that made use of point estimates for the variance. The method proposed here explicitly calculates in analytic form the marginal distribution for the difference in the mean expression of two samples, obviating the need for point estimates of the variance without recourse to posterior simulation. The prior distribution involves a single hyperparameter that can be calculated in a statistically rigorous manner, making clear the connection between the prior degrees of freedom and prior variance.

Conclusion

The test is easy to understand and implement and application to both real and simulated data shows that the method has equal or greater power compared to the previous method and demonstrates consistent Type I error rates. The test is generally applicable outside the microarray field to any situation where prior information about the variance is available and is not limited to cases where estimates of the variance are based on many similar observations.