Log on / register
Feedback | Support | My details
Open AccessMethodology article

Bayesian optimal discovery procedure for simultaneous significance testing

Jing Cao1 email, Xian-Jin Xie2 email, Song Zhang2 email, Angelique Whitehurst3 email and Michael A White3 email

1Department of Statistical Science, Southern Methodist University, Dallas, Texas, USA

2Department of Clinical Sciences, University of Texas Southwestern Medical Center, Dallas, Texas, USA

3Department of Cell Biology, University of Texas Southwestern Medical Center, Dallas, Texas, USA

author email corresponding author email

BMC Bioinformatics 2009, 10:5doi:10.1186/1471-2105-10-5

Published: 6 January 2009

Abstract

Background

In high throughput screening, such as differential gene expression screening, drug sensitivity screening, and genome-wide RNAi screening, tens of thousands of tests need to be conducted simultaneously. However, the number of replicate measurements per test is extremely small, rarely exceeding 3. Several current approaches demonstrate that test statistics with shrinking variance estimates have more power over the traditional t statistic.

Results

We propose a Bayesian hierarchical model to incorporate the shrinkage concept by introducing a mixture structure on variance components. The estimates from the Bayesian model are utilized in the optimal discovery procedure (ODP) proposed by Storey in 2007, which was shown to have optimal performance in multiple significance tests. We compared the performance of the Bayesian ODP with several competing test statistics.

Conclusion

We have conducted simulation studies with 2 to 6 replicates per gene. We have also included test results from two real datasets. The Bayesian ODP outperforms the other methods in our study, including the original ODP. The advantage of the Bayesian ODP becomes more significant when there are few replicates per test. The improvement over the original ODP is based on the fact that Bayesian model borrows strength across genes in estimating unknown parameters. The proposed approach is efficient in computation due to the conjugate structure of the Bayesian model. The R code (see Additional file 1) to calculate the Bayesian ODP is provided.

Additional file 1. Bayesian ODP R code. This file contains the R code to calculate the posterior probability from the Bayesian model and the Bayesian ODP.

Format: TXT Size: 5KB Download file


© 1999-2009 BioMed Central Ltd unless otherwise stated. Part of Springer Science+Business Media.