A Latent Variable Approach for Meta-Analysis of Gene Expression Data from Multiple Microarray Experiments
1 Department of Biostatistics, University of Michigan, Ann Arbor, MI, USA
2 Departments of Pathology and Urology, University of Michigan, Ann Arbor, MI, USA
3 Department of Statistics and Huck Institute for Life Sciences, Penn State University, University Park, PA, USA
BMC Bioinformatics 2007, 8:364 doi:10.1186/1471-2105-8-364Published: 27 September 2007
With the explosion in data generated using microarray technology by different investigators working on similar experiments, it is of interest to combine results across multiple studies.
In this article, we describe a general probabilistic framework for combining high-throughput genomic data from several related microarray experiments using mixture models. A key feature of the model is the use of latent variables that represent quantities that can be combined across diverse platforms. We consider two methods for estimation of an index termed the probability of expression (POE). The first, reported in previous work by the authors, involves Markov Chain Monte Carlo (MCMC) techniques. The second method is a faster algorithm based on the expectation-maximization (EM) algorithm. The methods are illustrated with application to a meta-analysis of datasets for metastatic cancer.