This article is part of the supplement: Selected Proceedings of Machine Learning in Systems Biology: MLSB 2007
A marginalized variational bayesian approach to the analysis of array data
Department of Engineering Mathematics, University of Bristol, Bristol BS8 1TR, UK
BMC Proceedings 2008, 2(Suppl 4):S7 doi:Published: 17 December 2008
Bayesian unsupervised learning methods have many applications in the analysis of biological data. For example, for the cancer expression array datasets presented in this study, they can be used to resolve possible disease subtypes and to indicate statistically significant dysregulated genes within these subtypes.
In this paper we outline a marginalized variational Bayesian inference method for unsupervised clustering. In this approach latent process variables and model parameters are allowed to be dependent. This is achieved by marginalizing the mixing Dirichlet variables and then performing inference in the reduced variable space. An iterative update procedure is proposed.
Theoretically and experimentally we show that the proposed algorithm gives a much better free-energy lower bound than a standard variational Bayesian approach. The algorithm is computationally efficient and its performance is demonstrated on two expression array data sets.