Email updates

Keep up to date with the latest news and content from BMC Proceedings and BioMed Central.

This article is part of the supplement: Selected Proceedings of Machine Learning in Systems Biology: MLSB 2007

Open Access Proceedings

A marginalized variational bayesian approach to the analysis of array data

Yiming Ying*, Peng Li and Colin Campbell

Author Affiliations

Department of Engineering Mathematics, University of Bristol, Bristol BS8 1TR, UK

For all author emails, please log on.

BMC Proceedings 2008, 2(Suppl 4):S7  doi:

Published: 17 December 2008



Bayesian unsupervised learning methods have many applications in the analysis of biological data. For example, for the cancer expression array datasets presented in this study, they can be used to resolve possible disease subtypes and to indicate statistically significant dysregulated genes within these subtypes.


In this paper we outline a marginalized variational Bayesian inference method for unsupervised clustering. In this approach latent process variables and model parameters are allowed to be dependent. This is achieved by marginalizing the mixing Dirichlet variables and then performing inference in the reduced variable space. An iterative update procedure is proposed.


Theoretically and experimentally we show that the proposed algorithm gives a much better free-energy lower bound than a standard variational Bayesian approach. The algorithm is computationally efficient and its performance is demonstrated on two expression array data sets.