Including probe-level uncertainty in model-based gene expression clustering
1 College of Information Science and Technology, Nanjing University of Aeronautics and Astronautics, 29 Yudao Street, Nanjing 210016, China
2 Departments of Biological Chemistry and Medicine, Institute for Genomics and Bioinformatics, University of California, Irvine, CA 92697, USA
3 School of Computer Science, University of Manchester, Kilburn Building, Oxford Road, Manchester M13 9PL, UK
BMC Bioinformatics 2007, 8:98 doi:10.1186/1471-2105-8-98Published: 21 March 2007
Clustering is an important analysis performed on microarray gene expression data since it groups genes which have similar expression patterns and enables the exploration of unknown gene functions. Microarray experiments are associated with many sources of experimental and biological variation and the resulting gene expression data are therefore very noisy. Many heuristic and model-based clustering approaches have been developed to cluster this noisy data. However, few of them include consideration of probe-level measurement error which provides rich information about technical variability.
We augment a standard model-based clustering method to incorporate probe-level measurement error. Using probe-level measurements from a recently developed Affymetrix probe-level model, multi-mgMOS, we include the probe-level measurement error directly into the standard Gaussian mixture model. Our augmented model is shown to provide improved clustering performance on simulated datasets and a real mouse time-course dataset.
The performance of model-based clustering of gene expression data is improved by including probe-level measurement error and more biologically meaningful clustering results are obtained.