Log on / register
Feedback | Support | My details
Open AccessHighly AccessResearch article

Unsupervised assessment of microarray data quality using a Gaussian mixture model

Brian E Howard1 email, Beate Sick2 email and Steffen Heber1 email

Bioinformatics Research Center, North Carolina State University, Raleigh, NC, USA

Institute of Data Analysis and Process Design, Zurich University of Applied Science, Winterthur, Switzerland

author email corresponding author email

BMC Bioinformatics 2009, 10:191doi:10.1186/1471-2105-10-191

Published: 22 June 2009

Abstract

Background

Quality assessment of microarray data is an important and often challenging aspect of gene expression analysis. This task frequently involves the examination of a variety of summary statistics and diagnostic plots. The interpretation of these diagnostics is often subjective, and generally requires careful expert scrutiny.

Results

We show how an unsupervised classification technique based on the Expectation-Maximization (EM) algorithm and the naïve Bayes model can be used to automate microarray quality assessment. The method is flexible and can be easily adapted to accommodate alternate quality statistics and platforms. We evaluate our approach using Affymetrix 3' gene expression and exon arrays and compare the performance of this method to a similar supervised approach.

Conclusion

This research illustrates the efficacy of an unsupervised classification approach for the purpose of automated microarray data quality assessment. Since our approach requires only unannotated training data, it is easy to customize and to keep up-to-date as technology evolves. In contrast to other "black box" classification systems, this method also allows for intuitive explanations.


© 1999-2009 BioMed Central Ltd unless otherwise stated. Part of Springer Science+Business Media.