BMC Bioinformatics

official impact factor 3.03

Open Access Highly Access Methodology article

Stability of gene contributions and identification of outliers in multivariate analysis of microarray data

Florent Baty1*, Daniel Jaeger2, Frank Preiswerk2, Martin M Schumacher3 and Martin H Brutsche1

Author Affiliations

1 Pulmonary Gene Research, University Hospital Basel, Petersgraben 4, Basel, Switzerland

2 Department of Computer Science, University of Basel, Klingelbergstrasse 50, Basel, Switzerland

3 Biomarker Development, Novartis AG, Klybeckstrasse 141, Basel, Switzerland

For all author emails, please log on.

BMC Bioinformatics 2008, 9:289 doi:10.1186/1471-2105-9-289

Published: 20 June 2008

Abstract

Background

Multivariate ordination methods are powerful tools for the exploration of complex data structures present in microarray data. These methods have several advantages compared to common gene-by-gene approaches. However, due to their exploratory nature, multivariate ordination methods do not allow direct statistical testing of the stability of genes.

Results

In this study, we developed a computationally efficient algorithm for: i) the assessment of the significance of gene contributions and ii) the identification of sample outliers in multivariate analysis of microarray data. The approach is based on the use of resampling methods including bootstrapping and jackknifing. A statistical package of R functions was developed. This package includes tools for both inferring the statistical significance of gene contributions and identifying outliers among samples.

Conclusion

The methodology was successfully applied to three published data sets with varying levels of signal intensities. Its relevance was compared with alternative methods. Overall, it proved to be particularly effective for the evaluation of the stability of microarray data.