Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Methodology article

New resampling method for evaluating stability of clusters

Irina M Gana Dresen1*, Tanja Boes1, Johannes Huesing2, Markus Neuhaeuser13 and Karl-Heinz Joeckel1

Author Affiliations

1 Institut für Medizinische Informatik, Biometrie und Epidemiologie, Universitaetsklinikum Essen, Germany

2 Koordinierungszentrum für Klinische Studien, Universitaetsklinikum Heidelberg, Germany

3 Fachbereich Mathematik und Technik, RheinAhrCampus Remagen, Germany

For all author emails, please log on.

BMC Bioinformatics 2008, 9:42  doi:10.1186/1471-2105-9-42

Published: 24 January 2008

Abstract

Background

Hierarchical clustering is a widely applied tool in the analysis of microarray gene expression data. The assessment of cluster stability is a major challenge in clustering procedures. Statistical methods are required to distinguish between real and random clusters. Several methods for assessing cluster stability have been published, including resampling methods such as the bootstrap.

We propose a new resampling method based on continuous weights to assess the stability of clusters in hierarchical clustering. While in bootstrapping approximately one third of the original items is lost, continuous weights avoid zero elements and instead allow non integer diagonal elements, which leads to retention of the full dimensionality of space, i.e. each variable of the original data set is represented in the resampling sample.

Results

Comparison of continuous weights and bootstrapping using real datasets and simulation studies reveals the advantage of continuous weights especially when the dataset has only few observations, few differentially expressed genes and the fold change of differentially expressed genes is low.

Conclusion

We recommend the use of continuous weights in small as well as in large datasets, because according to our results they produce at least the same results as conventional bootstrapping and in some cases they surpass it.