Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Software

DISCLOSE : DISsection of CLusters Obtained by SEries of transcriptome data using functional annotations and putative transcription factor binding sites

Evert-Jan Blom1, Sacha AFT van Hijum123, Klaas J Hofstede1, Remko Silvis1, Jos BTM Roerdink4 and Oscar P Kuipers1*

Author Affiliations

1 Molecular Genetics, Groningen Biomolecular Sciences and Biotechnology Institute, University of Groningen, The Netherlands

2 Interfacultary Institute for Genetics and Functional Genomics, Ernst-Moritz-Arndt-University, Friedrich-Ludwig-Jahnstra├če, 15A 17487, Greifswald 17489, Germany

3 NIZO Food Research, PO Box 20, 6710 BA Ede, the Netherlands

4 Institute for Mathematics and Computing Science, University of Groningen, Nijenborgh 9, 9747 AG, Groningen, The Netherlands

For all author emails, please log on.

BMC Bioinformatics 2008, 9:535  doi:10.1186/1471-2105-9-535

Published: 16 December 2008

Abstract

Background

A typical step in the analysis of gene expression data is the determination of clusters of genes that exhibit similar expression patterns. Researchers are confronted with the seemingly arbitrary choice between numerous algorithms to perform cluster analysis.

Results

We developed an exploratory application that benchmarks the results of clustering methods using functional annotations. In addition, a de novo DNA motif discovery algorithm is integrated in our program which identifies overrepresented DNA binding sites in the upstream DNA sequences of genes from the clusters that are indicative of sites of transcriptional control. The performance of our program was evaluated by comparing the original results of a time course experiment with the findings of our application.

Conclusion

DISCLOSE assists researchers in the prokaryotic research community in systematically evaluating results of the application of a range of clustering algorithms to transcriptome data. Different performance measures allow to quickly and comprehensively determine the best suited clustering approach for a given dataset.