Log on / register
Feedback | Support | My details
Open AccessHighly AccessSoftware

Measuring similarities between transcription factor binding sites

Szymon M Kielbasa1 email, Didier Gonze1,2 email and Hanspeter Herzel1 email

1Institute for Theoretical Biology, Humboldt University, Invalidenstraße 43, D-10115 Berlin, Germany

2Unité de Chronobiologie Théorique, Université Libre de Bruxelles, CP 231, Campus Plaine, Bvd du Triomphe, B-1050 Bruxelles, Belgium

author email corresponding author email

BMC Bioinformatics 2005, 6:237doi:10.1186/1471-2105-6-237

Published: 28 September 2005

Abstract

Background

Collections of transcription factor binding profiles (Transfac, Jaspar) are essential to identify regulatory elements in DNA sequences. Subsets of highly similar profiles complicate large scale analysis of transcription factor binding sites.

Results

We propose to identify and group similar profiles using two independent similarity measures: χ2 distances between position frequency matrices (PFMs) and correlation coefficients between position weight matrices (PWMs) scores.

Conclusion

We show that these measures complement each other and allow to associate Jaspar and Transfac matrices. Clusters of highly similar matrices are identified and can be used to optimise the search for regulatory elements. Moreover, the application of the measures is illustrated by assigning E-box matrices of a SELEX experiment and of experimentally characterised binding sites of circadian clock genes to the Myc-Max cluster.


© 1999-2009 BioMed Central Ltd unless otherwise stated. Part of Springer Science+Business Media.