Transcription factor site dependencies in human, mouse and rat genomes
1 Friedrich Miescher Institute for Biomedical Research, Novartis Research Foundation, Basel, Switzerland
2 Modeling and Simulation, Novartis Pharma AG, Basel, Switzerland
BMC Bioinformatics 2009, 10:339 doi:10.1186/1471-2105-10-339Published: 16 October 2009
It is known that transcription factors frequently act together to regulate gene expression in eukaryotes. In this paper we describe a computational analysis of transcription factor site dependencies in human, mouse and rat genomes.
Our approach for quantifying tendencies of transcription factor binding sites to co-occur is based on a binding site scoring function which incorporates dependencies between positions, the use of information about the structural class of each transcription factor (major/minor groove binder), and also considered the possible implications of varying GC content of the sequences. Significant tendencies (dependencies) have been detected by non-parametric statistical methodology (permutation tests). Evaluation of obtained results has been performed in several ways: reports from literature (many of the significant dependencies between transcription factors have previously been confirmed experimentally); dependencies between transcription factors are not biased due to similarities in their DNA-binding sites; the number of dependent transcription factors that belong to the same functional and structural class is significantly higher than would be expected by chance; supporting evidence from GO clustering of targeting genes. Based on dependencies between two transcription factor binding sites (second-order dependencies), it is possible to construct higher-order dependencies (networks). Moreover results about transcription factor binding sites dependencies can be used for prediction of groups of dependent transcription factors on a given promoter sequence. Our results, as well as a scanning tool for predicting groups of dependent transcription factors binding sites are available on the Internet.
We show that the computational analysis of transcription factor site dependencies is a valuable complement to experimental approaches for discovering transcription regulatory interactions and networks. Scanning promoter sequences with dependent groups of transcription factor binding sites improve the quality of transcription factor predictions.