TACO: a general-purpose tool for predicting cell-type–specific transcription factor dimers
1 Computational and Systems Biology, Genome Institute of Singapore, 60 Biopolis Street, Singapore 138672, Singapore
2 Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Banacha 2, 02-097 Warszawa, Poland
BMC Genomics 2014, 15:208 doi:10.1186/1471-2164-15-208Published: 19 March 2014
Cooperative binding of transcription factor (TF) dimers to DNA is increasingly recognized as a major contributor to binding specificity. However, it is likely that the set of known TF dimers is highly incomplete, given that they were discovered using ad hoc approaches, or through computational analyses of limited datasets.
Here, we present TACO (Transcription factor Association from Complex Overrepresentation), a general-purpose standalone software tool that takes as input any genome-wide set of regulatory elements and predicts cell-type–specific TF dimers based on enrichment of motif complexes. TACO is the first tool that can accommodate motif complexes composed of overlapping motifs, a characteristic feature of many known TF dimers. Our method comprehensively outperforms existing tools when benchmarked on a reference set of 29 known dimers. We demonstrate the utility and consistency of TACO by applying it to 152 DNase-seq datasets and 94 ChIP-seq datasets.
Based on these results, we uncover a general principle governing the structure of TF-TF-DNA ternary complexes, namely that the flexibility of the complex is correlated with, and most likely a consequence of, inter-motif spacing.