Integrating bioinformatic resources to predict transcription factors interacting with cis-sequences conserved in co-regulated genes
- Equal contributors
1 INRA, Institut Jean-Pierre Bourgin, Saclay Plant Sciences, UMR1318, RD10, F-78026, Versailles, France
2 AgroParisTech, Institut Jean-Pierre Bourgin, Saclay Plant Sciences, UMR1318, RD10, F-78026, Versailles, France
3 Current address: Biochimie et Physiologie Moleculaire des Plantes, UMR 5004, INRA/CNRS/SupAgro-M/UM2, 34060 Montpellier Cedex 1, France
4 Estación Experimental de Aula Dei/CSIC, Av. Montañana 1.005, 50059 Zaragoza, Spain
5 Institut für Genetik, Technische Universität Braunschweig, Spielmannstr. 7, 38106 Braunschweig, Germany
6 Department of Biology, Bielefeld University, Universitaetsstrasse 25, 33615 Bielefeld, Germany
7 Fundación ARAID, calle María de Luna 11, 50018 Zaragoza, Spain
BMC Genomics 2014, 15:317 doi:10.1186/1471-2164-15-317Published: 28 April 2014
Using motif detection programs it is fairly straightforward to identify conserved cis-sequences in promoters of co-regulated genes. In contrast, the identification of the transcription factors (TFs) interacting with these cis-sequences is much more elaborate. To facilitate this, we explore the possibility of using several bioinformatic and experimental approaches for TF identification. This starts with the selection of co-regulated gene sets and leads first to the prediction and then to the experimental validation of TFs interacting with cis-sequences conserved in the promoters of these co-regulated genes.
Using the PathoPlant database, 32 up-regulated gene groups were identified with microarray data for drought-responsive gene expression from Arabidopsis thaliana. Application of the binding site estimation suite of tools (BEST) discovered 179 conserved sequence motifs within the corresponding promoters. Using the STAMP web-server, 49 sequence motifs were classified into 7 motif families for which similarities with known cis-regulatory sequences were identified. All motifs were subjected to a footprintDB analysis to predict interacting DNA binding domains from plant TF families. Predictions were confirmed by using a yeast-one-hybrid approach to select interacting TFs belonging to the predicted TF families. TF-DNA interactions were further experimentally validated in yeast and with a Physcomitrella patens transient expression system, leading to the discovery of several novel TF-DNA interactions.
The present work demonstrates the successful integration of several bioinformatic resources with experimental approaches to predict and validate TFs interacting with conserved sequence motifs in co-regulated genes.