Evolution of transcriptional regulation in closely related bacteria
- Equal contributors
1 Institute for Information Transmission Problems, RAS, Bolshoi Karetny per. 19, Moscow, 127994, Russia
2 Faculty of Bioengineering and Bioinformatics, Moscow State University, Vorobievy Gory 1-73, Moscow, 119992, Russia
3 V.N. Orekhovich Institute of Biomedical Chemistry, RAMS, Pogodinskaya St. 10, Moscow, 119121, Russia
BMC Evolutionary Biology 2012, 12:200 doi:10.1186/1471-2148-12-200Published: 6 October 2012
The exponential growth of the number of fully sequenced genomes at varying taxonomic closeness allows one to characterize transcriptional regulation using comparative-genomics analysis instead of time-consuming experimental methods. A transcriptional regulatory unit consists of a transcription factor, its binding site and a regulated gene. These units constitute a graph which contains so-called “network motifs”, subgraphs of a given structure. Here we consider genomes of closely related Enterobacteriales and estimate the fraction of conserved network motifs and sites as well as positions under selection in various types of non-coding regions.
Using a newly developed technique, we found that the highest fraction of positions under selection, approximately 50%, was observed in synvergon spacers (between consecutive genes from the same strand), followed by ~45% in divergon spacers (common 5’-regions), and ~10% in convergon spacers (common 3’-regions). The fraction of selected positions in functional regions was higher, 60% in transcription factor-binding sites and ~45% in terminators and promoters. Small, but significant differences were observed between Escherichia coli and Salmonella enterica. This fraction is similar to the one observed in eukaryotes.
The conservation of binding sites demonstrated some differences between types of regulatory units. In E. coli, strains the interactions of the type “local transcriptional factor ➝ gene” turned out to be more conserved in feed-forward loops (FFLs) compared to non-motif interactions. The coherent FFLs tend to be less conserved than the incoherent FFLs. A natural explanation is that the former imply functional redundancy.
A naïve hypothesis that FFL would be highly conserved turned out to be not entirely true: its conservation depends on its status in the transcriptional network and also from its usage. The fraction of positions under selection in intergenic regions of bacterial genomes is roughly similar to that of eukaryotes. Known regulatory sites explain 20±5% of selected positions.