Transcriptome dynamics-based operon prediction in prokaryotes
- Equal contributors
1 Department of Computer Science (DI), NeuRoNe Lab, University of Salerno, via ponte don Melillo 84084, Fisciano, (SA), Italy
2 Department of Pharmaceutical and Biomedical Sciences (FARMABIOMED), University of Salerno, via ponte don Melillo, Fisciano, (SA) 84084, Italy
3 Unit of Systems Toxicology, Finnish Institute of Occupational Health (FIOH), Topeliuksenkatu 41b, Helsinki 00250, Finland
4 Institute of Biotechnology, University of Helsinki, Helsinki, Finland
BMC Bioinformatics 2014, 15:145 doi:10.1186/1471-2105-15-145Published: 16 May 2014
Inferring operon maps is crucial to understanding the regulatory networks of prokaryotic genomes. Recently, RNA-seq based transcriptome studies revealed that in many bacterial species the operon structure vary with the change of environmental conditions. Therefore, new computational solutions that use both static and dynamic data are necessary to create condition specific operon predictions.
In this work, we propose a novel classification method that integrates RNA-seq based transcriptome profiles with genomic sequence features to accurately identify the operons that are expressed under a measured condition. The classifiers are trained on a small set of confirmed operons and then used to classify the remaining gene pairs of the organism studied. Finally, by linking consecutive gene pairs classified as operons, our computational approach produces condition-dependent operon maps. We evaluated our approach on various RNA-seq expression profiles of the bacteria Haemophilus somni, Porphyromonas gingivalis, Escherichia coli and Salmonella enterica. Our results demonstrate that, using features depending on both transcriptome dynamics and genome sequence characteristics, we can identify operon pairs with high accuracy. Moreover, the combination of DNA sequence and expression data results in more accurate predictions than each one alone.
We present a computational strategy for the comprehensive analysis of condition-dependent operon maps in prokaryotes. Our method can be used to generate condition specific operon maps of many bacterial organisms for which high-resolution transcriptome data is available.