Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Highly Accessed Research article

Transcriptome dynamics-based operon prediction in prokaryotes

Vittorio Fortino123*, Olli-Pekka Smolander4, Petri Auvinen4, Roberto Tagliaferri1 and Dario Greco3

Author Affiliations

1 Department of Computer Science (DI), NeuRoNe Lab, University of Salerno, via ponte don Melillo 84084, Fisciano, (SA), Italy

2 Department of Pharmaceutical and Biomedical Sciences (FARMABIOMED), University of Salerno, via ponte don Melillo, Fisciano, (SA) 84084, Italy

3 Unit of Systems Toxicology, Finnish Institute of Occupational Health (FIOH), Topeliuksenkatu 41b, Helsinki 00250, Finland

4 Institute of Biotechnology, University of Helsinki, Helsinki, Finland

For all author emails, please log on.

BMC Bioinformatics 2014, 15:145  doi:10.1186/1471-2105-15-145

Published: 16 May 2014

Abstract

Background

Inferring operon maps is crucial to understanding the regulatory networks of prokaryotic genomes. Recently, RNA-seq based transcriptome studies revealed that in many bacterial species the operon structure vary with the change of environmental conditions. Therefore, new computational solutions that use both static and dynamic data are necessary to create condition specific operon predictions.

Results

In this work, we propose a novel classification method that integrates RNA-seq based transcriptome profiles with genomic sequence features to accurately identify the operons that are expressed under a measured condition. The classifiers are trained on a small set of confirmed operons and then used to classify the remaining gene pairs of the organism studied. Finally, by linking consecutive gene pairs classified as operons, our computational approach produces condition-dependent operon maps. We evaluated our approach on various RNA-seq expression profiles of the bacteria Haemophilus somni, Porphyromonas gingivalis, Escherichia coli and Salmonella enterica. Our results demonstrate that, using features depending on both transcriptome dynamics and genome sequence characteristics, we can identify operon pairs with high accuracy. Moreover, the combination of DNA sequence and expression data results in more accurate predictions than each one alone.

Conclusion

We present a computational strategy for the comprehensive analysis of condition-dependent operon maps in prokaryotes. Our method can be used to generate condition specific operon maps of many bacterial organisms for which high-resolution transcriptome data is available.

Keywords:
Operons; Computational prediction; Condition-dependent operon maps; RNA-seq data analysis