Open Access Highly Accessed Open Badges Research article

Multiple-omic data analysis of Klebsiella pneumoniae MGH 78578 reveals its transcriptional architecture and regulatory features

Joo-Hyun Seo1, Jay Sung-Joong Hong13, Donghyuk Kim1, Byung-Kwan Cho14, Tzu-Wen Huang12, Shih-Feng Tsai2, Bernhard O Palsson1 and Pep Charusanti1*

Author Affiliations

1 Department of Bioengineering, University of California, San Diego, La Jolla, CA 92093, USA

2 Institute of Molecular and Genomic Medicine, National Health Research Institutes, Miaoli, 350, Taiwan

3 Current address: Central Research Institute, Samsung Petrochemical, Samsung Advanced Institute of Technology, 6th floor, Building 2, Nongseo-dong, Giheung-gu, Yongin, Gyeonggi-do, 446-712, Republic of Korea

4 Current address: Department of Biological Sciences, Korea Advanced Institute of Science and Technology, 291 Daehak-ro, Yuseong-gu, Daejeon, 305-701, Korea

For all author emails, please log on.

BMC Genomics 2012, 13:679  doi:10.1186/1471-2164-13-679

Published: 29 November 2012



The increasing number of infections caused by strains of Klebsiella pneumoniae that are resistant to multiple antibiotics has developed into a major medical problem worldwide. The development of next-generation sequencing technologies now permits rapid sequencing of many K. pneumoniae isolates, but sequence information alone does not provide important structural and operational information for its genome.


Here we take a systems biology approach to annotate the K. pneumoniae MGH 78578 genome at the structural and operational levels. Through the acquisition and simultaneous analysis of multiple sample-matched –omics data sets from two growth conditions, we detected 2677, 1227, and 1066 binding sites for RNA polymerase, RpoD, and RpoS, respectively, 3660 RNA polymerase-guided transcript segments, and 3585 transcription start sites throughout the genome. Moreover, analysis of the transcription start site data identified 83 probable leaderless mRNAs, while analysis of unannotated transcripts suggested the presence of 119 putative open reading frames, 15 small RNAs, and 185 antisense transcripts that are not currently annotated.


These findings highlight the strengths of systems biology approaches to the refinement of sequence-based annotations, and to provide new insight into fundamental genome-level biology for this important human pathogen.

Klebsiella pneumoniae; Infectious disease; Transcriptional architecture; Omics data; Systems biology