Open Access Research article

Comparative genomics of 274 Vibrio cholerae genomes reveals mobile functions structuring three niche dimensions

Bas E Dutilh1234, Cristiane C Thompson5, Ana CP Vicente5, Michel A Marin5, Clarence Lee7, Genivaldo GZ Silva26, Robert Schmieder26, Bruno GN Andrade5, Luciane Chimetto4, Daniel Cuevas27, Daniel R Garza2, Iruka N Okeke8, Aaron Oladipo Aboderin9, Jessica Spangler7, Tristen Ross7, Elizabeth A Dinsdale1, Fabiano L Thompson4, Timothy T Harkins7 and Robert A Edwards110246*

Author Affiliations

1 Department of Biology, San Diego State University, San Diego, CA, USA

2 Department of Computer Science, San Diego State University, San Diego, CA, USA

3 Centre for Molecular and Biomolecular Informatics, Radboud Institute for Molecular Life Sciences, Radboud University Medical Centre, Nijmegen, The Netherlands

4 Department of Marine Biology, Institute of Biology, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil

5 Laboratory of Molecular Genetics of Microorganisms, Oswaldo Cruz Institute, FIOCRUZ, Rio de Janeiro, Brazil

6 Computational Science Research Center, San Diego State University, San Diego, CA, USA

7 Advanced Applications Group, Life Technologies, Inc, Beverly, MA, USA

8 Department of Biology, Haverford College, Haverford, PA, USA

9 Department of Medical Microbiology & Parasitology, College of Health Sciences, Obafemi Awolowo University, Ile-Ife, Nigeria

10 Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, IL, USA

For all author emails, please log on.

BMC Genomics 2014, 15:654  doi:10.1186/1471-2164-15-654

Published: 5 August 2014

Abstract

Background

Vibrio cholerae is a globally dispersed pathogen that has evolved with humans for centuries, but also includes non-pathogenic environmental strains. Here, we identify the genomic variability underlying this remarkable persistence across the three major niche dimensions space, time, and habitat.

Results

Taking an innovative approach of genome-wide association applicable to microbial genomes (GWAS-M), we classify 274 complete V. cholerae genomes by niche, including 39 newly sequenced for this study with the Ion Torrent DNA-sequencing platform. Niche metadata were collected for each strain and analyzed together with comprehensive annotations of genetic and genomic attributes, including point mutations (single-nucleotide polymorphisms, SNPs), protein families, functions and prophages.

Conclusions

Our analysis revealed that genomic variations, in particular mobile functions including phages, prophages, transposable elements, and plasmids underlie the metadata structuring in each of the three niche dimensions. This underscores the role of phages and mobile elements as the most rapidly evolving elements in bacterial genomes, creating local endemicity (space), leading to temporal divergence (time), and allowing the invasion of new habitats. Together, we take a data-driven approach for comparative functional genomics that exploits high-volume genome sequencing and annotation, in conjunction with novel statistical and machine learning analyses to identify connections between genotype and phenotype on a genome-wide scale.

Keywords:
Functional genomics; Mobile elements; Phages; Niche adaptation; Vibrio; Genome evolution; Genotype-phenotype association; Random forest