Open Access Research article

High depth, whole-genome sequencing of cholera isolates from Haiti and the Dominican Republic

Rachel Sealfon12*, Stephen Gire23, Crystal Ellis45, Stephen Calderwood45, Firdausi Qadri6, Lisa Hensley7, Manolis Kellis12, Edward T Ryan458, Regina C LaRocque45, Jason B Harris49 and Pardis C Sabeti238*

Author Affiliations

1 Computer Science and Artificial Intelligence Laboratory (CSAIL), Massachusetts Institute of Technology (MIT), Cambridge, MA, USA

2 Broad Institute of MIT and Harvard, Cambridge, MA, USA

3 Center for Systems Biology, Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, USA

4 Division of Infectious Diseases, Massachusetts General Hospital, Boston, MA, USA

5 Department of Medicine, Harvard Medical School, Boston, MA, USA

6 International Centre for Diarrheal Disease Research, Dhaka, Bangladesh

7 Viral Therapeutics, United States Army Institute of Infectious Disease, Fort Detrick, MD, USA

8 Department of Immunology and Infectious Diseases, Harvard School of Public Health, Cambridge, MA, USA

9 Department of Pediatrics, Harvard Medical School, Boston, MA, USA

For all author emails, please log on.

BMC Genomics 2012, 13:468  doi:10.1186/1471-2164-13-468

Published: 11 September 2012



Whole-genome sequencing is an important tool for understanding microbial evolution and identifying the emergence of functionally important variants over the course of epidemics. In October 2010, a severe cholera epidemic began in Haiti, with additional cases identified in the neighboring Dominican Republic. We used whole-genome approaches to sequence four Vibrio cholerae isolates from Haiti and the Dominican Republic and three additional V. cholerae isolates to a high depth of coverage (>2000x); four of the seven isolates were previously sequenced.


Using these sequence data, we examined the effect of depth of coverage and sequencing platform on genome assembly and identification of sequence variants. We found that 50x coverage is sufficient to construct a whole-genome assembly and to accurately call most variants from 100 base pair paired-end sequencing reads. Phylogenetic analysis between the newly sequenced and thirty-three previously sequenced V. cholerae isolates indicates that the Haitian and Dominican Republic isolates are closest to strains from South Asia. The Haitian and Dominican Republic isolates form a tight cluster, with only four variants unique to individual isolates. These variants are located in the CTX region, the SXT region, and the core genome. Of the 126 mutations identified that separate the Haiti-Dominican Republic cluster from the V. cholerae reference strain (N16961), 73 are non-synonymous changes, and a number of these changes cluster in specific genes and pathways.


Sequence variant analyses of V. cholerae isolates, including multiple isolates from the Haitian outbreak, identify coverage-specific and technology-specific effects on variant detection, and provide insight into genomic change and functional evolution during an epidemic.

Whole-genome sequencing; Vibrio cholerae; Haitian cholera epidemic; Microbial evolution