Open Access Open Badges Research article

A Multilocus Sequence Typing System (MLST) reveals a high level of diversity and a genetic component to Entamoeba histolytica virulence

Carol A Gilchrist1*, Ibne Karim M Ali1, Mamun Kabir2, Faisal Alam3, Sana Scherbakova4, Eric Ferlanti4, Gareth D Weedall5, Neil Hall5, Rashidul Haque2, William A Petri1 and Elisabet Caler4

Author Affiliations

1 Departments of Medicine, School of Medicine, University of Virginia, Charlottesville, VA, USA

2 International Centre for Diarrhoeal Diseases Research, Dhaka, Bangladesh

3 Rajshahi Medical College, Rajshahi, Bangladesh

4 J. Craig Venter Institute, Rockville, MD, USA

5 Institute of Integrative Biology, University of Liverpool, Liverpool, UK

For all author emails, please log on.

BMC Microbiology 2012, 12:151  doi:10.1186/1471-2180-12-151

Published: 27 July 2012



The outcome of an Entamoeba histolytica infection is variable and can result in either asymptomatic carriage, immediate or latent disease (diarrhea/dysentery/amebic liver abscess). An E. histolytica multilocus genotyping system based on tRNA gene-linked arrays has shown that genetic differences exist among parasites isolated from patients with different symptoms however, the tRNA gene-linked arrays cannot be located in the current assembly of the E. histolytica Reference genome (strain HM-1:IMSS) and are highly variable.


To probe the population structure of E. histolytica and identify genetic markers associated with clinical outcome we identified in E. histolytica positive samples selected single nucleotide polymorphisms (SNPs) by multiplexed massive parallel sequencing. Profile SNPs were selected which, compared to the reference strain HM-1:IMSS sequence, changed an encoded amino acid at the SNP position, and were present in independent E. histolytica isolates from different geographical origins. The samples used in this study contained DNA isolated from either xenic strains of E. histolytica trophozoites established in culture or E. histolytica positive clinical specimens (stool and amebic liver abscess aspirates). A record of the SNPs present at 16 loci out of the original 21 candidate targets was obtained for 63 of the initial 84 samples (63% of asymptomatically colonized stool samples, 80% of diarrheal stool, 73% of xenic cultures and 84% of amebic liver aspirates). The sequences in all the 63 samples both passed sequence quality control metrics and also had the required greater than 8X sequence coverage for all 16 SNPs in order to confidently identify variants.


Our work is in agreement with previous findings of extensive diversity among E. histolytica isolates from the same geographic origin. In phylogenetic trees, only four of the 63 samples were able to group in two sets of two with greater than 50% confidence. Two SNPs in the cylicin-2 gene (EHI_080100/XM_001914351) were associated with disease (asymptomatic/diarrhea pā€‰=ā€‰0.0162 or dysentery/amebic liver abscess pā€‰=ā€‰0.0003). This study demonstrated that there are genetic differences between virulent and avirulent E. histolytica strains and that this approach has the potential to define genetic changes that influence infection outcomes.