Open Access Highly Accessed Research article

Escherichia coli phylogenetic group determination and its application in the identification of the major animal source of fecal contamination

Camila Carlos1, Mathias M Pires2, Nancy C Stoppe1, Elayse M Hachich3, Maria IZ Sato3, Tânia AT Gomes4, Luiz A Amaral5 and Laura MM Ottoboni1*

Author Affiliations

1 Centro de Biologia Molecular e Engenharia Genética, Universidade Estadual de Campinas - UNICAMP, C. P. 6010, 13083-875 Campinas, S. P., Brasil

2 Programa de Pós Graduação em Ecologia, Instituto de Biologia, Universidade Estadual de Campinas -- UNICAMP, 13083-970 Campinas, S. P., Brasil

3 Departamento de Análises Ambientais, Companhia Ambiental do Estado de São Paulo - CETESB, 05459-900 São Paulo, S. P., Brasil

4 Departamento de Microbiologia, Imunologia e Parasitologia, Universidade Federal de São Paulo -- UNIFESP, 04023-062 São Paulo, S. P., Brasil

5 Faculdade de Ciências Agrárias e Veterinárias, Universidade Estadual Paulista -- UNESP, 14884-900 Jaboticabal, S. P., Brasil

For all author emails, please log on.

BMC Microbiology 2010, 10:161  doi:10.1186/1471-2180-10-161

Published: 1 June 2010



Escherichia coli strains are commonly found in the gut microflora of warm-blooded animals. These strains can be assigned to one of the four main phylogenetic groups, A, B1, B2 and D, which can be divided into seven subgroups (A0, A1, B1, B22, B23, D1 and D2), according to the combination of the three genetic markers chuA, yjaA and DNA fragment TspE4.C2. Distinct studies have demonstrated that these phylo-groups differ in the presence of virulence factors, ecological niches and life-history. Therefore, the aim of this work was to analyze the distribution of these E. coli phylo-groups in 94 human strains, 13 chicken strains, 50 cow strains, 16 goat strains, 39 pig strains and 29 sheep strains and to verify the potential of this analysis to investigate the source of fecal contamination.


The results indicated that the distribution of phylogenetic groups, subgroups and genetic markers is non-random in the hosts analyzed. Strains from group B1 were present in all hosts analyzed but were more prevalent in cow, goat and sheep samples. Subgroup B23 was only found in human samples. The diversity and the similarity indexes have indicated a similarity between the E. coli population structure of human and pig samples and among cow, goat and sheep samples. Correspondence analysis using contingence tables of subgroups, groups and genetic markers frequencies allowed the visualization of the differences among animal samples and the identification of the animal source of an external validation set. The classifier tools Binary logistic regression and Partial least square -- discriminant analysis, using the genetic markers profile of the strains, differentiated the herbivorous from the omnivorous strains, with an average error rate of 17%.


This is the first work, as far as we are aware, that identifies the major source of fecal contamination of a pool of strains instead of a unique strain. We concluded that the analysis of the E. coli population structure can be useful as a supplementary bacterial source tracking tool.