Email updates

Keep up to date with the latest news and content from BMC Microbiology and BioMed Central.

Open Access Highly Accessed Research article

Identification of Enterobacter sakazakii from closely related species: The use of Artificial Neural Networks in the analysis of biochemical and 16S rDNA data

Carol Iversen1, Lee Lancashire12, Michael Waddington3, Stephen Forsythe1 and Graham Ball12*

Author Affiliations

1 The Nottingham Trent University, School of Biomedical and Natural Sciences, Clifton Campus, Clifton Lane, Nottingham, NG11 8NS, UK

2 Loreus Ltd., Erasmus Darwin Building, College of Science and Technology, Nottingham Trent University, Clifton Lane, Nottingham, NG11 8NS, UK

3 Accugenix, 223 Lake Drive, Newark, DE 19702, USA

For all author emails, please log on.

BMC Microbiology 2006, 6:28  doi:10.1186/1471-2180-6-28

Published: 13 March 2006

Abstract

Background

Enterobacter sakazakii is an emergent pathogen associated with ingestion of infant formula and accurate identification is important in both industrial and clinical settings. Bacterial species can be difficult to accurately characterise from complex biochemical datasets and computer algorithms can potentially simplify the process.

Results

Artificial Neural Networks were applied to biochemical and 16S rDNA data derived from 282 strains of Enterobacteriaceae, including 189 E. sakazakii isolates, in order to identify key characteristics which could improve the identification of E. sakazakii. The models developed resulted in a predictive performance for blind (validation) data of 99.3 % correct discrimination between E. sakazakii and closely related species for both phenotypic and genotypic data. Three main regions of the partial rDNA sequence were found to be key in discriminating the species. Comparison between E. sakazakii and other strains also constitutively positive for expression of the enzyme α-glucosidase resulted in a predictive performance of 98.7 % for 16S rDNA sequence data and 100% for phenotypic data.

Conclusion

The computationally based methods developed here show a remarkable ability in reducing data dimensionality and complexity, in order to eliminate noise from the system in order to facilitate the speed and reliability of a potential strain identification system. Furthermore, the approaches described are also able to provide valuable information regarding the population structure and distribution of individual species thus providing the foundations for novel assays and diagnostic tests for rapid identification of pathogens.