Email updates

Keep up to date with the latest news and content from BMC Microbiology and BioMed Central.

Open Access Highly Accessed Research article

The effect of training set on the classification of honey bee gut microbiota using the Naïve Bayesian Classifier

Irene LG Newton1* and Guus Roeselers2

Author Affiliations

1 Department of Biology, 1001 E 3rd Street, Bloomington, IN, 47405, USA

2 Microbiology & Systems Biology group, TNO, Utrechtseweg, Zeist, The Netherlands

For all author emails, please log on.

BMC Microbiology 2012, 12:221  doi:10.1186/1471-2180-12-221

Published: 26 September 2012

Abstract

Background

Microbial ecologists now routinely utilize next-generation sequencing methods to assess microbial diversity in the environment. One tool heavily utilized by many groups is the Naïve Bayesian Classifier developed by the Ribosomal Database Project (RDP-NBC). However, the consistency and confidence of classifications provided by the RDP-NBC is dependent on the training set utilized.

Results

We explored the stability of classification of honey bee gut microbiota sequences by the RDP-NBC utilizing three publically available ribosomal RNA sequence databases as training sets: ARB-SILVA, Greengenes and RDP. We found that the inclusion of previously published, high-quality, full-length sequences from 16S rRNA clone libraries improved the precision in classification of novel bee-associated sequences. Specifically, by including bee-specific 16S rRNA gene sequences a larger fraction of sequences were classified at a higher confidence by the RDP-NBC (based on bootstrap scores).

Conclusions

Results from the analysis of these bee-associated sequences have ramifications for other environments represented by few sequences in the public databases or few bacterial isolates. We conclude that for the exploration of relatively novel habitats, the inclusion of high-quality, full-length 16S rRNA gene sequences allows for a more confident taxonomic classification.

Keywords:
Honey bee; Gut; Microbiota; Naïve Bayesian classifier; Pyrosequencing; Taxonomy