The effect of training set on the classification of sequences from the honey bee gut visualized by a heat map. Unique sequences (4,480) were classified using the NBC trained on either RDP, GG, or SILVA (A), three custom databases including near full length honey bee-associated sequences RDP + bees, GG + bees, SILVA + bees (B), or the near full length honey bee-associated sequences alone (C). Family-level taxonomic designations are shown and where taxonomic classifications occur across all three datasets, these are highlighted in bold lettering. Where a classification is unique to one training set, this is highlighted in red font. The average bootstrap score resulting from the classification is provided for each taxonomic assignment.
Newton and Roeselers BMC Microbiology 2012 12:221 doi:10.1186/1471-2180-12-221