Email updates

Keep up to date with the latest news and content from BMC Research Notes and BioMed Central.

Open Access Research article

NBC update: The addition of viral and fungal databases to the Naïve Bayes classification tool

Gail L Rosen1* and Tze Yee Lim2

Author Affiliations

1 Department of Electrical and Computer Engineering, Drexel University, Philadelphia, PA, USA

2 Department of Physics, Drexel University, Philadelphia, PA, USA

For all author emails, please log on.

BMC Research Notes 2012, 5:81  doi:10.1186/1756-0500-5-81

Published: 31 January 2012

Abstract

Background

Classifying the fungal and viral content of a sample is an important component of analyzing microbial communities in environmental media. Therefore, a method to classify any fragment from these organisms' DNA should be implemented.

Results

We update the näive Bayes classification (NBC) tool to classify reads originating from viral and fungal organisms. NBC classifies a fungal dataset similarly to Basic Local Alignment Search Tool (BLAST) and the Ribosomal Database Project (RDP) classifier. We also show NBC's similarities and differences to RDP on a fungal large subunit (LSU) ribosomal DNA dataset. For viruses in the training database, strain classification accuracy is 98%, while for those reads originating from sequences not in the database, the order-level accuracy is 78%, where order indicates the taxonomic level in the tree of life.

Conclusions

In addition to being competitive to other classifiers available, NBC has the potential to handle reads originating from any location in the genome. We recommend using the Bacteria/Archaea, Fungal, and Virus databases separately due to algorithmic biases towards long genomes. The tool is publicly available at: http://nbc.ece.drexel.edu webcite.