Estimation of bacterial diversity using next generation sequencing of 16S rDNA: a comparison of different workflows
Centro Nacional de Biotecnología (CSIC), c/Darwin 3, 28049 Madrid, Spain
BMC Bioinformatics 2011, 12:473 doi:10.1186/1471-2105-12-473Published: 14 December 2011
Next generation sequencing (NGS) enables a more comprehensive analysis of bacterial diversity from complex environmental samples. NGS data can be analysed using a variety of workflows. We test several simple and complex workflows, including frequently used as well as recently published tools, and report on their respective accuracy and efficiency under various conditions covering different sequence lengths, number of sequences and real world experimental data from rhizobacterial populations of glyphosate-tolerant maize treated or untreated with two different herbicides representative of differential diversity studies.
Alignment and distance calculations affect OTU estimations, and multiple sequence alignment exerts a major impact on the computational time needed. Generally speaking, most of the analyses produced consistent results that may be used to assess differential diversity changes, however, dataset characteristics dictate which workflow should be preferred in each case.
When estimating bacterial diversity, ESPRIT as well as the web-based workflow, RDP pyrosequencing pipeline, produced good results in all circumstances, however, its computational requirements can make method-combination workflows more attractive, depending on sequence variability, number and length.