Schematic for constructing metagenomic read trees and the simulations framework used to evaluate their accuracy. (a) Construction of a phylogeny with leaves labeled solely by metagenomic reads. A reference database of full-length sequences is used to build a profile model. Metagenomic reads are aligned to the full-length reference sequences via the profile model. The alignment is used to build a phylogeny from which reference sequences are then pruned to create a read tree. (b) Schematic for the simulations framework used to evaluate the accuracy of read trees. For each gene family, we used MetaPASSAGE to sample two sets of full-length sequences: a simulated reference database and a sample of source sequences from which shotgun metagenomic reads were generated. We built read trees and then measured the accuracy of each read tree by comparing it to a source tree, labeled by the full-length gene sequences corresponding to the reads. Both branch-length errors and errors in topological relationships were assessed.
Riesenfeld and Pollard BMC Genomics 2013 14:419 doi:10.1186/1471-2164-14-419