Table 3

Accuracy of distinguishing communities using Fast UniFrac on read trees
True positive rate (sensitivity) False positive rate (1-specificity)
FastTree RAxML pplacer FastTree RAxML pplacer
Pop 1 vs 2
Source Seqs 1.0 1.0 1.0 0.06 0.00 0.03
400-bp Reads 0.8 0.7 0.8 0.06 0.00 0.00
100-bp Reads 0.7 0.7 0.9 0.00 0.00 0.00
Pop 2 vs 3
Source Seqs 1.0 1.0 1.0 0.03 0.00 0.03
400-bp Reads 0.9 0.6 0.9 0.00 0.00 0.00
100-bp Reads 0.8 0.9 0.9 0.06 0.00 0.00
Pop 1 vs 3
Source Seqs 0.0 0.0 0.0 0.03 0.00 0.00
400-bp Reads 0.3 0.0 0.1 0.06 0.00 0.00
100-bp Reads 0.2 0.2 0.2 0.06 0.00 0.00

Weighted Fast UniFrac could reliably distinguish the underlying community structure while controlling false positive rates. The false positive rates, which are based on the null simulations for both communities in each pair, were consistently at or below the targeted 5% level. Weighted UniFrac was able to accurately distinguish samples from Pop1 versus Pop2, and from Pop2 versus Pop3, while it had much less success distinguishing samples from Pop1 and Pop3. Variation in performance was largely independent of mean read length and phylogenetic method.

Riesenfeld and Pollard

Riesenfeld and Pollard BMC Genomics 2013 14:419   doi:10.1186/1471-2164-14-419

Open Data