Ad hoc Ωa choices for the Divergent (a) and (d), Artificial (b) and (e), and Titanium (c) and (f) data sets. (a)-(c) are histograms of the Ωathreshold at which each cluster derived from a run of DADA with Ωa=Ωr=10−3rejoins some other nearby cluster. Genuine genotype counts are shown in blue and false positive counts are shown in red. The first gaps in these histograms were used to pick Ωathresholds for reclustering the data, and are indicated by vertical dashed lines. (d)-(f) show the Ωadiscrimination lines for the largest cluster in each data set (with 2294, 5479, and 1095 reads) for Ωa=10−3and the associated ad hoc Ωavalues.
Rosen et al. BMC Bioinformatics 2012 13:283 doi:10.1186/1471-2105-13-283