Resolution:
## Figure 2.
Discrimination plots for a typical cluster in the Artificial data set with 4691 reads. (a) simulated errors drawn from the error model and (b) the real errors in the cluster. Sequences (diamonds) are characterized by abundance
and the probability λper read of having been produced. On the x-axis, we plot logλscaled by the most common error probability, T_{A→G}, so that values can be interpreted as an effective Hamming distance. The dashed lines delineate the region – the lower left quadrant
– where, for significance thresholds Ω_{a}and Ω_{r} provided by the user, DADA accepts that a sequence could have arisen via the error model. The vertical dashed
lines shows the λbelow which (or the effective distance above which) the read p-value rejects sequences
as being errors, and the curved dashed line shows the abundances above which the abundance
p-value rejects sequences as being errors for each value of λ. There are several sequences in the real data (red diamonds) that would be rejected
by the abundance p-value at the Ω_{a} = .01 significance level; we posit that early round PCR effects are a suitable candidate
to explain these departures from the error model.
Rosen |