Complementation in error occurrence. An expected frequency of multiple errors was calculated based on the assumption that each error is independent using the formula p = C(SER)M, where SER = observed single error rate, M = number of mutated nt in sequence, and C = total number of possible erroneous sequence combinations. C = N!/(M!x(N-M)!), where N = number of nucleotides in the sequence. The expected frequency of multiple mutations is plotted against the observed frequency in experimental samples either for data sets not filtered based on phred score or filtered at a q = 30, and for the presence of between 2 and 10 mutated nt for q = 0 and 2 and 4 for q = 30 (no events were observed with 4-10 mutations for q = 30 filtered data).
Nguyen et al. BMC Genomics 2011 12:106 doi:10.1186/1471-2164-12-106