Example of output graphics for an SGA sample that was enriched for APOBEC mediated substitutions. HD frequency plots with best fitting Poisson (red line), on the left (panels A and C), and with theoretical star-phylogeny frequencies (red line), on the right (panels B and D). The top panels represent an alignment not corrected for enrichment for APOBEC motifs, whereas the bottom panels represent the same alignment after the G positions that in the consensus are in the APOBEC context have been removed from the alignment. Prior to the correction, the Poisson does not fit the HD frequency distribution (GOF P < 0.0001), whereas after the correction the Poisson yields a good fit (GOF P = 0.865).
Giorgi et al. BMC Bioinformatics 2010 11:532 doi:10.1186/1471-2105-11-532