Window sizes, CIS calculation and resolution of overlapping CISs.A) Based on the total number of insertions being analysed for CISs, Window sizes (10,000-301,000 bp) are calculated to define the largest window size which is capable of showing a significant CIS (Poisson distribution p-value < 0.05 following bonferroni correction based on total window size) using total insertion numbers which can only exist as integer values. Only the first 3 window sizes for 20139 insertions are shown here. B) For each of the window sizes the p-value is calculated for each possible window based on the total number of insertions starting with every insertion throughout the genome. Non-overlapping windows with the lowest p-value (most insertions) are then chosen for each window size where the p-value is below a user-defined threshold. C) In order to combine the different window sizes, non-overlapping windows with the lowest p-value are chosen and these are returned as CISs. In the case shown, the 24 kb window with 7 insertions had a lower p-values than best 44 kb window and the best 10 kb window within the region.
Sarver et al. BMC Bioinformatics 2012 13:154 doi:10.1186/1471-2105-13-154