Figure 5.
Window sizes, CIS calculation and resolution of overlapping CISs.A) Based on the total number of insertions being analysed for CISs, Window sizes (10,000-301,000 bp)
are calculated to define the largest window size which is capable of showing a significant
CIS (Poisson distribution p-value < 0.05 following bonferroni correction based on
total window size) using total insertion numbers which can only exist as integer values.
Only the first 3 window sizes for 20139 insertions are shown here. B) For each of the window sizes the p-value is calculated for each possible window
based on the total number of insertions starting with every insertion throughout the
genome. Non-overlapping windows with the lowest p-value (most insertions) are then
chosen for each window size where the p-value is below a user-defined threshold. C) In order to combine the different window sizes, non-overlapping windows with the
lowest p-value are chosen and these are returned as CISs. In the case shown, the 24 kb
window with 7 insertions had a lower p-values than best 44 kb window and the best
10 kb window within the region.
Sarver et al. BMC Bioinformatics 2012 13:154 doi:10.1186/1471-2105-13-154 |