Insert counting methods for p-value calculation. Three different methods of counting the number of insertions within a given CIS are used by the software in order to remove potential artifacts from the final CIS list. In the figure 10 transposon insertions from 6 different tumor libraries are shown. The number of insertions can be derived 1) from the total number of inserts 2) the total number of libraries within the CIS and 3) the total number of unique regions within a cis that hold an insertion. The total number of inserts obtained by these 3 counting methods are then indivually used to test the null hypothesys that no enrichment is present using the Poisson distribution based on the window size, the genome size and the total number of inserts present in the dataset being examined. We expect Bonferroni corrected p-values to be less than 0.05 for each of these 3 methods of counting in order to define ideal CIS.
Sarver et al. BMC Bioinformatics 2012 13:154 doi:10.1186/1471-2105-13-154