Table 2

Motif co-occurrence
Comparison Upstreams containing first motif only Upstreams containing second motif only Upstreams containing neither motif Intersection (upstreams containing both motifs) P-value Expected intersection Intersection/ Expected intersection Notes
12-0 and 12-5 502 525 19794 250 8.2E-178 28 9.04 Includes 12-0
A-rich and TATAAA 11261 1251 651 7908 1.4E-93 8332 0.95 Anti-occurrence
12-0 and 12-11 684 315 20004 68 1.9E-28 14 4.97 Includes 12-0
12-11 and 12-5 328 720 19968 55 3.9E-18 14 3.90
12-5 and AT-rich 411 6859 13437 364 1.3E-13 266 1.37
12-0 and TTTAGG 664 1195 19124 88 3.4E-09 46 1.92 Includes 12-0
12-0 and Trans-splice 575 3174 17145 177 3.5E-08 120 1.48 Includes 12-0
12-0 and AT-rich 423 6894 13425 329 4.9E-08 258 1.28 Includes 12-0
12-5 and Trans-splice 596 3172 17124 179 1.0E-07 123 1.45
AT-rich and Trans-splice 5951 2079 11769 1272 1.27E-06 1149 1.11
12-0 and 12-18 716 380 19939 36 1.2E-06 15 2.42 Includes 12-0
TTTAGG and Trans-splice 1017 3085 16703 266 2.2E-06 204 1.30
AT-rich and TTTAGG 6707 767 13081 516 5.2E-06 440 1.17
A-rich and AT-rich 12678 732 1170 6491 6.2E-05 6571 0.99 Anti-occurrence
12-5 and TTTAGG 701 1209 19087 74 1.2E-04 47 1.57
12-18 and 12-5 385 744 19911 31 1.9E-04 15 2.03

Out of all 22,428 upstream regions, 21,071 contained at least one instance of one motif or the same-strand TATAAA sequence. For each pair of motifs (including TATAAA), the upstream regions were divided into four categories: Upstream regions that contained the first motif but not the second, upstream regions that contained the second motif but not the first, upstream regions that contained neither of the two motifs, and upstream regions that contained both motifs (the intersection). We calculated the probability of this distribution by the Fisher Exact Test. We also calculated the expected intersection, based on the individual frequency of each motif, and then calculated the ratio of the actual intersection to the expected intersection to show whether the motif pairs co-occurred more or less frequently than expected. Shown are those motif pairs whose p-values of co-occurrence were more significant than a Bonferroni-adjusted threshold of 0.0002 (0.01/45 comparisons). Motif pairs that include motif 12–0 and motif pairs that co-occurred in fewer upstream regions than expected are indicated in column “Notes”.

Sleumer et al.

Sleumer et al. BMC Genomics 2012 13:433   doi:10.1186/1471-2164-13-433

Open Data