Table 9

Words not detected in the 3'UTRs

#WORD

E_S

E


CTAGCAGG

5.98269

6.17391


ACTGCCAG

4.99319

5.1526


CGCCTGAT

4.97776

5.13667


GCGTCCGA

4.52742

4.67187


GGGGTGGC

4.5248

4.66917


ACTCCGCC

4.38831

4.5283


CCCGTTCC

4.25101

4.3866


ACACGCCG

4.21714

4.35165


CCCGCTCA

4.193

4.32673


CTGGGCGT

4.06873

4.19847


GACCTGCG

3.71851

3.83704


GCGCAGTA

3.68699

3.80451


GCACCCGA

3.6084

3.7234


GCACCCTC

3.59671

3.71134


CGCACCCA

3.54333

3.65625


CCGCCGTC

3.53385

3.64646


GGGTCGGC

3.52406

3.63636


GCACGCCT

3.35465

3.46154


GCGCAGCC

3.31181

3.41732


CGTCCGCT

3.28252

3.3871


CTGGCGCC

3.2624

3.36634


GGCGACCT

3.25626

3.36


ATACGCCC

3.18816

3.28972


AGCGCTCC

2.98494

3.08


TAGCGCGG

2.98494

3.08


Top 25 words that were expected to occur in the 3'UTR but are not part of the sequences. Each word is identified through is nucleotide sequence and contains information about the expected number of sequences it was computed to occur in (E_S) as well as the expected number of total occurrences in the set of sequences (E). The words are sorted by their expected sequence occurrence.

Lichtenberg et al. BMC Genomics 2009 10:463   doi:10.1186/1471-2164-10-463

Open Data