Table 3

Top 2 clusters for the bidirectional promoter. The word-based clusters for the two most overrepresented words for the bidirectional promoters. Rank 1 refers to word TCGCGCCA and Rank 2 to TCCCGGGA.

(a) Rank 1


Word

S

ES

O

EO

Sln(S/ES)

RevComp.

Position

Palindrome


TCGCGCCA

4

0.918299

4

0.9375

5.88611

TGGCGCGA

12538

No


TCGCCCCA

3

0.805161

3

0.820513

3.94598

TGGGGCGA

2834

No


TAGCGCCA

1

0.263929

1

0.266667

1.33207

TGGCGCTA

4918

No


TCGAGCCA

1

0.469775

1

0.47619

0.755501

TGGCTCGA

NA

No


TCGCGACA

1

0.655751

1

0.666667

0.421975

TGTCGCGA

NA

No


TCGGGCCA

1

0.683955

1

0.695652

0.379863

TGGCCCGA

NA

No


TTGCGCCA

1

0.693903

2

0.705882

0.365423

TGGCGCAA

NA

No


TCGCGGCA

1

0.826074

1

0.842105

0.191071

TGCCGCGA

NA

No


TCGCGTCA

1

0.84063

1

0.857143

0.173604

TGACGCGA

4051

No


TCGCGCCC

1

1.51582

1

1.5625

-0.41596

GGGCGCGA

13089

No


CCGCGCCA

2

2.5054

2

2.625

-0.4506

TGGCGCGG

NA

No


(b) Rank 2


Word

S

ES

O

EO

Sln(S/ES)

RevComp.

Position

Palindrome


TCCCGGGA

8

3.97165

8

4.26667

5.60208

TCCCGGGA

2

Yes


TCCAGGGA

2

0.941495

2

0.961538

1.50687

TCCCTGGA

NA

No


TCCCGAGA

2

1.05556

2

1.08

1.27816

TCTCGGGA

13248

No


TGCCGGGA

1

0.514348

1

0.521739

0.664856

TCCCGGCA

NA

No


TCCCGTGA

1

0.702073

1

0.714286

0.353718

TCACGGGA

NA

No


TCCCAGGA

4

3.71413

5

3.97222

0.296597

TCCTGGGA

19059

No


TCTCGGGA

2

1.73986

2

1.8

0.278683

TCCCGAGA

3074

No


ACCCGGGA

1

0.785281

1

0.8

0.241714

TCCCGGGT

20941

No


TCCCCGGA

1

0.852649

1

0.869565

0.159407

TCCGGGGA

NA

No


TCCCGCGA

1

1.01424

1

1.03704

-0.01414

TCGCGGGA

NA

No


TCCCGGAA

3

3.29619

3

3.5

-0.28247

TTCCGGGA

NA

No


TCCTGGGA

1

1.32696

1

1.36364

-0.28289

TCCCAGGA

13129

No


TCCCGGGG

3

3.34568

3

3.55556

-0.32717

CCCCGGGA

21071

No


TCCCGGGT

1

2.38044

1

2.48889

-0.86729

ACCCGGGA

13746

No


CCCCGGGA

1

2.78651

1

2.93333

-1.02479

TCCCGGGG

19211

No


GCCCGGGA

1

3.73853

2

4

-1.31869

TCCCGGGC

21163

No


TCCCGGGC

3

5.1829

4

5.68889

-1.64025

GCCCGGGA

21138

No


Lichtenberg et al. BMC Genomics 2009 10(Suppl 1):S18   doi:10.1186/1471-2164-10-S1-S18

Open Data