Table 18

Co-occurrence in Distal Promoters

Word1

Word2

S

ES

S*ln(S/ES)


TAAAAAAT

ATTTTTTA

1855

898.8038

1344.087


AATATATT

TAAAAAAT

1759

902.7094

1173.429


AATATATT

ATTTTTTA

1692

882.8679

1100.631


TTATATAA

ATTTTTTA

1478

740.7429

1020.99


TTATATAA

TAAAAAAT

1464

757.3903

964.8477


AATATATT

TTATATAA

1447

743.9616

962.6287


AAAAATTG

TAAAAAAT

1301

747.7933

720.4442


CAATTTTT

TAAAAAAT

1279

745.3293

690.6698


AAAAATTG

ATTTTTTA

1237

731.3568

650.0966


ATTTTGTA

ATTTTTTA

1156

665.4975

638.3272


CAATTTTT

ATTTTTTA

1200

728.947

598.171


TAGAAAAT

TAAAAAAT

1024

586.114

571.3484


ATTTTGTA

TAAAAAAT

1108

680.4539

540.2074


CAATTTTT

AATATATT

1162

732.1145

536.7987


ATTTTTCA

ATTTTTTA

1078

666.4705

518.3745


AAAAATTG

AATATATT

1148

734.5348

512.627


CAATTTTT

TTATATAA

1003

614.2579

491.8069


TAGAAAAT

AATATATT

956

575.7221

484.8189


ATTTTCTA

ATTTTTTA

952

574.2477

481.2399


ATTTTCTA

TAAAAAAT

964

587.1534

477.9562


TAGAAAAT

ATTTTTTA

941

573.2313

466.4103


ATTTTTCA

TAAAAAAT

1058

681.4487

465.4297


TGAAAAAT

ATTTTTTA

1020

658.2655

446.7086


TGAAAAAT

TAAAAAAT

1033

673.0593

442.5259


AAAAATTG

TTATATAA

970

616.2886

439.9733


Overrepresented non-overlapping word-pairs detected in the distal promoters of Arabidopsis thaliana. A word-pair is characterized through the two nucleotide sequences associated with it (Word1 and Word2), the number of sequences the pair occurs in (S) as well as the expected number of sequences (ES) and a statistical score symbolizing the overrepresentation of the word-pair in the specific sequence set (S*ln(S/ES)).

Lichtenberg et al. BMC Genomics 2009 10:463   doi:10.1186/1471-2164-10-463

Open Data