Table 17

Co-occurrence in Proximal Promoters

Word1

Word2

S

ES

S*ln(S/ES)


AAATTTTA

TAAAAAAT

996

489.8445

706.8206


ATTTTTTA

TAAAAAAT

869

395.77

683.4771


TAAATTTT

TAAAAAAT

970

501.8706

639.1852


AAAAATTA

TAAAAAAT

1040

565.2386

634.1171


TAAAATTT

TAAAAAAT

963

498.7952

633.5171


TAAAATTT

ATTTTTTA

892

458.4645

593.7003


AAATTTTA

ATTTTTTA

868

450.2375

569.7695


AAAAATTA

ATTTTTTA

947

519.5356

568.5445


AAAATTTA

TAAAAAAT

919

496.1801

566.4231


TAATTTTT

TAAAAAAT

965

539.2575

561.5671


AAAATTTA

ATTTTTTA

865

456.0608

553.6894


TAATTTTT

ATTTTTTA

907

495.6552

548.0656


AATATATT

TAAAAAAT

776

391.8276

530.2646


AAAATTTA

AAATTTTA

973

564.4665

529.8015


AAATTTTA

TAAAATTT

976

567.4415

529.3092


AAAAATTA

TAATTTTT

1125

707.8947

521.1483


AATATATT

ATTTTTTA

730

360.1459

515.7708


TAAATTTT

ATTTTTTA

845

461.2912

511.4845


AAAAATTA

TAAAATTT

1052

654.7789

498.8066


AAAATTTA

AAAAATTA

1044

651.346

492.5318


AAAATTTA

TAAAATTT

958

574.7807

489.4031


AAATTTTA

TAATTTTT

993

613.4724

478.2242


TAATTTTT

TAAAATTT

995

624.6821

463.1724


AAAATTTA

TAATTTTT

990

621.407

461.0615


TTATATAA

TAAAAAAT

645

316.3233

459.5531


Overrepresented non-overlapping word-pairs detected in the proximal promoters of Arabidopsis thaliana. A word-pair is characterized through the two nucleotide sequences associated with it (Word1 and Word2), the number of sequences the pair occurs in (S) as well as the expected number of sequences (ES) and a statistical score symbolizing the overrepresentation of the word-pair in the specific sequence set (S*ln(S/ES)).

Lichtenberg et al. BMC Genomics 2009 10:463   doi:10.1186/1471-2164-10-463

Open Data