The word landscape of the non-coding segments of the Arabidopsis thaliana genome
-
* Corresponding author: Jens Lichtenberg lichtenj@ohio.edu
1 Bioinformatics Laboratory, School of Electrical Engineering and Computer Science, Ohio University, Athens, Ohio, USA
2 Department of Plant Cellular and Molecular Biology, Plant Biotechnology Center, The Ohio State University, Columbus, Ohio, USA
3 Department of Statistics, University of Idaho, Moscow, Idaho, USA
4 Department of Plant Biology, Southern Illinois University, Carbondale, Illinois, USA
5 Biomedical Engineering Program, Ohio University, Athens, Ohio, USA
6 Molecular and Cellular Biology Program, Ohio University, Athens, Ohio, USA
BMC Genomics 2009, 10:463 doi:10.1186/1471-2164-10-463
Published: 8 October 2009Additional files
Additional file 1:
Words discovered in 3'UTRs. Entire set of words discovered in the 3'UTRs with occurrences, expected occurrences, scores, reverse complement information and p-value.
Format: CSV Size: 5.5MB Download file
Additional file 2:
Words discovered in 5'UTRs. Entire set of words discovered in the 5'UTRs with occurrences, expected occurrences, scores, reverse complement information and p-value.
Format: CSV Size: 5.4MB Download file
Additional file 3:
Words discovered in introns. Entire set of words discovered in the introns with occurrences, expected occurrences, scores, reverse complement information and p-value.
Format: CSV Size: 5.6MB Download file
Additional file 4:
Words discovered in core promoters. Entire set of words discovered in the core promoters [-100;+1] with occurrences, expected occurrences, scores, reverse complement information and p-value.
Format: CSV Size: 5.4MB Download file
Additional file 5:
Words discovered in proximal promoters. Entire set of words discovered in the proximal promoters [-1,000;-101] with occurrences, expected occurrences, scores, reverse complement information and p-value.
Format: CSV Size: 5.7MB Download file
Additional file 6:
Words discovered in distal promoters. Entire set of words discovered in the distal promoters [-3,000;-1,001] with occurrences, expected occurrences, scores, reverse complement information and p-value.
Format: CSV Size: 5.8MB Download file
Additional file 7:
Words discovered in entire genome. Entire set of words discovered in the complete genome with occurrences, expected occurrences, scores, reverse complement information and p-value.
Format: CSV Size: 4.1MB Download file
Additional file 8:
Words missed in 3'UTRs. Entire set of words expected to occur but not discovered in the 3'UTRs with expected occurrences.
Format: CSV Size: 10KB Download file
Additional file 9:
Words missed in 5'UTRs. Entire set of words expected to occur but not discovered in the 5'UTRs with expected occurrences.
Format: CSV Size: 13KB Download file
Additional file 10:
Words missed in introns. Entire set of words expected to occur but not discovered in the introns with expected occurrences.
Format: CSV Size: 2KB Download file
Additional file 11:
Words missed in core promoters. Entire set of words expected to occur but not discovered in the core promoters with expected occurrences.
Format: CSV Size: 10KB Download file
Additional file 12:
Word based clusters. Word-based clusters built around 2 overrepresented words of each non-coding segment of Arabidopsis thaliana represented by the word cluster and the sequence logo associated with said cluster. A word in a word cluster is presented through the nucleotide sequence associated with the word, the sequence count, the overall count and the SlnSES score.
Format: DOC Size: 376KB Download file
This file can be viewed with: Microsoft Word Viewer
Additional file 13:
Word co-occurrences in 3'UTRs. Entire set of co-occurring words (taken from the top 25 words) discovered in the 3'UTRs with occurrence, expected occurrences and scores.
Format: CSV Size: 38KB Download file
Additional file 14:
Word co-occurrences in 5'UTRs. Entire set of co-occurring words (taken from the top 25 words) discovered in the 5'UTRs with occurrence, expected occurrences and scores.
Format: CSV Size: 42KB Download file
Additional file 15:
Word co-occurrences in introns. Entire set of co-occurring words (taken from the top 25 words) discovered in the introns with occurrence, expected occurrences and scores.
Format: CSV Size: 31KB Download file
Additional file 16:
Word co-occurrences in core promoters. Entire set of co-occurring words (taken from the top 25 words) discovered in the core promoters with occurrence, expected occurrences and scores.
Format: CSV Size: 43KB Download file
Additional file 17:
Word co-occurrences in proximal promoters. Entire set of co-occurring words (taken from the top 25 words) discovered in the proximal promoters with occurrence, expected occurrences and scores.
Format: CSV Size: 64KB Download file
Additional file 18:
Word co-occurrences in distal promoters. Entire set of co-occurring words (taken from the top 25 words) discovered in the distal promoters with occurrence, expected occurrences and scores.
Format: CSV Size: 67KB Download file
Additional file 19:
NASC Microarrays. Entire set of microarray experiments available in NASC that were used for the cellular functional analysis.
Format: XLS Size: 561KB Download file
This file can be viewed with: Microsoft Excel Viewer
