Open Access Research article

Repeat-encoded poly-Q tracts show statistical commonalities across species

Kai Willadsen1, Minh Duc Cao1, Janet Wiles2, Sureshkumar Balasubramanian3 and Mikael Bodén1*

Author Affiliations

1 School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane QLD 4072, Australia

2 School of Information Technology and Electrical Engineering, The University of Queensland, Brisbane QLD 4072, Australia

3 School of Biological Sciences, Monash University, Victoria 3800, Australia

For all author emails, please log on.

BMC Genomics 2013, 14:76  doi:10.1186/1471-2164-14-76

Published: 2 February 2013

Additional files

Additional file 1: Figure S1:

Distribution of TNR lengths in multiple organisms. The distribution of repeat sequence lengths across different organisms is generally similar. Repeat unit count is logarithmic; frequency is linear, measured as a percentage of the total number of repeat units identified.

Format: PDF Size: 32KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 2: Figure S2:

Frequency of protein-protein interaction counts for homo-AA proteins. Protein-protein interaction counts for homo-amino acid tract containing proteins in Saccharomyces cerevisiae, Arabidopsis thaliana, Caenorhabditis elegans, Drosophila melanogaster, Mus musculus and Homo sapiens, separated into those that are TNR-encoded and variant-encoded. Whole proteome data is provided as a comparison.

Format: PDF Size: 96KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 3: Table S1:

Over- and under-represented amino acids in TNR-encoded homo-AA repeats by species. Probabilities that the observed distribution between TNR and variant-encoded amino-acid repeats consisting of specific amino acids is consistent with a random distribution based on overall frequency in Saccharomyces cerevisiae, Arabidopsis thaliana, Caenorhabditis elegans, Drosophila melanogaster, Mus musculus and Homo sapiens.

Format: PDF Size: 46KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 4: Table S2:

Length comparison of TNR- vs. variant-encoded homo-AA repeats by species. Mean lengths and standard error of given amino-acid repeat sequences in TNR and variant-encoded repeats in Saccharomyces cerevisiae, Arabidopsis thaliana, Caenorhabditis elegans, Drosophila melanogaster, Mus musculus and Homo sapiens. The given p-values represent the probability that the length distributions are equal; these results show that for multiple amino acids, triplet encoded tracts are longer than variant-encoded tracts.

Format: PDF Size: 46KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data