Email updates

Keep up to date with the latest news and content from BMC Genomics and BioMed Central.

Open Access Methodology article

Discovering putative prion sequences in complete proteomes using probabilistic representations of Q/N-rich domains

Vladimir Espinosa Angarica123, Salvador Ventura45* and Javier Sancho123*

Author Affiliations

1 Departamento de Bioquímica y Biología Molecular y Celular, Facultad de Ciencias, Universidad de Zaragoza, Pedro Cerbuna 12, Zaragoza 50009, Spain

2 Institute for Biocomputation and Physics of Complex Systems (BIFI). Universidad de Zaragoza, Mariano Esquillor, Edificio I + D, Zaragoza 50018, Spain

3 Joint Unit BIFI-IQFR (CSIC), Serrano 119, Madrid 28006, Spain

4 Institut de Biotecnologia i de Biomedicina, Universitat Autònoma de Barcelona, 08193 Bellaterra, Barcelona, Spain

5 Departament de Bioquimica i Biologia Molecular, Universitat Autònoma de Barcelona, 08193 Bellaterra, Barcelona, Spain

For all author emails, please log on.

BMC Genomics 2013, 14:316  doi:10.1186/1471-2164-14-316

Published: 10 May 2013

Additional files

Additional file 1:

Prion-forming domain predictions in Archaea.

Format: PDF Size: 29KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 2:

Prion-forming domain predictions in Bacteria.

Format: PDF Size: 197KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 3:

Prion-forming domain predictions in Viruses.

Format: PDF Size: 36KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 4:

Prion-forming domain predictions in Fungi.

Format: PDF Size: 242KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 5:

Prion-forming domain predictions in Invertebrates.

Format: PDF Size: 875KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 6:

Prion-forming domain predictions in Vertebrates.

Format: PDF Size: 40KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 7:

Prion-forming domain predictions in Plants.

Format: PDF Size: 61KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 8:

Prion-forming domain predictions in Rodents.

Format: PDF Size: 36KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 9:

Prion-forming domain predictions in Mammals.

Format: PDF Size: 45KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 10:

Prion-forming domain predictions in Human.

Format: PDF Size: 32KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 11:

Significance over- or under-representation of PrD predictions according to gene ontology Molecular Function classifications. We tested the significance of the number of predictions found in all taxa according to the belonging of proteins bearing putative PrDs to different classifications in the molecular function ontology. We compared the abundance of predictions in a given class with the expected frequency obtained by randomly selecting a set of the same size in the proteomes over a 106 randomizations. In each taxon we represent the z-score for a number of representative GO terms. The GO terms description might be trimmed in some cases to fit in the chart.

Format: PDF Size: 9.4MB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 12:

Significance over- or under-representation of PrD predictions according to gene ontology Biological Process classifications. We tested the significance of the number of predictions found in all taxa according to the belonging of proteins bearing putative PrDs to different classifications in the biological process ontology. We compared the abundance of predictions in a given class with the expected frequency obtained by randomly selecting a set of the same size in the proteomes over a 106 randomizations. In each taxon we represent the z-score for a number of representative GO terms. The GO terms description might be trimmed in some cases to fit in the chart.

Format: PDF Size: 9MB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 13:

Significance over- or under-representation of PrD predictions according to gene ontology Cellular Component classifications. We tested the significance of the number of predictions found in all taxa according to the belonging of proteins bearing putative PrDs to different classifications in the cellular component ontology. We compared the abundance of predictions in a given class with the expected frequency obtained by randomly selecting a set of the same size in the proteomes over a 106 randomizations. In each taxon we represent the z-score for a number of representative GO terms. The GO terms description might be trimmed in some cases to fit in the chart.

Format: PDF Size: 6.1MB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 14:

Sequence of the prion forming domains and PrD-cores as predicted using a HMM model. These proteins were predicted using a HMM model reported in the work by Alberti et al.[38] and were then studied experimentally to test their aggregation propensity and prionogenicity. In the upper side of the table we include the 29 proteins and the corresponding prion domains (PrD) that were used in our work as the training set for obtaining the amino acid propensities in prion domains and in the second part of the table we include the 18 proteins which resulted as negatives in all four experimental tests and in accordance were used as the negative dataset for estimating the predictive performance of our methodology.

Format: DOC Size: 84KB Download file

This file can be viewed with: Microsoft Word Viewer

Open Data

Additional file 15:

Perl script (prion_parse_proteome.pl) used to predict prionogenic domains in the complete proteomes of organisms. This ad hoc script comes with a man page (run [./prion_parse_proteome.pl –man] in a UNIX/Linux console) which explains the functionality and parameters needed for running in a Linux environment and the required libraries dependencies. It is designed to read genomes in a Swissprot format and to run in a multicore environment to speed up the prediction in large protein sequence sets as those distributed in Uniprot.

Format: PL Size: 13KB Download file

Open Data