Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Open Badges Methodology article

Learning virulent proteins from integrated query networks

Eithon Cadag1*, Peter Tarczy-Hornoch2 and Peter J Myler23

Author Affiliations

1 Ayasdi Inc, Palo Alto, CA, USA 94301

2 Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, WA, USA 98195

3 Seattle Biomedical Research Institute, Seattle, WA, USA 98109

For all author emails, please log on.

BMC Bioinformatics 2012, 13:321  doi:10.1186/1471-2105-13-321

Published: 2 December 2012

Additional files

Additional file 1:

faa — Specific virulence protein sequences. This FASTA file contains the 3700 protein sequences used for training and testing the specific virulence classifiers. Note that the FASTA sequence headers contain only the unique identifiers; labels are located in a separate file.

Format: FAA Size: 1.5MB Download file

Open Data

Additional file 2:

txt – Specific virulence class labels. This tab-delimited file contains two columns. The first column is the unique sequence identifier matched to some protein sequence in the specific virulence FASTA file, and the second column indicates the specific virulence label, per Table 2. Non-virulent proteins are assigned class ‘12’ in this file.

Format: TXT Size: 70KB Download file

Open Data

Additional file 3:

pdf — Statistical significance outcomes of method and source comparisons. A PDF containing tabular results of statistical significance testing from six five-fold cross-validations of integrated data against each other and baselines. Data from these tables was used to construct the comparison networks in Figure 3.

Format: PDF Size: 87KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data