Email updates

Keep up to date with the latest news and content from BMC Genomics and BioMed Central.

This article is part of the supplement: SNP-SIG 2011: Identification and annotation of SNPs in the context of structure, function and disease

Open Access Proceedings

Prioritization of pathogenic mutations in the protein kinase superfamily

Jose MG Izarzugaza*, Angela del Pozo, Miguel Vazquez and Alfonso Valencia*

Author Affiliations

Structural Biology and BioComputing Programme, Spanish National Cancer Research Centre (CNIO), Madrid, Spain

For all author emails, please log on.

BMC Genomics 2012, 13(Suppl 4):S3  doi:10.1186/1471-2164-13-S4-S3

Published: 18 June 2012

Additional files

Additional file 1:

Supplementary figure 1.png Grid optimization of the predictive power of the classifier (all groups): F-score. We exhaustively tested the two most critical parameters of the SVM’s radial basis kernel: soft-margin (C) and radius (γ). The average f-score across the entire set of k-folds was chosen as a scoring function for the optimization. The optimal values used for the analyses were C = 3 and γ = 6 · 10–4 when all groups in the kinase superfamily were considered.

Format: PNG Size: 5KB Download file

Open Data

Additional file 2:

Supplementary figure 2.png Grid optimization of the predictive power of the classifier (all groups): AUC. We exhaustively tested the two most critical parameters of the SVM’s radial basis kernel: soft-margin (C) and radius (γ). The average area under the curve (AUC) across the entire set of k-folds was chosen as a scoring function for the optimization. The optimal values correspond to C = 2 and γ = 6 · 10–4.

Format: PNG Size: 5KB Download file

Open Data

Additional file 3:

Supplementary figure 3.png Grid optimization of the predictive power of the classifier (populated groups): F-score. Grid optimization of the predictive power of the classifier when only the groups with a reasonable number of reported disease-associated mutations are considered. We exhaustively tested soft-margin (C) and γ. The average f-score across the entire set of k-folds was chosen as the scoring function for the optimization. The optimal values used during the analyses were C = 8 and γ = 10–4.

Format: PNG Size: 8KB Download file

Open Data

Additional file 4:

Supplementary tables.pdf Supplementary Table 1: Ranking of the features according to their contribution to classification. Supplementary Table 2: Most representative GO terms to classify kinase genes as neutral. Supplementary Table 3: Most representative GO terms to classify kinase genes as disease-associated.

Format: PDF Size: 69KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 5:

Supplementary figure 5.png Benchmark of the classifiers with a common kinase dataset. Evaluation of the prediction capabilities of the four genome-wide classifiers (SNPs&GO, MutationAssessor, SIFT and SNAP) in comparison to KinMut. All predictors were evaluated with the same kinase dataset. Predictions from SNP&GO and MutationAssessor were obtained through their respective online servers while SIFT and SNAP predictions were retrieved from SNPdbe [42]. The dashed line represents the theoretical random predictor.

Format: PNG Size: 263KB Download file

Open Data

Additional file 6:

Variants.disease.txt Dataset of disease-associated mutations used to train and to evaluate the predictor.

Format: TXT Size: 12KB Download file

Open Data

Additional file 7:

Variants.neutral.txt Dataset of neutral mutations used in to train and to evaluate the predictor.

Format: TXT Size: 36KB Download file

Open Data