Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Highly Accessed Research article

Predicting PDZ domain mediated protein interactions from structure

Shirley Hui12, Xiang Xing12 and Gary D Bader123*

Author Affiliations

1 The Donnelly Centre, University of Toronto, Toronto, ON, Canada

2 Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada

3 Department of Computer Science, University of Toronto, Toronto, ON, Canada

For all author emails, please log on.

BMC Bioinformatics 2013, 14:27  doi:10.1186/1471-2105-14-27

Published: 21 January 2013

Additional files

Additional file 1:

Supplementary Information.

Format: PDF Size: 1.8MB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 2: Table S1:

Training domain structure information. In total, 83 PDZ domains were used for training. Domain structures were obtained from the PDB or homology modelled through the Protein Model Portal. For NMR structures, only the first model was used. All homology models were generated by SWISS-MODEL and have greater than 50% sequence similarity to their template structure (average 90%). Model quality is estimated using template sequence ID (percentage of residues between target and template sequences that are identical) and QMEAN score (a scoring function that measures multiple geometrical aspects of protein structure, ranging from 0 to 1 with higher values indicating more reliable models). Table S2. Blind test domain structure information. Blind testing was performed using interaction data from mouse, worm and fly protein microarray experiments. In total, 13 mouse orphan, 7 worm and 6 fly PDZ domains were used. Homology models were generated by SWISS-MODEL. All models have at least 40% sequence identity to their template structures. An NMR structure was available for one fly domain and the first model was used. The average template sequence similarity was 0.92, 0.61 and 0.61 for mouse, worm and fly domains, respectively. One mouse domain (CHAPSYN-110-1) was removed from the test set because its performance was consistently poor for both predictors. Model quality is estimated using template sequence ID (percentage of residues between target and template sequences that are identical) and QMEAN score (a scoring function that measures multiple geometrical aspects of protein structure, ranging from 0 to 1 with higher values indicating more reliable models). Table S3. Human proteome scanning domain structure information. Proteome scanning was performed for 218 human PDZ domains, which have known interactions in iRefIndex. In total, 61 X-ray and nine NMR structures (only the first models used) were obtained from the PDB and 148 homology models were created (template sequence similarity minimum 22%, average 72%). Model quality is estimated using template sequence ID (percentage of residues between target and template sequences that are identical) and QMEAN score (a scoring function that measures multiple geometrical aspects of protein structure, ranging from 0 to 1 with higher values indicating more reliable models). Table S4. Validation of structure-based predictions against known human PDZ domain-peptide interactions. Proteome scanning predictions for 45 human PDZ domains were validated against known PDZ domain-peptide interactions in PDZBase. Several statistics were calculated including: # Positives, # TP (total number of true positives), # Predicted Structure (number of predictions predicted only by the structure-based predictor). # Predicted Sequence (number of predictions predicted only by the sequence-based predictor), # Predicted Both (number of predictions predicted by both), # TP Structure (number of true positives predicted by the structure-based predictor only), # TP Sequence (number of true positives predicted by the sequence-based predictor only), # TP Both (number of true positives predicted by both). Table S5. Validation of structure-based predictions against known negative PDZ domain-peptide interactions for human. a. Negatives involving peptides with PDZ binding motifs. Proteome scanning predictions for 74 human PDZ domains were validated against experimentally determined negative interactions involving peptides with PDZ binding motifs (found from the literature) for a total of 410 interactions. b. Negatives involving peptides with non binding PDZ motifs. Proteome scanning predictions for 24 human PDZ domains were validated against known negative interactions involving mutated peptides with non-binding PDZ motifs (found from the literature) for a total of 126 interactions. Table S6. Validation of structure-based predictions against known experimentally determined PDZ domain-peptide interactions for worm. Proteome scanning was performed for six worm PDZ domains with interactions from protein microarray experiments. Several statistics were calculated including the ones from Table S4 as well as the following: # Negatives, # FP Structure (number of false positives predicted by the structure-based predictor only), # FP Sequence (number of false positives predicted by the sequence-based predictor only), # FP Both (number of false positives predicted by both). Table S7. Validation of structure-based predictions against known experimentally determined PDZ domain-peptide interactions for fly. Proteome scanning was performed for seven fly PDZ domains with interactions from protein microarray experiments. Several statistics were calculated (see Table S6 caption). Table S8. Validation of structure-based predictions against known protein-protein interactions. Proteome scanning results for 221 human PDZ domains with both structure-based and sequence-based predictions were validated against known human PPIs in iRefIndex. A prediction is considered to be a true positive if the domain involved is found in a known PPI where one of the proteins contains the domain. See Table S4 caption for details about the calculated statistics. Table S9. Structure-based predicted PDZ domain interactors for according to functional theme. These tables contain domains, their sequence-based predicted interactors and the enriched functional theme (i.e. clusters in the Enrichment Map). Table S10. Sequence-based predicted PDZ domain interactors according to functional theme. These tables contain domains, their structure-based predicted interactors and the enriched functional theme (i.e. clusters in the Enrichment Map).

Format: XLS Size: 760KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data