Table 2

Classification statistics
EPPIC (based on UniProt 2012_10)
# entries Geometry Entropy core-rim Entropy core-surface Combined
Bio Xtal Sens. Spec. Sens. Spec. Sens. Spec. Acc. Sens. Spec. MCC
DC (optimization) 83 82 0.80 0.73 0.82(68) 0.66(64) 0.87(69) 0.76(67) 0.81 0.88 0.73 0.62
Ponstingl (benchmarking) 88 52 0.85 0.92 0.84(76) 0.66(29) 0.85(75) 0.79(29) 0.89 0.90 0.87 0.76
Bahadur (benchmarking) 121 185 0.88 0.88 0.82(103) 0.64(114) 0.86(104) 0.77(114) 0.86 0.89 0.84 0.72
Acc. Sens. Spec. MCC
DC (optimization) 0.79 0.95 0.63 0.62
Ponstingl (benchmarking) 0.84 0.89 0.77 0.66
Bahadur (benchmarking) 0.77 0.89 0.69 0.57

Classification statistics for our own compiled datasets ("DC"), composed of DCxtal (crystal interfaces) and DCbio (biological interfaces), for the Ponstingl 2003 dataset of monomers (crystal interfaces) and dimers (biological interfaces) and for the Bahadur datasets (monomer and dimers). We first present the statistics for each of our indicators separately and the statistics for the combined predictor. PISA statistics compiled by us are shown in a separate table. Statistics are given in terms of sensitivity or rate of correct biological interface predictions and specificity or rate of correct crystal interface predictions. The statistics for the evolutionary methods are based on the total number of interfaces that could be predicted (enough homologs and enough core/rim/surface residues). The numbers for each case are indicated in parentheses together with the corresponding sensitivity or specificity. As well as accuracy values we present the Matthews correlation coefficient (MCC) which gives a better assessment of the predictions in cases where the positive and negative sets are unbalanced (as is the case with the Ponstingl sets). All EPPIC evolutionary predictions are based on UniProt release 2012_10.

Duarte et al.

Duarte et al. BMC Bioinformatics 2012 13:334   doi:10.1186/1471-2105-13-334

Open Data