Table 1 

Problemspecific Datasets. 

Problem 
Source 
Type 
#C 
#Seq 
#Res 
#CV 
% 


Disorder Prediction 
DisPro [7] 
Binary 
2 
723 
215612 
10 
30 
ProteinDNA Site 
DISIS [6] 
Binary 
2 
693 
127240 
3 
20 
Residuewise Contact 
SVM [15] 
Regression 
∞ 
680 
120421 
15 
40 
Local Structure 
Profnet [35] 
Multiclass 
16 
1600 
286238 
3 
40 


#C, #Seq, #Res, #CV, and % denote the number of classes, sequences, residues, number of cross validation folds, and the maximum pairwise sequence identity between the sequences, respectively. 8 represents the regression problem. 

Rangwala et al. BMC Bioinformatics 2009 10:439 doi:10.1186/1471210510439 