Table 2 

Performance of a selection of drugdisease similarity scores. 

Scoring Method 
Direct Connection Validation AUC 
CTD Validation AUC 
PREDICT Validation AUC 


Corrected drugdisease pvalue 
0.65 
0.76 
0.66 
Cosine distance tfidf 
0.88 
0.91 
0.87 
Cosine distance of pvalues 
0.64 
0.70 
0.52 
Cosine distance of term fractions 
0.78 
0.83 
0.80 
Sum of the log of combined pvalues 
0.92 
0.93 
0.80 
Sum of the differences of log p values 
0.89 
0.86 
0.58 
L2 of logp of intersecting terms 
0.95 
0.92 
0.66 
L2 of term fractions of intersecting terms only 
0.64 
0.55 
0.57 
L2 of log of pvalues 
0.88 
0.84 
0.57 
L2 of pvalues 
0.87 
0.82 
0.56 
L2 of term fractions P(s < S) 
0.85 
0.90 
0.78 
L2 of term frequency 
0.87 
0.83 
0.62 
Total number of terms 
0.90 
0.87 
0.62 
Number of Intersecting Terms 
0.91 
0.91 
0.63 
Number of Drug Terms 
0.80 
0.83 
0.58 
Number of Disease Terms 
0.84 
0.83 
0.60 


Performance validated using novel direct drugdisease direct cooccurrences from MEDLINE, and novel drugdisease relationships from the CTD. Top scores for each validation set are presented in boldface type. 

Cheung et al. BMC Medical Genomics 2013 6(Suppl 2):S3 doi:10.1186/175587946S2S3 