Table 1 

Explanation of the scoring functions evaluated. 

Scoring Method 
Description 


Cosine Distance of Term FrequencyInverse Document Frequency 



Cosine Distance of pvalues 



Cosine Distance of term fractions 



Sum of the log of combined pvalues 



Sum of the differences of log p values 



L_{2 }of logp of overlapping terms only 



L_{2 }of term fractions of overlapping terms only 



L_{2 }of log of pvalues 



L_{2 }of pvalues 



L_{2 }of term fractions 



L_{2 }of term frequency 



Term Coverage 



Term Overlap 



Number of Drug MeSH Terms 



Number of Disease MeSH Terms 



M refers to the set of all MeSH terms, C and D refer to the MeSH terms for the drug and disease profile respectively. c(i), c_{f}(i), c_{p}(i) and c_{i}(i) refer to the frequency, term fraction, hypergeometric pvalue and term frequencyinverse document frequency for the MeSH term i of the drug profile. d(i), d_{f}(i), d_{p}(i) and d_{i}(i) refer to the frequency, term fraction, hypergeometric pvalue and term frequencyinverse document frequency for the MeSH term i of the disease profile. 

Cheung et al. BMC Medical Genomics 2013 6(Suppl 2):S3 doi:10.1186/175587946S2S3 