Table 2

Description of the 20 disease expression data sets
Data set Samples Features Reference Data set ID Binary outcome
adenocarcinoma 76 9868 [32] NA most prevalent class vs others
brain 42 5597 [33] NA most prevalent class vs others
breast2 77 4869 [34] NA most prevalent class vs others
breast3 95 4869 [34] NA most prevalent class vs others
colon 62 2000 [35] NA most prevalent class vs others
leukemia 38 3051 [36] NA most prevalent class vs others
lymphoma 62 4026 [37] NA most prevalent class vs others
NCI60 61 5244 [38] NA most prevalent class vs others
prostate 102 6033 [39] NA most prevalent class vs others
srbct 63 2308 [40] NA most prevalent class vs others
BrainTumor2 50 10367 [41] NA Anaplastic oligodendrogliomas vs Glioblastomas
DLBCL 77 5469 [42] NA follicular lymphoma vs diffuse large B-cell lymphoma
lung1 58 10000 [43] GSE10245 Adenocarcinoma vs Squamous cell carcinoma
lung2 46 10000 [44] GSE18842 Adenocarcinoma vs Squamous cell carcinoma
lung3 71 10000 [45] GSE2109 Adenocarcinoma vs Squamous cell carcinoma
psoriasis1 180 10000 [46,47] GSE13355 lesional vs healthy skin
psoriasis2 82 10000 [48] GSE14905 lesional vs healthy skin
MSstage 26 10000 [49] E-MTAB-69 relapsing vs remitting RRMS
MSdiagnosis1 27 10000 [50] GSE21942 RRMS vs healthy control
MSdiagnosis2 44 10000 [49] E-MTAB-69 RRMS vs healthy control

Sample size, number of features, original reference, data set IDs and outcomes for the 20 disease related gene expression data sets.

Song et al.

Song et al. BMC Bioinformatics 2013 14:5   doi:10.1186/1471-2105-14-5

Open Data