Table 5

Overlap between top genes and gene sets for different classifiers

Classifier

#

MSigDB set

p-value

matches

set size


CC

1

GNF2_MKI67

< l.00 × l0-40

31

47

2

GNF2_TTK

< l.00 × l0-40

29

57

3

GNF2_CCNA2

< l.00 × 10-40

48

99

4

GNF2_HMMR

< 1.00 × 10-40

42

78

5

GNF2_SMC2L1

< 1.00 × 10-40

26

51

6

GNF2_CDC20

< 1.00 × 10-40

46

91

7

GNF2_ESPL1

< 1.00 × 10-40

27

58

8

GNF2_H2AFX

< 1.00 × 10-40

24

54

9

GNF2_RRM2

< 1.00 × 10-40

32

68

10

chrlqll

2.32 × 10-6

2

4


SVM

1

chr7q12

6.23 × 104

1

1

2

chr3qll

1.00

0

8

3

chrxq

1.00

0

2

4

BYSTRYKH_RUNX1_TARGETS_GLO-CUS

8.06 × 10-3

1

13

5

TESTIS_EXPRESSED _GENES

7.28 × 10-7

4

107

6

chr22q

1.00

0

6

7

REGULATION_OF_G_PROTEIN_COU-PLED_RECEPTOR_PROTEIN_SIGNAL-ING_PATHWAY

4.28 × 10-4

2

48

8

chr11p14

1.00

0

20

9

TERCPATHWAY

1.00

0

15

10

chrlq41

2.02 × 10-4

2

33


LR

1

chrSqll

1.00

0

8

2

chr22q

1.00

0

6

3

TERCPATHWAY

1.00

0

15

4

chrxq

1.00

0

2

5

BYSTRYKH_RUNX1_TARGETS_GLO-CUS

8.06 × 10-3

1

13

6

HSA00130_UBIQUINONE_BIOSYNTHE-SIS

1.00

0

8

7

chr20p

1.00

0

2

8

chrlq41

1.29 × 10-6

3

33

9

chr3q12

1.00

0

23

10

BETA_TUBULIN_BINDING

1.00

0

12


Top 10 sets using the set centroid statistic using different classifiers, and the p-value for the number of top genes belonging to each of them (Fisher's exact test, one sided). CC is centroid classifier, LR is logistic regression.

Abraham et al. BMC Bioinformatics 2010 11:277   doi:10.1186/1471-2105-11-277

Open Data