Table 1

Comparison of R-SVM and SVM-RFE on Data-G (with gene outliers)

Levela

ReduceSVb

P(sv-diff)c

ReduceTestd

P(test-diff)e

ImproveRecf

P(rec-diff)g


800

4.01%

1.81E-42

-7.70%

4.72E-03

-3.90%

1.71E-39

600

5.77%

1.74E-49

-2.50%

4.64E-01

-1.70%

5.21E-15

500

6.83%

2.75E-51

-4.00%

1.62E-01

-0.30%

0.079189

400

8.35%

3. 26E-60

2.80%

3.48E-01

1.10%

4.48E-06

300

9.33%

3.83E-58

7.40%

3.65E-02

3.70%

1.77E-31

200

8.22%

1.28E-48

19.20%

6.36E-09

6.30%

5.79E-44

150

8.55%

1.51E-53

19.50%

1.16E-08

7.10%

9.76E-46

100

4.97%

6.20E-22

11.90%

1.83E-04

6.00%

6.43E-40

90

5.84%

1.66E-27

13.70%

4.20E-06

4.60%

1.07E-30

80

5.17%

8.20E-29

12.40%

4.14E-06

4.50%

7.12E-29

70

4.14%

1.46E-27

8.50%

4.77E-04

3.80%

1.05E-24

60

3.10%

1.23E-20

10.20%

3.14E-05

3.40%

4.99E-24

50

2.27%

2.01E-15

10.20%

4.11E-06

2.90%

2.37E-21


a Level: The number of features selected in each recursive step. With all of the 1000 features, there is no difference between R-SVM and SVM-RFE because no feature selection happened.

b ReduceSV: Relative reduction in the mean number of support vectors used by R-SVM comparing to that by SVM-RFE, calculated as: (average #SVSVM-RFE - average #SVR-SVM)/(average #svSVM-RFE).

c P(sv-diff): The p-value of the observed difference in numbers of SVs, by paired t-test.

d ReduceTest: Relative reduction in the mean test error rates of SVM models with R-SVM-selected features comparing to that with SVM-RFE selected features, calculated as: (average TestErrorSVM-RFE - average TestErrorR-SVM)/(average TestErrorSVM-RFE).

e P(test-diff): The p-value of the observed difference in test error rates, by paired t-test.

f ImproveRec: Relative improvement in the proportion of recovered informative genes by R-SVM than that by SVM-RFE, calculated as: (average #RECR-SVM - average #RECSVM-RFE)/(average #RECSVM-RFE), where #REC is the number of recovered true informative genes with the method stated in the subscript.

g P(rec-diff): The p-value of the observed difference in proportion of recovered true informative genes, by paired t-test.

Zhang et al. BMC Bioinformatics 2006 7:197   doi:10.1186/1471-2105-7-197

Open Data