Table 2

The prediction performance of SVM models with combinations of three kinds of regulatory features such as over-represented hexamer nucleotides (OR), nucleotide composition (NC), and DNA stability (DS), is evaluated by benchmark "Cross-validation" based on the specified window size -200 to +100 of TSS(+1).

Training set

Window size

Features

Precision

Sensitivity

Specificity

Accuracy


All

(6,452)

-200 ~+100

OR+NC

77%

71%

79%

75%

-200 ~+100

OR+DS

76%

69%

78%

74%

-200 ~+100

NC+DS

75%

74%

76%

75%

-200 ~+100

OR+NC+DS

79%

76%

79%

78%


With CpG

(4,898)

-200 ~+100

OR+NC

79%

81%

79%

80%

-200 ~+100

OR+DS

77%

80%

76%

78%

-200 ~+100

NC+DS

77%

82%

75%

78%

-200 ~+100

OR+NC+DS

80%

84%

79%

82%


Without CpG (1,554)

-200 ~+100

OR+NC

68%

70%

67%

68%

-200 ~+100

OR+DS

68%

71%

66%

68%

-200 ~+100

NC+DS

66%

67%

66%

66%

-200 ~+100

OR+NC+DS

69%

69%

71%

70%


The number of training sequences used to construct the SVM models is shown in parenthesis of the column "Training set".

Lee et al. BMC Genomics 2012 13(Suppl 1):S3   doi:10.1186/1471-2164-13-S1-S3

Open Data