Table 4

Relative contribution of each feature to classification as disease gene. An estimate of the relative contribution of each sequence feature in the final score used by the alternating decision tree for classifying genes as being involved in disease. The percentages are based on the average absolute contribution to the cumulative absolute score of each disease gene in the training set.

Feature
% Contribution to final score

Signal peptide
23%
Mouse homolog % identity
21%
Length of 3' UTR
12%
Number of exons
7%
Rat homolog % identity
7%
Worm homolog % identity
6%
GC
6%
CDS length
5%
Gene length
4%
Mouse homolog Ka
3%
Paralog % identity
2%

Adie et al. BMC Bioinformatics 2005 6:55   doi:10.1186/1471-2105-6-55