Table 6

Genes most frequently selected in data sets.

Gene


Data set

Feature selection

j

Description

Fraction of splits


Colon cancer

Greedy

249

desmin

0.17


493

myosin heavy chain

0.15


1772

collagen alpha 2

0.12


Wrapper

249

desmin

0.13


1772

collagen alpha 2

0.08


1582

p cadherin

0.06


Leukemia 1

Greedy

4847

zxyin

0.46


6855

TCF3 transcription factor 3

0.17


1834

CD33 antigen

0.17


Wrapper

4847

zxyin

0.22


3252

glutathione S-transferase

0.16


6855

TCF3 transcription factor 3

0.11


Medulloblastoma

Greedy

5585

drebrin E

0.05


4174

COL6A2 collagen type IV alpha 2

0.04


3185

pancreatic beta cell growth factor

0.04


Wrapper

2426

prostaglandin D2 synthase

0.04


4710

acylphosphatase isozyme

0.03


3185

pancreatic beta cell growth factor

0.03


Prostate cancer

Greedy

6185

serine protease hepsin

0.70


8965

mitochondrial matrix protein P1

0.09


10494

mRNA, ne1-related protein P1

0.07


Wrapper

6185

serine protease hepsin

0.31


8965

mitochondrial matrix protein P1

0.09


4365

T-cell receptor Ti gamma chain

0.07


Leukemia 2

Greedy

8828

HLA class II alpha chain-like

0.59


9101

MHC class II lymphocyte antigen

0.23


2610

mRNA for oct-binding factor

0.13


Wrapper

8828

HLA class II alpha chain-like

0.41


9101

MHC class II lymphocyte antigen

0.20


2610

mRNA for oct-binding factor

0.17


Genes listed in bold occur most frequently and are discussed in the text.

Baker BMC Bioinformatics 2010 11:452   doi:10.1186/1471-2105-11-452

Open Data