Table 4

Number k discussion

n

document

passage

passage2


Genomics 2007

1

0.3012

0.0918

0.1436

5

0.3349

0.1400

0.1588

10

0.3438

0.1422

0.1635

20

0.3438

0.1422

0.1635

100

0.3438

0.1422

0.1635


Genomics 2006

1

0.3974

0.1401

-

5

0.4049

0.1445

-

10

0.4087

0.1467

-

20

0.4083

0.1466

-

100

0.4083

0.1466

-


Genomics 2005

1

0.3012

-

-

5

0.3116

-

-

10

0.3123

-

-

20

0.3123

-

-

100

0.3123

-

-


Genomics 2004

1

0.3470

-

-

5

0.3555

-

-

10

0.3584

-

-

20

0.3584

-

-

100

0.3584

-

-


HARD 2004

1

0.2015

0.2005

-

5

0.2223

0.2197

-

10

0.2250

0.2208

-

20

0.2248

0.2208

-

100

0.2248

0.2208

-


The number k is the parameter in the recursive re-ranking algorithm: (1) the empirical study makes a local optimization number k = 10 as the final depth in the final experiments; (2) k stands for the top k term associations weighted by the factor analysis based model; (3) the recursive re-ranking algorithm will re-rank the baselines according to these k terms; (4) the more the results contain terms among these k terms, the higher ranking scores the results obtain; (5) five numbers such as 1, 5, 10, 20, 100, are tested; (6) five original baselines from our five data sets respectively, namely Genomics 2007, Genomics 2006, Genomics 2005, Genomics 2004 and HARD 2004; (7) k affects the performance greatly when k is smaller than 10, while the final performance almost has no change if k becomes larger than 10.

Hu et al. BMC Bioinformatics 2012 13(Suppl 9):S2   doi:10.1186/1471-2105-13-S9-S2

Open Data