Table 2

Different criteria for filtering clusters for function prediction

(Filter 1)

(Filter 1 & Filter 2)

(Filter 1 & Filter 3)

(Filter 1 & Filter 4)

(Filter 1 & Filter 5)


# of groups

196

74

53

185

11

# of terms

345

159

102

338

16

# of genes

3213

711

409

2895

320

Precision

67.91%

62.52%

60.52%

67.73%

64.70%

Recall

22.98%

26.16%

19.78%

23.80%

11.21%


In order to push the values for precision and recall towards the precision ceiling, we strived for filter criteria for selecting appropriate gene groups a-priori. To achieve this goal, we defined the following filter criteria for our 1,000 'phenoclusters':

Filter 1: Removes groups with less than 3 genes, no GO-terms associated to at least 50% of genes

Filter 2: Removes groups with a GO-similarity score < 0.4

Filter 3: Removes groups with a PPi-connectedness < 33%.

Filter 4: removes all non-single species clusters.

Filter 5: removes all single-species clusters

Groth et al. BMC Bioinformatics 2008 9:136   doi:10.1186/1471-2105-9-136

Open Data