Service de Conformation des Macromolécules Biologiques et de Bioinformatique. Université Libre de Bruxelles, CP 263, Campus Plaine, Bd. du Triomphe, B-1050 Bruxelles, Belgium

Abstract

Background

Protein interactions are crucial components of all cellular processes. Recently, high-throughput methods have been developed to obtain a global description of the interactome (the whole network of protein interactions for a given organism). In 2002, the yeast interactome was estimated to contain up to 80,000 potential interactions. This estimate is based on the integration of data sets obtained by various methods (mass spectrometry, two-hybrid methods, genetic studies). High-throughput methods are known, however, to yield a non-negligible rate of false positives, and to miss a fraction of existing interactions.

The interactome can be represented as a graph where nodes correspond with proteins and edges with pairwise interactions. In recent years clustering methods have been developed and applied in order to extract relevant modules from such graphs. These algorithms require the specification of parameters that may drastically affect the results. In this paper we present a comparative assessment of four algorithms: Markov Clustering (MCL), Restricted Neighborhood Search Clustering (RNSC), Super Paramagnetic Clustering (SPC), and Molecular Complex Detection (MCODE).

Results

A test graph was built on the basis of 220 complexes annotated in the MIPS database. To evaluate the robustness to false positives and false negatives, we derived 41 altered graphs by randomly removing edges from or adding edges to the test graph in various proportions.

Each clustering algorithm was applied to these graphs with various parameter settings, and the clusters were compared with the annotated complexes.

We analyzed the sensitivity of the algorithms to the parameters and determined their optimal parameter values.

We also evaluated their robustness to alterations of the test graph.

We then applied the four algorithms to six graphs obtained from high-throughput experiments and compared the resulting clusters with the annotated complexes.

Conclusion

This analysis shows that MCL is remarkably robust to graph alterations. In the tests of robustness, RNSC is more sensitive to edge deletion but less sensitive to the use of suboptimal parameter values. The other two algorithms are clearly weaker under most conditions.

The analysis of high-throughput data supports the superiority of MCL for the extraction of complexes from interaction networks.

Background

Protein-protein interactions (PPI) play major roles in the cell: transient protein interactions are often involved in post-translational control of protein activity; enzymatic complexes ensure substrate channeling which drastically increases fluxes through metabolic pathways; large protein complexes play essential roles in basal cellular mechanisms such as DNA packaging (histones), transcription (RNA polymerase), replication (DNA polymerase), translation (ribosome), protein degradation (proteasome) ...

Various methods have been used to detect PPI. Co-immunoprecipitation, co-sedimentation, and two-hybrid systems have traditionally been used to characterize interactions at the level of a single protein complex. More recently, high-throughput methods have been developed for large-scale detection of pairwise interactions (two-hybrid systems, the split-ubiquitin method)

In 2002, von Mering

The network of interactions between proteins is generally represented as an interaction graph, where nodes represent proteins and edges represent pairwise interactions. Graph theory approaches have been applied to describe the topological properties of the network: distribution of node degree (number of incoming and outgoing edges per node), network diameter (average of the shortest distance between pairs of nodes), clustering coefficient (proportion of the potential edges between the neighbors of a node that are effectively observed in the graph). These analyses have led to the observation of some apparently recurrent properties of biological networks: power-law degree distribution, small world, high clustering coefficients, and modularity

Beyond these descriptive statistics, an important challenge for modern biology is to understand the relationship between the organization of a network and its function. In particular, it is essential to extract functional modules such as protein complexes

To achieve this goal, several clustering methods have been applied to the protein interactome graph in order to detect highly connected subgraphs (e.g.

In this paper we present a systematic quantitative evaluation of the capability of four clustering methods for inferring protein complexes from a network of pairwise protein interactions. The four methods tested here are Markov Clustering (MCL

Results and discussion

Algorithms

The four algorithms tested here rely on distinct approaches for extracting clusters from the graph (Table

Supplementary information about the algorithms

Click here for file

Main features of the graph clustering approaches presented in this study.

**Restricted Neighborhood Search Clustering (RNSC)**

**Markov Clustering (MCL)**

**Molecular Complex Detection (MCODE)**

**Super-paramagnetic clustering (SPC)**

**Type**

Local search cost based

Flow simulation

Local neighbourhood density search

Hierarchical

**Allow multiple assignations**

No

No

Yes

No

**Allow unassigned nodes**

No

No

Yes

No

**Edge-weighted graphs supported**

No

Yes

No

Yes

**First application**

Protein complex prediction

Protein family detection

Protein complex detection

**Other applications**

/

Identification of ortholog groups, protein complexes, peer-to-peer node clustering, image retrieval, Word Sense Discrimination, molecular pathway discovery, structural domains, ...

/

Image clustering, microarray data clustering, protein complexes detection, protein structure classification, identification of ortholog groups, ...

**Availability**

Upon request

Upon request

**Developper**

King AD

Van Dongen S

Bader GD and Hogue CWV

Blatt M, Wiseman S, Domany E

**References**

[21]

[35]

[19]

[18]

The Markov Cluster algorithm (MCL)

The second algorithm, Restricted Neighborhood Search Clustering (RNSC)

The third algorithm, Super Paramagnetic Clustering (SPC)

The fourth method, Molecular Complex Detection (MCODE)

Interaction graphs

From the collection of protein complexes annotated in the MIPS database

Graphical representation of interaction networks

**Graphical representation of interaction networks**. **(A) **Test graph built from the complexes annotated in the MIPS database (high-throughput data were excluded). **(B) **Altered graph _{100,40 }with 100% of random edge addition (red) and 40% of random edge removal.

In order to evaluate the robustness of the algorithms to missing and false interactions, we generated 41 _{add,del}, where

Figure _{100,40, }with 100% edge addition and 40% edge removal. Another problem of evaluation is that a certain proportion of interacting proteins can be assigned to the same cluster by chance. In order to estimate the random expectation of correct grouping, we built a

We also built 41

To each of these 84 graphs (test, altered test, random, altered random), we applied the four algorithms described above, with varying parameter values. As a second way to estimate the random expectation, each clustering result was also randomized so as to obtain a set of

Parameter optimization

The quality of a clustering result was evaluated by comparing each cluster with each annotated complex. The

To estimate the overall correspondence between a clustering result (a set of clusters) and the collection of annotated complexes, we computed the weighted means of all

Each algorithm has one or more parameters that influence properties such as number of clusters, cluster size, and cluster density (number of intra-cluster edges). For each algorithm we measured the impact of the main parameters on

Let us illustrate in more detail the procedure of parameter selection with the inflation parameter of the MCL algorithm. With the original test graph, interestingly, the effect of this parameter is barely detectable (Figure

Impact of the inflation parameter on MCL clustering results

**Impact of the inflation parameter on MCL clustering results**. **(A) **Impact of the inflation parameter on the clustering-wise Sensitivity (**(B) **Number of complexes predicted as a function of the inflation factor for the original test graph. Color code: **(C) **_{100,40}). **(D) **Number of complexes predicted as a function of the inflation factor for _{100,40}.

The crucial impact of the inflation parameter becomes obvious when MCL is applied to highly altered graphs. For example, for the altered graph _{100,40 }(Figure

We performed the same analysis and selected the optimal parameter values for each one of the 42 graphs (test and altered), as summarized in Table

Optimal values for MCL inflation parameter for the test and altered graphs

% removal\% addition

0

5

10

20

40

80

100

0

3.4

3.1

2.7

2.4

2

1.8

1.8

5

5.7

4

2.6

2

1.9

1.8

1.8

10

2.35

2.2

2.2

2.3

1.8

1.8

1.8

20

1.7

2.2

2.1

2

1.8

1.7

1.8

40

1.8

1.8

1.8

1.9

1.7

1.7

1.7

80

1.3

1.4

1.5

1.5

1.5

1.6

1.6

Note that in the case of the inflation parameter, the most frequent value (1.8) is especially well suited for graphs with a high level of alteration, such as those resulting from high-throughput data. In addition, for the less altered graphs, the accuracy is generally more robust to fluctuations of the inflation (the extreme case of the unaltered test graph shown in Figure

For the RNSC algorithm, we tested the impact of 7 parameters on the quality of the clustering. This represents a total of 2,916 combinations of parameter values. Figure _{100,40}). Each dot corresponds to one particular combination of parameter values. This figure shows that the RNSC algorithm is remarkably robust to the choice of parameter values: all the results are grouped in a cloud, with an almost constant

Impact of the RNSC parameters on the clustering of an altered graph _{100,40}

**Impact of the RNSC parameters on the clustering of an altered graph A _{100,40}**. Each dot represents the clustering-wise

The same analysis was carried out for each parameter of each algorithm. The complete tables of optimal values for the 42 graphs using both Accuracy and Separation (see next section) are available as supplementary material [see

Optimal accuracy parameter values

Click here for file

Optimal separation parameter values. These files and supplementary figures are also available on

Click here for file

Optimal parameters

**Algorithm**

**Parameter**

**Optimized for accuracy**

**Optimized for separation**

**MCL**

Inflation

1.8

1.8

**MCODE**

Depth

100

5

Node score percentage

0

0

Haircut

TRUE

TRUE

Fluff

FALSE

FALSE

Percentage for complex fluffing

0.2

0.9

**RNSC**

Diversification frequency

50

50

Shuffling diversification length

9

3

Tabu length

50

50

Tabu list tolerance

1

1

Number of experiments

3

3

Naive stopping tolerance

1

15

Scaled stopping tolerance

15

15

**SPC**

Number of nearest neighbours

15

10

Temperature

0.132

0.116

Robustness analysis

In this analysis, we chose fixed parameter values for each algorithm (Table

Figure

Robustness of the algorithms to random edge addition and removal

**Robustness of the algorithms to random edge addition and removal**. Each curve represents the value of accuracy (left panels) or separation (right panels). **(A-B) **edge addition to the test graph. **(C-D) **edges removal from the test graph. **(E-F) **Edge removal from an altered graph with 100% of randomly added edges. **(G-H) **Edge addition to an altered test graph with 40% of randomly removed edges. Color code:

To estimate the random expectation, we performed for each clustering result a permutation test, by shuffling the proteins between clusters. The number of clusters and their respective sizes thus remained unchanged. The geometric accuracy of the permuted clusters is displayed with dotted lines in Figure _{100,0}. This illustrates the importance of the permutation test: the test makes it possible to estimate the performance of an algorithm in terms of gains relative to the random expectation. We inspected the clustering result in more detail in order to understand why the program can yield high accuracy values even when clusters are permuted. This effect comes from the fact that, under the chosen conditions, SPC yields a huge cluster of 567 proteins, plus a multitude of very small clusters of 1 or 2 proteins. The effect of the huge cluster is to artificially increase the

In order to circumvent this problem, we defined an additional statistic, which we call

The

Schematic illustration of a contingency table, and the derived statistics

**Counts**

**cluster 1**

**cluster 2**

**cluster 3**

**cluster 4**

**cluster 5**

**sum**

**complex size**

**complex 1**

7

0

0

0

0

7

7

**complex 2**

0

6

8

0

0

14

14

**complex 3**

0

0

0

14

3

17

20

**complex 4**

0

0

0

4

5

9

8

**sum**

7

6

8

18

8

47

**cluster size**

7

6

8

16

8

**Positive Predictive Value (PPV)**

**cluster 1**

**cluster 2**

**cluster 3**

**cluster 4**

**cluster 5**

**complex 1**

1

0

0

0

0

**complex 2**

0

1

1

0

0

**complex 3**

0

0

0

0.78

0.38

**complex 4**

0

0

0

0.22

0.62

**cluster-wise PPV**

1

1

1

0.78

0.62

**Sensitivity**

**cluster 1**

**cluster 2**

**cluster 3**

**cluster 4**

**cluster 5**

**complex-wise Sn**

**complex 1**

1

0

0

0

0

1

**complex 2**

0

0.43

0.57

0

0

0.57

**complex 3**

0

0

0

0.70

0.15

0.70

**complex 4**

0

0

0

0.50

0.62

0.62

**Frequency per row**

_{
row
}

**cluster 1**

**cluster 2**

**cluster 3**

**cluster 4**

**cluster 5**

**complex 1**

1

0

0

0

0

**complex 2**

0

0.43

0.57

0

0

**complex 3**

0

0

0

0.82

0.18

**complex 4**

0

0

0

0.44

0.56

**Separation**

**cluster 1**

**cluster 2**

**cluster 3**

**cluster 4**

**cluster 5**

**complex-wise separation**

**complex 1**

1

0

0

0

0

1

**complex 2**

0

0.43

0.57

0

0

1

**complex 3**

0

0

0

0.64

0.07

0.71

**complex 4**

0

0

0

0.10

0.35

0.45

**cluster-wise separation**

1

0.43

0.57

0.74

0.41

Clustering-wise sensitivity 0.69

Clustering-wise PPV 0.85

Accuracy 0.77

Average cluster-wise separation 0.63

Average complex-wise separation 0.79

Clustering-wise separation 0.70

Similarly, we defined a

The

Mutually overlapping clusters obtained under some parameter conditions with MCODE

cluster\cluster

1

2

3

4

5

...

49

50

51

52

...

102

103

...

607

1

81

80

79

78

77

...

47

46

0

0

...

0

32

...

0

2

80

80

79

78

77

...

47

46

0

0

...

0

32

...

0

3

79

79

79

78

77

...

47

46

0

0

...

0

32

...

0

4

78

78

78

78

77

...

47

46

0

0

...

0

32

...

0

5

77

77

77

77

77

...

47

46

0

0

...

0

32

...

0

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

49

47

47

47

47

47

...

47

46

0

0

...

0

32

...

0

50

46

46

46

46

46

...

46

46

0

0

...

0

32

...

0

51

0

0

0

0

0

...

0

0

46

0

...

0

0

...

0

52

0

0

0

0

0

...

0

0

0

46

...

32

0

...

0

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

102

0

0

0

0

0

...

0

0

0

32

...

32

0

...

0

103

32

32

32

32

32

...

32

32

0

0

...

0

32

...

0

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

607

0

0

0

0

0

...

0

0

0

0

...

0

0

...

3

Cluster-wise separation penalizes this effect by using the marginal sums rather than the cluster size. Thus, if a method generates many redundant clusters, each one intersecting with a given complex, the marginal sum will increase drastically, and _{cl }will be reduced accordingly. Note that the result of Table

Figure

In Figure

In order to obtain a realistic estimate of algorithm robustness, we thus need to combine edge addition and removal. Figure

Figures

Analysis of data sets obtained in high-throughput experiments

In the previous chapters our evaluations were based on artificial graphs obtained by adding and removing various proportions of edges to a reference network (the MIPS complexes). The next step was to evaluate the capability of these algorithms to extract relevant information from high-throughput data sets. To this end, we downloaded from the GRID database

Main features of the four large scale data sets and clustering performances of the algorithms when applied to them

Dataset

Nb nodes

Nb edges

Mean degree

Mean clust coeff

MCL

MCODE

RNSC

SPC

real

permuted

real

permuted

real

permuted

real

permuted

Uetz

926

865

1.175

0.018

Number of clusters

288

10

48

234

Mean nb prot/cluster

3.22

11.2

1.91

3.96

Median nb prot/cluster

3

4.5

2

2

Largest cluster size

16

53

6

276

57.3%

38.6%

84.3%

74.5%

49.4%

36.5%

65.5%

43.3%

53.8%

45.9%

25.5%

21.6%

59.6%

54.4%

38.0%

38.9%

_{
g
}

55.6%

42.3%

54.9%

48.0%

54.5%

45.5%

51.8%

41.1%

_{
co
}

23.0%

20.6%

48.9%

62.5%

15.5%

14.8%

19.1%

21.2%

_{
cl
}

30.1%

26.9%

2.2%

2.8%

34.3%

32.7%

20.3%

22.6%

26.3%

23.5%

10.4%

13.3%

23.1%

22.0%

19.7%

21.9%

Ito

2937

4038

2.682

0.019

Number of clusters

630

9

1746

410

Mean nb prot/cluster

4.66

97.8

1.68

7.16

Median nb prot/cluster

3

11

2

2

Largest cluster size

157

485

4

1928

34.9%

26.0%

66.9%

68.0%

31.4%

24.0%

73.2%

64.6%

42.7%

38.5%

8.2%

5.8%

63.6%

61.8%

24.3%

23.8%

_{
g
}

38.8%

32.2%

37.5%

36.9%

47.5%

42.9%

48.8%

44.2%

_{
co
}

12.7%

11.8%

41.6%

33.0%

7.1%

7.0%

11.3%

11.0%

_{
cl
}

36.2%

33.9%

1.7%

1.3%

56.7%

55.9%

20.1%

20.4%

21.4%

20%

8.4%

6.7%

20.1%

19.8%

15.4%

15.0%

Ho

1564

3600

4.6

0.029

Number of clusters

314

13

957

63

Mean nb prot/cluster

4.98

49.5

1.63

24.8

Median nb prot/cluster

3

13

1

3

Largest cluster size

34

432

8

1383

50.6%

28.2%

81.2%

76.5%

37.0%

27.4%

90.1%

92.1%

47.1%

35.6%

12.9%

8.5%

61.5%

57.1%

10.4%

8.2%

_{
g
}

48.9%

31.9%

47.1%

42.5%

49.3%

42.2%

50.2%

50.2%

_{
co
}

22.6%

19%

44.7%

37.2%

11%

10.5%

19.3%

13.8%

_{
cl
}

32.3%

27.1%

2.6%

2.2%

48%

45.6%

5.5%

4.0%

27.0%

22.7%

10.9%

9.0%

23%

21.9%

10.3%

7.4%

Gavin

1352

3210

4.7

0.148

Number of clusters

212

27

709

87

Mean nb prot/cluster

6.38

32.5

1.91

15.5

Median nb prot/cluster

4

7

1

2

Largest cluster size

54

414

16

1074

74.1%

24.2%

67.0%

51.1%

52.1%

20.8%

91.8%

81.4%

57.0%

23.9%

20.4%

9.4%

62.0%

46.0%

18.1%

10.7%

_{
g
}

65.6%

24.0%

43.7%

30.3%

57.1%

33.4%

54.9%

46.0%

_{
co
}

39.4%

17.6%

44.5%

16.1%

14.5%

11.3%

34.4%

15.7%

_{
cl
}

38.0%

17.0%

5.5%

2.0%

46.9%

36.5%

13.6%

6.2%

38.7%

17.3%

15.6%

5.6%

26.1%

20.3%

21.6%

9.8%

Gavin

1430

6531

9.1

0.348

Number of clusters

189

39

487

136

Mean nb prot/cluster

7.57

40.3

2.94

10.5

Median nb prot/cluster

4

9

2

3

Largest cluster size

90

697

35

620

75.7%

23.7%

58.3%

43.2%

60.8%

20.9%

79.8%

48.4%

54.3%

21.0%

20.6%

8.0%

63.3%

37.3%

37.0%

16.5%

_{
g
}

65.0%

22.4%

39.5%

25.6%

62.1%

29.1%

58.4%

32.4%

_{
co
}

38.1%

15.5%

44.7%

15.3%

20.1%

12.9%

34.9%

14.9%

_{
cl
}

32.7%

13.3%

7.9%

2.7%

44.5%

28.6%

21.6%

9.2%

35.3%

14.4%

18.8%

6.4%

29.9%

19.2%

27.4%

11.7%

Krogan

2675

7088

5.296

0.146

Number of clusters

813

70

1405

114

Mean nb prot/cluster

4.93

28.3

2.1

10.3

Median nb prot/cluster

3

5.5

2

3

Largest cluster size

50

387

21

1724

62.8%

19.8%

56.3%

30.9%

53.1%

19.1%

82.6%

64.0%

56.2%

33.5%

21.9%

9.7%

63.3%

51.1%

25.4%

17.2%

_{
g
}

59.5%

26.7%

39.1%

20.3%

58.2%

35.1%

54.0%

40.6%

_{
co
}

20.0%

12.1%

33.2%

13.6%

10.3%

8.7%

20.3%

11.9%

_{
cl
}

49.5%

29.9%

8.8%

3.6%

59.6%

50.3%

24.0%

14.1%

31.5%

19.0%

17.0%

7.0%

24.7%

21.6%

20.9%

12.9%

We then ran the four clustering algorithms on these graphs, with the optimal parameters determined in the first part of this study. The clusters obtained from these high-throughput networks were compared with the complexes annotated in the MIPS database by computing the same statistics as described above (Table

Application of clustering on high-throughput data sets

**Application of clustering on high-throughput data sets**. **(A) **Cluster-wise separation. **(B) **Complex-wise separation. **(C) **Clustering-wise separation. **Color code: **

Some precautions should be taken before interpreting these results. In particular, it is not trivial to interpret the "positive predictive value", as our reference set is the MIPS collection, filtered to discard any high-throughput result. This collection should by no means be considered exhaustive, since the complexes detected by previous studies represent only a fraction of all existing complexes. High-throughput methods are thus expected to yield many complexes that have not previously been characterized by other methods. Thus, interactions detected by high-throughput methods that are not annotated in MIPS cannot be considered "false positives". The same holds true for cluster-wise separation. Thus, the

An important criterion for this analysis is the contrast between the scores reached with the real clustering results and the random expectation estimated with permuted clusters. A look at this contrast already reveals some general characteristics of the data sets. Whatever the clustering method used, the

Compared to the other algorithms, MCODE yields a lower number of clusters, with a higher number of proteins per cluster. It generally yields a moderate sensitivity, a low

SPC is characterized by a high sensitivity and a low

For all the data sets, RNSC yields a large number of mini-clusters (the average number of proteins per cluster is typically 2, the median is 1 or 2), plus a few clusters of reasonable size (up to 35 proteins per cluster). It shows a relatively high cluster-wise separation value (Figure

Finally, MCL clearly outperforms the other algorithms in terms of general performance (Table

Conclusion

We have evaluated the capability of four graph-based clustering algorithms to extract protein complexes from networks of protein-protein interactions. This evaluation has led us to elaborate a testing procedure for the selection of optimal parameters and the analysis of robustness to noise. We have defined new matching statistics called

To study the ability of the tested algorithms to extract protein complexes from an interaction network, we built a test graph from the complexes annotated in the MIPS database.

In a first step we assessed the impact of the parameters of each algorithm, and determined the optimal values for extracting complexes from an interaction network. This analysis shows that under most conditions, RNSC and MCL outperform MCODE and SPC. RNSC is remarkably robust to variations in the choice of parameters, whereas the other algorithms require appropriate tuning in order to yield relevant results. Secondly we assessed the robustness of these programs to noise and to missing information in the data, by randomly adding and removing edges from the test graph. This analysis clearly revealed differences between the algorithms, highlighting the robustness of MCL, and to a lesser extent RNSC, to graph alterations.

We then applied the same four algorithms to interaction networks obtained from six high-throughput studies. This analysis revealed that whatever the algorithm used, some data sets provide insufficient information for extracting the correct protein complexes. An analysis of the more informative data sets confirmed the general superiority of MCL over the three other algorithms tested here.

An important limitation of the present evaluation is that it was performed by naive users. Any algorithm is likely to work better in the hands of its own developer than in those of external users. As we did not participate in the development of any of the tested algorithms, our evaluation may underestimate the capabilities of some of the algorithms tested here. An advantage of such an external evaluation, however, is that the evaluators are not biased by better knowledge of one particular algorithm. Consequently, our evaluation might be biased in favour of algorithms which are more user-friendly, or easier to configure. It thus reflects a compromise between algorithm user-friendliness and efficiency.

Another limitation is that all of our analyses were performed on unweighted graphs, because our reference graph (the MIPS complexes) does not contain any information that would enable us to assign reliability values (weights) to the edges. It should be mentioned that MCL and SPC can deal with weighted graphs and are likely to give better performances if the weights reflect the reliability of the links between proteins

Methods

Test graphs

Annotated protein complexes

In order to test the ability of each algorithm to extract complexes from a network of binary interactions, we built a graph representing a large collection of experimentally characterized complexes. We collected from the MIPS database the collection of protein complexes annotated for the yeast

Altered graphs

A series of 41 altered graphs was derived from the test graph described above by combining various proportions of random edge deletions (0%, 5%, 10%, 20%, 40%, 80%) and additions (0%, 5%, 10%, 20%, 40%, 80%, 100%). We refer to altered graphs as _{add,del }where

Random expectation

The random expectation of clustering results was estimated in two ways: with randomized graphs and permuted clusters.

Randomized graphs

A _{add,del, }where

Permuted clusters

A set of permuted clusters can be obtained from a clustering result by shuffling the associations between proteins and clusters. This randomization procedure preserves cluster sizes. We applied it to each clustering result obtained with the test graph and the altered graphs.

Matching statistics

Each clustering result was compared with the annotated complexes by building a contingency table, as schematically exemplified in Table ^{th }annotated complex, and column ^{th }cluster. The value of a cell _{i,j }indicates the number of proteins found in common between complex

Sensitivity, positive predictive value (

Sensitivity

Considering the annotated complexes as our reference classification, we define

_{i,j }= _{i,j}_{i}

In this formula, _{i }is the number of proteins belonging to complex

To characterize the general sensitivity of a clustering result, we compute a

Positive predictive value

The positive predictive value is the proportion of members of cluster

_{.j }is the marginal sum of a colum

To characterize the general PPV of a clustering result as a whole, we compute a

Accuracy

The

The advantage of taking the geometric rather than arithmetic mean is that it yields a low score when either the _{arithm }> 0.5) or where, on the contrary, each protein is assigned to a single-element cluster (_{arithm }> 0.5).

Separation

The contingency table indicates the absolute frequency of intersections between complexes and clusters. From these values, we derive relative frequencies with respect to the marginal sums, either per row (

Note that the frequency per column is identical to the PPV defined above. The frequency per row, on the contrary, can differ from the sensitivity for some algorithms, if the algorithm permits assigning a protein to multiple clusters (Table

We define the

The separation is comprised between 0 and 1. The maximal value _{i,j }= 1 indicates a perfect and exclusive correspondence between complex

The

Reciprocally, we calculate a

To estimate a clustering result as a whole, clustering-wise _{co }and _{cl }values are computed as the averages of

We then compute the _{co }and _{cl}.

Computation

Clustering was performed on a PC cluster of 40 nodes. Statistical treatments were done and figures made with the freeware statistical package

Graphs of protein interactions were manipulated using the Java classes developed by the aMAZE group

Authors' contributions

SB collected the data, built the graphs, tested the algorithms, and ran the evaluation procedures. JvH conceived of the study and participated in its design, coordination, and in defining the matching statistics. Both authors were equally involved in writing the manuscript.

Acknowledgements

Sylvain Brohée is the recipient of a PhD grant from the Fonds pour la Formation à la Recherche dans l'Industrie et dans l'Agriculture (FRIA). The PC cluster used for the intensive computations was funded by the Belgian Fonds de la Recherche Fondamentale Collective and the Fonds d'Encouragement à la Recherche. We thank Raphaël Leplae and Marc Lensink for their valuable work to install and maintain this cluster. We thank Professor Bruno André for co-supervising this project.