Table 1

Clustering algorithm summary statistics.


NNN g = 5, n = 25
CAST t = 0.8
CLICK h = μT
QTC d = 0.5, n = 5
SAMBA

Brem 2005, 6162 genes, 131 conditions

Genes
1527
3410
6162
6137
2284
Clusters
54
800
82
127
113
Mean Size
28.4
4.26
75.1
48.3
102
Size Dev.
49.2
16.91
161
93.3
70.3

Gasch 2000, 6115 genes, 173 conditions

Genes
1142
4079
6115
6092
3120
Clusters
38
666
9
69
128
Mean Size
30.1
6.12
679
88.3
130
Size Dev.
62.5
35.58
787
220
101

Haugen 2004, 6256 genes, 7 conditions

Genes
64
6251
6256
6236
280
Clusters
11
45
16
56
5
Mean Size
5.82
138.9
391
11.4
88.4
Size Dev.
1.19
347.3
474
258
36.5

Hughes 2000, 6153 genes, 300 conditions

Genes
1996
2579
6153
6121
3375
Clusters
29
519
75
177
325
Mean Size
68.9
4.97
82.0
34.6
45.9
Size Dev.
245.4
11.95
107
57.8
44.1

Primig 2000, 6005 genes, 24 conditions

Genes
2247
5820
6005
5970
778
Clusters
27
687
46
110
25
Mean Size
83.2
8.47
131
54.3
139
Size Dev.
390
19.26
187
80.4
96.3

Spellman 1998, 5701 genes, 25 conditions

Genes
2050
5535
5701
5669
777
Clusters
28
616
47
100
32
Mean Size
73.3
8.99
121
56.7
69.0
Size Dev.
324
30.14
206
114
37.3

Concatenated Data, 6160 genes, 660 conditions

Genes
694
6155
6160
-
4892
Clusters
29
7
5
-
609
Mean Size
23.9
879.3
1232
-
63.7
Size Dev.
34.7
2140
1768
-
82.0

Uniformly Distributed Random Data, 6000 genes, 10 conditions

Genes
0 (± 0)
5988 (± 0.89)
3600 (± 3286)
5964 (± 28.8)
0 (± 0)
Clusters
0 (± 0)
216.2 (± 2.95)
9.8 (± 9.81)
109 (± 4.72)
0 (± 0)
Mean Size
0 (± 0)
27.7 (± 0.38)
190 (± 175)
53.0 (± 1.39)
0 (± 0)
Size Dev.
0 (± 0)
21.86 (± 0.25)
48.8 (± 45.7)
35.2 (± 0.791)
0 (± 0)

Normally Distributed Random Data, 6000 genes, 10 conditions

Genes
0 (± 0)
5986 (± 3.58)
6000 (± 0)
5975 (± 4.77)
0 (± 0)
Clusters
0 (± 0)
231.6 (± 3.29)
28.8 (± 11.9)
124 (± 1.30)
0 (± 0)
Mean Size
0 (± 0)
25.85 (± 0.36)
235 (± 82.6)
48.3 (± 0.482)
0 (± 0)
Size Dev.
0 (± 0)
18.14 (± 0.15)
64.8 (± 46.3)
30.9 (± 0.374)
0 (± 0)

Brem 2005, 6162 genes, 131 conditions, randomly permuted

Genes
101.4 (± 28.85)
0 (± 0)
6162 (± 0)
5837 (± 260.6)
1061 (± 35.87)
Clusters
16.2 (± 3.96)
0 (± 0)
36.2 (± 28.99)
428 (± 33.88)
156 (± 4.85)
Mean Size
6.23 (± 0.78)
0 (± 0)
680.7 (± 864.7)
13.67 (± 0.46)
32.46 (± 1.35)
Size Dev.
1.64 (± 0.79)
0 (± 0)
884.5 (± 1179)
2.36 (± 0.52)
18.03 (± 1.13)

Gasch 2000, 6115 genes, 173 conditions, randomly permuted

Genes
19.4 (± 6.66)
0 (± 0)
4586 (± 3058)
5507 (± 47.19)
1382 (± 15.27)
Clusters
3.6 (± 1.34)
0 (± 0)
20.75 (± 33.71)
411.2 (± 5.12)
219.8 (± 15.27)
Mean Size
5.47 (± 1.04)
0 (± 0)
701 (± 941.3)
13.39 (± 0.058)
18.38 (± 0.35)
Size Dev.
0.66 (± 1.48)
0 (± 0)
950.8 (± 1197)
1.7 (± 0.03)
9.25 (± 0.38)

Hughes 2000, 6153 genes, 300 conditions, randomly permuted

Genes
20.2 (± 8.61)
572.8 (± 12.74)
4922 (± 2752)
4815 (± 76.96)
1808 (± 56.32)
Clusters
3.6 (± 1.82)
224 (± 8.22)
13 (± 10.84)
407.2 (± 7.56)
390.8 (± 5.67)
Mean Size
6.13 (± 1.64)
2.56 (± 0.044)
592.5 (± 826.8)
11.83 (± 0.038)
11.09 (± 0.39)
Size Dev.
0.53 (± 0.71)
0.82 (± 0.046)
101.7 (± 200.2)
1.15 (± 0.024)
5.59 (± 0.5)

Summary statistics detailing Nearest Neighbor Networks clusters formed from the data sets employed in this study, from their concatenation, and from two synthetic random data sets using default parameters (g = 5, n = 25). Results from other clustering algorithms with appropriate output formats (CAST, CLICK, QTC, and SAMBA) have been included, also utilizing default parameter settings provided by the algorithms' implementations. Random values are shown with standard deviations over five different seeds.

Huttenhower et al. BMC Bioinformatics 2007 8:250   doi:10.1186/1471-2105-8-250