H. sapiens simulated data. Relative frequencies of the estimated number of clusters obtained when a variety of clustering algorithms (BHC-C, BHC-SE, SplineCluster with linear and cubic splines, MCLUST and SSClust) were applied to simulated data sets (due to slow running times, we only used 100 of the 1000 simulated data sets to obtain the SSClust results). For each clustering algorithm, we draw lines between relative frequency values to aid interpretability. Each simulated data set was generated from the 6 Gaussian processes obtained from the BHC-SE clustering of the H. sapiens data set, and has the same number of genes, timepoints and per cluster noise levels. Note that, for SSClust, we specified the maximum permissible number of clusters to be 12.
Cooke et al. BMC Bioinformatics 2011 12:399 doi:10.1186/1471-2105-12-399