Results from example 2 – Module discovery benchmark. This figure shows the nucleotide-level performance of ModuleSearcher on two of the datasets from the module discovery benchmark. Since ModuleSearcher is based on a non-deterministic algorithm, we ran it 10 times on each dataset. The bars show the average scores with standard deviations. The “baseline” scores reflect the performance when no pre-processing was performed to filter candidate binding sites, and the other scores are for different filtering criteria and combinations thereof. “C10_I10” means that the sites were filtered according to both the “Conservation10” and “Interacting10” criteria etc. a) The “Sp1-Ets” dataset was one of the hardest in the original benchmark, but filtering sites based on either conservation or potential interacting sites nearby significantly improves the performance of ModuleSearcher on this dataset. b) For the “liver” dataset we also filtered binding sites for motifs that were not known to be expressed in liver (“Tissue”) and combined this criterion with different requirements on conservation level (“C10_T” etc.).
Klepper and Drabløs BMC Bioinformatics 2013 14:9 doi:10.1186/1471-2105-14-9