|
Motif discovery algorithms used in the performance comparison. Nuisance parameters are parameters that cannot be precisely defined without knowledge of the true binding sites (such as motif length, number of occurrences and orientation). For MotifSampler and wConsensus, the lower part of the range indicates required parameters, while the upper part indicates the total number of parameters, including "power user" parameters that the program authors stress should typically be left as default. Motif model abbreviations: cons = consensus; PWM = position weight matrix; mis = consensus with predefined number of allowed non-position-specific mismatches. |
||||
| Program |
# Nuisance Parameters |
Motif Model |
Search Strategy |
Citation |
|
|
||||
| Oligo analysis (RSAT) |
3 |
cons |
Exhaustive enumeration of short and bipartite oligos. Clusters overlapping motifs. Uses a binomial approximation to the hypergeometric score, similar to the overrepresentation objective function. |
[14, 33, 34] |
| Yeast Motif Finder (YMF) |
2 |
cons |
Exhaustive enumeration of short and bipartite oligos. Alphabet is {ACGTYR}. Uses the Normal approximation to the hypergeometric function, similar to the overrepresentation objective function. |
[35] |
| AlignAce (AA) |
2 |
PWM |
Gibbs sampling to optimize a Maximum a Posteriori (MAP) score. |
[36] |
| MotifSampler (MS) |
3–5 |
PWM |
Gibbs sampling with higher order Markov model. |
[37] |
| BioProspector (Biopros) |
7 |
PWM |
Gibbs sampling with higher order Markov model. Designed for long and bipartite motifs common in prokaryotes. |
[16, 38] |
| MEME |
4 |
PWM |
Expectation Maximization over a modified information content. |
[39] |
| Improbizer (Imp) |
8 |
PWM |
Expectation Maximization. Uses 2nd order Markov model and optionally accounts for positional restrictions using a Gaussian model. |
[40] |
| MITRA |
1 |
mis |
Tree-based search for long bipartite motifs with many mismatches. Uses a hypergeometric score similar to the overrepresentation objective function. |
[41] |
| wConsensus (wCons) |
1–13 |
PWM |
Greedy enumeration to maximize information content. Infers motif length. |
[42] |
| Weeder |
4 |
mis |
Bounded enumeration using a suffix tree. Tries all motif lengths from 6–12. |
[43] |
Chakravarty et al. BMC Bioinformatics 2007 8:249 doi:10.1186/1471-2105-8-249 |
||||