Table 2

Motif discovery algorithms used in the performance comparison. Nuisance parameters are parameters that cannot be precisely defined without knowledge of the true binding sites (such as motif length, number of occurrences and orientation). For MotifSampler and wConsensus, the lower part of the range indicates required parameters, while the upper part indicates the total number of parameters, including "power user" parameters that the program authors stress should typically be left as default. Motif model abbreviations: cons = consensus; PWM = position weight matrix; mis = consensus with predefined number of allowed non-position-specific mismatches.

Program
# Nuisance Parameters
Motif Model
Search Strategy
Citation

Oligo analysis (RSAT)
3
cons
Exhaustive enumeration of short and bipartite oligos. Clusters overlapping motifs. Uses a binomial approximation to the hypergeometric score, similar to the overrepresentation objective function.
[14, 33, 34]
Yeast Motif Finder (YMF)
2
cons
Exhaustive enumeration of short and bipartite oligos. Alphabet is {ACGTYR}. Uses the Normal approximation to the hypergeometric function, similar to the overrepresentation objective function.
[35]
AlignAce (AA)
2
PWM
Gibbs sampling to optimize a Maximum a Posteriori (MAP) score.
[36]
MotifSampler (MS)
3–5
PWM
Gibbs sampling with higher order Markov model.
[37]
BioProspector (Biopros)
7
PWM
Gibbs sampling with higher order Markov model. Designed for long and bipartite motifs common in prokaryotes.
[16, 38]
MEME
4
PWM
Expectation Maximization over a modified information content.
[39]
Improbizer (Imp)
8
PWM
Expectation Maximization. Uses 2nd order Markov model and optionally accounts for positional restrictions using a Gaussian model.
[40]
MITRA
1
mis
Tree-based search for long bipartite motifs with many mismatches. Uses a hypergeometric score similar to the overrepresentation objective function.
[41]
wConsensus (wCons)
1–13
PWM
Greedy enumeration to maximize information content. Infers motif length.
[42]
Weeder
4
mis
Bounded enumeration using a suffix tree. Tries all motif lengths from 6–12.
[43]

Chakravarty et al. BMC Bioinformatics 2007 8:249   doi:10.1186/1471-2105-8-249