|
Resolution: standard / high Figure 2.
The Topologies of p53 Single-site and Cluster-site Models. (a) A Profile Hidden Markov Model (PHMM) contains three hidden states for each position
in a sequence motif of length n: a match state (green squares), an insertion state (orange diamonds), and a delete
state (gray circles). The arrows represent allowed transitions between states and
have associated probabilities. The match and insertion states also have associated
nucleotide emission probabilities. The first and last insertion states (I-0 and I-n)
and associated transitions (in red) are shown for completeness. However, they are
not present in the p53 models since they are replaced by FIM and FEM models. (b) The topology of the Finite Emission Module (FEM) of length N allows the ability to model any distribution of spacer-lengths between 1 and N. For
the p53 models, the model and background probabilities within the FEM modules are
identically uniform so that there is no-cost for spacer-lengths between 1 and N, and are referred to as "no-cost FEMs". (c) The topology of the Free Insertion Module (FIM) allows for the ability to model an
exponentially decaying distribution of spacer-lengths. However, by setting the model
and background probabilities to identically uniform, the FIM can model any sequence
of infinite length with no associated cost to the overall score (hence the word "Free").
(d) The main components of the p53 single-site model are the left and right half-site
PHMMs, which potentially contain corresponding positions between them. These two half-site
models are separated by a no-cost FEM model that limits the length of any intervening
spacer sequence to 20 bp. The half-site models are also wrapped by two FIMs that allow
the Viterbi algorithm to find the best matching motifs anywhere in the candidate sequences.
(e) The topology of the p53 cluster-site model consists of a single PHMM that models a
general half-site, and two back-transitions that allow for modeling an infinite number
of half-sites within the cluster-site. The back-transition through the no-cost FEM-14
model limits the spacer-sequence between the half-sites to lengths ≤ 14 bp.
Riley et al. BMC Bioinformatics 2009 10:111 doi:10.1186/1471-2105-10-111 |