<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1748-7188-2-13</ui>
   <ji>1748-7188</ji>
   <fm>
      <dochead>Research</dochead>
      <bibl>
         <title>
            <p>Exact p-value calculation for heterotypic clusters of regulatory motifs and its application in computational annotation of <it>cis</it>-regulatory modules</p>
         </title>
         <aug>
            <au id="A1" ca="yes">
               <snm>Boeva</snm>
               <fnm>Valentina</fnm>
               <insr iid="I1"/>
               <insr iid="I2"/>
               <email>valeyo@yandex.ru</email>
            </au>
            <au id="A2">
               <snm>Cl&#233;ment</snm>
               <fnm>Julien</fnm>
               <insr iid="I3"/>
               <email>Julien.Clement@info.unicaen.fr</email>
            </au>
            <au id="A3">
               <snm>R&#233;gnier</snm>
               <fnm>Mireille</fnm>
               <insr iid="I2"/>
               <email>Mireille.Regnier@inria.fr</email>
            </au>
            <au id="A4">
               <snm>Roytberg</snm>
               <mi>A</mi>
               <fnm>Mikhail</fnm>
               <insr iid="I4"/>
               <insr iid="I5"/>
               <email>mroytberg@impb.psn.ru</email>
            </au>
            <au id="A5">
               <snm>Makeev</snm>
               <mi>J</mi>
               <fnm>Vsevolod</fnm>
               <insr iid="I1"/>
               <insr iid="I6"/>
               <email>makeev@genetika.ru</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Institute of Genetics and Selection of Industrial Microorganisms, GosNIIGenetika, 117545 Moscow, Russia</p>
            </ins>
            <ins id="I2">
               <p>MIGEC, INRIA Rocquencourt, 78153 Le Chesnay, France</p>
            </ins>
            <ins id="I3">
               <p>GREYC, CNRS UMR 6072, Laboratoire d'informatique, 14032 Caen, France</p>
            </ins>
            <ins id="I4">
               <p>Institute of Mathematical Problems of Biology, Russian Academy of Sciences, Puschino, Moscow Region, Russia</p>
            </ins>
            <ins id="I5">
               <p>Puschino State University, Puschino, Moscow Region, Russia</p>
            </ins>
            <ins id="I6">
               <p>Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow, Russia</p>
            </ins>
         </insg>
         <source>Algorithms for Molecular Biology</source>
         <issn>1748-7188</issn>
         <pubdate>2007</pubdate>
         <volume>2</volume>
         <issue>1</issue>
         <fpage>13</fpage>
         <url>http://www.almob.org/content/2/1/13</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">17927813</pubid>
               <pubid idtype="doi">10.1186/1748-7188-2-13</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>13</day>
               <month>7</month>
               <year>2007</year>
            </date>
         </rec>
         <acc>
            <date>
               <day>10</day>
               <month>10</month>
               <year>2007</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>10</day>
               <month>10</month>
               <year>2007</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2007</year>
         <collab>Boeva et al; licensee BioMed Central Ltd.</collab>
         <note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p><it>cis</it>-Regulatory modules (CRMs) of eukaryotic genes often contain multiple binding sites for transcription factors. The phenomenon that binding sites form clusters in CRMs is exploited in many algorithms to locate CRMs in a genome. This gives rise to the problem of calculating the statistical significance of the event that multiple sites, recognized by different factors, would be found simultaneously in a text of a fixed length. The main difficulty comes from overlapping occurrences of motifs. So far, no tools have been developed allowing the computation of <it>p</it>-values for simultaneous occurrences of different motifs which can overlap.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>We developed and implemented an algorithm computing the <it>p</it>-value that <it>s </it>different motifs occur respectively <it>k</it><sub>1</sub>, ..., <it>k</it><sub><it>s </it></sub>or more times, possibly overlapping, in a random text. Motifs can be represented with a majority of popular motif models, but in all cases, without indels. Zero or first order Markov chains can be adopted as a model for the random text. The computational tool was tested on the set of <it>cis</it>-regulatory modules involved in <it>D. melanogaster </it>early development, for which there exists an annotation of binding sites for transcription factors. Our test allowed us to correctly identify transcription factors cooperatively/competitively binding to DNA.</p>
            </sec>
            <sec>
               <st>
                  <p>Method</p>
               </st>
               <p>The algorithm that precisely computes the probability of simultaneous motif occurrences is inspired by the Aho-Corasick automaton and employs a prefix tree together with a transition function. The algorithm runs with the <it>O</it>(<it>n</it>|&#931;|(<it>m</it>|<inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula>| + <it>K</it>|<it>&#963;</it>|<sup><it>K</it></sup>) &#8719;<sub><it>i </it></sub><it>k</it><sub><it>i</it></sub>) time complexity, where <it>n </it>is the length of the text, |&#931;| is the alphabet size, <it>m </it>is the maximal motif length, |<inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula>| is the total number of words in motifs, <it>K </it>is the order of Markov model, and <it>k</it><sub><it>i </it></sub>is the number of occurrences of the <it>i</it>th motif.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusion</p>
               </st>
               <p>The primary objective of the program is to assess the likelihood that a given DNA segment is CRM regulated with a known set of regulatory factors. In addition, the program can also be used to select the appropriate threshold for PWM scanning. Another application is assessing similarity of different motifs.</p>
            </sec>
            <sec>
               <st>
                  <p>Availability</p>
               </st>
               <p>Project web page, stand-alone version and documentation can be found at <url>http://bioinform.genetika.ru/AhoPro/</url></p>
            </sec>
         </sec>
      </abs>
   </fm>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>During the past few years, a number of computational tools have been designed <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B2">2</abbr><abbr bid="B3">3</abbr></abbrgrp> for locating potential <it>transcription factor binding sites </it>(TFBSs) in nucleotide sequences, e.g., in compilations of sequences upstream of putative co-regulated genes. In parallel, experimental approaches were developed <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>, which allowed identification of binding motifs for many different transcription factors. Experimental <abbrgrp><abbr bid="B5">5</abbr></abbrgrp> and bioinformatical <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> studies demonstrated that sequences of regulatory DNA that bind transcription factors can exhibit many different types of architecture. In eukaryotes TFBSs found in DNA sequences often form rather dense clusters: this was demonstrated both by experimental <abbrgrp><abbr bid="B5">5</abbr><abbr bid="B7">7</abbr></abbrgrp> and computational <abbrgrp><abbr bid="B8">8</abbr><abbr bid="B9">9</abbr></abbrgrp> methods. Such clusters can contain sites binding the same factor or several different factors <abbrgrp><abbr bid="B10">10</abbr></abbrgrp>. The <it>cis</it>-regulatory module (CRM) in this case contains respectively homotypic or heterotypic clusters of motifs specifically recognized by binding proteins <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>.</p>
         <p>The particular arrangement of motifs in a homotypic or heterotypic cluster is not random, and it is commonly accepted, that the motif arrangement within a CRM is important for its functionality <abbrgrp><abbr bid="B12">12</abbr><abbr bid="B13">13</abbr><abbr bid="B14">14</abbr><abbr bid="B15">15</abbr><abbr bid="B16">16</abbr><abbr bid="B17">17</abbr><abbr bid="B18">18</abbr><abbr bid="B19">19</abbr><abbr bid="B20">20</abbr></abbrgrp>. Bioinformatics studies indicate that antagonistic factors often bind to overlapping sites <abbrgrp><abbr bid="B21">21</abbr></abbrgrp> whereas synergetic factors are often positioned within a fixed distance <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>, often close to the multiple of 10.2 bp, the DNA double-helix pitch value <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>.</p>
         <p>Non-random arrangements of TFBSs within regulatory segments of DNA sequences are exploited in several TFBS identification tools, and it was observed that cooperativity-based discrimination of TFBSs surpasses the performance of models for individual TFBSs <abbrgrp><abbr bid="B22">22</abbr></abbrgrp>.</p>
         <p>On observing a cluster of TFBSs in some genome segment one can calculate the probability of observing similar site arrangements in a random sequence. This idea of evaluating the statistical significance of heterotypic clusters of sites was implemented in many programs including ClusterDraw <abbrgrp><abbr bid="B23">23</abbr></abbrgrp>, ModuleSearcher <abbrgrp><abbr bid="B24">24</abbr></abbrgrp>, MCAST <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>, eCIS-ANALYST <abbrgrp><abbr bid="B26">26</abbr></abbrgrp>, Cister <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>, Cluster-Buster <abbrgrp><abbr bid="B28">28</abbr></abbrgrp> and TargetExplorer <abbrgrp><abbr bid="B29">29</abbr></abbrgrp>. At the moment, such programs use empirical procedures like motif counting in biological and simulated sequences to assess the significance of observed site clustering. But it is highly desirable to have a good statistical measure of site clustering, and we believe that the best measure is the <it>p</it>-value of obtaining the observed cluster by chance in a random sequence of a Markov or Bernoulli (common name for Markov chain of order 0) type. In the case of heterotypic clusters one needs to take into account possible overlapping occurrences of different motifs, a problem that was considered difficult until now <abbrgrp><abbr bid="B30">30</abbr></abbrgrp>. In the case of homotypic clusters, an approximate statistical scoring function was constructed <abbrgrp><abbr bid="B8">8</abbr><abbr bid="B31">31</abbr></abbrgrp>; this approach has been implemented in algorithms like FLYENHANCER <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>, SCORE <abbrgrp><abbr bid="B33">33</abbr></abbrgrp>, and CLUSTER <abbrgrp><abbr bid="B34">34</abbr></abbrgrp>. However, this approximation performs poorly for highly overlapping TFBSs. One cannot ignore site overlapping if the motifs are fuzzy (highly degenerate), which is often the case for so-called "shadow sites" <abbrgrp><abbr bid="B31">31</abbr></abbrgrp>. In the case of heterotypic clusters, competing factors can bind even to very well determined motifs that overlap.</p>
         <sec>
            <st>
               <p>Representation of protein binding motifs in nucleotide sequences</p>
            </st>
            <p>Experimental methods on protein binding to DNA usually locate some DNA segment, or word in DNA text, as a probable binding target. Proteins can bind to similar DNA words <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>, the whole assembly of which can be called a motif. The simplest motif representation is the enumeration of sequences that can be bound by a transcription factor (TF) <abbrgrp><abbr bid="B35">35</abbr></abbrgrp>. Sometimes, information about binding sites can be found in SELEX <abbrgrp><abbr bid="B36">36</abbr><abbr bid="B37">37</abbr></abbrgrp> or Protein Binding Microarray (PBM) experiments <abbrgrp><abbr bid="B38">38</abbr></abbrgrp>. However, it is possible that such experiments do not give the exhaustive list of sequences of binding sites, so one needs to expand the list of putative binding sites using an appropriate criterion, which brings about the problem of the generalization of several known examples.</p>
            <p>For instance, several words aligned with mismatches, can be generalized to IUPAC string (like RSTGACTNMNW for AP-1 binding sites <abbrgrp><abbr bid="B39">39</abbr></abbrgrp>) by disregarding correlated substitutions in different motif positions <abbrgrp><abbr bid="B40">40</abbr></abbrgrp>. Another example of generalization is the set of words that can deviate from a consensus word for less than a given number of mismatches.</p>
            <p>The most popular way to represent binding sites is a Position Weight Matrix (PWM), which is also called position-specific weight matrix (PSWM) or position-specific scoring matrix (PSSM) <abbrgrp><abbr bid="B41">41</abbr></abbrgrp>. For a text with length <it>D </it>over an alphabet &#931; with |&#931;| symbols, a PWM is a |&#931;| &#215; <it>D </it>matrix: each row corresponding to a symbol of the alphabet &#931;, and each column to a position in the motif. For DNA texts, one has &#931; = {<it>A</it>, <it>C</it>, <it>G</it>, <it>T</it>}. The PWM score is defined as <inline-formula><m:math name="1748-7188-2-13-i2" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:mstyle displaystyle="true"><m:msubsup><m:mo>&#8721;</m:mo><m:mrow><m:mi>i</m:mi><m:mo>=</m:mo><m:mn>1</m:mn></m:mrow><m:mi>L</m:mi></m:msubsup><m:mrow><m:msub><m:mi>m</m:mi><m:mrow><m:mi>&#969;</m:mi><m:mo stretchy="false">(</m:mo><m:mi>i</m:mi><m:mo stretchy="false">)</m:mo><m:mo>,</m:mo><m:mi>i</m:mi></m:mrow></m:msub></m:mrow></m:mstyle></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaadaaeWaqaaiabd2gaTnaaBaaaleaaiiGacqWFjpWDcqGGOaakcqWGPbqAcqGGPaqkcqGGSaalcqWGPbqAaeqaaaqaaiabdMgaPjabg2da9iabigdaXaqaaiabdYeambqdcqGHris5aaaa@3BC0@</m:annotation></m:semantics></m:math></inline-formula>, where <it>i </it>represents a position in the <it>D</it>-substring, <it>&#969;</it>(<it>i</it>) the symbol at position <it>i </it>in the substring, and <it>m</it><sub><it>&#945;, i </it></sub>the score in row <it>&#945;</it>, column <it>i </it>of the matrix. So, given a cutoff value, one gets a list of <it>D</it>-sequences that score higher than this cutoff; thus representing possible DNA binding sites for the protein.</p>
            <p>Any of the three motif representations above can be converted to a list of words. The same is true for many other representations of motifs. In this study, we consider only the motifs that can be represented as a set of words.</p>
         </sec>
         <sec>
            <st>
               <p>P-value for clusters of motif occurrences, problem formulation</p>
            </st>
            <p>The objective of this work is to develop a statistical criterion to assess clustering of TFBS. Intuitively, a TFBS cluster is a DNA segment simultaneously containing "too many" TFBSs for given factor proteins; such a segment can often operate as a CRM regulated by these TFs. From a formal point of view, the problem we address here is as follows. Let <it>s </it>sets of words <inline-formula><m:math name="1748-7188-2-13-i3" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msub><m:mi>&#8459;</m:mi><m:mn>1</m:mn></m:msub><m:mo>,</m:mo><m:mn>...</m:mn><m:mo>,</m:mo><m:msub><m:mi>&#8459;</m:mi><m:mi>s</m:mi></m:msub></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsdaWgaaWcbaGaeGymaedabeaakiabcYcaSiabc6caUiabc6caUiabc6caUiabcYcaSiab=TqiinaaBaaaleaacqWGZbWCaeqaaaaa@3F88@</m:annotation></m:semantics></m:math></inline-formula> be given. Typically, each set <inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula><sub><it>i </it></sub>is associated to a TF motif. Given a <it>s</it>-tuple of integers (<it>k</it><sub>1</sub>, ..., <it>k</it><sub><it>s</it></sub>), we compute the corresponding <it>p</it>-value, that is the probability to find at least <it>k</it><sub><it>i </it></sub>occurrences of words from each set <inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula><sub><it>i </it></sub>in a random text of size <it>n</it>. We assume that the texts where motifs are searched are randomly generated by a Bernoulli process or a Markov model of order <it>K</it>. If (<it>k</it><sub>1</sub>, ..., <it>k</it><sub><it>s</it></sub>) occurrences of motifs <inline-formula><m:math name="1748-7188-2-13-i3" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msub><m:mi>&#8459;</m:mi><m:mn>1</m:mn></m:msub><m:mo>,</m:mo><m:mn>...</m:mn><m:mo>,</m:mo><m:msub><m:mi>&#8459;</m:mi><m:mi>s</m:mi></m:msub></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsdaWgaaWcbaGaeGymaedabeaakiabcYcaSiabc6caUiabc6caUiabc6caUiabcYcaSiab=TqiinaaBaaaleaacqWGZbWCaeqaaaaa@3F88@</m:annotation></m:semantics></m:math></inline-formula> are found in a DNA segment, the <it>p</it>-value can be used to infer if such numbers of occurrences could be found by chance.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Related work</p>
         </st>
         <p>Most previous works address counting problems for one set of several words <inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula>. In contrast, in this paper we deal with a separate counting for several sets of several words <inline-formula><m:math name="1748-7188-2-13-i3" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msub><m:mi>&#8459;</m:mi><m:mn>1</m:mn></m:msub><m:mo>,</m:mo><m:mn>...</m:mn><m:mo>,</m:mo><m:msub><m:mi>&#8459;</m:mi><m:mi>s</m:mi></m:msub></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsdaWgaaWcbaGaeGymaedabeaakiabcYcaSiabc6caUiabc6caUiabc6caUiabcYcaSiab=TqiinaaBaaaleaacqWGZbWCaeqaaaaa@3F88@</m:annotation></m:semantics></m:math></inline-formula>, each set <inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula><sub><it>j </it></sub>represents one TFBS motif.</p>
         <p>All methods of solving the problem of <it>p</it>-value calculations for multiple occurrences of words from a set <inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula> study some basic languages. Let <it>L</it><sub><it>n </it></sub>(<inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula>; <it>k</it>) be the set of texts of length <it>n </it>containing at least <it>k </it>occurrences of <inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula>. The desired <it>p</it>-value would therefore be the probability <b>P </b>(<it>L</it><sub><it>n </it></sub>(<inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula>; <it>k</it>)). Let <inline-formula><m:math name="1748-7188-2-13-i4" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>&#8475;</m:mi><m:mi>&#8459;</m:mi><m:mi>k</m:mi></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFBeIudaqhaaWcbaGae83cHGeabaGaem4AaSgaaaaa@3A01@</m:annotation></m:semantics></m:math></inline-formula> be the set of texts of all lengths that contain exactly <it>k </it>words of <inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula>, the last one occurring as a suffix <abbrgrp><abbr bid="B42">42</abbr></abbrgrp>. For any H<sub><it>j </it></sub>in <inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula>, let <inline-formula><m:math name="1748-7188-2-13-i5" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>&#8475;</m:mi><m:mrow><m:msub><m:mtext>H</m:mtext><m:mi>j</m:mi></m:msub></m:mrow><m:mi>k</m:mi></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFBeIudaqhaaWcbaGaeeisaG0aaSbaaWqaaiabdQgaQbqabaaaleaacqWGRbWAaaaaaa@3BB4@</m:annotation></m:semantics></m:math></inline-formula> be the subset of <inline-formula><m:math name="1748-7188-2-13-i4" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>&#8475;</m:mi><m:mi>&#8459;</m:mi><m:mi>k</m:mi></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFBeIudaqhaaWcbaGae83cHGeabaGaem4AaSgaaaaa@3A01@</m:annotation></m:semantics></m:math></inline-formula> where H<sub><it>j </it></sub>is a suffix. One observes that a text contains at least <it>k </it>occurrences if and only if it admits a prefix in <inline-formula><m:math name="1748-7188-2-13-i6" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>&#8475;</m:mi><m:mi>&#8459;</m:mi><m:mi>k</m:mi></m:msubsup><m:mo>=</m:mo><m:msub><m:mo>&#8746;</m:mo><m:mrow><m:msub><m:mtext>H</m:mtext><m:mi>j</m:mi></m:msub><m:mo>&#8712;</m:mo><m:mi>&#8459;</m:mi></m:mrow></m:msub><m:msubsup><m:mi>&#8475;</m:mi><m:mrow><m:msub><m:mtext>H</m:mtext><m:mi>j</m:mi></m:msub></m:mrow><m:mi>k</m:mi></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFBeIudaqhaaWcbaGae83cHGeabaGaem4AaSgaaOGaeyypa0JaeSOkIu1aaSbaaSqaaiabbIeainaaBaaameaacqWGQbGAaeqaaSGaeyicI4Sae83cHGeabeaakiab=TrisnaaDaaaleaacqqGibasdaWgaaadbaGaemOAaOgabeaaaSqaaiabdUgaRbaaaaa@46ED@</m:annotation></m:semantics></m:math></inline-formula>. One defines <inline-formula><m:math name="1748-7188-2-13-i7" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>r</m:mi><m:mi>j</m:mi><m:mi>k</m:mi></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGYbGCdaqhaaWcbaGaemOAaOgabaGaem4AaSgaaaaa@3102@</m:annotation></m:semantics></m:math></inline-formula> (<it>p</it>) as the probability that a text of size <it>p </it>be in set <inline-formula><m:math name="1748-7188-2-13-i5" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>&#8475;</m:mi><m:mrow><m:msub><m:mtext>H</m:mtext><m:mi>j</m:mi></m:msub></m:mrow><m:mi>k</m:mi></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFBeIudaqhaaWcbaGaeeisaG0aaSbaaWqaaiabdQgaQbqabaaaleaacqWGRbWAaaaaaa@3BB4@</m:annotation></m:semantics></m:math></inline-formula>. If no word in <inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula> is a subword of another word in <inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula>, the probability <b>P </b>(<it>L</it><sub><it>n </it></sub>(<inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula>; <it>k</it>)) to find at least <it>k </it>occurrences of words from <inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula> in a random text of length <it>n </it>satisfies</p>
         <p>
            <display-formula>
               <m:math name="1748-7188-2-13-i8" xmlns:m="http://www.w3.org/1998/Math/MathML">
                  <m:semantics>
                     <m:mrow>
                        <m:mi>P</m:mi>
                        <m:mo stretchy="false">(</m:mo>
                        <m:msub>
                           <m:mi>L</m:mi>
                           <m:mi>n</m:mi>
                        </m:msub>
                        <m:mo stretchy="false">(</m:mo>
                        <m:mi>&#8459;</m:mi>
                        <m:mo>;</m:mo>
                        <m:mi>k</m:mi>
                        <m:mo stretchy="false">)</m:mo>
                        <m:mo stretchy="false">)</m:mo>
                        <m:mo>=</m:mo>
                        <m:mstyle displaystyle="true">
                           <m:munder>
                              <m:mo>&#8721;</m:mo>
                              <m:mrow>
                                 <m:mi>p</m:mi>
                                 <m:mo>&#8804;</m:mo>
                                 <m:mi>n</m:mi>
                              </m:mrow>
                           </m:munder>
                           <m:mrow>
                              <m:mstyle displaystyle="true">
                                 <m:munder>
                                    <m:mo>&#8721;</m:mo>
                                    <m:mrow>
                                       <m:msub>
                                          <m:mtext>H</m:mtext>
                                          <m:mi>j</m:mi>
                                       </m:msub>
                                       <m:mo>&#8712;</m:mo>
                                       <m:mi>&#8459;</m:mi>
                                    </m:mrow>
                                 </m:munder>
                                 <m:mrow>
                                    <m:msubsup>
                                       <m:mi>r</m:mi>
                                       <m:mi>j</m:mi>
                                       <m:mi>k</m:mi>
                                    </m:msubsup>
                                    <m:mo stretchy="false">(</m:mo>
                                    <m:mi>p</m:mi>
                                    <m:mo stretchy="false">)</m:mo>
                                 </m:mrow>
                              </m:mstyle>
                           </m:mrow>
                        </m:mstyle>
                     </m:mrow>
                     <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaaieqacqWFqbaucqGGOaakcqWGmbatdaWgaaWcbaGaemOBa4gabeaakiabcIcaOmrtHrhAL1wy0L2yHvtyaeHbnfgDOvwBHrxAJfwnaGabaiab+TqiijabcUda7iabdUgaRjabcMcaPiabcMcaPiabg2da9maaqafabaWaaabuaeaacqWGYbGCdaqhaaWcbaGaemOAaOgabaGaem4AaSgaaOGaeiikaGIaemiCaaNaeiykaKcaleaacqqGibasdaWgaaadbaGaemOAaOgabeaaliabgIGiolab+Tqiibqab0GaeyyeIuoaaSqaaiabdchaWjabgsMiJkabd6gaUbqab0GaeyyeIuoaaaa@577F@</m:annotation>
                  </m:semantics>
               </m:math>
            </display-formula>
         </p>
         <p>Therefore, one tries to compute the sequence of (<inline-formula><m:math name="1748-7188-2-13-i7" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>r</m:mi><m:mi>j</m:mi><m:mi>k</m:mi></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGYbGCdaqhaaWcbaGaemOAaOgabaGaem4AaSgaaaaa@3102@</m:annotation></m:semantics></m:math></inline-formula> (<it>p</it>)) values.</p>
         <sec>
            <st>
               <p>Linear induction</p>
            </st>
            <p>In the first class of methods <abbrgrp><abbr bid="B43">43</abbr><abbr bid="B44">44</abbr><abbr bid="B45">45</abbr><abbr bid="B46">46</abbr></abbrgrp>, one computes, implicitly or explicitly, probabilities <b>P </b>(<it>L</it><sub><it>n </it></sub>(<inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula>; <it>k</it>)) up to a given text length <it>n</it>. Such methods are intrinsically linear in <it>n</it>. In <abbrgrp><abbr bid="B43">43</abbr><abbr bid="B44">44</abbr><abbr bid="B45">45</abbr><abbr bid="B46">46</abbr></abbrgrp> one relies on a recurrence relation on <inline-formula><m:math name="1748-7188-2-13-i7" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>r</m:mi><m:mi>j</m:mi><m:mi>k</m:mi></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGYbGCdaqhaaWcbaGaemOAaOgabaGaem4AaSgaaaaa@3102@</m:annotation></m:semantics></m:math></inline-formula> (<it>n</it>) that extends the one originally given in <abbrgrp><abbr bid="B47">47</abbr></abbrgrp>. Typically, one step will cost <it>O </it>(|<inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula>|<it>m</it>), where <inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula> is a set of words of length <it>m </it>and |<inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula>| is its cardinality. Time complexity is <it>O </it>(<it>n</it>|<inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula>|<it>m</it>) and, relying on a combinatorial property, <abbrgrp><abbr bid="B44">44</abbr></abbrgrp> achieves optimal space complexity <it>O </it>(|<inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula>| log |<inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula>|<it>m</it>). However the authors of <abbrgrp><abbr bid="B44">44</abbr></abbrgrp> do not consider several motifs occurrences and restrict themselves to the Bernoulli model. The authors of <abbrgrp><abbr bid="B43">43</abbr></abbrgrp> consider the Markov model, still using one motif for TFBS.</p>
         </sec>
         <sec>
            <st>
               <p>Algebraic Formulae</p>
            </st>
            <p>In a second class of methods <abbrgrp><abbr bid="B47">47</abbr><abbr bid="B48">48</abbr><abbr bid="B49">49</abbr><abbr bid="B50">50</abbr><abbr bid="B51">51</abbr><abbr bid="B52">52</abbr></abbrgrp>, a preprocessing computes <it>generating functions</it></p>
            <p>
               <display-formula>
                  <m:math name="1748-7188-2-13-i9" xmlns:m="http://www.w3.org/1998/Math/MathML">
                     <m:semantics>
                        <m:mrow>
                           <m:msubsup>
                              <m:mi>r</m:mi>
                              <m:mi>j</m:mi>
                              <m:mi>k</m:mi>
                           </m:msubsup>
                           <m:mo stretchy="false">(</m:mo>
                           <m:mi>z</m:mi>
                           <m:mo stretchy="false">)</m:mo>
                           <m:mo>=</m:mo>
                           <m:mstyle displaystyle="true">
                              <m:munder>
                                 <m:mo>&#8721;</m:mo>
                                 <m:mi>n</m:mi>
                              </m:munder>
                              <m:mrow>
                                 <m:msubsup>
                                    <m:mi>r</m:mi>
                                    <m:mi>j</m:mi>
                                    <m:mi>k</m:mi>
                                 </m:msubsup>
                                 <m:mo stretchy="false">(</m:mo>
                                 <m:mi>n</m:mi>
                                 <m:mo stretchy="false">)</m:mo>
                                 <m:msup>
                                    <m:mi>z</m:mi>
                                    <m:mi>n</m:mi>
                                 </m:msup>
                              </m:mrow>
                           </m:mstyle>
                           <m:mo>.</m:mo>
                        </m:mrow>
                        <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGYbGCdaqhaaWcbaGaemOAaOgabaGaem4AaSgaaOGaeiikaGIaemOEaONaeiykaKIaeyypa0ZaaabuaeaacqWGYbGCdaqhaaWcbaGaemOAaOgabaGaem4AaSgaaOGaeiikaGIaemOBa4MaeiykaKIaemOEaO3aaWbaaSqabeaacqWGUbGBaaaabaGaemOBa4gabeqdcqGHris5aOGaeiOla4caaa@4432@</m:annotation>
                     </m:semantics>
                  </m:math>
               </display-formula>
            </p>
            <p>In a second step, probabilities <b>P </b>(<it>L</it><sub><it>n </it></sub>(<inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula>; <it>k</it>)) are either extracted from the generating function or approximated.</p>
            <p>In <abbrgrp><abbr bid="B49">49</abbr><abbr bid="B53">53</abbr></abbrgrp>, <inline-formula><m:math name="1748-7188-2-13-i7" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>r</m:mi><m:mi>j</m:mi><m:mi>k</m:mi></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGYbGCdaqhaaWcbaGaemOAaOgabaGaem4AaSgaaaaa@3102@</m:annotation></m:semantics></m:math></inline-formula> (<it>z</it>) are the solutions of a system of equations. To derive these equations, the authors build an automaton that recognizes these languages <inline-formula><m:math name="1748-7188-2-13-i5" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>&#8475;</m:mi><m:mrow><m:msub><m:mtext>H</m:mtext><m:mi>j</m:mi></m:msub></m:mrow><m:mi>k</m:mi></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFBeIudaqhaaWcbaGaeeisaG0aaSbaaWqaaiabdQgaQbqabaaaleaacqWGRbWAaaaaaa@3BB4@</m:annotation></m:semantics></m:math></inline-formula> (one can prove that they are regular).</p>
            <p>A language approach <abbrgrp><abbr bid="B50">50</abbr></abbrgrp> or an induction <abbrgrp><abbr bid="B48">48</abbr></abbrgrp> leads to a formal expression that depends on the words overlaps. The main drawback is that these methods need to compute the determinant of a matrix of polynomials with a huge dimension, e.g. <it>O </it>(|<inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula>|). This <it>O </it>(|<inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula>|<sup>2</sup>) <it>symbolic computation </it>may be more expensive than the extraction step or the linear computation above, that involve <it>arithmetic operations </it>on real numbers.</p>
            <p>When the preprocessing step is achievable, the extraction step is amenable to the solution of a linear recurrence of degree <it>m</it>|<inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula>|; therefore, its complexity is <it>O </it>(<it>m</it>|<inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula>|<it>n</it>) and a classical optimization yields <it>O </it>(<it>m</it>|<inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula>| log <it>n</it>). There exists some good implementations that are numerically stable. One may cite the REGEXPCOUNT <abbrgrp><abbr bid="B54">54</abbr></abbrgrp> or EXCEP <abbrgrp><abbr bid="B55">55</abbr></abbrgrp> programs that rely on Fast Fourier Transform.</p>
            <p>Finally, approximations are available, the computation of which is constant with respect to <it>n</it>, but not to <inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula>. One approach is the compound Poisson approximation <abbrgrp><abbr bid="B56">56</abbr></abbrgrp>, but this approximation is not precise enough <abbrgrp><abbr bid="B57">57</abbr></abbrgrp>. Asymptotic results can also be derived from the algebraic formulae above <abbrgrp><abbr bid="B44">44</abbr><abbr bid="B58">58</abbr></abbrgrp>, not needing an explicit expression for <inline-formula><m:math name="1748-7188-2-13-i7" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>r</m:mi><m:mi>j</m:mi><m:mi>k</m:mi></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGYbGCdaqhaaWcbaGaemOAaOgabaGaem4AaSgaaaaa@3102@</m:annotation></m:semantics></m:math></inline-formula> (<it>z</it>), and therefore avoiding the expensive determinant computation. Time complexity, typically, is the one for computing all possible overlaps, that is approximately <it>O </it>(|<inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula>|<sup>2</sup>). This yields extremely precise results when the expectation of the number of occurrences, <it>nP </it>(H) is very small <abbrgrp><abbr bid="B59">59</abbr></abbrgrp> or close to 1 <abbrgrp><abbr bid="B51">51</abbr></abbrgrp> (the case studied the most often). Case <it>nP </it>(H) ~2 is achieved in <abbrgrp><abbr bid="B60">60</abbr></abbrgrp>. Nevertheless, extension to larger values of <it>k </it>or multioccurrences and multisets is still open.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Methods</p>
         </st>
         <p>Here we consider in detail the approach we suggest.</p>
         <p>A motif assigned to a TF is a finite set of words <inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula> = (H<sub>1</sub>, ..., H<sub>r</sub>) where each word represents one putative TF binding site in DNA. Note that words in motif can generally be of different lengths. However, no word from <inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula> can contain another word from <inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula> as a substring. We consider, as an occurrence of motif <inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula> in text <it>T</it>, any occurrence of any word <inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula><sub><it>j </it></sub>&#8712; <inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula> in <it>T</it>. Below all texts and words in motifs are sequences on a given alphabet &#931;.</p>
         <p>Let (<inline-formula><m:math name="1748-7188-2-13-i3" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msub><m:mi>&#8459;</m:mi><m:mn>1</m:mn></m:msub><m:mo>,</m:mo><m:mn>...</m:mn><m:mo>,</m:mo><m:msub><m:mi>&#8459;</m:mi><m:mi>s</m:mi></m:msub></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsdaWgaaWcbaGaeGymaedabeaakiabcYcaSiabc6caUiabc6caUiabc6caUiabcYcaSiab=TqiinaaBaaaleaacqWGZbWCaeqaaaaa@3F88@</m:annotation></m:semantics></m:math></inline-formula>) be <it>s </it>different motifs. Our objective is to calculate the probability (<it>p</it>-value) that motifs (<inline-formula><m:math name="1748-7188-2-13-i3" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msub><m:mi>&#8459;</m:mi><m:mn>1</m:mn></m:msub><m:mo>,</m:mo><m:mn>...</m:mn><m:mo>,</m:mo><m:msub><m:mi>&#8459;</m:mi><m:mi>s</m:mi></m:msub></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsdaWgaaWcbaGaeGymaedabeaakiabcYcaSiabc6caUiabc6caUiabc6caUiabcYcaSiab=TqiinaaBaaaleaacqWGZbWCaeqaaaaa@3F88@</m:annotation></m:semantics></m:math></inline-formula>) have respectively at least (<it>k</it><sub>1</sub>, ..., <it>k</it><sub><it>s</it></sub>) possibly overlapping occurrences in a random text <it>T</it><sub><it>n</it></sub>.</p>
         <p>To be more precise, there is a probability distribution defined on the set &#931;<sup><it>n </it></sup>of all texts of length <it>n </it>in the alphabet &#931;; the most widely used models are random Bernoulli trials and a Markov model of order <it>K</it>. Denote as <it>L</it><sub><it>n </it></sub>(<inline-formula><m:math name="1748-7188-2-13-i3" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msub><m:mi>&#8459;</m:mi><m:mn>1</m:mn></m:msub><m:mo>,</m:mo><m:mn>...</m:mn><m:mo>,</m:mo><m:msub><m:mi>&#8459;</m:mi><m:mi>s</m:mi></m:msub></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsdaWgaaWcbaGaeGymaedabeaakiabcYcaSiabc6caUiabc6caUiabc6caUiabcYcaSiab=TqiinaaBaaaleaacqWGZbWCaeqaaaaa@3F88@</m:annotation></m:semantics></m:math></inline-formula>; <it>k</it><sub>1</sub>, ..., <it>k</it><sub><it>s</it></sub>) the set of all texts of length <it>n </it>containing at least <it>k</it><sub><it>i </it></sub>possibly overlapping occurrences of each motif <inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula><sub><it>i</it></sub>; <it>i </it>= 1, ..., <it>s</it>. Then the desired <it>p</it>-value is the probability <b>P </b>(<it>L</it><sub><it>n </it></sub>(<inline-formula><m:math name="1748-7188-2-13-i3" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msub><m:mi>&#8459;</m:mi><m:mn>1</m:mn></m:msub><m:mo>,</m:mo><m:mn>...</m:mn><m:mo>,</m:mo><m:msub><m:mi>&#8459;</m:mi><m:mi>s</m:mi></m:msub></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsdaWgaaWcbaGaeGymaedabeaakiabcYcaSiabc6caUiabc6caUiabc6caUiabcYcaSiab=TqiinaaBaaaleaacqWGZbWCaeqaaaaa@3F88@</m:annotation></m:semantics></m:math></inline-formula>; <it>k</it><sub>1</sub>, ..., <it>k</it><sub><it>s</it></sub>)) of the set <it>L</it><sub><it>n </it></sub>(<inline-formula><m:math name="1748-7188-2-13-i3" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msub><m:mi>&#8459;</m:mi><m:mn>1</m:mn></m:msub><m:mo>,</m:mo><m:mn>...</m:mn><m:mo>,</m:mo><m:msub><m:mi>&#8459;</m:mi><m:mi>s</m:mi></m:msub></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsdaWgaaWcbaGaeGymaedabeaakiabcYcaSiabc6caUiabc6caUiabc6caUiabcYcaSiab=TqiinaaBaaaleaacqWGZbWCaeqaaaaa@3F88@</m:annotation></m:semantics></m:math></inline-formula>; <it>k</it><sub>1</sub>, ..., <it>k</it><sub><it>s</it></sub>) with respect to the given probability distribution on &#931;<sup><it>n</it></sup>.</p>
         <p>Our approach to the calculation of this <it>p</it>-value is similar to that published in <abbrgrp><abbr bid="B61">61</abbr></abbrgrp>, which was used there to calculate seed sensitivity in local alignment search. The approach exploits the fact that the algorithm of Aho and Corasick <abbrgrp><abbr bid="B62">62</abbr></abbrgrp> can be modified to efficiently determine whether a given text belongs to the set <it>L</it><sub><it>n </it></sub>(<inline-formula><m:math name="1748-7188-2-13-i3" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msub><m:mi>&#8459;</m:mi><m:mn>1</m:mn></m:msub><m:mo>,</m:mo><m:mn>...</m:mn><m:mo>,</m:mo><m:msub><m:mi>&#8459;</m:mi><m:mi>s</m:mi></m:msub></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsdaWgaaWcbaGaeGymaedabeaakiabcYcaSiabc6caUiabc6caUiabc6caUiabcYcaSiab=TqiinaaBaaaleaacqWGZbWCaeqaaaaa@3F88@</m:annotation></m:semantics></m:math></inline-formula>; <it>k</it><sub>1</sub>, ..., <it>k</it><sub><it>s</it></sub>) or not. Ideas published in <abbrgrp><abbr bid="B61">61</abbr></abbrgrp> and <abbrgrp><abbr bid="B62">62</abbr></abbrgrp> can be adopted to compute the probability <b>P </b>(<it>L</it><sub><it>n </it></sub>(<inline-formula><m:math name="1748-7188-2-13-i3" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msub><m:mi>&#8459;</m:mi><m:mn>1</m:mn></m:msub><m:mo>,</m:mo><m:mn>...</m:mn><m:mo>,</m:mo><m:msub><m:mi>&#8459;</m:mi><m:mi>s</m:mi></m:msub></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsdaWgaaWcbaGaeGymaedabeaakiabcYcaSiabc6caUiabc6caUiabc6caUiabcYcaSiab=TqiinaaBaaaleaacqWGZbWCaeqaaaaa@3F88@</m:annotation></m:semantics></m:math></inline-formula>; <it>k</it><sub>1</sub>, ..., <it>k</it><sub><it>s</it></sub>)) that the random text <it>T</it><sub><it>n </it></sub>&#8712; &#931;<sup><it>n </it></sup>belongs to the set <it>L</it><sub><it>n </it></sub>(<inline-formula><m:math name="1748-7188-2-13-i3" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msub><m:mi>&#8459;</m:mi><m:mn>1</m:mn></m:msub><m:mo>,</m:mo><m:mn>...</m:mn><m:mo>,</m:mo><m:msub><m:mi>&#8459;</m:mi><m:mi>s</m:mi></m:msub></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsdaWgaaWcbaGaeGymaedabeaakiabcYcaSiabc6caUiabc6caUiabc6caUiabcYcaSiab=TqiinaaBaaaleaacqWGZbWCaeqaaaaa@3F88@</m:annotation></m:semantics></m:math></inline-formula>; <it>k</it><sub>1</sub>, ..., <it>k</it><sub><it>s</it></sub>).</p>
         <p>We start from the simplest case of one motif <inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula> for which we calculate the probability <b>P </b>(<it>L</it><sub><it>n </it></sub>(<inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula>; 1)) that text <it>T</it><sub><it>n </it></sub>contains at least one occurrence of the motif with respect to a Bernoulli probability distribution. More complicated cases (arbitrary number of occurrences; arbitrary number of motifs; Markov distribution) will be discussed in the following sections.</p>
         <sec>
            <st>
               <p>Construction of Aho-Corasick traversal</p>
            </st>
            <p>Aho and Corasick <abbrgrp><abbr bid="B62">62</abbr></abbrgrp> have proposed the algorithm determining if a given text <it>T </it>contains an occurrence of a word from a given set <inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula>. The basic data structure is a prefix tree which is a variant of the classical trie <inline-formula><m:math name="1748-7188-2-13-i10" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:mi mathvariant="script">T</m:mi><m:mo stretchy="false">(</m:mo><m:mi>&#8459;</m:mi><m:mo stretchy="false">)</m:mo></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFtepvcqGGOaakcqWFlecscqGGPaqkaaa@3AF1@</m:annotation></m:semantics></m:math></inline-formula><abbrgrp><abbr bid="B42">42</abbr></abbrgrp> that may be built on the set of words <inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula>. Let <inline-formula><m:math name="1748-7188-2-13-i11" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msub><m:mi>Q</m:mi><m:mi>&#8459;</m:mi></m:msub></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGrbqudaWgaaWcbaWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aaceaGae83cHGeabeaaaaa@38B9@</m:annotation></m:semantics></m:math></inline-formula> denote the set of prefixes of these words. In the following, we identify a word <it>q </it>&#8712; <inline-formula><m:math name="1748-7188-2-13-i11" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msub><m:mi>Q</m:mi><m:mi>&#8459;</m:mi></m:msub></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGrbqudaWgaaWcbaWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aaceaGae83cHGeabeaaaaa@38B9@</m:annotation></m:semantics></m:math></inline-formula> with node <it>Node </it>(<it>q</it>) at the end of the branch labeled by <it>q</it>. In particular, the root is identified with the empty string <it>&#949;</it>. The length of a prefix is the depth of <it>Node </it>(<it>q</it>).</p>
            <p>The classic Aho-Corasick algorithm is a tree traversal determined by a <it>transition function </it><inline-formula><m:math name="1748-7188-2-13-i12" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:mi>&#948;</m:mi><m:mo>:</m:mo><m:msub><m:mi>Q</m:mi><m:mi>&#8459;</m:mi></m:msub><m:mo>&#215;</m:mo><m:mi>&#931;</m:mi><m:mo>&#8594;</m:mo><m:msub><m:mi>Q</m:mi><m:mi>&#8459;</m:mi></m:msub></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaaiiGacqWF0oazcqGG6aGocqWGrbqudaWgaaWcbaWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aaceaGae43cHGeabeaakiabgEna0kabfo6atjabgkziUkabdgfarnaaBaaaleaacqGFlecsaeqaaaaa@4341@</m:annotation></m:semantics></m:math></inline-formula> defined as follows. For any pair (<it>p, a</it>) in <inline-formula><m:math name="1748-7188-2-13-i11" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msub><m:mi>Q</m:mi><m:mi>&#8459;</m:mi></m:msub></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGrbqudaWgaaWcbaWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aaceaGae83cHGeabeaaaaa@38B9@</m:annotation></m:semantics></m:math></inline-formula> &#215; &#931;, <it>&#948; </it>(<it>p, a</it>) is the largest suffix of concatenation <it>pa </it>that belongs to <inline-formula><m:math name="1748-7188-2-13-i11" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msub><m:mi>Q</m:mi><m:mi>&#8459;</m:mi></m:msub></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGrbqudaWgaaWcbaWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aaceaGae83cHGeabeaaaaa@38B9@</m:annotation></m:semantics></m:math></inline-formula>. Remark that <it>&#948; </it>(<it>p, a</it>) = <it>pa </it>iff <it>pa </it>&#8712; <inline-formula><m:math name="1748-7188-2-13-i11" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msub><m:mi>Q</m:mi><m:mi>&#8459;</m:mi></m:msub></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGrbqudaWgaaWcbaWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aaceaGae83cHGeabeaaaaa@38B9@</m:annotation></m:semantics></m:math></inline-formula>.</p>
            <p>Given a text <it>T </it>read from left to right, let <it>T </it>[<it>i</it>] denote the letter of <it>T </it>at position <it>i</it>. Let <it>q</it><sub><it>i </it></sub>be the largest suffix in text <it>T</it>[1] &#8943; <it>T </it>[<it>i</it>] that belongs to <inline-formula><m:math name="1748-7188-2-13-i11" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msub><m:mi>Q</m:mi><m:mi>&#8459;</m:mi></m:msub></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGrbqudaWgaaWcbaWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aaceaGae83cHGeabeaaaaa@38B9@</m:annotation></m:semantics></m:math></inline-formula>. The sequence of nodes visited during the traversal are defined by words <it>q</it><sub><it>i </it></sub>that satisfy the inductive relationship</p>
            <p>
               <display-formula>&#8704;<it>i </it>&#8805; 0, <it>q</it><sub><it>i</it>+1 </sub>= <it>&#948; </it>(<it>q</it><sub><it>i</it></sub>, <it>T </it>[<it>i </it>+ 1]),</display-formula>
            </p>
            <p>with the initial condition <it>q</it><sub>0 </sub>= <it>&#949;</it>.</p>
            <p><b>Example: </b>Let <inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula> be the set {AAA, AAC, ACA, ACA, CCT}. The corresponding tree <inline-formula><m:math name="1748-7188-2-13-i10" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:mi mathvariant="script">T</m:mi><m:mo stretchy="false">(</m:mo><m:mi>&#8459;</m:mi><m:mo stretchy="false">)</m:mo></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFtepvcqGGOaakcqWFlecscqGGPaqkaaa@3AF1@</m:annotation></m:semantics></m:math></inline-formula> is depicted in Figure <figr fid="F1">1</figr>. Values of <it>&#948; </it>function are given in Table <tblr tid="T1">1</tblr>. Aho-Corasick traversal of tree <inline-formula><m:math name="1748-7188-2-13-i10" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:mi mathvariant="script">T</m:mi><m:mo stretchy="false">(</m:mo><m:mi>&#8459;</m:mi><m:mo stretchy="false">)</m:mo></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFtepvcqGGOaakcqWFlecscqGGPaqkaaa@3AF1@</m:annotation></m:semantics></m:math></inline-formula> according to text <it>T </it>= 'ATGCCAACCTT' produces the following sequence of nodes {<it>q</it><sub><it>i</it></sub>}<sub><it>i </it>&#8805; 1 </sub>in <inline-formula><m:math name="1748-7188-2-13-i11" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msub><m:mi>Q</m:mi><m:mi>&#8459;</m:mi></m:msub></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGrbqudaWgaaWcbaWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aaceaGae83cHGeabeaaaaa@38B9@</m:annotation></m:semantics></m:math></inline-formula> (the numbers of corresponding nodes in Figure <figr fid="F1">1</figr> are shown in square brackets): A[1], <it>&#949;</it>[0], <it>&#949;</it>[0], C[2], CC[5], A[1], AA[3], AAC[7], ACC[9], CCT[10], <it>&#949;</it>[0].</p>
            <tbl id="T1">
               <title>
                  <p>Table 1</p>
               </title>
               <caption>
                  <p>Values of <it>&#948; </it>function for the set <inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula> = {aaa, aac, aca, acc, cct}.</p>
               </caption>
               <tblbdy cols="5">
                  <r>
                     <c ca="left">
                        <p><it>q</it>\<it>&#945;</it></p>
                     </c>
                     <c ca="center">
                        <p>A</p>
                     </c>
                     <c ca="center">
                        <p>C</p>
                     </c>
                     <c ca="center">
                        <p>G</p>
                     </c>
                     <c ca="center">
                        <p>T</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="5">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="center">
                        <p>7</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>8</p>
                     </c>
                     <c ca="center">
                        <p>9</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>5</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>10</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>6</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="center">
                        <p>7</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>7</p>
                     </c>
                     <c ca="center">
                        <p>8</p>
                     </c>
                     <c ca="center">
                        <p>9</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>8</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>9</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>10</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>10</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>Values of <it>&#948; </it>(<it>q</it>, <it>&#945;</it>) function for <it>q </it>&#8712; <it>Q </it>and <it>&#945; </it>= <it>A</it>, <it>C</it>, <it>G</it>, <it>T </it>constructed for the set <inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula> = {AAA, AAC, ACA, ACC, CCT}.</p>
               </tblfn>
            </tbl>
            <fig id="F1">
               <title>
                  <p>Figure 1</p>
               </title>
               <caption>
                  <p>Tree <inline-formula><m:math name="1748-7188-2-13-i10" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:mi mathvariant="script">T</m:mi><m:mo stretchy="false">(</m:mo><m:mi>&#8459;</m:mi><m:mo stretchy="false">)</m:mo></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFtepvcqGGOaakcqWFlecscqGGPaqkaaa@3AF1@</m:annotation></m:semantics></m:math></inline-formula> for the set <inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula> = {aaa, aac, aca, acc, cct} with dashed links for <it>&#948; </it>function</p>
               </caption>
               <text>
                  <p><b>Tree <inline-formula><m:math name="1748-7188-2-13-i10" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:mi mathvariant="script">T</m:mi><m:mo stretchy="false">(</m:mo><m:mi>&#8459;</m:mi><m:mo stretchy="false">)</m:mo></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFtepvcqGGOaakcqWFlecscqGGPaqkaaa@3AF1@</m:annotation></m:semantics></m:math></inline-formula> for the set <inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula> = {aaa, aac, aca, acc, cct} with dashed links for <it>&#948; </it>function</b>. Tree <inline-formula><m:math name="1748-7188-2-13-i10" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:mi mathvariant="script">T</m:mi><m:mo stretchy="false">(</m:mo><m:mi>&#8459;</m:mi><m:mo stretchy="false">)</m:mo></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFtepvcqGGOaakcqWFlecscqGGPaqkaaa@3AF1@</m:annotation></m:semantics></m:math></inline-formula> for the set <inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula> = {AAA, AAC, ACA, ACC, CCT}. Dashed colored links represent <it>&#948; </it>function for internal node (5) &#8211; in red, and for marked node (7) corresponding to the word AAC &#8712; <inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula> &#8211; in purple.</p>
               </text>
               <graphic file="1748-7188-2-13-1"/>
            </fig>
            <p><inline-formula><m:math name="1748-7188-2-13-i10" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:mi mathvariant="script">T</m:mi><m:mo stretchy="false">(</m:mo><m:mi>&#8459;</m:mi><m:mo stretchy="false">)</m:mo></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFtepvcqGGOaakcqWFlecscqGGPaqkaaa@3AF1@</m:annotation></m:semantics></m:math></inline-formula> and transition function <it>&#948; </it>can be efficiently constructed with an algorithm proposed by Aho and Corasick <abbrgrp><abbr bid="B62">62</abbr></abbrgrp>. Both time and space of the algorithm is proportional to the sum of lengths of all words from <inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula>.</p>
            <p>The combination of tree <inline-formula><m:math name="1748-7188-2-13-i10" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:mi mathvariant="script">T</m:mi><m:mo stretchy="false">(</m:mo><m:mi>&#8459;</m:mi><m:mo stretchy="false">)</m:mo></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFtepvcqGGOaakcqWFlecscqGGPaqkaaa@3AF1@</m:annotation></m:semantics></m:math></inline-formula> and transition function <it>&#948; </it>allows solving numerous pattern matching problems: search of the first occurrence of a word from a given set, search of all occurrences, word counting, <it>etc</it>.</p>
         </sec>
         <sec>
            <st>
               <p>Bernoulli text model. Probability to find at least one occurrence of a single motif</p>
            </st>
            <p>In this section we consider the simplest case. One computes the <it>p</it>-value for a single motif in a text <it>T</it><sub><it>n </it></sub>of length <it>n</it>, assuming that <it>T</it><sub><it>n </it></sub>is generated by independent Bernoulli random trials over alphabet &#931;. The algorithm computes probabilities <b>P </b>(<it>L</it><sub><it>n </it></sub>(<inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula>; 1)) by induction on <it>n</it>.</p>
            <p>To describe the algorithm we divide the set &#931;<sup><it>i </it></sup>of all texts <it>T</it><sub><it>i </it></sub>of length <it>i </it>into classes that do and do not contain occurrences of <inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula>.</p>
            <p><b>Definition 1 </b><it>A text T<sub><it>i </it></sub>belongs to class C</it><sub><it>i </it></sub>(0; <it>q</it>) <it>iff</it></p>
            <p><it>1. Length of T</it><sub><it>i </it></sub><it>is i</it>,</p>
            <p><it>2. T</it><sub><it>i </it></sub><it>does not contain words from </it><inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula>,</p>
            <p><it>3. A traversal AC </it>(<inline-formula><m:math name="1748-7188-2-13-i10" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:mi mathvariant="script">T</m:mi><m:mo stretchy="false">(</m:mo><m:mi>&#8459;</m:mi><m:mo stretchy="false">)</m:mo></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFtepvcqGGOaakcqWFlecscqGGPaqkaaa@3AF1@</m:annotation></m:semantics></m:math></inline-formula>, <it>T</it><sub><it>i</it></sub>) <it>ends at node q</it>.</p>
            <p><it>A text T</it><sub><it>i </it></sub><it>belongs to class G</it><sub><it>i </it></sub>(1) <it>iff</it></p>
            <p><it>(i) Length of T</it><sub><it>i </it></sub><it>is i</it>,</p>
            <p><it>(ii) T</it><sub><it>i </it></sub><it>does contain at least one occurrence of a word from </it><inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula>.</p>
            <p>For a given number <it>i </it>larger than <it>m</it>, the union for classes <it>C</it><sub><it>i </it></sub>(0; <it>q</it>), where <it>q </it>is in <inline-formula><m:math name="1748-7188-2-13-i13" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msub><m:mi>Q</m:mi><m:mi>&#8459;</m:mi></m:msub><m:mo>\</m:mo><m:mi>&#8459;</m:mi></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGrbqudaWgaaWcbaWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aaceaGae83cHGeabeaakiabcYfaCjab=Tqiibaa@3AFC@</m:annotation></m:semantics></m:math></inline-formula> and the class <it>G</it><sub><it>i </it></sub>(1) form a partition of the set &#931;<sup><it>i </it></sup>of all texts of length <it>i</it>, i.e., any texts of length <it>i </it>belongs either to a class <it>C</it><sub><it>i </it></sub>(0; <it>q</it>) for some <it>q </it>in <inline-formula><m:math name="1748-7188-2-13-i13" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msub><m:mi>Q</m:mi><m:mi>&#8459;</m:mi></m:msub><m:mo>\</m:mo><m:mi>&#8459;</m:mi></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGrbqudaWgaaWcbaWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aaceaGae83cHGeabeaakiabcYfaCjab=Tqiibaa@3AFC@</m:annotation></m:semantics></m:math></inline-formula>, or to a class <it>G</it><sub><it>i </it></sub>(1). Indeed, condition 3. means that the largest suffix of <it>T</it><sub><it>i </it></sub>in <inline-formula><m:math name="1748-7188-2-13-i11" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msub><m:mi>Q</m:mi><m:mi>&#8459;</m:mi></m:msub></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGrbqudaWgaaWcbaWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aaceaGae83cHGeabeaaaaa@38B9@</m:annotation></m:semantics></m:math></inline-formula> is <it>q</it>. It follows from condition 2. that classes <it>C</it><sub><it>i </it></sub>(<it>q</it>; 0) are empty if <it>q </it>is in <inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula>. A text <it>T</it><sub><it>i </it></sub>of length <it>i </it>is in <it>G</it><sub><it>i </it></sub>(1) if and only if a node of <inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula> was visited during the traversal.</p>
            <p>Let <b>P </b>(<it>C</it><sub><it>n </it></sub>(0; <it>q</it>)) and <b>P </b>(<it>G</it><sub><it>n </it></sub>(1)) denote probabilities that a text <it>T</it><sub><it>n </it></sub>belongs to class <it>C</it><sub><it>n </it></sub>(0; <it>q</it>) and <it>G</it><sub><it>n </it></sub>(1), respectively. Then, <it>L</it><sub><it>n </it></sub>(<inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula>; 1) = <it>G</it><sub><it>n </it></sub>(1); therefore the desired <it>p</it>-value <b>P </b>(<it>L</it><sub><it>n </it></sub>(<inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula>; 1)) is equal to <b>P </b>(<it>G</it><sub><it>n </it></sub>(1)).</p>
            <p>The algorithm calculates probabilities <b>P </b>(<it>C</it><sub><it>i </it></sub>(0; <it>q</it>)) and <b>P </b>(<it>G</it><sub><it>i </it></sub>(1)) using induction on length <it>i</it>. For <it>i </it>= 0, these probabilities obviously comply with: <b>P </b>(<it>C</it><sub>0 </sub>(0; <it>&#949;</it>)) = 1; <b>P </b>(<it>C</it><sub>0 </sub>(0; <it>q</it>)) = 0, for any <it>q </it>&#8800; <it>&#949;</it>; <b>P </b>(<it>G</it><sub>0 </sub>(1)) = 0.</p>
            <p>The values of <b>P </b>(<it>C</it><sub><it>i</it>+1 </sub>(0; <it>q</it>)) and <b>P </b>(<it>G</it><sub><it>i</it>+1 </sub>(1)) are calculated using values of <b>P </b>(<it>C</it><sub><it>i </it></sub>(0; <it>q</it>)) and <b>P </b>(<it>G</it><sub><it>i </it></sub>(1)). Therefore, the needed space is proportional to the size of <inline-formula><m:math name="1748-7188-2-13-i11" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msub><m:mi>Q</m:mi><m:mi>&#8459;</m:mi></m:msub></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGrbqudaWgaaWcbaWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aaceaGae83cHGeabeaaaaa@38B9@</m:annotation></m:semantics></m:math></inline-formula> (see section <it>Extensions and complexity </it>below).</p>
            <p>Calculation of values <b>P </b>(<it>C</it><sub><it>i</it>+1 </sub>(0; <it>q</it>)) and <b>P </b>(<it>G</it><sub><it>i</it>+1 </sub>(1)) is based on the following observations. Let <it>U </it>be a set of texts of the same length over the alphabet &#931;, <b>P </b>(<it>U</it>) the probability of <it>U </it>in the Bernoulli model and <it>a </it>a character in &#931;. Let <it>U</it>&#183;<it>a </it>be the set of all possible concatenations, i.e., <it>U</it>&#183;<it>a </it>= {<it>xa</it>|<it>x </it>&#8712; <it>U</it>}. And in the case of the Bernoulli model</p>
            <p>
               <display-formula id="M1"><b>P </b>(<it>U</it>&#183;<it>a</it>) = <b>P </b>(<it>U</it>) <b>P </b>(<it>a</it>).</display-formula>
            </p>
            <p>Then the following relations hold for any <it>i </it>&#8712; {1, ..., <it>n </it>- 1} and &#931;:</p>
            <p>(i) if the text <it>T</it><sub><it>i </it></sub>contains a word from <inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula> then all its concatenations with characters from &#931; would contain a word from <inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula>; i.e.,</p>
            <p>
               <display-formula id="M2"><it>G</it><sub><it>i </it></sub>(1)&#183;<it>a </it>&#8834; <it>G</it><sub><it>i</it>+1 </sub>(1).</display-formula>
            </p>
            <p>(ii) if the text <it>T</it><sub><it>i </it></sub>does not contain a word from <inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula> and belongs to <it>C</it><sub><it>i</it>+1 </sub>(0; <it>q</it>), i.e., ends with <it>q </it>&#8712; <inline-formula><m:math name="1748-7188-2-13-i13" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msub><m:mi>Q</m:mi><m:mi>&#8459;</m:mi></m:msub><m:mo>\</m:mo><m:mi>&#8459;</m:mi></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGrbqudaWgaaWcbaWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aaceaGae83cHGeabeaakiabcYfaCjab=Tqiibaa@3AFC@</m:annotation></m:semantics></m:math></inline-formula>, then its concatenation <it>T</it><sub><it>i</it></sub>&#183;<it>a </it>belongs to the class determined by the result of the Aho-Corasick transition function <it>&#948; </it>(<it>q, a</it>); i.e.,</p>
            <p>
               <display-formula id="M3">if <it>&#948; </it>(<it>q</it>, <it>a</it>) &#8712; <m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math>,&#160;&#160;&#160;then <it>C</it><sub><it>i </it></sub>(0; <it>q</it>)&#183;<it>a </it>&#8834; <it>C</it><sub><it>i</it>+1 </sub>(0; <it>&#948; </it>(<it>q, a</it>))</display-formula>
            </p>
            <p>
               <display-formula id="M4">otherwise&#160;&#160;&#160;<it>C</it><sub><it>i </it></sub>(0; <it>q</it>) &#8834; <it>G</it><sub><it>i</it>+1 </sub>(1).</display-formula>
            </p>
            <p>Remembering that classes <it>C</it><sub><it>i </it></sub>(0; <it>q</it>) for different <it>q </it>and <it>G</it><sub><it>i </it></sub>(1) form a partition of &#931;<sup><it>i</it></sup>, we obtain the following relation for the texts containing words from <inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula>:</p>
            <p>
               <display-formula id="M5">
                  <m:math name="1748-7188-2-13-i14" xmlns:m="http://www.w3.org/1998/Math/MathML">
                     <m:semantics>
                        <m:mrow>
                           <m:msub>
                              <m:mi>G</m:mi>
                              <m:mrow>
                                 <m:mi>i</m:mi>
                                 <m:mo>+</m:mo>
                                 <m:mn>1</m:mn>
                              </m:mrow>
                           </m:msub>
                           <m:mo stretchy="false">(</m:mo>
                           <m:mn>1</m:mn>
                           <m:mo stretchy="false">)</m:mo>
                           <m:mo>=</m:mo>
                           <m:mo>{</m:mo>
                           <m:mstyle displaystyle="true">
                              <m:munder>
                                 <m:mo>&#8746;</m:mo>
                                 <m:mrow>
                                    <m:mi>a</m:mi>
                                    <m:mo>&#8712;</m:mo>
                                    <m:mi>&#931;</m:mi>
                                 </m:mrow>
                              </m:munder>
                              <m:mrow>
                                 <m:msub>
                                    <m:mi>G</m:mi>
                                    <m:mi>i</m:mi>
                                 </m:msub>
                                 <m:mo stretchy="false">(</m:mo>
                                 <m:mn>1</m:mn>
                                 <m:mo stretchy="false">)</m:mo>
                                 <m:mo>&#8901;</m:mo>
                                 <m:mi>a</m:mi>
                                 <m:mo>}</m:mo>
                                 <m:mo>&#8746;</m:mo>
                                 <m:mo>{</m:mo>
                                 <m:mstyle displaystyle="true">
                                    <m:munder>
                                       <m:mo>&#8746;</m:mo>
                                       <m:mrow>
                                          <m:mo stretchy="false">(</m:mo>
                                          <m:mi>q</m:mi>
                                          <m:mo>,</m:mo>
                                          <m:mi>a</m:mi>
                                          <m:mo stretchy="false">)</m:mo>
                                          <m:mo>;</m:mo>
                                          <m:mi>&#948;</m:mi>
                                          <m:mo stretchy="false">(</m:mo>
                                          <m:mi>q</m:mi>
                                          <m:mo>,</m:mo>
                                          <m:mi>a</m:mi>
                                          <m:mo stretchy="false">)</m:mo>
                                          <m:mo>&#8712;</m:mo>
                                          <m:mi>&#8459;</m:mi>
                                       </m:mrow>
                                    </m:munder>
                                    <m:mrow>
                                       <m:msub>
                                          <m:mi>C</m:mi>
                                          <m:mi>i</m:mi>
                                       </m:msub>
                                       <m:mo stretchy="false">(</m:mo>
                                       <m:mn>0</m:mn>
                                       <m:mo>;</m:mo>
                                       <m:mi>q</m:mi>
                                       <m:mo stretchy="false">)</m:mo>
                                       <m:mo>&#8901;</m:mo>
                                       <m:mi>a</m:mi>
                                       <m:mo>}</m:mo>
                                    </m:mrow>
                                 </m:mstyle>
                                 <m:mo>.</m:mo>
                              </m:mrow>
                           </m:mstyle>
                        </m:mrow>
                        <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGhbWrdaWgaaWcbaGaemyAaKMaey4kaSIaeGymaedabeaakiabcIcaOiabigdaXiabcMcaPiabg2da9iabcUha7naatafabaGaem4raC0aaSbaaSqaaiabdMgaPbqabaGccqGGOaakcqaIXaqmcqGGPaqkcqGHflY1cqWGHbqycqGG9bqFcqGHQicYcqGG7bWEdaWeqbqaaiabdoeadnaaBaaaleaacqWGPbqAaeqaaOGaeiikaGIaeGimaaJaei4oaSJaemyCaeNaeiykaKIaeyyXICTaemyyaeMaeiyFa0haleaacqGGOaakcqWGXbqCcqGGSaalcqWGHbqycqGGPaqkcqGG7aWoiiGacqWF0oazcqGGOaakcqWGXbqCcqGGSaalcqWGHbqycqGGPaqkcqGHiiIZt0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqGFlecsaeqaniablQIivbGccqGGUaGlaSqaaiabdggaHjabgIGiolabfo6atbqab0GaeSOkIufaaaa@72A7@</m:annotation>
                     </m:semantics>
                  </m:math>
               </display-formula>
            </p>
            <p>Similarly, classes of texts that do not contain words from <inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula> satisfy</p>
            <p>
               <display-formula id="M6">
                  <m:math name="1748-7188-2-13-i15" xmlns:m="http://www.w3.org/1998/Math/MathML">
                     <m:semantics>
                        <m:mrow>
                           <m:mtable>
                              <m:mtr>
                                 <m:mtd>
                                    <m:mrow>
                                       <m:mo>&#8704;</m:mo>
                                       <m:msup>
                                          <m:mi>q</m:mi>
                                          <m:mo>&#8242;</m:mo>
                                       </m:msup>
                                       <m:mo>&#8712;</m:mo>
                                       <m:msub>
                                          <m:mi>Q</m:mi>
                                          <m:mi>&#8459;</m:mi>
                                       </m:msub>
                                       <m:mo>\</m:mo>
                                       <m:mi>&#8459;</m:mi>
                                       <m:mo>:</m:mo>
                                    </m:mrow>
                                 </m:mtd>
                                 <m:mtd>
                                    <m:mrow>
                                       <m:msub>
                                          <m:mi>C</m:mi>
                                          <m:mrow>
                                             <m:mi>i</m:mi>
                                             <m:mo>+</m:mo>
                                             <m:mn>1</m:mn>
                                          </m:mrow>
                                       </m:msub>
                                       <m:mo stretchy="false">(</m:mo>
                                       <m:mn>0</m:mn>
                                       <m:mo>;</m:mo>
                                       <m:msup>
                                          <m:mi>q</m:mi>
                                          <m:mo>&#8242;</m:mo>
                                       </m:msup>
                                       <m:mo stretchy="false">)</m:mo>
                                       <m:mo>=</m:mo>
                                       <m:mstyle displaystyle="true">
                                          <m:munder>
                                             <m:mo>&#8746;</m:mo>
                                             <m:mrow>
                                                <m:mo stretchy="false">(</m:mo>
                                                <m:mi>q</m:mi>
                                                <m:mo>,</m:mo>
                                                <m:mi>a</m:mi>
                                                <m:mo stretchy="false">)</m:mo>
                                                <m:mo>;</m:mo>
                                                <m:mi>&#948;</m:mi>
                                                <m:mo stretchy="false">(</m:mo>
                                                <m:mi>q</m:mi>
                                                <m:mo>,</m:mo>
                                                <m:mi>a</m:mi>
                                                <m:mo stretchy="false">)</m:mo>
                                                <m:mo>=</m:mo>
                                                <m:msup>
                                                   <m:mi>q</m:mi>
                                                   <m:mo>&#8242;</m:mo>
                                                </m:msup>
                                             </m:mrow>
                                          </m:munder>
                                          <m:mrow>
                                             <m:msub>
                                                <m:mi>C</m:mi>
                                                <m:mi>i</m:mi>
                                             </m:msub>
                                             <m:mo stretchy="false">(</m:mo>
                                             <m:mn>0</m:mn>
                                             <m:mo>;</m:mo>
                                             <m:mi>q</m:mi>
                                             <m:mo stretchy="false">)</m:mo>
                                             <m:mo>&#8901;</m:mo>
                                             <m:mi>a</m:mi>
                                          </m:mrow>
                                       </m:mstyle>
                                       <m:mo>.</m:mo>
                                    </m:mrow>
                                 </m:mtd>
                              </m:mtr>
                           </m:mtable>
                        </m:mrow>
                        <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaafaqabeqacaaabaGaeyiaIiIafmyCaeNbauaacqGHiiIZcqWGrbqudaWgaaWcbaWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aaceaGae83cHGeabeaakiabcYfaCjab=TqiijabcQda6aqaaiabdoeadnaaBaaaleaacqWGPbqAcqGHRaWkcqaIXaqmaeqaaOGaeiikaGIaeGimaaJaei4oaSJafmyCaeNbauaacqGGPaqkcqGH9aqpdaWeqbqaaiabdoeadnaaBaaaleaacqWGPbqAaeqaaOGaeiikaGIaeGimaaJaei4oaSJaemyCaeNaeiykaKIaeyyXICTaemyyaegaleaacqGGOaakcqWGXbqCcqGGSaalcqWGHbqycqGGPaqkcqGG7aWoiiGacqGF0oazcqGGOaakcqWGXbqCcqGGSaalcqWGHbqycqGGPaqkcqGH9aqpcuWGXbqCgaqbaaqab0GaeSOkIufakiabc6caUaaaaaa@67F3@</m:annotation>
                     </m:semantics>
                  </m:math>
               </display-formula>
            </p>
            <p>Classes <it>C</it><sub><it>i </it></sub>(0; <it>q</it>) for different <it>q </it>in <inline-formula><m:math name="1748-7188-2-13-i13" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msub><m:mi>Q</m:mi><m:mi>&#8459;</m:mi></m:msub><m:mo>\</m:mo><m:mi>&#8459;</m:mi></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGrbqudaWgaaWcbaWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aaceaGae83cHGeabeaakiabcYfaCjab=Tqiibaa@3AFC@</m:annotation></m:semantics></m:math></inline-formula> and <it>G</it><sub><it>i </it></sub>(1) form a partition of &#931;<sup><it>i</it></sup>; classes <it>C</it><sub><it>i </it></sub>(0; <it>q</it>) are empty if <it>q </it>is in <inline-formula><m:math name="1748-7188-2-13-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>&#8459;</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@</m:annotation></m:semantics></m:math></inline-formula>. Relations (5) and (6) with the help of (1) yield the recursive expressions for probabilities <b>P </b>(<it>C</it><sub><it>i+i </it></sub>(0; <it>q</it>)) and <b>P </b>(<it>G</it><sub><it>i</it>+1 </sub>(1)) in the Bernoulli case:</p>
            <p>
               <display-formula id="M7">
                  <m:math name="1748-7188-2-13-i16" xmlns:m="http://www.w3.org/1998/Math/MathML">
                     <m:semantics>
                        <m:mrow>
                           <m:mi>P</m:mi>
                           <m:mo stretchy="false">(</m:mo>
                           <m:msub>
                              <m:mi>G</m:mi>
                              <m:mrow>
                                 <m:mi>i</m:mi>
                                 <m:mo>+</m:mo>
                                 <m:mn>1</m:mn>
                              </m:mrow>
                           </m:msub>
                           <m:mo stretchy="false">(</m:mo>
                           <m:mn>1</m:mn>
                           <m:mo stretchy="false">)</m:mo>
                           <m:mo stretchy="false">)</m:mo>
                           <m:mo>=</m:mo>
                           <m:mi>P</m:mi>
                           <m:mo stretchy="false">(</m:mo>
                           <m:msub>
                              <m:mi>G</m:mi>
                              <m:mi>i</m:mi>
                           </m:msub>
                           <m:mo stretchy="false">(</m:mo>
                           <m:mn>1</m:mn>
                           <m:mo stretchy="false">)</m:mo>
                           <m:mo stretchy="false">)</m:mo>
                           <m:mo>+</m:mo>
                           <m:mstyle displaystyle="true">
                              <m:munder>
                                 <m:mo>&#8721;</m:mo>
                                 <m:mrow>
                                    <m:mo stretchy="false">(</m:mo>
                                    <m:mi>q</m:mi>
                                    <m:mo>,</m:mo>
                                    <m:mi>a</m:mi>
                                    <m:mo stretchy="false">)</m:mo>
                                    <m:mo>:</m:mo>
                                    <m:mi>&#948;</m:mi>
                                    <m:mo stretchy="false">(</m:mo>
                                    <m:mi>q</m:mi>
                                    <m:mo>,</m:mo>
                                    <m:mi>a</m:mi>
                                    <m:mo stretchy="false">)</m:mo>
                                    <m:mo>&#8712;</m:mo>
                                    <m:mi>&#8459;</m:mi>
                                 </m:mrow>
                              </m:munder>
                              <m:mrow>
                                 <m:mi>P</m:mi>
                                 <m:mo stretchy="false">(</m:mo>
                                 <m:msub>
                                    <m:mi>C</m:mi>
                                    <m:mi>i</m:mi>
                                 </m:msub>
                                 <m:mo stretchy="false">(</m:mo>
                                 <m:mn>0</m:mn>
                                 <m:mo>;</m:mo>
                                 <m:mi>q</m:mi>
                                 <m:mo stretchy="false">)</m:mo>
                                 <m:mo stretchy="false">)</m:mo>
                                 <m:mo>&#8901;</m:mo>
                                 <m:mi>p</m:mi>
                                 <m:mo stretchy="false">(</m:mo>
                                 <m:mi>a</m:mi>
                                 <m:mo stretchy="false">)</m:mo>
                              </m:mrow>
                           </m:mstyle>
                           <m:mo>,</m:mo>
                        </m:mrow>
                        <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaaieqacqWFqbaucqGGOaakcqWGhbWrdaWgaaWcbaGaemyAaKMaey4kaSIaeGymaedabeaakiabcIcaOiabigdaXiabcMcaPiabcMcaPiabg2da9iab=bfaqjabcIcaOiabdEeahnaaBaaaleaacqWGPbqAaeqaaOGaeiikaGIaeGymaeJaeiykaKIaeiykaKIaey4kaSYaaabuaeaacqWFqbaucqGGOaakcqWGdbWqdaWgaaWcbaGaemyAaKgabeaakiabcIcaOiabicdaWiabcUda7iabdghaXjabcMcaPiabcMcaPiabgwSixlabdchaWjabcIcaOiabdggaHjabcMcaPaWcbaGaeiikaGIaemyCaeNaeiilaWIaemyyaeMaeiykaKIaeiOoaOdcciGae4hTdqMaeiikaGIaemyCaeNaeiilaWIaemyyaeMaeiykaKIaeyicI48enfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aaceaGae03cHGeabeqdcqGHris5aOGaeiilaWcaaa@6E5E@</m:annotation>
                     </m:semantics>
                  </m:math>
               </display-formula>
            </p>
            <p>
               <display-formula id="M8">
                  <m:math 