<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1748-7188-3-3</ui>
   <ji>1748-7188</ji>
   <fm>
      <dochead>Research</dochead>
      <bibl>
         <title>
            <p>A scoring matrix approach to detecting miRNA target sites</p>
         </title>
         <aug>
            <au id="A1">
               <snm>Moxon</snm>
               <fnm>Simon</fnm>
               <insr iid="I1"/>
               <email>simonm@cmp.uea.ac.uk</email>
            </au>
            <au id="A2">
               <snm>Moulton</snm>
               <fnm>Vincent</fnm>
               <insr iid="I1"/>
               <email>vincent.moulton@cmp.uea.ac.uk</email>
            </au>
            <au id="A3" ca="yes">
               <snm>Kim</snm>
               <mi>T</mi>
               <fnm>Jan</fnm>
               <insr iid="I1"/>
               <email>jtk@cmp.uea.ac.uk</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>School of Computing Sciences, University of East Anglia, Norwich, NR4 7TJ, UK</p>
            </ins>
         </insg>
         <source>Algorithms for Molecular Biology</source>
         <issn>1748-7188</issn>
         <pubdate>2008</pubdate>
         <volume>3</volume>
         <issue>1</issue>
         <fpage>3</fpage>
         <url>http://www.almob.org/content/3/1/3</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">18377655</pubid>
               <pubid idtype="doi">10.1186/1748-7188-3-3</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>05</day>
               <month>10</month>
               <year>2007</year>
            </date>
         </rec>
         <acc>
            <date>
               <day>31</day>
               <month>3</month>
               <year>2008</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>31</day>
               <month>3</month>
               <year>2008</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2008</year>
         <collab>Moxon et al; licensee BioMed Central Ltd.</collab>
         <note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>Experimental identification of microRNA (miRNA) targets is a difficult and time consuming process. As a consequence several computational prediction methods have been devised in order to predict targets for follow up experimental validation. Current computational target prediction methods use only the miRNA sequence as input. With an increasing number of experimentally validated targets becoming available, utilising this additional information in the search for further targets may help to improve the specificity of computational methods for target site prediction.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>We introduce a generic target prediction method, the Stacking Binding Matrix (<it>SBM</it>) that uses both information about the miRNA as well as experimentally validated target sequences in the search for candidate target sequences. We demonstrate the utility of our method by applying it to both animal and plant data sets and compare it with miRanda, a commonly used target prediction method.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusion</p>
               </st>
               <p>We show that <it>SBM </it>can be applied to target prediction in both plants and animals and performs well in terms of sensitivity and specificity. Open source code implementing the <it>SBM </it>method, together with documentation and examples are freely available for download from the address in the Availability and Requirements section.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>microRNAs (miRNAs) are small non-coding RNAs of around 21 nt in length, which are currently receiving a great deal of attention <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B2">2</abbr></abbrgrp>. They are derived from a precursor RNA hairpin structure by RNAse III-like enzymes <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>, and are incorporated into the RNA induced silencing complex (RISC). Via this complex, the microRNA guides either the cleavage or translational repression of messenger RNAs (mRNAs) by binding to a region of the mRNA known as the target site <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>. In this way, miRNAs regulate a variety of cellular and molecular functions <abbrgrp><abbr bid="B5">5</abbr><abbr bid="B6">6</abbr></abbrgrp>, playing important roles in, for example, organism growth and development <abbrgrp><abbr bid="B7">7</abbr><abbr bid="B8">8</abbr></abbrgrp>.</p>
         <p>New miRNAs are being discovered at an increasingly rapid rate <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>. Since they play an important role in eukaryotic gene regulation, the problem of determining their function is thus of utmost importance. Accordingly, several computational methods have been developed for miRNA target prediction &#8211; see e.g. <abbrgrp><abbr bid="B10">10</abbr><abbr bid="B11">11</abbr><abbr bid="B12">12</abbr><abbr bid="B13">13</abbr></abbrgrp>. These methods usually rely on finding target sequences based on a single miRNA input, and employ nucleotide complementarity and minimum free energy (MFE) calculations to identify candidate miRNA/target duplexes. Although these methods have been successfully used in target prediction e.g. <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>, their specificity can be limited, i.e. they may produce many false positives <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>.</p>
         <p>Various methods have been proposed to improve the specificity of miRNA target prediction methods. For example, comparative genomics has been used to focus on sites that are conserved between species <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>. Here we concentrate on an alternative approach, the Stacking Binding Matrix (<it>SBM</it>), in which we can incorporate all of the known targets for a given miRNA (in general a miRNA may target several sites) into a search for additional targets. The number of experimentally validated miRNA targets is steadily growing, and as this number increases so too should the usefulness of the <it>SBM </it>method.</p>
         <p>Our approach is an adaptation of the binding matrix (BM) technique for transcription factor binding site classification <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>, a method that was designed to systematically maximise specificity in searches for transcription factor binding sites. In contrast to computation of the BM, which uses single nucleotide information and results in a 4 &#215; <it>l </it>matrix for scoring words of length <it>l</it>, the <it>SBM </it>is a 16 &#215; (<it>l </it>- 1) matrix based on dinucleotides (i.e. consecutive pairs of nucleotides). In this way, it is possible to incorporate the fundamental principle of RNA stacking energies <abbrgrp><abbr bid="B17">17</abbr></abbrgrp> which is commonly used in miRNA detection.</p>
      </sec>
      <sec>
         <st>
            <p>Methods</p>
         </st>
         <p>In brief, the <it>SBM </it>is computed from a multiple sequence alignment consisting of the reverse complement of the miRNA in question together with any known target sequences. The resulting matrix (or set of matrices in case the alignment contains gaps) is then used to scan and score a set of potential target sequences. Sequences having a score exceeding a user-defined threshold are returned as potential targets.</p>
         <sec>
            <st>
               <p>Scoring matrices and the Binding Matrix</p>
            </st>
            <p>A <it>scoring matrix </it>for nucleotide words of length <it>l </it>is an {<it>A</it>, <it>C</it>, <it>G</it>, <it>U</it>} &#215; <it>l </it>matrix <it>M </it>= (<it>m</it><sub><it>bk</it></sub>). Given a word <it>w </it>= <it>w</it>[1] <it>w</it>[2] ... <it>w</it>[<it>l</it>] in the alphabet {<it>A</it>, <it>C</it>, <it>G</it>, <it>U</it>} its score <it>S</it>(<it>w</it>) is the sum of the matrix elements "selected" by the symbols in the word, that is,</p>
            <p>
               <display-formula>
                  <m:math name="1748-7188-3-3-i1" xmlns:m="http://www.w3.org/1998/Math/MathML">
                     <m:semantics>
                        <m:mrow>
                           <m:mi>S</m:mi>
                           <m:mo stretchy="false">(</m:mo>
                           <m:mi>w</m:mi>
                           <m:mo stretchy="false">)</m:mo>
                           <m:mo>=</m:mo>
                           <m:mstyle displaystyle="true">
                              <m:munderover>
                                 <m:mo>&#8721;</m:mo>
                                 <m:mrow>
                                    <m:mi>k</m:mi>
                                    <m:mo>=</m:mo>
                                    <m:mn>1</m:mn>
                                 </m:mrow>
                                 <m:mi>l</m:mi>
                              </m:munderover>
                              <m:mrow>
                                 <m:msub>
                                    <m:mi>m</m:mi>
                                    <m:mrow>
                                       <m:mi>w</m:mi>
                                       <m:mo stretchy="false">[</m:mo>
                                       <m:mi>k</m:mi>
                                       <m:mo stretchy="false">]</m:mo>
                                       <m:mo>,</m:mo>
                                       <m:mi>k</m:mi>
                                    </m:mrow>
                                 </m:msub>
                                 <m:mo>.</m:mo>
                              </m:mrow>
                           </m:mstyle>
                        </m:mrow>
                        <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaem4uamLaeiikaGIaem4DaCNaeiykaKIaeyypa0ZaaabCaeaacqWGTbqBdaWgaaWcbaGaem4DaCNaei4waSLaem4AaSMaeiyxa0LaeiilaWIaem4AaSgabeaakiabc6caUaWcbaGaem4AaSMaeyypa0JaeGymaedabaGaemiBaWganiabggHiLdaaaa@428B@</m:annotation>
                     </m:semantics>
                  </m:math>
               </display-formula>
            </p>
            <p>Given a threshold <it>S</it><sub>min</sub>, a word <it>w </it>is classified as a <it>binding word </it>if <it>S</it>(<it>w</it>) &#8805; <it>S</it><sub>min </sub>and otherwise it is classified as a non-binding word. Generally, the threshold can be used to adjust sensitivity and specificity of classification: Assuming a positive correlation between density of true positives and score, lowering the threshold increases sensitivity and decreases specificity. Also, notice that for any <it>&#955; </it>> 0, scoring a word with the matrix <it>&#955; M </it>and using the threshold <it>&#955; S</it><sub>min </sub>results in the same classification. A matrix classifier is called <it>consistent </it>with a set <it>B </it>= {<it>b</it><sub>1</sub>, ..., <it>b</it><sub><it>N</it></sub>}, of known binding words if it classifies them all correctly <abbrgrp><abbr bid="B18">18</abbr></abbrgrp>, i.e. if <it>S</it>(<it>b</it>) &#8805; <it>S</it><sub>min </sub>for all <it>b </it>&#8712; <it>B</it>. There are various ways of constructing a scoring matrix from a set of binding words <abbrgrp><abbr bid="B19">19</abbr></abbrgrp>. The <it>Binding Matrix (BM) </it>is defined to be the matrix for which the number of words classified as binding words is minimal, under the condition that it is consistent. A method for computing the BM and a discussion of its properties is given in <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>.</p>
         </sec>
         <sec>
            <st>
               <p>Incorporating stacking into binding matrix computations</p>
            </st>
            <p>A key feature in RNA structure prediction is the incorporation of stacking energies <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>. So as to capture information from both nucleotide complementarity and base pair stacking energies, in the computation of the <it>SBM </it>we score <it>dinucleotides</it>. Formally, for nucleotide words of length <it>l</it>, <it>SBM </it>is a {<it>A</it>, <it>C</it>, <it>G</it>, <it>U</it>}<sup>2 </sup>&#215; (<it>l </it>- 1) matrix. It is computed by first converting each word <it>w </it>into the sequence <it>w</it>[1]<it>w</it>[2], <it>w</it>[2]<it>w</it>[3], ..., <it>w</it>[<it>l </it>- 1]<it>w</it>[<it>l</it>] of dinucleotides in {<it>A</it>, <it>C</it>, <it>G</it>, <it>U</it>}<sup>2 </sup>and then optimising as with the BM. For performance reasons, to compute the <it>SBM </it>we use the optimisation approach described in <abbrgrp><abbr bid="B20">20</abbr></abbrgrp> rather than the quadratic programming technique used in <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>. All <it>SBM</it>s are scaled so that a threshold of 1 corresponds to the most specific consistent classifier.</p>
            <p>Note that in contrast to transcription factors, where only binding site sequences (binding words) are available, the reverse complement of the miRNA sequence itself provides information about the accepted target site sequences. Thus we include the reverse complement of the miRNA within the alignment of the known target sites.</p>
         </sec>
         <sec>
            <st>
               <p>Incorporating gaps</p>
            </st>
            <p>The complementarity of a miRNA binding to a target site is usually imperfect and commonly involves bulges (see Figure <figr fid="F1">1</figr>), which results in gapped alignments.</p>
            <fig id="F1">
               <title>
                  <p>Figure 1</p>
               </title>
               <caption>
                  <p>Alignment of the <it>Drosophila melanogaster </it>let-7 miRNA to a cognate target site in the 3' UTR of the <it>ab </it>gene adapted from [21, Fig. 1]</p>
               </caption>
               <text>
                  <p>
                     <b>Alignment of the <it>Drosophila melanogaster </it>let-7 miRNA to a cognate target site in the 3' UTR of the <it>ab </it>gene adapted from [21, Fig. 1].</b>
                  </p>
               </text>
               <graphic file="1748-7188-3-3-1"/>
            </fig>
            <p>However, in common with scoring matrix-based classification approaches, the <it>SBM </it>cannot accommodate gaps directly. To address this, we employ a <it>set </it>of <it>SBM</it>s rather than a single <it>SBM</it>.</p>
            <p>For <it>N </it>= {<it>A</it>, <it>C</it>, <it>G</it>, <it>U</it>}, let <it>A </it>= {<it>S</it><sub>1</sub>, <it>S</it><sub>2</sub>, ..., <it>S</it><sub><it>n</it></sub>} denote an alignment consisting of (possibly) gapped sequences over <it>N </it>of length <it>l</it>. Denote the gap character by -, and let <it>s</it><sub><it>i</it>,<it>j </it></sub>be the <it>j</it>-th symbol of <it>S</it><sub><it>i</it></sub>. Suppose that <it>D </it>&#8838; {1, 2, ..., <it>l</it>}. Given a sequence <it>S</it><sub><it>i </it></sub>&#8712; <it>A</it>, let <inline-formula><m:math name="1748-7188-3-3-i2" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>S</m:mi><m:mi>i</m:mi><m:mi>D</m:mi></m:msubsup><m:mo>=</m:mo><m:msub><m:mi>s</m:mi><m:mrow><m:mi>i</m:mi><m:mo>,</m:mo><m:msub><m:mi>j</m:mi><m:mn>1</m:mn></m:msub></m:mrow></m:msub><m:msub><m:mi>s</m:mi><m:mrow><m:mi>i</m:mi><m:mo>,</m:mo><m:msub><m:mi>j</m:mi><m:mn>2</m:mn></m:msub></m:mrow></m:msub><m:mn>...</m:mn><m:msub><m:mi>s</m:mi><m:mrow><m:mi>i</m:mi><m:mo>,</m:mo><m:msub><m:mi>j</m:mi><m:mrow><m:mi>l</m:mi><m:mo>&#8722;</m:mo><m:mrow><m:mo>|</m:mo><m:mi>D</m:mi><m:mo>|</m:mo></m:mrow></m:mrow></m:msub></m:mrow></m:msub></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaem4uam1aa0baaSqaaiabdMgaPbqaaiabdseaebaakiabg2da9iabdohaZnaaBaaaleaacqWGPbqAcqGGSaalcqWGQbGAdaWgaaadbaGaeGymaedabeaaaSqabaGccqWGZbWCdaWgaaWcbaGaemyAaKMaeiilaWIaemOAaO2aaSbaaWqaaiabikdaYaqabaaaleqaaOGaeiOla4IaeiOla4IaeiOla4Iaem4Cam3aaSbaaSqaaiabdMgaPjabcYcaSiabdQgaQnaaBaaameaacqWGSbaBcqGHsisldaabdaqaaiabdseaebGaay5bSlaawIa7aaqabaaaleqaaaaa@4C11@</m:annotation></m:semantics></m:math></inline-formula> denote the subsequence of <it>S</it><sub><it>i </it></sub>with <it>j</it><sub><it>k </it></sub>&lt;<it>j</it><sub><it>k</it>+1 </sub>and <it>j</it><sub><it>k </it></sub>&#8712; {1, 2, ..., <it>l</it>} - <it>D</it>, and define the <it>subsequence alignment </it>of <it>A </it>corresponding to <it>D </it>to be <inline-formula><m:math name="1748-7188-3-3-i3" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msup><m:mi>A</m:mi><m:mi>D</m:mi></m:msup><m:mo>=</m:mo><m:mo>{</m:mo><m:msubsup><m:mi>S</m:mi><m:mn>1</m:mn><m:mi>D</m:mi></m:msubsup><m:mo>,</m:mo><m:msubsup><m:mi>S</m:mi><m:mn>2</m:mn><m:mi>D</m:mi></m:msubsup><m:mo>,</m:mo><m:mn>...</m:mn><m:mo>,</m:mo><m:msubsup><m:mi>S</m:mi><m:mi>n</m:mi><m:mi>D</m:mi></m:msubsup><m:mo>}</m:mo></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemyqae0aaWbaaSqabeaacqWGebaraaGccqGH9aqpcqGG7bWEcqWGtbWudaqhaaWcbaGaeGymaedabaGaemiraqeaaOGaeiilaWIaem4uam1aa0baaSqaaiabikdaYaqaaiabdseaebaakiabcYcaSiabc6caUiabc6caUiabc6caUiabcYcaSiabdofatnaaDaaaleaacqWGUbGBaeaacqWGebaraaGccqGG9bqFaaa@4226@</m:annotation></m:semantics></m:math></inline-formula> (i.e. the alignment obtained from <it>A </it>by removing the columns indexed by elements of <it>D</it>). The <it>gap pattern of a sequence S</it><sub><it>i </it></sub>&#8712; <it>A</it>, denoted <it>G</it>(<it>S</it><sub><it>i</it></sub>), is the set <it>G</it>(<it>S</it><sub><it>i</it></sub>) = {<it>j </it>: <it>s</it><sub><it>i</it>,<it>j </it></sub>= -}. In particular, for each <it>S</it><sub><it>i </it></sub>&#8712; <it>A </it>the ungapped sequence corresponding to <it>S</it><sub><it>i </it></sub>equals <inline-formula><m:math name="1748-7188-3-3-i4" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>S</m:mi><m:mi>i</m:mi><m:mrow><m:mi>G</m:mi><m:mo stretchy="false">(</m:mo><m:msub><m:mi>S</m:mi><m:mi>i</m:mi></m:msub><m:mo stretchy="false">)</m:mo></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaem4uam1aa0baaSqaaiabdMgaPbqaaiabdEeahjabcIcaOiabdofatnaaBaaameaacqWGPbqAaeqaaSGaeiykaKcaaaaa@3417@</m:annotation></m:semantics></m:math></inline-formula>. Correspondingly, the <it>gap pattern of A </it>is defined as <it>G</it>(<it>A</it>) = &#8746;<sub><it>i </it></sub><it>G</it>(<it>S</it><sub><it>i</it></sub>), i.e. the set of indices of those columns in <it>A </it>that contain at least one gap.</p>
            <p>Now, let <inline-formula><m:math name="1748-7188-3-3-i5" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi mathvariant="script">D</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aaceaGae83aXteaaa@374F@</m:annotation></m:semantics></m:math></inline-formula> be a subset of 2<sup><it>G</it>(<it>A</it>) </sup>(in practice we take either <inline-formula><m:math name="1748-7188-3-3-i6" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msub><m:mi mathvariant="script">D</m:mi><m:mrow><m:mtext>all</m:mtext></m:mrow></m:msub></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aaceaGae83aXt0aaSbaaSqaaiabbggaHjabbYgaSjabbYgaSbqabaaaaa@3B82@</m:annotation></m:semantics></m:math></inline-formula> = 2<sup><it>G</it>(<it>A</it>) </sup>or <inline-formula><m:math name="1748-7188-3-3-i7" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msub><m:mi mathvariant="script">D</m:mi><m:mrow><m:mtext>observed</m:mtext></m:mrow></m:msub></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aaceaGae83aXt0aaSbaaSqaaiabb+gaVjabbkgaIjabbohaZjabbwgaLjabbkhaYjabbAha2jabbwgaLjabbsgaKbqabaaaaa@4267@</m:annotation></m:semantics></m:math></inline-formula> = {<it>G</it>(<it>S</it>) : <it>S </it>&#8712; <it>A</it>}). For each of the alignments <inline-formula><m:math name="1748-7188-3-3-i8" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:mi mathvariant="script">A</m:mi><m:mo stretchy="false">(</m:mo><m:mi mathvariant="script">D</m:mi><m:mo stretchy="false">)</m:mo><m:mo>=</m:mo><m:mo>{</m:mo><m:msup><m:mi>A</m:mi><m:mi>D</m:mi></m:msup><m:mo>:</m:mo><m:mi>D</m:mi><m:mo>&#8712;</m:mo><m:mi mathvariant="script">D</m:mi><m:mo>}</m:mo></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aaceaGae8haXhKaeiikaGIae83aXtKaeiykaKIaeyypa0Jaei4EaSNaemyqae0aaWbaaSqabeaacqWGebaraaGccqGG6aGocqWGebarcqGHiiIZcqWFdeprcqGG9bqFaaa@465F@</m:annotation></m:semantics></m:math></inline-formula> we calculate a <it>SBM</it>. In case an alignment <it>A' </it>in <inline-formula><m:math name="1748-7188-3-3-i9" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi mathvariant="script">A</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aaceaGae8haXheaaa@3749@</m:annotation></m:semantics></m:math></inline-formula> contains some gaps, each sequence <it>S </it>in <it>A' </it>that contains gaps is replaced by the set of all sequences obtained by replacing the gaps in <it>S </it>with all possible nucleotide symbol combinations (or the set of nucleotides actually observed at the gap containing position).</p>
            <p>Once the set of <it>SBM</it>s has been computed for each alignment in <inline-formula><m:math name="1748-7188-3-3-i10" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:mi mathvariant="script">A</m:mi><m:mo stretchy="false">(</m:mo><m:mi mathvariant="script">D</m:mi><m:mo stretchy="false">)</m:mo></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aaceaGae8haXhKaeiikaGIae83aXtKaeiykaKcaaa@3AB8@</m:annotation></m:semantics></m:math></inline-formula>, query sequences are then scanned with each of the matrices, and the final score at a given base in a query sequence is taken to be the largest of the scores attained by the individual <it>SBM</it>s. As usual, a target site is predicted in case the final score exceeds a user-defined threshold. This extension to gapped alignments allows the detection of target sites with varying lengths whilst preserving specificity and consistency, both of which are key features of the original BM approach. Note that consistency is ensured since, for each sequence <it>S</it><sub><it>i </it></sub>&#8712; <it>A</it>, we have <it>G</it>(<it>S</it><sub><it>i</it></sub>) &#8712; <inline-formula><m:math name="1748-7188-3-3-i5" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi mathvariant="script">D</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aaceaGae83aXteaaa@374F@</m:annotation></m:semantics></m:math></inline-formula> as one alignment in <inline-formula><m:math name="1748-7188-3-3-i10" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:mi mathvariant="script">A</m:mi><m:mo stretchy="false">(</m:mo><m:mi mathvariant="script">D</m:mi><m:mo stretchy="false">)</m:mo></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aaceaGae8haXhKaeiikaGIae83aXtKaeiykaKcaaa@3AB8@</m:annotation></m:semantics></m:math></inline-formula> must contain <inline-formula><m:math name="1748-7188-3-3-i4" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>S</m:mi><m:mi>i</m:mi><m:mrow><m:mi>G</m:mi><m:mo stretchy="false">(</m:mo><m:msub><m:mi>S</m:mi><m:mi>i</m:mi></m:msub><m:mo stretchy="false">)</m:mo></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaem4uam1aa0baaSqaaiabdMgaPbqaaiabdEeahjabcIcaOiabdofatnaaBaaameaacqWGPbqAaeqaaSGaeiykaKcaaaaa@3417@</m:annotation></m:semantics></m:math></inline-formula>. Computing <it>SBM</it>s based on <inline-formula><m:math name="1748-7188-3-3-i7" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msub><m:mi mathvariant="script">D</m:mi><m:mrow><m:mtext>observed</m:mtext></m:mrow></m:msub></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aaceaGae83aXt0aaSbaaSqaaiabb+gaVjabbkgaIjabbohaZjabbwgaLjabbkhaYjabbAha2jabbwgaLjabbsgaKbqabaaaaa@4267@</m:annotation></m:semantics></m:math></inline-formula> makes most use of the gap information contained in the alignment. As an alternative, computing a (larger) <it>SBM </it>set based on <inline-formula><m:math name="1748-7188-3-3-i6" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msub><m:mi mathvariant="script">D</m:mi><m:mrow><m:mtext>all</m:mtext></m:mrow></m:msub></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aaceaGae83aXt0aaSbaaSqaaiabbggaHjabbYgaSjabbYgaSbqabaaaaa@3B82@</m:annotation></m:semantics></m:math></inline-formula> may allow detection of target sites that are recognised by a pairing structure different from those formed by the target sites known so far, which may be used to improve sensitivity.</p>
         </sec>
         <sec>
            <st>
               <p>Computational complexity</p>
            </st>
            <p>The number of alignments in the set <inline-formula><m:math name="1748-7188-3-3-i10" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:mi mathvariant="script">A</m:mi><m:mo stretchy="false">(</m:mo><m:mi mathvariant="script">D</m:mi><m:mo stretchy="false">)</m:mo></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aaceaGae8haXhKaeiikaGIae83aXtKaeiykaKcaaa@3AB8@</m:annotation></m:semantics></m:math></inline-formula> used in the calculation of <it>SBM </it>set is of order 2<sup>|<it>G</it>(<it>A</it>)|</sup>, and so grows exponentially with the number of columns in <it>A </it>containing gaps. Hence, our approach will not scale to long alignments containing many gaps. Even so, in practice we have found the approach to be applicable to miRNA target prediction, since usually |<it>G</it>(<it>A</it>)| &#8804; 6 (as miRNAs are about 21 nt in length), resulting in at most 2<sup>6 </sup>= 64 alignments in <inline-formula><m:math name="1748-7188-3-3-i10" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:mi mathvariant="script">A</m:mi><m:mo stretchy="false">(</m:mo><m:mi mathvariant="script">D</m:mi><m:mo stretchy="false">)</m:mo></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aaceaGae8haXhKaeiikaGIae83aXtKaeiykaKcaaa@3AB8@</m:annotation></m:semantics></m:math></inline-formula>. Obviously, choosing <inline-formula><m:math name="1748-7188-3-3-i11" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:mi mathvariant="script">D</m:mi><m:mo>=</m:mo><m:msub><m:mi mathvariant="script">D</m:mi><m:mrow><m:mtext>observed</m:mtext></m:mrow></m:msub></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aaceaGae83aXtKaeyypa0Jae83aXt0aaSbaaSqaaiabb+gaVjabbkgaIjabbohaZjabbwgaLjabbkhaYjabbAha2jabbwgaLjabbsgaKbqabaaaaa@452A@</m:annotation></m:semantics></m:math></inline-formula> rather than <inline-formula><m:math name="1748-7188-3-3-i12" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:mi mathvariant="script">D</m:mi><m:mo>=</m:mo><m:msub><m:mi mathvariant="script">D</m:mi><m:mrow><m:mtext>all</m:mtext></m:mrow></m:msub></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aaceaGae83aXtKaeyypa0Jae83aXt0aaSbaaSqaaiabbggaHjabbYgaSjabbYgaSbqabaaaaa@3E45@</m:annotation></m:semantics></m:math></inline-formula> can considerably reduce |<inline-formula><m:math name="1748-7188-3-3-i5" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi mathvariant="script">D</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aaceaGae83aXteaaa@374F@</m:annotation></m:semantics></m:math></inline-formula>|, particularly if gaps occur in only a few distinct patterns. Likewise, the number of alignments obtained after the gap filling procedure is performed also grows exponentially, although the approach is still feasible for miRNA targets, again due to their short length.</p>
         </sec>
         <sec>
            <st>
               <p>Implementation</p>
            </st>
            <p>We have implemented our method in Python <abbrgrp><abbr bid="B22">22</abbr></abbrgrp> and R <abbrgrp><abbr bid="B23">23</abbr></abbrgrp>. The code, together with documentation and examples, is freely available for download (see Availability and Requirements).</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Results</p>
         </st>
         <p>To demonstrate the utility of the <it>SBM </it>method, we present an application to the problem of miRNA target detection for nematode worm (<it>Caenorhabditis elegans</it>), fruit fly (<it>Drosophila melanogaster</it>), mouse (<it>Mus musculus</it>), human (<it>Homo sapiens</it>) and thale cress (<it>Arabidopsis thaliana</it>). We also present a leave one out analysis, and a comparison with miRanda <abbrgrp><abbr bid="B10">10</abbr></abbrgrp>, a commonly used miRNA target prediction algorithm.</p>
         <sec>
            <st>
               <p>Data</p>
            </st>
            <p>We extracted <it>C. elegans</it>, <it>D. melanogaster</it>, <it>M. musculus </it>and <it>H. sapiens </it>miRNA entries from the miRBase database, release 9.1 <abbrgrp><abbr bid="B9">9</abbr></abbrgrp> that had more than one unique, experimentally validated target in the TarBase database <abbrgrp><abbr bid="B24">24</abbr></abbrgrp>. The reverse complement of each miRNA was then aligned with its validated target regions using the ClustalW alignment package <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>. If local alignment algorithms are used, terminal gaps carry much less significance than internal gaps. Therefore, alignments were trimmed by removing columns containing terminal gaps at the 5' or 3' end.</p>
            <p><it>SBM </it>sets were computed for these alignments as described in the Methods section. The <it>SBM </it>sets were used to search for potential new targets in the UTR sequence sets obtained from BioMart <abbrgrp><abbr bid="B26">26</abbr></abbrgrp> (see Table <tblr tid="T1">1</tblr> for details). To test the applicability of the method to plant target prediction, we took a selection of <it>A. thaliana </it>miRNAs from miRBase together with validated target regions from the the <it>Arabidopsis </it>Small RNA Project Database (ASRP) <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>, aligned these sequences with ClustalW, and computed <it>SBM</it>s.</p>
            <tbl id="T1">
               <title>
                  <p>Table 1</p>
               </title>
               <caption>
                  <p>Summary of UTR datasets</p>
               </caption>
               <tblbdy cols="4">
                  <r>
                     <c ca="center">
                        <p>
                           <b>Organism</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>No. Sequences</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>Sequence type</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>No. Nucleotides</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c cspan="4">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <it>C. elegans</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>12,172</p>
                     </c>
                     <c ca="center">
                        <p>UTR</p>
                     </c>
                     <c ca="center">
                        <p>2,724,326</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <it>D. melanogaster</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>11,277</p>
                     </c>
                     <c ca="center">
                        <p>UTR</p>
                     </c>
                     <c ca="center">
                        <p>4,612,168</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <it>M. musculus</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>20,271</p>
                     </c>
                     <c ca="center">
                        <p>UTR</p>
                     </c>
                     <c ca="center">
                        <p>20,009,781</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <it>H. sapiens</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>27,685</p>
                     </c>
                     <c ca="center">
                        <p>UTR</p>
                     </c>
                     <c ca="center">
                        <p>30,673,888</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <it>A. thaliana</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>31,527</p>
                     </c>
                     <c ca="center">
                        <p>cDNA</p>
                     </c>
                     <c ca="center">
                        <p>46,447,255</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>"No. sequences" gives total number of unique sequences in this dataset; "Sequence type" gives the sequence type used (UTR or cDNA); "No. nucleotides" gives total number of nucleotides in the UTR set.</p>
               </tblfn>
            </tbl>
         </sec>
         <sec>
            <st>
               <p>Summary of <it>SBM </it>Scan</p>
            </st>
            <p>On the animal data sets, we determined for each of the <it>SBM </it>sets the number of predicted targets obtained by scanning the UTR data set, using a score threshold of 1 [see Additional file <supplr sid="S3">3</supplr>]. As in <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>, we used the number of predicted targets obtained with a consistent classifier as an indicator of the classifier's specificity.</p>
            <suppl id="S3">
               <title>
                  <p>Additional file 3</p>
               </title>
               <text>
                  <p>Raw result files. Raw <it>SBM </it>results files, alignments of miRNA targets used in the analysis and a list of overlapping targets predicted by both <it>SBM </it>and miRanda.</p>
               </text>
               <file name="1748-7188-3-3-S3.zip">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <p>Plant miRNA targets usually occur in the protein coding region of genes and therefore we searched the gene sequence set TAIR6_cdna_20051108 obtained from The <it>Arabidopsis </it>Information Resource (TAIR) <abbrgrp><abbr bid="B28">28</abbr></abbrgrp> again using a threshold of 1. A summary of these results can be seen in Table <tblr tid="T2">2</tblr> with full datasets available in the supplementary materials [see Additional file <supplr sid="S1">1</supplr>].</p>
            <tbl id="T2">
               <title>
                  <p>Table 2</p>
               </title>
               <caption>
                  <p><it>SBM </it>scan summary obtained using a score threshold of 1</p>
               </caption>
               <tblbdy cols="5">
                  <r>
                     <c ca="center">
                        <p>
                           <b>Organism</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>miRNA</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>Validated targets</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>Recovered targets</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>Potential novel targets</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c cspan="5">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <it>C. elegans</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>cel-miR-273</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <it>C. elegans</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>cel-let-7</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>15</p>
                     </c>
                     <c ca="center">
                        <p>15</p>
                     </c>
                     <c ca="center">
                        <p>1708</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <it>C. elegans</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>cel-miR-84</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>7</p>
                     </c>
                     <c ca="center">
                        <p>7</p>
                     </c>
                     <c ca="center">
                        <p>123</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <it>D. melanogaster</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>dme-miR-11</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <it>D. melanogaster</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>dme-miR-2</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <it>D. melanogaster</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>dme-miR-4</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>8</p>
                     </c>
                     <c ca="center">
                        <p>8</p>
                     </c>
                     <c ca="center">
                        <p>23</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <it>D. melanogaster</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>dme-miR-7</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>15</p>
                     </c>
                     <c ca="center">
                        <p>15</p>
                     </c>
                     <c ca="center">
                        <p>28</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <it>M. musculus</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>mmu-miR-124</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <it>M. musculus</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>mmu-miR-206</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <it>H. sapiens</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>hsa-miR-1</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <it>H. sapiens</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>hsa-miR-122</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <it>A. thaliana</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>ath-miR-163</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <it>A. thaliana</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>ath-miR-172</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <it>A. thaliana</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>ath-miR-390</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <it>A. thaliana</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>ath-miR-398</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <it>A. thaliana</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>ath-miR-408</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>"miRNA" gives miRBase miRNA identifier; "Validated targets" gives number of unique validated targets present in the starting alignment; "Recovered targets" gives number of validated targets in the input alignment that were recovered; "Predicted novel targets" gives number of candidate target sequences (other than the validated targets) predicted by the <it>SBM </it>method.</p>
               </tblfn>
            </tbl>
            <suppl id="S1">
               <title>
                  <p>Additional file 1</p>
               </title>
               <text>
                  <p>tableS1 &#8211; <it>SBM </it>comparison with miRanda. Full results of the <it>SBM </it>comparison with miRanda for each of the four miRNAs tested.</p>
               </text>
               <file name="1748-7188-3-3-S1.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <p>In accordance with the definition of the <it>SBM </it>method, in Table <tblr tid="T2">2</tblr> we see that all validated targets present in the input alignment are recovered in the scan output using a threshold of 1. In many cases no additional candidate targets are predicted using this stringent threshold, especially when there are few sequences provided in the input <it>SBM </it>set. Larger sets of validated targets tended to result in the prediction of more new candidate target sites, as illustrated in Table <tblr tid="T2">2</tblr> by the cases of <it>dme-miR-4</it>, <it>dme-miR-7</it>, <it>cel-let-7 </it>and <it>cel-miR-84</it>. This reflects the consistency criterion built into the binding matrix definition; a larger input set of sequences generally tended to reduce the stringency of the classifier.</p>
            <p><it>cel-let-7 </it>returned 1708 predicted targets at threshold 1 which appears to be relatively high compared with the other results, but given the size the searched database (2,274,326 nt) it is a small proportion of all possible target regions. A possible reason for the large number of predicted targets is that the input sequence set used to build the <it>SBM </it>set was misaligned by ClustalW. The validated targets used to create the alignment showed a greater degree of heterogeneity that those in other alignments. Another possible explanation is that <it>cel-let-7 </it>is known to have several paralogs (<it>cel-miR-84</it>, <it>cel-miR-48 </it>and <it>cel-miR-241</it>) <abbrgrp><abbr bid="B29">29</abbr></abbrgrp> and therefore its targets are likely to overlap with other members of this miRNA family. It has also been suggested that some miRNAs may target thousands of different genes <abbrgrp><abbr bid="B30">30</abbr></abbrgrp> making it possible that many of the targets predicted are in fact true positives.</p>
         </sec>
         <sec>
            <st>
               <p>Leave one out analysis</p>
            </st>
            <p>While the <it>SBM </it>method used with <it>S</it><sub>min </sub>= 1 recovers all targets that are present in the input alignment, unknown targets that receive a score below 1 are likely to exist. It is possible to detect such sequences using the <it>SBM </it>method by lowering the threshold. This increases the classifier's sensitivity at the expense of reducing its specificity. To assess this effect quantitatively we conducted a leave one out analysis. In particular we constructed leave one out alignments by deleting one target site sequence from an input alignment. Then, for each alignment in which the target sequence <it>w </it>was left out, we computed a <it>SBM </it>set and determined the score <it>S</it>(<it>w</it>) of the target site that was left out. If <it>S</it>(<it>w</it>) &lt; 1, the threshold needs to be adjusted to <it>S</it><sub>min </sub>= <it>S</it>(<it>w</it>) in order to detect <it>w</it>. We therefore scanned the respective UTR set with <it>S</it><sub>min </sub>= min {1, <it>S</it>(<it>w</it>)} and determined the number of predicted targets.</p>
            <p>An input alignment of <it>n </it>sequences allows construction of <it>n </it>- 1 leave one out alignments (we did not leave out the reverse complement of the miRNA), so data sets containing more experimentally validated target sites clearly result in more meaningful leave one out analyses. We therefore chose the four miRNAs that had the greatest number of known experimentally validated targets; <it>D. melanogaster miR-7 </it>and <it>C. elegans let-7</it>, which both targeted 15 unique UTR regions as well as <it>D. melanogaster miR-4 </it>(8 unique targets) and <it>C. elegans miR-84 </it>(7 unique targets). In total 2,484,850 UTR regions were scanned in the <it>C. elegans </it>set compared with to 4,409,641 regions in the <it>D. melanogaster </it>set. The score of each left out target along with the number of regions with a score equal to or greater than this value in the scan using the full alignment are shown in Table <tblr tid="T3">3</tblr>.</p>
            <tbl id="T3">
               <title>
                  <p>Table 3</p>
               </title>
               <caption>
                  <p>Leave one out analysis</p>
               </caption>
               <tblbdy cols="3">
                  <r>
                     <c cspan="3" ca="center">
                        <p>
                           <b><it>Drosophila melanogaster</it>, miR-7</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c cspan="3">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>Target</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>LOO score</b>
                        </p>
                     </c>
                     <c ca="right">
                        <p>&#10878; <b>LOO score</b></p>
                     </c>
                  </r>
                  <r>
                     <c cspan="3">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>CG12487.3/223&#8211;241</p>
                     </c>
                     <c ca="left">
                        <p>0.946</p>
                     </c>
                     <c ca="right">
                        <p>94</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>CG5185.3/279&#8211;297</p>
                     </c>
                     <c ca="left">
                        <p>1.000</p>
                     </c>
                     <c ca="right">
                        <p>34</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>CG3096.3/152&#8211;170</p>
                     </c>
                     <c ca="left">
                        <p>1.000</p>
                     </c>
                     <c ca="right">
                        <p>34</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>CG12487.3/250&#8211;268</p>
                     </c>
                     <c ca="left">
                        <p>1.000</p>
                     </c>
                     <c ca="right">
                        <p>34</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>CG3166.3/1100&#8211;1118</p>
                     </c>
                     <c ca="left">
                        <p>0.951</p>
                     </c>
                     <c ca="right">
                        <p>76</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>CG6096.3/103&#8211;121</p>
                     </c>
                     <c ca="left">
                        <p>1.000</p>
                     </c>
                     <c ca="right">
                        <p>34</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>CG8346.3/78&#8211;96</p>
                     </c>
                     <c ca="left">
                        <p>0.966</p>
                     </c>
                     <c ca="right">
                        <p>58</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>CG5185.3/334&#8211;352</p>
                     </c>
                     <c ca="left">
                        <p>1.000</p>
                     </c>
                     <c ca="right">
                        <p>34</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>CG6494.3/447&#8211;465</p>
                     </c>
                     <c ca="left">
                        <p>0.919</p>
                     </c>
                     <c ca="right">
                        <p>155</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>CG6096.3/24&#8211;42</p>
                     </c>
                     <c ca="left">
                        <p>1.000</p>
                     </c>
                     <c ca="right">
                        <p>34</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>CG6096.3/68&#8211;86</p>
                     </c>
                     <c ca="left">
                        <p>0.961</p>
                     </c>
                     <c ca="right">
                        <p>65</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>CG8328.3/63&#8211;81</p>
                     </c>
                     <c ca="left">
                        <p>0.773</p>
                     </c>
                     <c ca="right">
                        <p>2015</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>CG3166.3/1586&#8211;1602</p>
                     </c>
                     <c ca="left">
                        <p>0.855</p>
                     </c>
                     <c ca="right">
                        <p>393</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>CG3166.3/29&#8211;46</p>
                     </c>
                     <c ca="left">
                        <p>0.845</p>
                     </c>
                     <c ca="right">
                        <p>513</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>CG3166.3/1294&#8211;1312</p>
                     </c>
                     <c ca="left">
                        <p>0.861</p>
                     </c>
                     <c ca="right">
                        <p>521</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="3">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c cspan="3" ca="center">
                        <p>
                           <b><it>Caenorhabditis elegans</it>, let-7</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c cspan="3">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>Target</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>LOO score</b>
                        </p>
                     </c>
                     <c ca="right">
                        <p>&#10878; <b>LOO score</b></p>
                     </c>
                  </r>
                  <r>
                     <c cspan="3">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>ZK792.6/247&#8211;264</p>
                     </c>
                     <c ca="left">
                        <p>0.959</p>
                     </c>
                     <c ca="right">
                        <p>3561</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>F38A6.1a/271&#8211;288</p>
                     </c>
                     <c ca="left">
                        <p>1.000</p>
                     </c>
                     <c ca="right">
                        <p>1708</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>C18D1.1.1/526&#8211;542</p>
                     </c>
                     <c ca="left">
                        <p>0.906</p>
                     </c>
                     <c ca="right">
                        <p>10458</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>ZK792.6/666&#8211;683</p>
                     </c>
                     <c ca="left">
                        <p>0.959</p>
                     </c>
                     <c ca="right">
                        <p>3522</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>ZK792.6/458&#8211;475</p>
                     </c>
                     <c ca="left">
                        <p>0.929</p>
                     </c>
                     <c ca="right">
                        <p>7311</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>F38A6.1a/133&#8211;150</p>
                     </c>
                     <c ca="left">
                        <p>0.874</p>
                     </c>
                     <c ca="right">
                        <p>19177</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>C01G8.9a/21&#8211;38</p>
                     </c>
                     <c ca="left">
                        <p>0.850</p>
                     </c>
                     <c ca="right">
                        <p>23906</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>ZK792.6/132&#8211;148</p>
                     </c>
                     <c ca="left">
                        <p>0.859</p>
                     </c>
                     <c ca="right">
                        <p>20570</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>C01G8.9a/159&#8211;175</p>
                     </c>
                     <c ca="left">
                        <p>0.813</p>
                     </c>
                     <c ca="right">
                        <p>30895</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>ZK792.6/190&#8211;207</p>
                     </c>
                     <c ca="left">
                        <p>0.807</p>
                     </c>
                     <c ca="right">
                        <p>41812</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>C12C8.3a/693&#8211;709</p>
                     </c>
                     <c ca="left">
                        <p>0.791</p>
                     </c>
                     <c ca="right">
                        <p>39369</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>C12C8.3a/742&#8211;757</p>
                     </c>
                     <c ca="left">
                        <p>1.000</p>
                     </c>
                     <c ca="right">
                        <p>1499</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>ZK792.6/484&#8211;499</p>
                     </c>
                     <c ca="left">
                        <p>0.898</p>
                     </c>
                     <c ca="right">
                        <p>10232</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>F11A1.3a/1007&#8211;1021</p>
                     </c>
                     <c ca="left">
                        <p>0.948</p>
                     </c>
                     <c ca="right">
                        <p>4658</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>ZK792.6/343&#8211;361</p>
                     </c>
                     <c ca="left">
                        <p>0.955</p>
                     </c>
                     <c ca="right">
                        <p>4352</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="3">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c cspan="3" ca="center">
                        <p>
                           <b><it>Drosophila melanogaster</it>, miR-4</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c cspan="3">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>Target</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>LOO score</b>
                        </p>
                     </c>
                     <c ca="right">
                        <p>&#10878; <b>LOO score</b></p>
                     </c>
                  </r>
                  <r>
                     <c cspan="3">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>CG6096.3/135&#8211;154</p>
                     </c>
                     <c ca="left">
                        <p>0.755</p>
                     </c>
                     <c ca="right">
                        <p>3118</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>CG8328.3/27&#8211;45</p>
                     </c>
                     <c ca="left">
                        <p>1.000</p>
                     </c>
                     <c ca="right">
                        <p>8</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>CG3096.3/33&#8211;52</p>
                     </c>
                     <c ca="left">
                        <p>0.929</p>
                     </c>
                     <c ca="right">
                        <p>161</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>CG3096.3/138&#8211;157</p>
                     </c>
                     <c ca="left">
                        <p>0.877</p>
                     </c>
                     <c ca="right">
                        <p>473</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>CG5185.3/46&#8211;65</p>
                     </c>
                     <c ca="left">
                        <p>0.960</p>
                     </c>
                     <c ca="right">
                        <p>64</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>CG12487.3/188&#8211;208</p>
                     </c>
                     <c ca="left">
                        <p>0.820</p>
                     </c>
                     <c ca="right">
                        <p>1298</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>CG12487.3/62&#8211;82</p>
                     </c>
                     <c ca="left">
                        <p>0.871</p>
                     </c>
                     <c ca="right">
                        <p>627</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>CG6096.3/210&#8211;230</p>
                     </c>
                     <c ca="left">
                        <p>0.908</p>
                     </c>
                     <c ca="right">
                        <p>207</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="3">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c cspan="3" ca="center">
                        <p>
                           <b><it>Caenorhabditis elegans</it>, miR-84</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c cspan="3">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>Target</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>LOO score</b>
                        </p>
                     </c>
                     <c ca="right">
                        <p>&#10878; <b>LOO score</b></p>
                     </c>
                  </r>
                  <r>
                     <c cspan="3">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>ZK792.6/126&#8211;148</p>
                     </c>
                     <c ca="left">
                        <p>0.804</p>
                     </c>
                     <c ca="right">
                        <p>4970</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>ZK792.6/187&#8211;207</p>
                     </c>
                     <c ca="left">
                        <p>0.552</p>
                     </c>
                     <c ca="right">
                        <p>132626</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>ZK792.6/249&#8211;264</p>
                     </c>
                     <c ca="left">
                        <p>0.947</p>
                     </c>
                     <c ca="right">
                        <p>355</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>ZK792.6/342&#8211;361</p>
                     </c>
                     <c ca="left">
                        <p>0.761</p>
                     </c>
                     <c ca="right">
                        <p>12552</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>ZK792.6/460&#8211;475</p>
                     </c>
                     <c ca="left">
                        <p>0.858</p>
                     </c>
                     <c ca="right">
                        <p>2012</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>ZK792.6/479&#8211;499</p>
                     </c>
                     <c ca="left">
                        <p>0.739</p>
                     </c>
                     <c ca="right">
                        <p>18375</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>ZK792.6/665&#8211;683</p>
                     </c>
                     <c ca="left">
                        <p>0.726</p>
                     </c>
                     <c ca="right">
                        <p>15846</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>"target" gives validated target sequence accession/start-end; "miRNA" gives miRNA targeting that region; "&#10878; LOO score" gives mean number of regions scoring equal to or greater than the left out sequence.</p>
               </tblfn>
            </tbl>
            <p>The <it>SBM </it>method appears to show a greater degree of accuracy in the <it>D. melanogaster </it>miR-7 results. Here the mean score of the left out target is 0.9385 and the mean number of target regions scoring greater than or equal to the left out sequence is 273 (0.006% of the total UTR regions scanned). The <it>C. elegans </it>let-7 scan indicates a lower degree of specificity, with an average score of 0.9032, returning a mean of 14869 regions with a score greater than or equal to the score of the left out validated target sequence. This represents 0.598% of the sequence database that was searched. For <it>D. melanogaster </it>miR-4 the <it>SBM </it>method gave a mean score of 0.890 with an average of 745 target regions scoring greater than or equal to the left out sequence (0.017% of the total UTR regions scanned), and for <it>C. elegans </it>miR-84 a mean score of 0.770 was obtained and an average of 26677 target regions scoring greater than or equal to the left out sequence was returned (1.074% of the total UTR regions scanned). The decrease in specificity in the <it>C. elegans </it>miR-84 results is largely due to a single leave one out test in which over 132626 sequences scored higher than the left out sequence (which received a score of 0.552).</p>
            <p>Overall, the lowering of the threshold required to detect a word not in the input set results in a moderate increase in the number of reported hits, which is indicative of a high specificity even with the reduced threshold.</p>
            <p>In order to assess the performance of the algorithm when few known targets are provided in the input alignment were-ran the <it>C. elegans let-7 </it>and <it>D. melanogaster miR-7 </it>scans but this time split each of the alignments of 15 validated targets into two subalignments containing 8 and 7 sequences respectively. Table <tblr tid="T4">4</tblr> shows that as the number of sequences used to build the <it>SBM </it>decreases, so does the mean score of the left out sequences. This indicates, as might be expected that as the number of sequences left out of the alignment increases the specificity decreases.</p>
            <tbl id="T4">
               <title>
                  <p>Table 4</p>
               </title>
               <caption>
                  <p>Leave several out analysis</p>
               </caption>
               <tblbdy cols="5">
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="right">
                        <p>
                           <b>15 targets</b>
                        </p>
                     </c>
                     <c ca="right">
                        <p>
                           <b>14 targets</b>
                        </p>
                     </c>
                     <c ca="right">
                        <p>
                           <b>8 targets</b>
                        </p>
                     </c>
                     <c ca="right">
                        <p>
                           <b>7 targets</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c cspan="5">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Mean score <it>C. elegans let-7</it></p>
                     </c>
                     <c ca="right">
                        <p>1.000</p>
                     </c>
                     <c ca="right">
                        <p>0.903</p>
                     </c>
                     <c ca="right">
                        <p>0.851</p>
                     </c>
                     <c ca="right">
                        <p>0.810</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Mean number returned <it>C. elegans let-7</it></p>
                     </c>
                     <c ca="right">
                        <p>1708</p>
                     </c>
                     <c ca="right">
                        <p>14869</p>
                     </c>
                     <c ca="right">
                        <p>18032</p>
                     </c>
                     <c ca="right">
                        <p>17225</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Mean score <it>D. melanogaster miR-7</it></p>
                     </c>
                     <c ca="right">
                        <p>1.000</p>
                     </c>
                     <c ca="right">
                        <p>0.938</p>
                     </c>
                     <c ca="right">
                        <p>0.908</p>
                     </c>
                     <c ca="right">
                        <p>0.890</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Mean number returned <it>D. melanogaster miR-7</it></p>
                     </c>
                     <c ca="right">
                        <p>28</p>
                     </c>
                     <c ca="right">
                        <p>273</p>
                     </c>
                     <c ca="right">
                        <p>509</p>
                     </c>
                     <c ca="right">
                        <p>138</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>Shows mean scores and mean number of regions scoring above maximal consistent threshold for alignments containing 15, 14, 8 and 7 validated targets.</p>
               </tblfn>
            </tbl>
         </sec>
         <sec>
            <st>
               <p>Comparison with miRanda</p>
            </st>
            <p>We also compared the performance of the <it>SBM </it>method with miRanda v1.9, a commonly used target prediction tool <abbrgrp><abbr bid="B10">10</abbr></abbrgrp>. miRanda takes a single miRNA sequence as input and searches a sequence dataset for potential target regions. It uses two different criteria to detect potential target sites, the alignment score and the MFE of the miRNA bound to the potential target sequence.</p>
            <p>In order to obtain results with miRanda that could be meaningfully compared with the <it>SBM </it>method, we used miRanda to score every potential target site across each of the UTR sequences. To do this we split each of the UTRs into 30 nt sequence windows covering the entire length of each UTR and used this as our sequence database for the miRanda scan. Since the same target region may be scored more than once using this approach, we removed any duplicate regions from the results before the comparison. By default miRanda uses relatively stringent threshold values which do not necessarily recover all known target regions, i.e. classification is not consistent. For this reason miRanda was run using a negative score threshold and a positive energy threshold which allowed us to obtain a wide distribution of scores and to ensure consistency.</p>
            <p>Table <tblr tid="T5">5</tblr> provides an overview of the miRanda comparison, the full results can be found in the supplementary materials [see Additional file <supplr sid="S1">1</supplr>]. In general the <it>SBM </it>method compared favourably with miRanda. This is not unexpected as we incorporate additional information into our searches. For example the <it>cel-let7 </it>results show that an average of 14869 regions had a score that was at least as high as the left out sequence using <it>SBM </it>whereas an average of 92332 regions scored at least as high as the validated target using miRanda. This difference was more pronounced in the <it>dme-miR-7 </it>results where an average of 273 sequences scored equal to or better than the left out sequences and an average of 8868 sequences scored at least as high as the validated target using miRanda. The <it>SBM </it>method returned an average of 745 sequences scoring equal to or better than the left out sequence for <it>dme-miR-4 </it>in comparison to an average of 11488 sequences that scored at least as high as the validated target using miRanda. An average of 26677 target regions were returned using the <it>SBM </it>method for <it>cel-miR-84 </it>compared with 190693 using miRanda.</p>
            <tbl id="T5">
               <title>
                  <p>Table 5</p>
               </title>
               <caption>
                  <p>Summary of results for the leave one out analysis</p>
               </caption>
               <tblbdy cols="8">
                  <r>
                     <c ca="center">
                        <p>
                           <b>miRNA</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>LOO score</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>&#10878; <b>LOO score</b></p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>miRanda(s)</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>&#10878; <b>miRanda(s)</b></p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>miRanda(e)</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>&#10878; <b>miRanda(e)</b></p>
                     </c>
                     <c ca="center">
                        <p>&#10878; <b>miRanda(se)</b></p>
                     </c>
                  </r>
                  <r>
                     <c cspan="8">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>cel-let-7</p>
                     </c>
                     <c ca="center">
                        <p>0.903</p>
                     </c>
                     <c ca="center">
                        <p>14869</p>
                     </c>
                     <c ca="center">
                        <p>119</p>
                     </c>
                     <c ca="center">
                        <p>92332</p>
                     </c>
                     <c ca="center">
                        <p>-15.46</p>
                     </c>
                     <c ca="center">
                        <p>60266</p>
                     </c>
                     <c ca="center">
                        <p>23992</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>cel-miR-84</p>
                     </c>
                     <c ca="center">
                        <p>0.770</p>
                     </c>
                     <c ca="center">
                        <p>26677</p>
                     </c>
                     <c ca="center">
                        <p>106</p>
                     </c>
                     <c ca="center">
                        <p>190693</p>
                     </c>
                     <c ca="center">
                        <p>-10.19</p>
                     </c>
                     <c ca="center">
                        <p>150137</p>
                     </c>
                     <c ca="center">
                        <p>48538</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>dme-miR-7</p>
                     </c>
                     <c ca="center">
                        <p>0.938</p>
                     </c>
                     <c ca="center">
                        <p>273</p>
                     </c>
                     <c ca="center">
                        <p>159</p>
                     </c>
                     <c ca="center">
                        <p>8868</p>
                     </c>
                     <c ca="center">
                        <p>-21.69</p>
                     </c>
                     <c ca="center">
                        <p>7227</p>
                     </c>
                     <c ca="center">
                        <p>2129</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>dme-miR-4</p>
                     </c>
                     <c ca="center">
                        <p>0.890</p>
                     </c>
                     <c ca="center">
                        <p>745</p>
                     </c>
                     <c ca="center">
                        <p>131</p>
                     </c>
                     <c ca="center">
                        <p>11488</p>
                     </c>
                     <c ca="center">
                        <p>-8.51</p>
                     </c>
                     <c ca="center">
                        <p>184134</p>
                     </c>
                     <c ca="center">
                        <p>5325</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>"miRNA" gives miRBase accession of the miRNA sequence; "LOO score" gives mean score of the targets left out of the <it>SBM</it>; "&#10878; LOO score" gives mean number of regions scoring equal to or greater than the left out sequence; "miRanda(s)" gives raw score of the miRanda hit of lowest scoring target region; "&#10878; miRanda(s)" gives number of regions with returned using the maximal consistent score threshold; "miRanda(e)" gives minimum free energy (MFE) of the miRanda hit of the least stable target region; "&#10878; miRanda(e)" gives number of regions with returned using the maximal consistent MFE threshold; "&#10878; miRanda(se)" gives number of regions with returned using the maximal consistent combined score and MFE threshold.</p>
               </tblfn>
            </tbl>
            <p>We determined the maximal consistent threshold for miRanda results by filtering out all candidates with an alignment score lower than the lowest scoring validated target. The remaining candidates are then filtered further by removing any sequence with an MFE of greater than the MFE of the highest (least stable) of the validated targets. The number of regions returned using the maximal consistent threshold in miRanda were 23992 for <it>cel-let7 </it>in contrast to the 1708 returned using the <it>SBM </it>method with maximal consistent threshold. 48538 regions were recovered for <it>cel-miR-84 </it>compared with 123 using <it>SBM</it>, 2129 for <it>dme-miR-4 </it>in comparison to 23 with <it>SBM </it>and 5325 for <it>dme-miR-7</it>, with the <it>SBM </it>method returning 28.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Discussion</p>
         </st>
         <p>We have presented a new method, <it>SBM</it>, that allows the use of miRNA target site sequences in addition to the miRNA sequence itself to search for novel target sites. We have demonstrated its application to target prediction for a variety of miRNA examples from different organisms and have shown that it performs well in comparison to miRanda. Many computational methods for target prediction tend to suffer from a lack of specificity <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>. The <it>SBM </it>method allows the use of all known target sequences in the search, and is designed to provide maximum specificity whilst recovering all members present in the starting alignment. Thus, as the number of experimentally validated miRNA targets grows, the <it>SBM </it>method should provide an attractive addition to the available miRNA target site detection methods.</p>
         <p>Many current target prediction techniques are based on algorithms with fixed parameters (such as base pairing rules or binding energies) that are used to assess potential targets by matching them to the miRNA sequence. These algorithms are designed to reflect molecular target recognition mechanisms that are assumed to apply to miRNA target recognition in general. Tailoring these algorithms to reflect mechanisms that are specific to the miRNA is difficult or impossible. In contrast to this, the <it>SBM </it>method can capture aspects of specific binding mechanisms by extracting such specific information from the set of validated target site sequences. This also makes the method generic in that it can be applied to any organism without having to assume any prior knowledge of specific target recognition mechanisms.</p>
         <p>Due to the small number of validated targets for each miRNA, the maximal consistent threshold used in the <it>SBM </it>method is rather stringent. We chose this threshold to facilitate comparison of the method to miRanda. For many applications lowering thresholds to increase sensitivity at the cost of losing some specificity may be advisable. The specificity advantage of the <it>SBM </it>method can be expected to be partly independent of the threshold, since moderate relaxation of the threshold for a classifier that attains a high level of specificity with a given threshold can be assumed to retain some of the specificity advantage.</p>
         <p>As with all scoring matrix approaches, the <it>SBM </it>method is limited by the quality of the input data. Firstly if a false positive target sequence is provided as input the method will be adversely affected, therefore only experimentally validated targets should generally be used as input. Secondly the quality of the input alignment is extremely important and a poor quality alignment will lead to poor performance. miRNAs are relatively short (~21 nt) which means that in many cases they can be aligned quite accurately using multiple alignment algorithms such as ClustalW <abbrgrp><abbr bid="B25">25</abbr></abbrgrp> and MUSCLE <abbrgrp><abbr bid="B31">31</abbr></abbrgrp>. In some cases however, the conservation between sites targeted by the same miRNA is very low, meaning that an accurate sequence alignment is hard to produce using automated methods. In such cases it may be favourable to hand curate alignments in order to ensure quality and obtain optimal <it>SBM </it>results. Thirdly, although the short length of miRNAs also allows for the integration of gapped alignments in the <it>SBM </it>method, the method will only search for the gap patterns contained in the input alignment. Thus, if targets contain insertion/deletion patterns which are not specified in this way, then they may receive a lower score or even be missed completely depending on the threshold used in the search.</p>
         <p>Several miRNA target prediction systems have implemented post-processing steps in order to increase their specificity. The most commonly used filtering approach is to look for cross-species conservation of target sites. Here target sites that appear not to be conserved between multiple species are filtered out from the search results, removing false positives, and leading to increased specificity. This type of approach could be applied to results obtained with <it>SBM </it>to further increase the specificity of target predictions. However, we note that this might also lead to a reduction in sensitivity as it is now known that miRNAs themselves are not always conserved between related species (e.g. <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>). Another possibility is to post-process based on target site accessibility. It has recently been shown that taking into account target site accessibility in the 3' UTR can improve target prediction accuracy <abbrgrp><abbr bid="B33">33</abbr></abbrgrp>. For instance if a predicted target site is part of a stable secondary structure (and is therefore already involved in base-pairing) it is less likely that the miRNA will be able to bind to the target causing the translational repression of the mRNA. In conclusion, we have presented a promising new method for miRNA target prediction, <it>SBM</it>, that employs a generic scoring matrix approach and incorporates experimentally validated targets. Since the number of validated targets is constantly growing, <it>SBM </it>should provide a useful new addition to the current target prediction toolbox.</p>
         <suppl id="S2">
            <title>
               <p>Additional file 2</p>
            </title>
            <text>
               <p>tableS2 &#8211; Number of overlapping predictions. Summary table showing the number of target regions predicted by <it>SBM </it>and miRanda using default parameters and their overlap.</p>
            </text>
            <file name="1748-7188-3-3-S2.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
      </sec>
      <sec>
         <st>
            <p>Availability and Requirements</p>
         </st>
         <p>The code, together with documentation and examples, is freely available for download from <url>http://www.cmp.uea.ac.uk/~jtk/stackbm/</url>.</p>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>We would like to thank the referees for their helpful comments. Vincent Moulton thanks Biotechnology and Biological Sciences Research Council (grant BB/E004091/1) for its support.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>Role of microRNAs in plant and animal development</p>
            </title>
            <aug>
               <au>
                  <snm>Carrington</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Ambros</snm>
                  <fnm>V</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2003</pubdate>
            <volume>301</volume>
            <issue>5631</issue>
            <fpage>336</fpage>
            <lpage>338</lpage>
            <url>http://dx.doi.org/10.1126/science.1085242</url>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.1085242</pubid>
                  <pubid idtype="pmpid" link="fulltext">12869753</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>MicroRNA-cancer connection: the beginning of a new tale</p>
            </title>
            <aug>
               <au>
                  <snm>Calin</snm>
                  <fnm>GA</fnm>
               </au>
               <au>
                  <snm>Croce</snm>
                  <fnm>CM</fnm>
               </au>
            </aug>
            <source>Cancer Res</source>
            <pubdate>2006</pubdate>
            <volume>66</volume>
            <issue>15</issue>
            <fpage>7390</fpage>
            <lpage>7394</lpage>
            <url>http://dx.doi.org/10.1158/0008-5472.CAN-06-0800</url>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1158/0008-5472.CAN-06-0800</pubid>
                  <pubid idtype="pmpid" link="fulltext">16885332</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>MicroRNAs in plants</p>
            </title>
            <aug>
               <au>
                  <snm>Reinhart</snm>
                  <fnm>BJ</fnm>
               </au>
               <au>
                  <snm>Weinstein</snm>
                  <fnm>EG</fnm>
               </au>
               <au>
                  <snm>Rhoades</snm>
                  <fnm>MW</fnm>
               </au>
               <au>
                  <snm>Bartel</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Bartel</snm>
                  <fnm>DP</fnm>
               </au>
            </aug>
            <source>Genes Dev</source>
            <pubdate>2002</pubdate>
            <volume>16</volume>
            <issue>13</issue>
            <fpage>1616</fpage>
            <lpage>1626</lpage>
            <url>http://dx.doi.org/10.1101/gad.1004402</url>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">186362</pubid>
                  <pubid idtype="pmpid" link="fulltext">12101121</pubid>
                  <pubid idtype="doi">10.1101/gad.1004402</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>An RNA-directed nuclease mediates post-transcriptional gene silencing in Drosophila cells</p>
            </title>
            <aug>
               <au>
                  <snm>Hammond</snm>
                  <fnm>SM</fnm>
               </au>
               <au>
                  <snm>Bernstein</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Beach</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Hannon</snm>
                  <fnm>GJ</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2000</pubdate>
            <volume>404</volume>
            <issue>6775</issue>
            <fpage>293</fpage>
            <lpage>296</lpage>
            <url>http://dx.doi.org/10.1038/35005107</url>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/35005107</pubid>
                  <pubid idtype="pmpid" link="fulltext">10749213</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>Functions of microRNAs and related small RNAs in plants</p>
            </title>
            <aug>
               <au>
                  <snm>Mallory</snm>
                  <fnm>AC</fnm>
               </au>
               <au>
                  <snm>Vaucheret</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>2006</pubdate>
            <issue>38 Suppl</issue>
            <fpage>S31</fpage>
            <lpage>S36</lpage>
            <url>http://dx.doi.org/10.1038/ng1791</url>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/ng1791</pubid>
                  <pubid idtype="pmpid" link="fulltext">16736022</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>miRNAs and apoptosis: RNAs to die for</p>
            </title>
            <aug>
               <au>
                  <snm>Jovanovic</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Hengartner</snm>
                  <fnm>MO</fnm>
               </au>
            </aug>
            <source>Oncogene</source>
            <pubdate>2006</pubdate>
            <volume>25</volume>
            <issue>46</issue>
            <fpage>6176</fpage>
            <lpage>6187</lpage>
            <url>http://dx.doi.org/10.1038/sj.onc.1209912</url>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/sj.onc.1209912</pubid>
                  <pubid idtype="pmpid" link="fulltext">17028597</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>Spatio-temporal accumulation of microRNAs is highly coordinated in developing plant tissues</p>
            </title>
            <aug>
               <au>
                  <snm>V&#225;l&#243;czi</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>V&#225;rallyay</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Kauppinen</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Burgy&#225;n</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Havelda</snm>
                  <fnm>Z</fnm>
               </au>
            </aug>
            <source>Plant J</source>
            <pubdate>2006</pubdate>
            <volume>47</volume>
            <fpage>140</fpage>
            <lpage>151</lpage>
            <url>http://dx.doi.org/10.1111/j.1365-313X.2006.02766.x</url>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1111/j.1365-313X.2006.02766.x</pubid>
                  <pubid idtype="pmpid" link="fulltext">16824182</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>The diverse functions of microRNAs in animal development and disease</p>
            </title>
            <aug>
               <au>
                  <snm>Kloosterman</snm>
                  <fnm>WP</fnm>
               </au>
               <au>
                  <snm>Plasterk</snm>
                  <fnm>RHA</fnm>
               </au>
            </aug>
            <source>Dev Cell</source>
            <pubdate>2006</pubdate>
            <volume>11</volume>
            <issue>4</issue>
            <fpage>441</fpage>
            <lpage>450</lpage>
            <url>http://dx.doi.org/10.1016/j.devcel.2006.09.009</url>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.devcel.2006.09.009</pubid>
                  <pubid idtype="pmpid" link="fulltext">17011485</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>miRBase: microRNA sequences, targets and gene nomenclature</p>
            </title>
            <aug>
               <au>
                  <snm>Griffiths-Jones</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Grocock</snm>
                  <fnm>RJ</fnm>
               </au>
               <au>
                  <snm>van Dongen</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Bateman</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Enright</snm>
                  <fnm>AJ</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2006</pubdate>
            <issue>34 Database</issue>
            <fpage>D140</fpage>
            <lpage>D144</lpage>
            <url>http://dx.doi.org/10.1093/nar/gkj112</url>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1347474</pubid>
                  <pubid idtype="pmpid" link="fulltext">16381832</pubid>
                  <pubid idtype="doi">10.1093/nar/gkj112</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>MicroRNA targets in Drosophila</p>
            </title>
            <aug>
               <au>
                  <snm>Enright</snm>
                  <fnm>AJ</fnm>
               </au>
               <au>
                  <snm>John</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Gaul</snm>
                  <fnm>U</fnm>
               </au>
               <au>
                  <snm>Tuschl</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Sander</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Marks</snm>
                  <fnm>DS</fnm>
               </au>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2003</pubdate>
            <volume>5</volume>
            <fpage>R1</fpage>
            <url>http://dx.doi.org/10.1186/gb-2003-5-1-r1</url>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">395733</pubid>
                  <pubid idtype="pmpid" link="fulltext">14709173</pubid>
                  <pubid idtype="doi">10.1186/gb-2003-5-1-r1</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>miRU: an automated plant miRNA target prediction server</p>
            </title>
            <aug>
               <au>
                  <snm>Zhang</snm>
                  <fnm>Y</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2005</pubdate>
            <issue>33 Web Server</issue>
            <fpage>W701</fpage>
            <lpage>W704</lpage>
            <url>http://dx.doi.org/10.1093/nar/gki383</url>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1160144</pubid>
                  <pubid idtype="pmpid" link="fulltext">15980567</pubid>
                  <pubid idtype="doi">10.1093/nar/gki383</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>RNAhybrid: microRNA target prediction easy, fast and flexible</p>
            </title>
            <aug>
               <au>
                  <snm>Kr&#252;ger</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Rehmsmeier</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2006</pubdate>
            <issue>34 Web Server</issue>
            <fpage>W451</fpage>
            <lpage>W454</lpage>
            <url>http://dx.doi.org/10.1093/nar/gkl243</url>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1538877</pubid>
                  <pubid idtype="pmpid" link="fulltext">16845047</pubid>
                  <pubid idtype="doi">10.1093/nar/gkl243</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B13">
            <title>
               <p>Prediction of microRNA targets</p>
            </title>
            <aug>
               <au>
                  <snm>Mazi&#232;re</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Enright</snm>
                  <fnm>AJ</fnm>
               </au>
            </aug>
            <source>Drug Discov Today</source>
            <pubdate>2007</pubdate>
            <volume>12</volume>
            <issue>11&#8211;12</issue>
            <fpage>452</fpage>
            <lpage>458</lpage>
            <url>http://dx.doi.org/10.1016/j.drudis.2007.04.002</url>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.drudis.2007.04.002</pubid>
                  <pubid idtype="pmpid" link="fulltext">17532529</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>Human MicroRNA targets</p>
            </title>
            <aug>
               <au>
                  <snm>John</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Enright</snm>
                  <fnm>AJ</fnm>
               </au>
               <au>
                  <snm>Aravin</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Tuschl</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Sander</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Marks</snm>
                  <fnm>DS</fnm>
               </au>
            </aug>
            <source>PLoS Biol</source>
            <pubdate>2004</pubdate>
            <volume>2</volume>
            <issue>11</issue>
            <fpage>e363</fpage>
            <url>http://dx.doi.org/10.1371/journal.pbio.0020363</url>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">521178</pubid>
                  <pubid idtype="pmpid" link="fulltext">15502875</pubid>
                  <pubid idtype="doi">10.1371/journal.pbio.0020363</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B15">
            <title>
               <p>microRNA target predictions in animals</p>
            </title>
            <aug>
               <au>
                  <snm>Rajewsky</snm>
                  <fnm>N</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>2006</pubdate>
            <issue>38 Suppl</issue>
            <fpage>S8</fpage>
            <lpage>13</lpage>
            <url>http://dx.doi.org/10.1038/ng1798</url>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/ng1798</pubid>
                  <pubid idtype="pmpid" link="fulltext">16736023</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <title>
               <p>Binding matrix: a novel approach for binding site recognition</p>
            </title>
            <aug>
               <au>
                  <snm>Kim</snm>
                  <fnm>JT</fnm>
               </au>
               <au>
                  <snm>Gewehr</snm>
                  <fnm>JE</fnm>
               </au>
               <au>
                  <snm>Martinetz</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>J Bioinform Comput Biol</source>
            <pubdate>2004</pubdate>
            <volume>2</volume>
            <issue>2</issue>
            <fpage>289</fpage>
            <lpage>307</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1142/S0219720004000569</pubid>
                  <pubid idtype="pmpid" link="fulltext">15297983</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>Improved parameters for prediction of RNA structure</p>
            </title>
            <aug>
               <au>
                  <snm>Turner</snm>
                  <fnm>DH</fnm>
               </au>
               <au>
                  <snm>Sugimoto</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Jaeger</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Longfellow</snm>
                  <fnm>CE</fnm>
               </au>
               <au>
                  <snm>Freier</snm>
                  <fnm>SM</fnm>
               </au>
               <au>
                  <snm>Kierzek</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Cold Spring Harb Symp Quant Biol</source>
            <pubdate>1987</pubdate>
            <volume>52</volume>
            <fpage>123</fpage>
            <lpage>133</lpage>
            <xrefbib>
               <pubid idtype="pmpid">2456874</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p>Integrated Functional and Bioinformatics Approach for the Identification and Experimental Verification of RNA Signals: Application to HIV-1 INS</p>
            </title>
            <aug>
               <au>
                  <snm>Wolff</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Brack-Werner</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Neumann</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Werner</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Schneider</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Research</source>
            <pubdate>2003</pubdate>
            <volume>31</volume>
            <fpage>2839</fpage>
            <lpage>2851</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">156724</pubid>
                  <pubid idtype="pmpid" link="fulltext">12771211</pubid>
                  <pubid idtype="doi">10.1093/nar/gkg390</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B19">
            <title>
               <p>DNA Binding Sites: Representation and Discovery</p>
            </title>
            <aug>
               <au>
                  <snm>Stormo</snm>
                  <fnm>GD</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2000</pubdate>
            <volume>16</volume>
            <fpage>16</fpage>
            <lpage>23</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/16.1.16</pubid>
                  <pubid idtype="pmpid" link="fulltext">10812473</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B20">
            <title>
               <p>One-Class Classification with Subgaussians</p>
            </title>
            <aug>
               <au>
                  <snm>Madany Mamlouk</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Kim</snm>
                  <fnm>JT</fnm>
               </au>
               <au>
                  <snm>Barth</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Brauckmann</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Martinetz</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>DAGM Symposium</source>
            <publisher>Berlin Heidelberg: Springer Verlag</publisher>
            <editor>Michaelis B, Krell G</editor>
            <pubdate>2003</pubdate>
            <fpage>346</fpage>
            <lpage>353</lpage>
         </bibl>
         <bibl id="B21">
            <title>
               <p>Prediction and verification of microRNA targets by MovingTargets, a highly adaptable prediction method</p>
            </title>
            <aug>
               <au>
                  <snm>Burgler</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Macdonald</snm>
                  <fnm>PM</fnm>
               </au>
            </aug>
            <source>BMC Genomics</source>
            <pubdate>2005</pubdate>
            <volume>6</volume>
            <fpage>88</fpage>
            <url>http://dx.doi.org/10.1186/1471-2164-6-88</url>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1180435</pubid>
                  <pubid idtype="pmpid" link="fulltext">15943864</pubid>
                  <pubid idtype="doi">10.1186/1471-2164-6-88</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B22">
            <title>
               <p>The Python Programming Language</p>
            </title>
            <url>http://www.python.org</url>
         </bibl>
         <bibl id="B23">
            <aug>
               <au>
                  <cnm>R Development Core Team</cnm>
               </au>
            </aug>
            <source>R: A language and environment for statistical computing</source>
            <publisher>R Foundation for Statistical Computing, Vienna, Austria</publisher>
            <pubdate>2004</pubdate>
            <url>http://www.R-project.org</url>
            <note>[ISBN 3-900051-07-0].</note>
         </bibl>
         <bibl id="B24">
            <title>
               <p>TarBase: A comprehensive database of experimentally supported animal microRNA targets</p>
            </title>
            <aug>
               <au>
                  <snm>Sethupathy</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Corda</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Hatzigeorgiou</snm>
                  <fnm>AG</fnm>
               </au>
            </aug>
            <source>RNA</source>
            <pubdate>2006</pubdate>
            <volume>12</volume>
            <issue>2</issue>
            <fpage>192</fpage>
            <lpage>197</lpage>
            <url>http://dx.doi.org/10.1261/rna.2239606</url>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1370898</pubid>
                  <pubid idtype="pmpid" link="fulltext">16373484</pubid>
                  <pubid idtype="doi">10.1261/rna.2239606</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B25">
            <title>
               <p>Multiple sequence alignment with the Clustal series of programs</p>
            </title>
            <aug>
               <au>
                  <snm>Chenna</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Sugawara</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Koike</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Lopez</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Gibson</snm>
                  <fnm>TJ</fnm>
               </au>
               <au>
                  <snm>Higgins</snm>
                  <fnm>DG</fnm>
               </au>
               <au>
                  <snm>Thompson</snm>
                  <fnm>JD</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2003</pubdate>
            <volume>31</volume>
            <issue>13</issue>
            <fpage>3497</fpage>
            <lpage>3500</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">168907</pubid>
                  <pubid idtype="pmpid" link="fulltext">12824352</pubid>
                  <pubid idtype="doi">10.1093/nar/gkg500</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B26">
            <title>
               <p>EnsMart: a generic system for fast and flexible access to biological data</p>
            </title>
            <aug>
               <au>
                  <snm>Kasprzyk</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Keefe</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Smedley</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>London</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Spooner</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Melsopp</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Hammond</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Rocca-Serra</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Cox</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Birney</snm>
                  <fnm>E</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2004</pubdate>
            <volume>14</volume>
            <fpage>160</fpage>
            <lpage>169<