<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1471-2105-13-S10-S14</ui>
   <ji>1471-2105</ji>
   <fm>
      <dochead>Proceedings</dochead>
      <bibl>
         <title>
            <p>Algorithms: simultaneous error-correction and rooting for gene tree reconciliation and the gene duplication problem</p>
         </title>
         <aug>
            <au ca="yes" id="A1"><snm>G&#243;recki</snm><fnm>Pawel</fnm><insr iid="I1"/><email>gorecki@mimuw.edu.pl</email></au>
            <au id="A2"><snm>Eulenstein</snm><fnm>Oliver</fnm><insr iid="I2"/><email>oeulenst@cs.iastate.edu</email></au>
         </aug>
         <insg>
            <ins id="I1"><p>Institute of Informatics, University of Warsaw, Warsaw, 02-097, Poland</p></ins>
            <ins id="I2"><p>Department of Computer Science, Iowa State University, Ames, 50011, USA</p></ins>
         </insg>
         <source>BMC Bioinformatics</source>
         
         
         <supplement><title><p>Selected articles from the 7th International Symposium on Bioinformatics Research and Applications (ISBRA'11)</p></title><editor>Jianer Chen, Ion Mandoiu, Raj Sunderraman, Jianxin Wang and Alexander Zelikovsky</editor><note>Proceedings</note></supplement><conference><title><p>7th International Symposium on Bioinformatics Research and Applications (ISBRA'11)</p></title><location>Changsha, China</location><date-range>27-29 May 2011</date-range><url>http://www.cs.gsu.edu/isbra11/</url></conference><issn>1471-2105</issn>
         <pubdate>2012</pubdate>
         <volume>13</volume>
         <issue>Suppl 10</issue>
         <fpage>S14</fpage>
         <url>http://www.biomedcentral.com/1471-2105/13/S10/S14</url>
         <xrefbib><pubidlist><pubid idtype="pmpid">22759419</pubid><pubid idtype="doi">10.1186/1471-2105-13-S10-S14</pubid></pubidlist></xrefbib>
      </bibl>
      <history><pub><date><day>25</day><month>6</month><year>2012</year></date></pub></history>
      <cpyrt><year>2012</year><collab>G&#243;recki and Eulenstein; licensee BioMed Central Ltd.</collab><note>This is an open access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note></cpyrt>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>Evolutionary methods are increasingly challenged by the wealth of fast growing resources of genomic sequence information. Evolutionary events, like gene duplication, loss, and deep coalescence, account more then ever for incongruence between gene trees and the actual species tree. Gene tree reconciliation is addressing this fundamental problem by invoking the minimum number of gene duplication and losses that reconcile a rooted gene tree with a rooted species tree. However, the reconciliation process is highly sensitive to topological error or wrong rooting of the gene tree, a condition that is not met by most gene trees in practice. Thus, despite the promises of gene tree reconciliation, its applicability in practice is severely limited.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>We introduce the problem of reconciling unrooted and erroneous gene trees by simultaneously rooting and error-correcting them, and describe an efficient algorithm for this problem. Moreover, we introduce an error-corrected version of the gene duplication problem, a standard application of gene tree reconciliation. We introduce an effective heuristic for our error-corrected version of the gene duplication problem, given that the original version of this problem is NP-hard. Our experimental results suggest that our error-correcting approaches for unrooted input trees can significantly improve on the accuracy of gene tree reconciliation, and the species tree inference under the gene duplication problem. Furthermore, the efficiency of our algorithm for error-correcting reconciliation is capable of handling truly large-scale phylogenetic studies.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusions</p>
               </st>
               <p>Our presented error-correction approach is a crucial step towards making gene tree reconciliation more robust, and thus to improve on the accuracy of applications that fundamentally rely on gene tree reconciliation, like the inference of gene-duplication supertrees.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>The wealth of newly sequenced genomes has provided us with an unprecedented resource of information for phylogenetic studies that will have extensive implications for a host of issues in biology, ecology, and medicine, and promise even more. Yet, before such phylogenies can be reliably inferred, challenging problems that came along with the newly sequenced genomes have to be overcome. Evolutionary biologists have long realized that gene-duplication and subsequent loss, a fundamental evolutionary process <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>, can largely obfuscate phylogenetic inference <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>. Gene-duplication can form complex evolutionary histories of genes, called gene trees, whose topologies are traditionally used to derive species trees. This approach relies on the assumption that the topologies from gene trees are consistent with the topology of the species tree. However, frequently genes that evolve from different copies of ancestral gene-duplications can become extinct and result in gene trees with correct topologies that are inconsistent with the topology of the actual species tree (see Figure <figr fid="F1">1</figr>). In many such cases phylogenetic information from the gene trees is indispensable and may still be recovered using gene tree reconciliation.</p>
         <fig id="F1"><title><p>Figure 1</p></title><caption><p>Rooted reconciliation</p></caption><text>
   <p><b>Rooted reconciliation</b>. An lca-mapping <it>M </it>from the gene tree <inline-formula><graphic file="1471-2105-13-S10-S14-i5.gif"/></inline-formula> into the species tree <inline-formula><graphic file="1471-2105-13-S10-S14-i1.gif"/></inline-formula> and the corresponding embedding. <it>M </it>is shown for the internal nodes of <inline-formula><graphic file="1471-2105-13-S10-S14-i5.gif"/></inline-formula>.</p>
</text><graphic file="1471-2105-13-S10-S14-1"/></fig>
         <sec>
            <st>
               <p>Related work</p>
            </st>
            <p>Gene tree reconciliation is a well-studied method for resolving topological inconsistencies between a gene tree and a trusted species tree <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr><abbr bid="B5">5</abbr><abbr bid="B6">6</abbr><abbr bid="B7">7</abbr></abbrgrp>. Inconsistencies are resolved by invoking gene-duplication and loss events that reconcile the gene tree to be consistent with the actual species tree. Such events do not only reconcile gene trees, but also lay foundation for a variety of evolutionary applications including ortholog/paralog annotation of genes, locating episodes of gene-duplications in species trees <abbrgrp><abbr bid="B8">8</abbr><abbr bid="B9">9</abbr><abbr bid="B10">10</abbr></abbrgrp>, reconstructing domain decompositions <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>, and species supertree construction <abbrgrp><abbr bid="B8">8</abbr><abbr bid="B12">12</abbr><abbr bid="B13">13</abbr><abbr bid="B14">14</abbr></abbrgrp>.</p>
            <p>A major problem in the application of gene tree reconciliation is its high sensitivity to error-prone gene trees. Even seemingly insignificant errors can largely mislead the reconciliation process and, typically undetected, infer incorrect phylogenies (e.g., <abbrgrp><abbr bid="B7">7</abbr><abbr bid="B15">15</abbr></abbrgrp>). Errors in gene trees are often topological errors and rooting errors. Topological error results in an incorrect topology of the gene tree that can be caused by the inference process (e.g. noise in the underlying sequence data) or the inference method itself (e.g. heuristic results). This problem has been addressed for rooted gene trees by 'correcting the error'; that is, editing the given tree such that the number of invoked gene-duplications and losses is minimized <abbrgrp><abbr bid="B16">16</abbr><abbr bid="B17">17</abbr></abbrgrp>. However, most inference methods used in practice return only unrooted gene trees (e.g. parsimony and maximum likelihood based methods) that have to be rooted for the gene tree reconciliation process. Rooting error is a wrongly chosen root in an unrooted gene tree. Whereas rooting can be typically achieved in species trees by outgroup analysis, this approach may not be possible for gene trees if there is a history of gene duplication and loss <abbrgrp><abbr bid="B7">7</abbr></abbrgrp>. Other rooting approaches like midpoint rooting or molecular clock rooting assume a constant rate of evolution that is often unrealistic. However, rooting problems can be bypassed by identifying roots that minimize the invoked number of gene duplications and losses <abbrgrp><abbr bid="B7">7</abbr><abbr bid="B16">16</abbr><abbr bid="B17">17</abbr><abbr bid="B18">18</abbr><abbr bid="B19">19</abbr></abbrgrp>.</p>
            <p>In summary, even small topological error or a slightly misplaced root can incorrectly identify enormous numbers of gene duplications and losses, and therefore largely mislead the reconciliation process. Therefore, gene tree reconciliation requires gene trees that are free of error and correctly rooted at the same time <abbrgrp><abbr bid="B5">5</abbr></abbrgrp>. However, as previous work has incorporated topological error-correction only separately from correctly rooting gene trees into the reconciliation process <abbrgrp><abbr bid="B16">16</abbr><abbr bid="B18">18</abbr></abbrgrp>, this process can still be misled.</p>
         </sec>
         <sec>
            <st>
               <p>Our contribution</p>
            </st>
            <p>We address the problem of reconciling erroneous and unrooted gene trees by error-correcting and rooting them at the same time. Solving this problem efficiently is a crucial step towards making gene tree reconciliation more robust, and thus to improve on the accuracy of applications that rely on gene tree reconciliation like the construction of gene-duplication supertrees. We introduce the problem and design an efficient algorithm that facilitates a much more precise gene tree reconciliation, even for large-scale data sets. Our algorithm detects and corrects errors in unrooted gene trees, and thus we avoid the biologists' difficulty and uncertainty of handling erroneous gene trees and correctly rooting them. The presented experimental results suggest that our novel reconciliation algorithms can identify and correct topological error in unrooted input gene trees, and at the same time root them optimally.</p>
            <p>Our algorithm is designed to search for the correct and rooted tree of a given unrooted tree in local search neighborhoods of the given tree. The size of these neighborhoods is described by a positive integer <it>k </it>that allows to fine-tune the search. While in theory <it>k </it>can be large it is assumed that gene trees have only small topological error, which typically can be captured by small values of <it>k</it>. For a fixed but freely choosable integer <it>k </it>the runtime of our algorithm is <it>O</it>(<it>l<sup>k </sup></it>+ max(<it>n, m</it>)), where <it>n </it>and <it>m </it>is the size of the gene tree and species tree respectively, and <it>l </it>is the number of edges in the gene tree that potentially contain an error (such edges will be called <it>weak</it>). Thus, for a small error, which is expressed by <it>k </it>= 1, our algorithm runs in linear time. Our experiments show that error-correction runs of the algorithm for <it>k </it>= 3 are still possible even for trees with large number of weak edges (e.g., <it>l </it>= 200) on a standard workstation configuration.</p>
            <p>Further, we address the problem of constructing rooted supertrees by reconciling unrooted and erroneous gene trees with assigned weak edges, a key problem in illuminating the role and effect of gene duplication and loss in shaping the evolution of organisms. We introduce the problem and develop an effective local search heuristic that makes the construction of more accurate supertrees possible and allows a much better postulation of gene duplication histories. Our experimental results demonstrate that our approach is effective in identifying gene duplication histories given erroneous gene trees and producing more accurate supertrees under gene tree reconciliation.</p>
         </sec>
         <sec>
            <st>
               <p>Duplication-loss model</p>
            </st>
            <p>We introduce the fundamentals of the classical duplication-loss model. Our definitions are mostly adopted from <abbrgrp><abbr bid="B18">18</abbr></abbrgrp>. For a more detailed introduction to the duplication-loss model we refer the interested reader to <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B5">5</abbr><abbr bid="B10">10</abbr><abbr bid="B20">20</abbr></abbrgrp>.</p>
            <p>Let &#8464; be the set of species consisting of <it>N </it>&gt; 0 elements. The <it>unrooted gene tree </it>is an undirected acyclic graph in which each node has degree 3 (internal nodes) or 1 (leaves), and the leaves are labeled by the elements from &#8464;. A <it>species tree </it><inline-formula><graphic file="1471-2105-13-S10-S14-i1.gif"/></inline-formula> is a rooted binary tree with <it>N </it>leaves uniquely labeled by the elements from &#8464;. In some cases, a node of a tree will be referred by "cluster" of labels of its subtree leaves. For instance, a species tree (<it>a</it>, (<it>b, c</it>)) has 5 nodes denoted by: <it>a, b, c, bc </it>and <it>abc</it>. <it>A rooted gene tree </it>is a rooted binary tree with leaves labeled by the elements from &#8464;. The internal nodes of a tree <it>T </it>we denote by int(<it>T</it>).</p>
            <p>Let <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i2"><m:mi mathvariant="script">S</m:mi>
<m:mo class="MathClass-rel">=</m:mo>
<m:mrow>
   <m:mo class="MathClass-open">&#10216;</m:mo>
   <m:mrow>
      <m:msub>
         <m:mrow>
            <m:mi>V</m:mi>
         </m:mrow>
         <m:mrow>
            <m:mi mathvariant="script">S</m:mi>
         </m:mrow>
      </m:msub>
      <m:mo class="MathClass-punc">,</m:mo>
      <m:msub>
         <m:mrow>
            <m:mi>E</m:mi>
         </m:mrow>
         <m:mrow>
            <m:mi mathvariant="script">S</m:mi>
         </m:mrow>
      </m:msub>
   </m:mrow>
   <m:mo class="MathClass-close">&#10217;</m:mo>
</m:mrow>
</m:math></inline-formula> be a <it>species tree</it>. <inline-formula><graphic file="1471-2105-13-S10-S14-i1.gif"/></inline-formula> can be viewed as an upper semilattice with + a binary least upper bound operation and &#8868; the top element, that is, the root. In particular for <it>a</it>, <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i3"><m:mi>b</m:mi>
<m:mo class="MathClass-rel">&#8712;</m:mo>
<m:msub>
   <m:mrow>
      <m:mi>V</m:mi>
   </m:mrow>
   <m:mrow>
      <m:mi mathvariant="script">S</m:mi>
   </m:mrow>
</m:msub>
</m:math></inline-formula>, <it>a </it>&lt;<it>b </it>means that <it>a </it>and <it>b </it>are on the same path from the root, with <it>b </it>being closer to the root than <it>a</it>. We define the <it>comparability predicate D</it>(<it>a, b</it>) = 1, if <it>a </it>&#8804; <it>b </it>or <it>b </it>&#8804; <it>a </it>and <it>D</it>(<it>a, b</it>) = 0, when <it>a </it>and <it>b </it>are incomparable. The <it>distance function &#961;</it>(<it>a, b</it>) is used to denote the number of edges on the unique (non-directed) path connecting <it>a </it>and <it>b</it>.</p>
            <p>We call distinct nodes <it>a</it>, <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i3"><m:mi>b</m:mi><m:mo class="MathClass-rel">&#8712;</m:mo><m:msub><m:mrow><m:mi>V</m:mi></m:mrow><m:mrow><m:mi mathvariant="script">S</m:mi></m:mrow></m:msub></m:math></inline-formula><it>siblings </it>when <it>a </it>+ <it>b </it>is a parent of <it>a </it>and <it>b</it>. For <it>a</it>, <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i3"><m:mi>b</m:mi><m:mo class="MathClass-rel">&#8712;</m:mo><m:msub><m:mrow><m:mi>V</m:mi></m:mrow><m:mrow><m:mi mathvariant="script">S</m:mi></m:mrow></m:msub></m:math></inline-formula> let <b>Sb</b>(<it>a, b</it>) be the set of nodes defined by the following recurrent rule: <b>(i) Sb</b>(<it>a, b</it>) = &#8709; if <it>a </it>= <it>b </it>or <it>a </it>and <it>b </it>are siblings, <b>(ii) Sb</b>(<it>a, b</it>) = {<it>c</it>} &#8746; <b>Sb</b>(<it>a </it>+ <it>c, b</it>), if <it>a </it>&lt;<it>b </it>or <it>a </it>+ <it>c </it>&lt;<it>a </it>+ <it>b</it>; here <it>c </it>is the sibling of <it>a</it>, and <b>(iii) Sb</b>(<it>a, b</it>) = <b>Sb</b>(<it>b, a</it>) otherwise.</p>
            <p>By <it>L</it>(<it>a, b</it>) we denote the number of elements in <b>Sb</b>(<it>a, b</it>). Observe that <it>L</it>(<it>a, b</it>) = <it>&#961;</it>(<it>a, b</it>) - 2 &#183; (1 - <it>D</it>(<it>a, b</it>)). Let <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i4"><m:mi>M</m:mi>
<m:mo class="MathClass-rel">:</m:mo>
<m:msub>
   <m:mrow>
      <m:mi>V</m:mi>
   </m:mrow>
   <m:mrow>
      <m:mi mathvariant="script">G</m:mi>
   </m:mrow>
</m:msub>
<m:mo class="MathClass-rel">&#8594;</m:mo>
<m:msub>
   <m:mrow>
      <m:mi>V</m:mi>
   </m:mrow>
   <m:mrow>
      <m:mi mathvariant="script">S</m:mi>
   </m:mrow>
</m:msub>
</m:math></inline-formula> be the <it>least common ancestor (lca) mapping</it>, from rooted <inline-formula><graphic file="1471-2105-13-S10-S14-i5.gif"/></inline-formula> into <inline-formula><graphic file="1471-2105-13-S10-S14-i1.gif"/></inline-formula> that preserves the labeling of the leaves. Formally, if <it>v </it>is a leaf in <inline-formula><graphic file="1471-2105-13-S10-S14-i5.gif"/></inline-formula> then <it>M</it>(<it>v</it>) is the node in <inline-formula><graphic file="1471-2105-13-S10-S14-i1.gif"/></inline-formula> labeled by the label of <it>v</it>. If <it>v </it>is internal node in <inline-formula><graphic file="1471-2105-13-S10-S14-i5.gif"/></inline-formula> with two children <it>a, b</it>, then <it>M</it>(<it>v</it>) = <it>M</it>(<it>a</it>) + <it>M</it>(<it>b</it>). An example is depicted in Figure <figr fid="F1">1</figr>.</p>
            <p>In this general setting let us assume that we are given a <it>cost function </it><inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i6"><m:mi>&#958;</m:mi>
<m:mo class="MathClass-rel">:</m:mo>
<m:msub>
   <m:mrow>
      <m:mi>V</m:mi>
   </m:mrow>
   <m:mrow>
      <m:mi mathvariant="script">G</m:mi>
   </m:mrow>
</m:msub>
<m:mo class="MathClass-bin">&#215;</m:mo>
<m:msub>
   <m:mrow>
      <m:mi>V</m:mi>
   </m:mrow>
   <m:mrow>
      <m:mi mathvariant="script">S</m:mi>
   </m:mrow>
</m:msub>
<m:mo class="MathClass-rel">&#8594;</m:mo>
<m:mi mathvariant="bold">R</m:mi>
</m:math></inline-formula> which for all nodes <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i7"><m:mi>v</m:mi>
<m:mo class="MathClass-rel">&#8712;</m:mo>
<m:msub>
   <m:mrow>
      <m:mi>V</m:mi>
   </m:mrow>
   <m:mrow>
      <m:mi mathvariant="script">G</m:mi>
   </m:mrow>
</m:msub>
</m:math></inline-formula>, <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i8"><m:mi>a</m:mi>
<m:mo class="MathClass-rel">&#8712;</m:mo>
<m:msub>
   <m:mrow>
      <m:mi>V</m:mi>
   </m:mrow>
   <m:mrow>
      <m:mi mathvariant="script">S</m:mi>
   </m:mrow>
</m:msub>
</m:math></inline-formula> assigns a real <it>&#958;</it>(<it>v, a</it>) representing a contribution to node <it>a </it>which comes from <it>v </it>when reconciling <inline-formula><graphic file="1471-2105-13-S10-S14-i5.gif"/></inline-formula> with <inline-formula><graphic file="1471-2105-13-S10-S14-i1.gif"/></inline-formula>. Having <it>&#958; </it>we can define <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i9"><m:mrow>
   <m:mi>k</m:mi>
   <m:mrow>
      <m:mo class="MathClass-open">(</m:mo>
      <m:mrow>
         <m:mi>v</m:mi>
      </m:mrow>
      <m:mo class="MathClass-close">)</m:mo>
   </m:mrow>
   <m:mo class="MathClass-rel">=</m:mo>
   <m:msub>
      <m:mrow>
         <m:mo mathsize="big"> &#8721;</m:mo>
      </m:mrow>
      <m:mrow>
         <m:mi>a</m:mi>
      </m:mrow>
   </m:msub>
   <m:mi>&#958;</m:mi>
   <m:mfenced close=")" open="(" separators="">
      <m:mrow>
         <m:mi>v</m:mi>
         <m:mo class="MathClass-punc">,</m:mo>
         <m:mstyle class="text">
            <m:mtext class="textsf" mathvariant="sans-serif">&#160;</m:mtext>
         </m:mstyle>
         <m:mi>a</m:mi>
      </m:mrow>
   </m:mfenced>
</m:mrow>
</m:math></inline-formula> to be a total contribution from <it>v </it>in the reconciliation of <inline-formula><graphic file="1471-2105-13-S10-S14-i5.gif"/></inline-formula> with <inline-formula><graphic file="1471-2105-13-S10-S14-i1.gif"/></inline-formula>. We call <it>&#954; </it>a <it>contribution </it>function. Finally, <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i10"><m:mi>&#963;</m:mi>
<m:mo class="MathClass-rel">=</m:mo>
<m:msub>
   <m:mrow>
      <m:mo class="MathClass-op"> &#8721;</m:mo>
   </m:mrow>
   <m:mrow>
      <m:mi>v</m:mi>
   </m:mrow>
</m:msub>
<m:mi>k</m:mi>
<m:mfenced close=")" open="(" separators="">
   <m:mrow>
      <m:mi>v</m:mi>
   </m:mrow>
</m:mfenced>
</m:math></inline-formula> is the total cost of reconciliation of <inline-formula><graphic file="1471-2105-13-S10-S14-i5.gif"/></inline-formula> with <inline-formula><graphic file="1471-2105-13-S10-S14-i1.gif"/></inline-formula>.</p>
            <p>Now we present examples of cost functions that are used in the duplication model. We assume that if <it>v </it>is an internal node in <inline-formula><graphic file="1471-2105-13-S10-S14-i5.gif"/></inline-formula> then <it>w</it><sub>1 </sub>and <it>w</it><sub>2 </sub>are its children. The <it>Duplication cost </it>function is defined as follows: <it>&#958;<sup>D</sup></it>(<it>v, a</it>) = 1 if <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i11"><m:mi>v</m:mi>
<m:mo class="MathClass-rel">&#8712;</m:mo>
<m:mstyle class="text">
   <m:mtext class="textsf" mathvariant="sans-serif">int</m:mtext>
</m:mstyle>
<m:mspace class="nbsp" width="1em"/>
<m:mrow>
   <m:mo class="MathClass-open">(</m:mo>
   <m:mrow>
      <m:mi mathvariant="script">G</m:mi>
   </m:mrow>
   <m:mo class="MathClass-close">)</m:mo>
</m:mrow>
</m:math></inline-formula> and <it>M</it>(<it>v</it>) = <it>M</it>(<it>w<sub>i</sub></it>) = <it>a </it>for some <it>i</it>, and <it>&#958;<sup>D</sup></it>(<it>v, a</it>) = 0 otherwise. The <it>Loss cost </it>function: <it>&#958;<sup>L</sup></it>(<it>v, a</it>) = 1 if <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i11"><m:mi>v</m:mi><m:mo class="MathClass-rel">&#8712;</m:mo><m:mstyle class="text"><m:mtext class="textsf" mathvariant="sans-serif">int</m:mtext></m:mstyle><m:mspace class="nbsp" width="1em"/><m:mrow><m:mo class="MathClass-open">(</m:mo><m:mrow><m:mi mathvariant="script">G</m:mi></m:mrow><m:mo class="MathClass-close">)</m:mo></m:mrow></m:math></inline-formula> and <it>a </it>&#8712; <b>Sb </b>(<it>M</it>(<it>w</it><sub>1</sub>), <it>M</it>(<it>w</it><sub>2</sub>)), and <it>&#958;<sup>L</sup></it>(<it>v, a</it>) = 0 otherwise. It can be proved that if <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i11"><m:mi>v</m:mi><m:mo class="MathClass-rel">&#8712;</m:mo><m:mstyle class="text"><m:mtext class="textsf" mathvariant="sans-serif">int</m:mtext></m:mstyle><m:mspace class="nbsp" width="1em"/><m:mrow><m:mo class="MathClass-open">(</m:mo><m:mrow><m:mi mathvariant="script">G</m:mi></m:mrow><m:mo class="MathClass-close">)</m:mo></m:mrow></m:math></inline-formula> then <it>&#954;<sup>D</sup></it>(<it>v</it>) = <it>D</it>(<it>M</it>(<it>w</it><sub>1</sub>), <it>M</it>(<it>w</it><sub>2</sub>)) and <it>&#954;<sup>L </sup></it>(<it>v</it>) = <it>L</it>(<it>M</it>(<it>w</it><sub>1</sub>), <it>M</it>(<it>w</it><sub>2</sub>)) (in both cases 0 if <it>v </it>is a leaf).</p>
            <p>The <it>Duplication cost </it>function is defined as follows: <it>&#958;<sup>D</sup></it>(<it>v, a</it>) = 1 if <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i11"><m:mi>v</m:mi><m:mo class="MathClass-rel">&#8712;</m:mo><m:mstyle class="text"><m:mtext class="textsf" mathvariant="sans-serif">int</m:mtext></m:mstyle><m:mspace class="nbsp" width="1em"/><m:mrow><m:mo class="MathClass-open">(</m:mo><m:mrow><m:mi mathvariant="script">G</m:mi></m:mrow><m:mo class="MathClass-close">)</m:mo></m:mrow></m:math></inline-formula> and <it>M</it>(<it>v</it>) = <it>M</it>(<it>w<sub>i</sub></it>) = <it>a </it>for some <it>i</it>, and <it>&#958;<sup>D</sup></it>(<it>v, a</it>) = 0 otherwise. Loss cost function: <it>&#958;<sup>L</sup></it>(<it>v, a</it>) = 1 if <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i11"><m:mi>v</m:mi><m:mo class="MathClass-rel">&#8712;</m:mo><m:mstyle class="text"><m:mtext class="textsf" mathvariant="sans-serif">int</m:mtext></m:mstyle><m:mspace class="nbsp" width="1em"/><m:mrow><m:mo class="MathClass-open">(</m:mo><m:mrow><m:mi mathvariant="script">G</m:mi></m:mrow><m:mo class="MathClass-close">)</m:mo></m:mrow></m:math></inline-formula> and <it>a </it>&#8712; <b>Sb</b>(<it>M</it>(<it>w</it><sub>1</sub>), <it>M</it>(<it>w</it><sub>2</sub>)), and <it>&#958;<sup>L</sup></it>(<it>v, a</it>) = 0 otherwise. It can be proved that if <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i11"><m:mi>v</m:mi><m:mo class="MathClass-rel">&#8712;</m:mo><m:mstyle class="text"><m:mtext class="textsf" mathvariant="sans-serif">int</m:mtext></m:mstyle><m:mspace class="nbsp" width="1em"/><m:mrow><m:mo class="MathClass-open">(</m:mo><m:mrow><m:mi mathvariant="script">G</m:mi></m:mrow><m:mo class="MathClass-close">)</m:mo></m:mrow></m:math></inline-formula> then <it>&#954;<sup>D</sup></it>(<it>v</it>) = <it>D</it>(<it>M</it>(<it>w</it><sub>1</sub>), <it>M</it>(<it>w</it><sub>2</sub>)) and <it>&#954;<sup>L</sup></it>(<it>v</it>) = <it>L</it>(<it>M</it>(<it>w</it><sub>1</sub>), <it>M</it>(<it>w</it><sub>2</sub>)) (in both cases 0 if <it>v </it>is a leaf).</p>
            <p>Observe that a node <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i12"><m:mi>v</m:mi>
<m:mo class="MathClass-rel">&#8712;</m:mo>
<m:msub>
   <m:mrow>
      <m:mi>V</m:mi>
   </m:mrow>
   <m:mrow>
      <m:mi mathvariant="script">G</m:mi>
   </m:mrow>
</m:msub>
</m:math></inline-formula> is called a duplication <abbrgrp><abbr bid="B4">4</abbr><abbr bid="B13">13</abbr></abbrgrp> if <it>&#954;<sup>D</sup></it>(<it>v</it>) = 1. Moreover, <it>&#954;<sup>L</sup></it>(<it>v</it>) = <it>l</it>(<it>v</it>), where <it>l</it>(<it>v</it>) is the number of gene losses associated to <it>v</it>. It can be proved that <it>&#963;<sup>D </sup></it>and <it>&#963;<sup>L </sup></it>are the minimal number of gene duplications and gene losses (respectively) required to reconcile (or to embed) <inline-formula><graphic file="1471-2105-13-S10-S14-i5.gif"/></inline-formula> with <inline-formula><graphic file="1471-2105-13-S10-S14-i1.gif"/></inline-formula>. Please refer to <abbrgrp><abbr bid="B18">18</abbr></abbrgrp> for more details. The example of an embedding is depicted in Figure <figr fid="F1">1</figr>.</p>
         </sec>
         <sec>
            <st>
               <p>Introduction to unrooted reconciliation</p>
            </st>
            <p>Here we highlight some results from <abbrgrp><abbr bid="B18">18</abbr></abbrgrp> that are used for the design of our algorithm. From now on, we assume that <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i87"><m:mrow>
   <m:mi mathvariant="script">G</m:mi>
   <m:mo>=</m:mo>
   <m:mo>&#9001;</m:mo>
   <m:msub>
      <m:mi>V</m:mi>
      <m:mi mathvariant="script">G</m:mi>
   </m:msub>
   <m:mo>,</m:mo>
   <m:msub>
      <m:mi>E</m:mi>
      <m:mi mathvariant="script">G</m:mi>
   </m:msub>
   <m:msup>
      <m:mrow/>
      <m:mo>&#160;</m:mo>
   </m:msup>
   <m:mo>&#9002;</m:mo>
</m:mrow>
</m:math></inline-formula> is an unrooted gene tree. We define a rooting of <inline-formula><graphic file="1471-2105-13-S10-S14-i5.gif"/></inline-formula> by selecting an edge <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i13"><m:mi>e</m:mi>
<m:mo class="MathClass-rel">&#8712;</m:mo>
<m:msub>
   <m:mrow>
      <m:mi>E</m:mi>
   </m:mrow>
   <m:mrow>
      <m:mi mathvariant="script">G</m:mi>
   </m:mrow>
</m:msub>
</m:math></inline-formula> on which the root is to be placed. Such a rooted tree will be denoted by <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i14"><m:msub>
   <m:mrow>
      <m:mi mathvariant="script">G</m:mi>
   </m:mrow>
   <m:mrow>
      <m:mi>e</m:mi>
   </m:mrow>
</m:msub>
</m:math></inline-formula>, where <it>v</it><sub>* </sub>is a new node defining the root. To distinguish between rootings of <inline-formula><graphic file="1471-2105-13-S10-S14-i5.gif"/></inline-formula>, the symbols defined in previous section for rooted gene trees will be extended by inserting index <it>e</it>. Please observe, that the mapping of the root of <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i14"><m:msub><m:mrow><m:mi mathvariant="script">G</m:mi></m:mrow><m:mrow><m:mi>e</m:mi></m:mrow></m:msub></m:math></inline-formula> is independent of <it>e</it>. Without loss of generality the following is assumed: <b>(A1) </b><inline-formula><graphic file="1471-2105-13-S10-S14-i1.gif"/></inline-formula> and <inline-formula><graphic file="1471-2105-13-S10-S14-i5.gif"/></inline-formula> have at least one internal node and <b>(A2) </b><it>M<sub>e</sub></it>(<it>v</it><sub>*</sub>)=&#8868;; that is, the root of every rooting is mapped into the root of <inline-formula><graphic file="1471-2105-13-S10-S14-i1.gif"/></inline-formula> (we may always consider the subtree of the species tree rooted in <it>M<sub>e</sub></it>(<it>v</it><sub>*</sub>) with no change of the cost).</p>
            <p>First, we transform <inline-formula><graphic file="1471-2105-13-S10-S14-i5.gif"/></inline-formula> into a directed graph <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i15"><m:mrow>
   <m:mover accent="false">
      <m:mrow>
         <m:mi mathvariant="script">G</m:mi>
      </m:mrow>
      <m:mo class="MathClass-op">^</m:mo>
   </m:mover>
   <m:mo class="MathClass-rel">=</m:mo>
   <m:mrow>
      <m:mo class="MathClass-open">&#10216;</m:mo>
      <m:mrow>
         <m:msub>
            <m:mrow>
               <m:mi>V</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mi mathvariant="script">G</m:mi>
            </m:mrow>
         </m:msub>
         <m:mo class="MathClass-punc">,</m:mo>
         <m:mover accent="false">
            <m:mrow>
               <m:msub>
                  <m:mrow>
                     <m:mi>E</m:mi>
                  </m:mrow>
                  <m:mrow>
                     <m:mi mathvariant="script">G</m:mi>
                  </m:mrow>
               </m:msub>
            </m:mrow>
            <m:mo class="MathClass-op">^</m:mo>
         </m:mover>
      </m:mrow>
      <m:mo class="MathClass-close">&#10217;</m:mo>
   </m:mrow>
</m:mrow>
</m:math></inline-formula> where <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i16"><m:mrow>
   <m:mover accent="false">
      <m:mrow>
         <m:msub>
            <m:mrow>
               <m:mi>E</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mi mathvariant="script">G</m:mi>
            </m:mrow>
         </m:msub>
      </m:mrow>
      <m:mo class="MathClass-op">^</m:mo>
   </m:mover>
   <m:mo class="MathClass-rel">=</m:mo>
   <m:mrow>
      <m:mo class="MathClass-open">{</m:mo>
      <m:mrow>
         <m:mrow>
            <m:mo class="MathClass-open">&#10216;</m:mo>
            <m:mrow>
               <m:mi>v</m:mi>
               <m:mo class="MathClass-punc">,</m:mo>
               <m:mspace class="tmspace" width="2.77695pt"/>
               <m:mi>w</m:mi>
            </m:mrow>
            <m:mo class="MathClass-close">&#10217;</m:mo>
         </m:mrow>
         <m:mo class="MathClass-rel">|</m:mo>
         <m:mspace class="tmspace" width="2.77695pt"/>
         <m:mrow>
            <m:mo class="MathClass-open">{</m:mo>
            <m:mrow>
               <m:mi>v</m:mi>
               <m:mo class="MathClass-punc">,</m:mo>
               <m:mspace class="tmspace" width="2.77695pt"/>
               <m:mi>w</m:mi>
            </m:mrow>
            <m:mo class="MathClass-close">}</m:mo>
         </m:mrow>
         <m:mo class="MathClass-rel">&#8712;</m:mo>
         <m:msub>
            <m:mrow>
               <m:mi>E</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mi mathvariant="script">G</m:mi>
            </m:mrow>
         </m:msub>
      </m:mrow>
      <m:mo class="MathClass-close">}</m:mo>
   </m:mrow>
</m:mrow>
</m:math></inline-formula>. In other words each edge &#9001;<it>v, w</it>&#9002; in <inline-formula><graphic file="1471-2105-13-S10-S14-i5.gif"/></inline-formula> is replaced in <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i17"><m:mrow>
   <m:mover accent="false">
      <m:mrow>
         <m:mi mathvariant="script">G</m:mi>
      </m:mrow>
      <m:mo class="MathClass-op">^</m:mo>
   </m:mover>
</m:mrow>
</m:math></inline-formula> by a pair of directed edges &#9001;<it>v, w</it>&#9002; and &#9001;<it>w, v</it>&#9002;.</p>
            <p>Edges in <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i17"><m:mrow><m:mover accent="false"><m:mrow><m:mi mathvariant="script">G</m:mi></m:mrow><m:mo class="MathClass-op">^</m:mo></m:mover></m:mrow></m:math></inline-formula> are labeled by nodes of <inline-formula><graphic file="1471-2105-13-S10-S14-i1.gif"/></inline-formula> as follows. If <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i18"><m:mi>v</m:mi>
<m:mo class="MathClass-rel">&#8712;</m:mo>
<m:msub>
   <m:mrow>
      <m:mi>V</m:mi>
   </m:mrow>
   <m:mrow>
      <m:mi mathvariant="script">G</m:mi>
   </m:mrow>
</m:msub>
</m:math></inline-formula> is a leaf labeled by <it>a</it>, then the edge <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i19"><m:mrow>
   <m:mrow>
      <m:mo class="MathClass-open">&#10216;</m:mo>
      <m:mrow>
         <m:mi>v</m:mi>
         <m:mo class="MathClass-punc">,</m:mo>
         <m:mi>w</m:mi>
      </m:mrow>
      <m:mo class="MathClass-close">&#10217;</m:mo>
   </m:mrow>
   <m:mo class="MathClass-rel">&#8712;</m:mo>
   <m:mover accent="false">
      <m:mrow>
         <m:msub>
            <m:mrow>
               <m:mi>E</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mi mathvariant="script">G</m:mi>
            </m:mrow>
         </m:msub>
      </m:mrow>
      <m:mo class="MathClass-op">^</m:mo>
   </m:mover>
</m:mrow>
</m:math></inline-formula> is labeled by <it>a</it>. When <it>v </it>is an internal node in <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i17"><m:mrow><m:mover accent="false"><m:mrow><m:mi mathvariant="script">G</m:mi></m:mrow><m:mo class="MathClass-op">^</m:mo></m:mover></m:mrow></m:math></inline-formula> we assume that &#9001;<it>w</it><sub>1</sub>, <it>v</it>&#9002; and &#9001;<it>w</it><sub>2</sub>, <it>v</it>&#9002; are labeled by <it>b</it><sub>1 </sub>and <it>b</it><sub>2</sub>, respectively. Then the edge <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i20"><m:mrow>
   <m:mrow>
      <m:mo class="MathClass-open">&#10216;</m:mo>
      <m:mrow>
         <m:mi>v</m:mi>
         <m:mo class="MathClass-punc">,</m:mo>
         <m:msub>
            <m:mrow>
               <m:mi>w</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mn>3</m:mn>
            </m:mrow>
         </m:msub>
      </m:mrow>
      <m:mo class="MathClass-close">&#10217;</m:mo>
   </m:mrow>
   <m:mo class="MathClass-rel">&#8712;</m:mo>
   <m:mover accent="false">
      <m:mrow>
         <m:msub>
            <m:mrow>
               <m:mi>E</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mi mathvariant="script">G</m:mi>
            </m:mrow>
         </m:msub>
      </m:mrow>
      <m:mo class="MathClass-op">^</m:mo>
   </m:mover>
   <m:mo class="MathClass-punc">,</m:mo>
</m:mrow>
</m:math></inline-formula> such that <it>w</it><sub>3 </sub>&#8800; <it>w</it><sub>1 </sub>and <it>w</it><sub>3 </sub>&#8800; <it>w</it><sub>2 </sub>is labeled by <it>b</it><sub>1 </sub>+ <it>b</it><sub>2</sub>. Such labeling will be used to explore mappings of rootings of <inline-formula><graphic file="1471-2105-13-S10-S14-i5.gif"/></inline-formula>. An edge {<it>v, w</it>} in <inline-formula><graphic file="1471-2105-13-S10-S14-i5.gif"/></inline-formula> is called <it>asymmetric </it>if exactly one of the labels of &#9001;<it>v, w</it>&#9002; and &#9001;<it>w, v</it>&#9002; in <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i17"><m:mrow><m:mover accent="false"><m:mrow><m:mi mathvariant="script">G</m:mi></m:mrow><m:mo class="MathClass-op">^</m:mo></m:mover></m:mrow></m:math></inline-formula> is equal to &#8868;, otherwise it is called <it>symmetric</it>.</p>
            <p>Every internal node <it>v</it>, and its neighbors in <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i17"><m:mrow><m:mover accent="false"><m:mrow><m:mi mathvariant="script">G</m:mi></m:mrow><m:mo class="MathClass-op">^</m:mo></m:mover></m:mrow></m:math></inline-formula> define a subtree of <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i21"><m:mrow>
   <m:mover accent="false">
      <m:mrow>
         <m:msub>
            <m:mrow>
               <m:mi>E</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mi mathvariant="script">G</m:mi>
            </m:mrow>
         </m:msub>
      </m:mrow>
      <m:mo class="MathClass-op">^</m:mo>
   </m:mover>
</m:mrow>
</m:math></inline-formula>, called a <it>star </it>with a center <it>v</it>, as depicted in Figure <figr fid="F2">2</figr>. The edges &#9001;<it>v, w<sub>i</sub></it>&#9002; are called <it>outgoing</it>, while the edges &#9001;<it>w<sub>i</sub>, v</it>&#9002; are called <it>incoming</it>. We will refer to the undirected edge {<it>v, w<sub>i</sub></it>} as <it>e<sub>i</sub></it>, for <it>i </it>= 1, 2, 3.</p>
            <fig id="F2"><title><p>Figure 2</p></title><caption><p>Unrooted reconciliation</p></caption><text>
   <p><b>Unrooted reconciliation</b>. a) A star in <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i17"><m:mrow><m:mover accent="false"><m:mrow><m:mi mathvariant="script">G</m:mi></m:mrow><m:mo class="MathClass-op">^</m:mo></m:mover></m:mrow></m:math></inline-formula>. b) Types of edges. c) All possible types of stars. We use simplified notation instead of the full topology.</p>
</text><graphic file="1471-2105-13-S10-S14-2"/></fig>
            <p>The are several types of possible star topologies based on the labeling (for proofs and details see <abbrgrp><abbr bid="B18">18</abbr></abbrgrp>): (S1) a star has one incoming edge labeled by &#8868; and two outgoing edges labeled &#8868; and these edges are connected to the three siblings of the center, (S2) a star has exactly two outgoing edges labeled by &#8868;, (S3) a star has all outgoing edges and exactly one incoming edgd labeled by &#8868;, (S4) a star has all edges labelled by <it>top</it>, and (S5) a star has all outgoing edges and exactly two incoming edges labeled by &#8868;. Figure <figr fid="F2">2</figr> illustrates the star topologies.</p>
            <p>In summary stars are basic 'puzzle-like' units that can be used to assemble them into unrooted gene trees. However, not all star compositions represent a gene tree. For instance, there is no gene tree with 3 stars of type S2. It follows from <abbrgrp><abbr bid="B18">18</abbr></abbrgrp> (see Lemma 4) that we need the following additional condition: (C1) if a gene tree has two stars of type S2 then they share a common edge.</p>
            <p>Now we overview the main result of <abbrgrp><abbr bid="B18">18</abbr></abbrgrp> (see Theorem 1 for more details). Let <inline-formula><graphic file="1471-2105-13-S10-S14-i1.gif"/></inline-formula> be a species tree and <inline-formula><graphic file="1471-2105-13-S10-S14-i5.gif"/></inline-formula> be unrooted gene tree. The set of optimal edges, that is, candidates for best rootings, is defined as follows: <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i22"><m:mrow>
   <m:mi mathvariant="bold">M</m:mi>
   <m:mi>i</m:mi>
   <m:msub>
      <m:mrow>
         <m:mi mathvariant="bold">n</m:mi>
      </m:mrow>
      <m:mrow>
         <m:mi mathvariant="script">G</m:mi>
      </m:mrow>
   </m:msub>
   <m:mo class="MathClass-rel">=</m:mo>
   <m:mrow>
      <m:mo class="MathClass-open">{</m:mo>
      <m:mrow>
         <m:mi>e</m:mi>
         <m:mo class="MathClass-rel">&#8712;</m:mo>
         <m:msub>
            <m:mrow>
               <m:mi>E</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mi mathvariant="script">G</m:mi>
            </m:mrow>
         </m:msub>
         <m:mo class="MathClass-rel">|</m:mo>
         <m:msubsup>
            <m:mrow>
               <m:mi>&#963;</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mi>e</m:mi>
            </m:mrow>
            <m:mrow>
               <m:msub>
                  <m:mrow>
                     <m:mi>M</m:mi>
                  </m:mrow>
                  <m:mrow>
                     <m:mi>&#945;</m:mi>
                     <m:mo class="MathClass-punc">,</m:mo>
                     <m:mi>&#946;</m:mi>
                  </m:mrow>
               </m:msub>
            </m:mrow>
         </m:msubsup>
         <m:mstyle class="text">
            <m:mtext class="textsf" mathvariant="sans-serif">&#160;is&#160;minimal</m:mtext>
         </m:mstyle>
      </m:mrow>
      <m:mo class="MathClass-close">}</m:mo>
   </m:mrow>
</m:mrow>
</m:math></inline-formula>, where <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i23"><m:mrow>
   <m:msubsup>
      <m:mrow>
         <m:mi>&#963;</m:mi>
      </m:mrow>
      <m:mrow>
         <m:mi>e</m:mi>
      </m:mrow>
      <m:mrow>
         <m:msub>
            <m:mrow>
               <m:mi>M</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mi>&#945;</m:mi>
               <m:mo class="MathClass-punc">,</m:mo>
               <m:mi>&#946;</m:mi>
            </m:mrow>
         </m:msub>
      </m:mrow>
   </m:msubsup>
</m:mrow>
</m:math></inline-formula>is the total cost for the weighted mutation cost defined by <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i24"><m:mrow>
   <m:msubsup>
      <m:mrow>
         <m:mi>&#958;</m:mi>
      </m:mrow>
      <m:mrow>
         <m:mi>e</m:mi>
      </m:mrow>
      <m:mrow>
         <m:msub>
            <m:mrow>
               <m:mi>M</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mi>&#945;</m:mi>
               <m:mo class="MathClass-punc">,</m:mo>
               <m:mi>&#946;</m:mi>
            </m:mrow>
         </m:msub>
      </m:mrow>
   </m:msubsup>
   <m:mrow>
      <m:mo class="MathClass-open">(</m:mo>
      <m:mrow>
         <m:mi>v</m:mi>
         <m:mo class="MathClass-punc">,</m:mo>
         <m:mspace class="tmspace" width="2.77695pt"/>
         <m:mi>a</m:mi>
      </m:mrow>
      <m:mo class="MathClass-close">)</m:mo>
   </m:mrow>
   <m:mo class="MathClass-rel">=</m:mo>
   <m:mi>&#945;</m:mi>
   <m:mspace class="tmspace" width="2.77695pt"/>
   <m:mo class="MathClass-bin">&#8901;</m:mo>
   <m:mspace class="tmspace" width="2.77695pt"/>
   <m:msubsup>
      <m:mrow>
         <m:mi>&#958;</m:mi>
      </m:mrow>
      <m:mrow>
         <m:mi>e</m:mi>
      </m:mrow>
      <m:mrow>
         <m:mi>D</m:mi>
      </m:mrow>
   </m:msubsup>
   <m:mrow>
      <m:mo class="MathClass-open">(</m:mo>
      <m:mrow>
         <m:mi>v</m:mi>
         <m:mo class="MathClass-punc">,</m:mo>
         <m:mspace class="tmspace" width="2.77695pt"/>
         <m:mi>a</m:mi>
      </m:mrow>
      <m:mo class="MathClass-close">)</m:mo>
   </m:mrow>
   <m:mo class="MathClass-bin">+</m:mo>
   <m:mi>&#946;</m:mi>
   <m:mspace class="tmspace" width="2.77695pt"/>
   <m:mo class="MathClass-bin">&#8901;</m:mo>
   <m:mspace class="tmspace" width="2.77695pt"/>
   <m:msubsup>
      <m:mrow>
         <m:mi>&#958;</m:mi>
      </m:mrow>
      <m:mrow>
         <m:mi>e</m:mi>
      </m:mrow>
      <m:mrow>
         <m:mi>L</m:mi>
      </m:mrow>
   </m:msubsup>
   <m:mrow>
      <m:mo class="MathClass-open">(</m:mo>
      <m:mrow>
         <m:mi>v</m:mi>
         <m:mo class="MathClass-punc">,</m:mo>
         <m:mspace class="tmspace" width="2.77695pt"/>
         <m:mi>a</m:mi>
      </m:mrow>
      <m:mo class="MathClass-close">)</m:mo>
   </m:mrow>
</m:mrow>
</m:math></inline-formula>, <it>e </it>is an edge in <inline-formula><graphic file="1471-2105-13-S10-S14-i5.gif"/></inline-formula> and <it>&#945;, &#946; </it>are two positive reals. Then <b>(M1) </b>if <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i25"><m:mo class="MathClass-rel">|</m:mo>
<m:mi mathvariant="bold">M</m:mi>
<m:mi>i</m:mi>
<m:msub>
   <m:mrow>
      <m:mi mathvariant="bold">n</m:mi>
   </m:mrow>
   <m:mrow>
      <m:mi mathvariant="script">G</m:mi>
   </m:mrow>
</m:msub>
<m:mo class="MathClass-rel">|</m:mo>
<m:mo class="MathClass-rel">&gt;</m:mo>
<m:mn>1</m:mn>
</m:math></inline-formula>, then <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i88"><m:mi mathvariant="bold">M</m:mi>
<m:mi>i</m:mi>
<m:msub>
   <m:mrow>
      <m:mi mathvariant="bold">n</m:mi>
   </m:mrow>
   <m:mrow>
      <m:mi mathvariant="script">G</m:mi>
   </m:mrow>
</m:msub>
</m:math></inline-formula> consists of all edges present in all stars of type S4 or S5, <b>(M2) </b>if <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i26"><m:mo class="MathClass-rel">|</m:mo>
<m:mi mathvariant="bold">M</m:mi>
<m:mi>i</m:mi>
<m:msub>
   <m:mrow>
      <m:mi mathvariant="bold">n</m:mi>
   </m:mrow>
   <m:mrow>
      <m:mi mathvariant="script">G</m:mi>
   </m:mrow>
</m:msub>
<m:mo class="MathClass-rel">|</m:mo>
<m:mo class="MathClass-rel">=</m:mo>
<m:mn>1</m:mn>
</m:math></inline-formula>, then <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i88"><m:mi mathvariant="bold">M</m:mi><m:mi>i</m:mi><m:msub><m:mrow><m:mi mathvariant="bold">n</m:mi></m:mrow><m:mrow><m:mi mathvariant="script">G</m:mi></m:mrow></m:msub></m:math></inline-formula> contains exactly one symmetric edge that is present in star of type S2 or S3. From the above statements, (C1) and star topologies we can easily determine <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i88"><m:mi mathvariant="bold">M</m:mi><m:mi mathvariant="bold">i</m:mi><m:msub><m:mrow><m:mi mathvariant="bold">n</m:mi></m:mrow><m:mrow><m:mi mathvariant="script">G</m:mi></m:mrow></m:msub></m:math></inline-formula>. More precisely, the star edges outside <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i88"><m:mi mathvariant="bold">M</m:mi><m:mi mathvariant="bold">i</m:mi><m:msub><m:mrow><m:mi mathvariant="bold">n</m:mi></m:mrow><m:mrow><m:mi mathvariant="script">G</m:mi></m:mrow></m:msub></m:math></inline-formula> are asymmetric and share the same direction. Thus, to find an optimal edge it is sufficient to follow the direction of non &#8868; edges in <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i17"><m:mrow><m:mover accent="false"><m:mrow><m:mi mathvariant="script">G</m:mi></m:mrow><m:mo class="MathClass-op">^</m:mo></m:mover></m:mrow></m:math></inline-formula>.</p>
            <p>Now we summarize the time complexity of this procedure. It follows from <abbrgrp><abbr bid="B21">21</abbr></abbrgrp> that a single lca-query (that, is <it>a </it>+ <it>b </it>for nodes <it>a </it>and <it>b </it>in <inline-formula><graphic file="1471-2105-13-S10-S14-i1.gif"/></inline-formula>) can be computed in constant time after an initial preprocessing step requiring <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i27"><m:mi>O</m:mi>
<m:mrow>
   <m:mo class="MathClass-open">(</m:mo>
   <m:mrow>
      <m:mo class="MathClass-rel">|</m:mo>
      <m:mi mathvariant="script">S</m:mi>
      <m:mo class="MathClass-rel">|</m:mo>
   </m:mrow>
   <m:mo class="MathClass-close">)</m:mo>
</m:mrow>
</m:math></inline-formula> time. Other structures like <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i17"><m:mrow><m:mover accent="false"><m:mrow><m:mi mathvariant="script">G</m:mi></m:mrow><m:mo class="MathClass-op">^</m:mo></m:mover></m:mrow></m:math></inline-formula> with the labeling can be computed in <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i28"><m:mi>O</m:mi>
<m:mrow>
   <m:mo class="MathClass-open">(</m:mo>
   <m:mrow>
      <m:mo class="MathClass-rel">|</m:mo>
      <m:mi mathvariant="script">G</m:mi>
      <m:mo class="MathClass-rel">|</m:mo>
   </m:mrow>
   <m:mo class="MathClass-close">)</m:mo>
</m:mrow>
</m:math></inline-formula> time. The same complexity has the procedure of finding an optimal edge in <inline-formula><graphic file="1471-2105-13-S10-S14-i5.gif"/></inline-formula>. In summary an optimal edge/rooting and the minimal cost can be computed in linear time. See <abbrgrp><abbr bid="B18">18</abbr></abbrgrp> for more details and other properties.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Methods</p>
         </st>
         <p>First we describe our algorithm for computing the optimal cost and the set of optimal edges after one nearest neighbor interchange (NNI) operation performed on an unrooted gene tree, and then extend it to a general case with <it>k </it>NNI operations. For the definition of NNI please refer to Def. 1 and Figure <figr fid="F3">3</figr>.</p>
         <fig id="F3"><title><p>Figure 3</p></title><caption><p>NNI</p></caption><text>
   <p><b>NNI</b>. A single NNI on <inline-formula><graphic file="1471-2105-13-S10-S14-i5.gif"/></inline-formula> and <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i17"><m:mrow><m:mover accent="false"><m:mrow><m:mi mathvariant="script">G</m:mi></m:mrow><m:mo class="MathClass-op">^</m:mo></m:mover></m:mrow></m:math></inline-formula>. On the left <it>e</it><sub>i </sub>and <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i83"><m:msubsup>
   <m:mrow>
      <m:mi>e</m:mi>
   </m:mrow>
   <m:mrow>
      <m:mi>i</m:mi>
   </m:mrow>
   <m:mrow>
      <m:mi>&#8242;</m:mi>
   </m:mrow>
</m:msubsup>
</m:math></inline-formula> (for <it>i </it>= 0, ... , 4) denote edges in <inline-formula><graphic file="1471-2105-13-S10-S14-i5.gif"/></inline-formula> and its NNI-neighbor <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i84"><m:msup>
   <m:mrow>
      <m:mi mathvariant="script">G</m:mi>
   </m:mrow>
   <m:mrow>
      <m:mi>&#8242;</m:mi>
   </m:mrow>
</m:msup>
</m:math></inline-formula>, respectively. On the right each node <it>a<sub>i </sub></it>denote the labeling of edges in <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i17"><m:mrow><m:mover accent="false"><m:mrow><m:mi mathvariant="script">G</m:mi></m:mrow><m:mo class="MathClass-op">^</m:mo></m:mover></m:mrow></m:math></inline-formula>. Notation <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i85"><m:mrow>
   <m:msub>
      <m:mrow>
         <m:mi>&#257;</m:mi>
      </m:mrow>
      <m:mrow>
         <m:mi>i</m:mi>
      </m:mrow>
   </m:msub>
</m:mrow>
</m:math></inline-formula> denote the lca-mapping of complementary subtrees, for instance, <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i86"><m:mrow>
   <m:msub>
      <m:mrow>
         <m:mi>&#257;</m:mi>
      </m:mrow>
      <m:mrow>
         <m:mn>3</m:mn>
      </m:mrow>
   </m:msub>
   <m:mo class="MathClass-rel">=</m:mo>
   <m:msub>
      <m:mrow>
         <m:mi>a</m:mi>
      </m:mrow>
      <m:mrow>
         <m:mn>1</m:mn>
      </m:mrow>
   </m:msub>
   <m:mo class="MathClass-bin">+</m:mo>
   <m:msub>
      <m:mrow>
         <m:mi>a</m:mi>
      </m:mrow>
      <m:mrow>
         <m:mn>2</m:mn>
      </m:mrow>
   </m:msub>
   <m:mo class="MathClass-bin">+</m:mo>
   <m:msub>
      <m:mrow>
         <m:mi>a</m:mi>
      </m:mrow>
      <m:mrow>
         <m:mn>4</m:mn>
      </m:mrow>
   </m:msub>
</m:mrow>
</m:math></inline-formula>, etc. For brevity, we omit each subtree <it>T<sub>i </sub></it>attached to <it>w<sub>i </sub></it>in the left diagram.</p>
</text><graphic file="1471-2105-13-S10-S14-3"/></fig>
         <sec>
            <st>
               <p>Algorithm</p>
            </st>
            <p>Now we show that a single NNI operation can be completed in constant time if all structures required for computing the optimal rootings are already constructed. First, let us assume that the following is given: (a) two positive reals <it>&#945; </it>and <it>&#946;</it>, a species tree <inline-formula><graphic file="1471-2105-13-S10-S14-i1.gif"/></inline-formula>, (b) lca structure for <inline-formula><graphic file="1471-2105-13-S10-S14-i1.gif"/></inline-formula> that allows to answer lca-queries in constant time, (c) an unrooted gene tree <inline-formula><graphic file="1471-2105-13-S10-S14-i5.gif"/></inline-formula>, (d) <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i17"><m:mrow><m:mover accent="false"><m:mrow><m:mi mathvariant="script">G</m:mi></m:mrow><m:mo class="MathClass-op">^</m:mo></m:mover></m:mrow></m:math></inline-formula> with the labeling of edges, (e) <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i88"><m:mi mathvariant="bold">M</m:mi><m:mi mathvariant="bold">i</m:mi><m:msub><m:mrow><m:mi mathvariant="bold">n</m:mi></m:mrow><m:mrow><m:mi mathvariant="script">G</m:mi></m:mrow></m:msub></m:math></inline-formula> - the set of optimal edges, and (f) <it>&#963; </it>- the minimal total weighted mutation cost. As observed in the previous section (b),(d)-(f) can be computed in <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i29"><m:mi>O</m:mi>
<m:mrow>
   <m:mo class="MathClass-open">(</m:mo>
   <m:mrow>
      <m:mtext>max</m:mtext>
      <m:mrow>
         <m:mo class="MathClass-open">(</m:mo>
         <m:mrow>
            <m:mo class="MathClass-rel">|</m:mo>
            <m:mi mathvariant="script">S</m:mi>
            <m:mo class="MathClass-rel">|</m:mo>
            <m:mo class="MathClass-punc">,</m:mo>
            <m:mo class="MathClass-rel">|</m:mo>
            <m:mi mathvariant="script">G</m:mi>
            <m:mo class="MathClass-rel">|</m:mo>
         </m:mrow>
         <m:mo class="MathClass-close">)</m:mo>
      </m:mrow>
   </m:mrow>
   <m:mo class="MathClass-close">)</m:mo>
</m:mrow>
</m:math></inline-formula>. Now we show that (c)-(f) can be computed in constant time after a single NNI operation.</p>
            <p><it>NNI operation (c) and the update of lca-mappings (d)</it>.</p>
            <p><b>Definition 1</b>. <it>(Single NNI operation) An NNI operation transforms a gene tree <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i30"><m:mi mathvariant="script">G</m:mi>
<m:mo class="MathClass-rel">=</m:mo>
<m:mrow>
   <m:mo class="MathClass-open">(</m:mo>
   <m:mrow>
      <m:mrow>
         <m:mo class="MathClass-open">(</m:mo>
         <m:mrow>
            <m:msub>
               <m:mrow>
                  <m:mi>T</m:mi>
               </m:mrow>
               <m:mrow>
                  <m:mn>1</m:mn>
               </m:mrow>
            </m:msub>
            <m:mo class="MathClass-punc">,</m:mo>
            <m:msub>
               <m:mrow>
                  <m:mi>T</m:mi>
               </m:mrow>
               <m:mrow>
                  <m:mn>2</m:mn>
               </m:mrow>
            </m:msub>
         </m:mrow>
         <m:mo class="MathClass-close">)</m:mo>
      </m:mrow>
      <m:mo class="MathClass-punc">,</m:mo>
      <m:mrow>
         <m:mo class="MathClass-open">(</m:mo>
         <m:mrow>
            <m:msub>
               <m:mrow>
                  <m:mi>T</m:mi>
               </m:mrow>
               <m:mrow>
                  <m:mn>3</m:mn>
               </m:mrow>
            </m:msub>
            <m:mo class="MathClass-punc">,</m:mo>
            <m:msub>
               <m:mrow>
                  <m:mi>T</m:mi>
               </m:mrow>
               <m:mrow>
                  <m:mn>4</m:mn>
               </m:mrow>
            </m:msub>
         </m:mrow>
         <m:mo class="MathClass-close">)</m:mo>
      </m:mrow>
   </m:mrow>
   <m:mo class="MathClass-close">)</m:mo>
</m:mrow>
</m:math></inline-formula> into </it><inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i31"><m:msup>
   <m:mrow>
      <m:mi mathvariant="script">G</m:mi>
   </m:mrow>
   <m:mrow>
      <m:mi>&#8242;</m:mi>
   </m:mrow>
</m:msup>
<m:mo class="MathClass-rel">=</m:mo>
<m:mrow>
   <m:mo class="MathClass-open">(</m:mo>
   <m:mrow>
      <m:mrow>
         <m:mo class="MathClass-open">(</m:mo>
         <m:mrow>
            <m:msub>
               <m:mrow>
                  <m:mi>T</m:mi>
               </m:mrow>
               <m:mrow>
                  <m:mn>2</m:mn>
               </m:mrow>
            </m:msub>
            <m:mo class="MathClass-punc">,</m:mo>
            <m:msub>
               <m:mrow>
                  <m:mi>T</m:mi>
               </m:mrow>
               <m:mrow>
                  <m:mn>3</m:mn>
               </m:mrow>
            </m:msub>
         </m:mrow>
         <m:mo class="MathClass-close">)</m:mo>
      </m:mrow>
      <m:mo class="MathClass-punc">,</m:mo>
      <m:mrow>
         <m:mo class="MathClass-open">(</m:mo>
         <m:mrow>
            <m:msub>
               <m:mrow>
                  <m:mi>T</m:mi>
               </m:mrow>
               <m:mrow>
                  <m:mn>1</m:mn>
               </m:mrow>
            </m:msub>
            <m:mo class="MathClass-punc">,</m:mo>
            <m:msub>
               <m:mrow>
                  <m:mi>T</m:mi>
               </m:mrow>
               <m:mrow>
                  <m:mn>4</m:mn>
               </m:mrow>
            </m:msub>
         </m:mrow>
         <m:mo class="MathClass-close">)</m:mo>
      </m:mrow>
   </m:mrow>
   <m:mo class="MathClass-close">)</m:mo>
</m:mrow>
</m:math></inline-formula><it>, where T<sub>i</sub>-s are (rooted) subtrees of </it><inline-formula><graphic file="1471-2105-13-S10-S14-i5.gif"/></inline-formula><it>. The edge that connects the roots of </it>(<it>T</it><sub>1</sub>, <it>T</it><sub>2</sub>) <it>and </it>(<it>T</it><sub>3</sub>, <it>T</it><sub>4</sub>) <it>in </it><inline-formula><graphic file="1471-2105-13-S10-S14-i5.gif"/></inline-formula><it>is denoted by e</it><sub>0 </sub><it>and called the </it>center edge. <it>For each i </it>= 1, 2, 3, 4 <it>we assume the following: w<sub>i </sub>is the root of T<sub>i</sub>, e<sub>i </sub>is the edge connecting w<sub>i </sub>with e</it><sub>0 </sub><it>and a<sub>i </sub>is the lca-mapping of T<sub>i</sub></it>. <it>Similarly, we define the </it>center edge <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i32"><m:msubsup>
   <m:mrow>
      <m:mi>e</m:mi>
   </m:mrow>
   <m:mrow>
      <m:mn>0</m:mn>
   </m:mrow>
   <m:mrow>
      <m:mi>&#8242;</m:mi>
   </m:mrow>
</m:msubsup>
</m:math></inline-formula><it>and <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i33"><m:msubsup>
   <m:mrow>
      <m:mi>e</m:mi>
   </m:mrow>
   <m:mrow>
      <m:mi>i</m:mi>
   </m:mrow>
   <m:mrow>
      <m:mi>&#8242;</m:mi>
   </m:mrow>
</m:msubsup>
</m:math></inline-formula> in </it><inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i34"><m:msup>
   <m:mrow>
      <m:mi mathvariant="script">G</m:mi>
   </m:mrow>
   <m:mrow>
      <m:mi>&#8242;</m:mi>
   </m:mrow>
</m:msup>
</m:math></inline-formula>.</p>
            <p>An NNI operation is depicted in Figure <figr fid="F3">3</figr> with the transformation of <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i17"><m:mrow><m:mover accent="false"><m:mrow><m:mi mathvariant="script">G</m:mi></m:mrow><m:mo class="MathClass-op">^</m:mo></m:mover></m:mrow></m:math></inline-formula> into <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i35"><m:mrow>
   <m:mover accent="false">
      <m:mrow>
         <m:msup>
            <m:mrow>
               <m:mi mathvariant="script">G</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mi>&#8242;</m:mi>
            </m:mrow>
         </m:msup>
      </m:mrow>
      <m:mo class="MathClass-op">^</m:mo>
   </m:mover>
</m:mrow>
</m:math></inline-formula>. The notation will be used from now on. Note that there is a second NNI operation, when <inline-formula><graphic file="1471-2105-13-S10-S14-i5.gif"/></inline-formula> is replaced with ((<it>T</it><sub>1</sub>, <it>T</it><sub>3</sub>), (<it>T</it><sub>2</sub>, <it>T</it><sub>4</sub>)). However, it can be easily defined and therefore it is omitted here. Observe that the NNI operation (without updating of lca-mappings) can be performed in constant time for both trees.</p>
            <p>The right part of Figure <figr fid="F3">3</figr> depicts the transformation of <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i17"><m:mrow><m:mover accent="false"><m:mrow><m:mi mathvariant="script">G</m:mi></m:mrow><m:mo class="MathClass-op">^</m:mo></m:mover></m:mrow></m:math></inline-formula>. Observe that the labels of the incoming and outgoing edges attached to each <it>w<sub>i </sub></it>in <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i17"><m:mrow><m:mover accent="false"><m:mrow><m:mi mathvariant="script">G</m:mi></m:mrow><m:mo class="MathClass-op">^</m:mo></m:mover></m:mrow></m:math></inline-formula> do not change during this operation. Lemma 1 follows directly from this observation.</p>
            <p><b>Lemma 1</b>. <it>An NNI operation changes only the labels of the center edge</it>.</p>
            <p>We conclude that updating <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i17"><m:mrow><m:mover accent="false"><m:mrow><m:mi mathvariant="script">G</m:mi></m:mrow><m:mo class="MathClass-op">^</m:mo></m:mover></m:mrow></m:math></inline-formula> requires only two lca-queries, and therefore can be performed in constant time.</p>
            <p><it>Reconstruction of optimal edges (e)</it>. We analyze the changes of the optimal set of edges <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i36"><m:mi>M</m:mi>
<m:mi>i</m:mi>
<m:msub>
   <m:mrow>
      <m:mi>n</m:mi>
   </m:mrow>
   <m:mrow>
      <m:mi mathvariant="script">G</m:mi>
   </m:mrow>
</m:msub>
</m:math></inline-formula>. To this end we consider a number of cases depending on the relation between the optimal set of edges and the set of edges, incident to the nodes of the center edge. Let <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i37"><m:msub>
   <m:mrow>
      <m:mi>C</m:mi>
   </m:mrow>
   <m:mrow>
      <m:mi mathvariant="script">G</m:mi>
   </m:mrow>
</m:msub>
<m:mo class="MathClass-rel">=</m:mo>
<m:msub>
   <m:mrow>
      <m:mrow>
         <m:mo class="MathClass-open">{</m:mo>
         <m:mrow>
            <m:msub>
               <m:mrow>
                  <m:mi>e</m:mi>
               </m:mrow>
               <m:mrow>
                  <m:mi>i</m:mi>
               </m:mrow>
            </m:msub>
         </m:mrow>
         <m:mo class="MathClass-close">}</m:mo>
      </m:mrow>
   </m:mrow>
   <m:mrow>
      <m:mi>i</m:mi>
      <m:mo class="MathClass-rel">=</m:mo>
      <m:mn>0</m:mn>
      <m:mo class="MathClass-punc">,</m:mo>
      <m:mo class="MathClass-op">&#8230;</m:mo>
      <m:mo class="MathClass-punc">,</m:mo>
      <m:mn>4</m:mn>
   </m:mrow>
</m:msub>
</m:math></inline-formula>.</p>
            <p>For convenience, assume that the NNI operation replaces <it>e<sub>i </sub></it>with <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i38"><m:msubsup>
   <m:mrow>
      <m:mi>e</m:mi>
   </m:mrow>
   <m:mrow>
      <m:mi>i</m:mi>
   </m:mrow>
   <m:mrow>
      <m:mi>&#8242;</m:mi>
   </m:mrow>
</m:msubsup>
</m:math></inline-formula> as indicated in Figure <figr fid="F3">3</figr>. We call two disjoint edges from <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i39"><m:msub>
   <m:mrow>
      <m:mi>C</m:mi>
   </m:mrow>
   <m:mrow>
      <m:mi mathvariant="script">G</m:mi>
   </m:mrow>
</m:msub>
</m:math></inline-formula><it>semi-alternating </it>if they share a common node after the NNI operation. In Figure <figr fid="F3">3</figr> {<it>e</it><sub>1</sub>, <it>e</it><sub>4</sub>} and {<it>e</it><sub>2</sub>, <it>e</it><sub>3</sub>} are semi-alternating. For two edges <it>a </it>and <it>b </it>that are incident to the same node let &#8902;(<it>a, b</it>) be the set of three edges defining the unique star that contains <it>a </it>and <it>b</it>.</p>
            <p><b>Lemma 2</b>. <it>Assuming that e<sub>i </sub>is replaced by <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i40"><m:msubsup>
   <m:mrow>
      <m:mi>e</m:mi>
   </m:mrow>
   <m:mrow>
      <m:mi>i</m:mi>
   </m:mrow>
   <m:mrow>
      <m:mi>&#8242;</m:mi>
   </m:mrow>
</m:msubsup>
</m:math></inline-formula> after the NNI operation the set of optimal edges does not require additional changes if and only if one of the following conditions is satisfied:</it></p>
            <p>
               <inline-formula>
                  <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i41"><m:mrow>
   <m:mrow>
      <m:mo class="MathClass-open">(</m:mo>
      <m:mrow>
         <m:mi mathvariant="bold">E</m:mi>
         <m:mi>Q</m:mi>
         <m:mi>1</m:mi>
      </m:mrow>
      <m:mo class="MathClass-close">)</m:mo>
   </m:mrow>
   <m:mspace class="tmspace" width="2.77695pt"/>
   <m:mi mathvariant="bold">M</m:mi>
   <m:mi mathvariant="bold">i</m:mi>
   <m:msub>
      <m:mrow>
         <m:mi mathvariant="bold">n</m:mi>
      </m:mrow>
      <m:mrow>
         <m:mi mathvariant="script">G</m:mi>
      </m:mrow>
   </m:msub>
   <m:mo class="MathClass-bin">&#8745;</m:mo>
   <m:msub>
      <m:mrow>
         <m:mi>C</m:mi>
      </m:mrow>
      <m:mrow>
         <m:mi mathvariant="script">G</m:mi>
      </m:mrow>
   </m:msub>
   <m:mo class="MathClass-rel">=</m:mo>
   <m:mi>&#8709;</m:mi>
   <m:mo class="MathClass-punc">,</m:mo>
</m:mrow>
</m:math>
               </inline-formula>
            </p>
            <p><inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i42"><m:mrow>
   <m:mrow>
      <m:mo class="MathClass-open">(</m:mo>
      <m:mrow>
         <m:mi mathvariant="bold">E</m:mi>
         <m:mi>Q</m:mi>
         <m:mi>2</m:mi>
      </m:mrow>
      <m:mo class="MathClass-close">)</m:mo>
   </m:mrow>
   <m:mspace class="tmspace" width="2.77695pt"/>
   <m:mi mathvariant="bold">M</m:mi>
   <m:mi mathvariant="bold">i</m:mi>
   <m:msub>
      <m:mrow>
         <m:mi mathvariant="bold">n</m:mi>
      </m:mrow>
      <m:mrow>
         <m:mi mathvariant="script">G</m:mi>
      </m:mrow>
   </m:msub>
   <m:mo class="MathClass-rel">&#8839;</m:mo>
   <m:msub>
      <m:mrow>
         <m:mi>C</m:mi>
      </m:mrow>
      <m:mrow>
         <m:mi mathvariant="script">G</m:mi>
      </m:mrow>
   </m:msub>
</m:mrow>
</m:math></inline-formula><it>and each pair of semi-alternating edges contains at least one symmetric edge</it>,</p>
            <p><b><inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i43"><m:mrow>
   <m:mo class="MathClass-open">(</m:mo>
   <m:mrow>
      <m:mi mathvariant="bold">E</m:mi>
      <m:mi>Q</m:mi>
      <m:mi>3</m:mi>
   </m:mrow>
   <m:mo class="MathClass-close">)</m:mo>
</m:mrow>
<m:mspace class="tmspace" width="2.77695pt"/>
<m:mi mathvariant="bold">M</m:mi>
<m:mi mathvariant="bold">i</m:mi>
<m:msub>
   <m:mrow>
      <m:mi mathvariant="bold">n</m:mi>
   </m:mrow>
   <m:mrow>
      <m:mi mathvariant="script">G</m:mi>
   </m:mrow>
</m:msub>
</m:math></inline-formula></b><it>consists of only the center edge</it>,</p>
            <p><inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i44"><m:mrow>
   <m:mo class="MathClass-open">(</m:mo>
   <m:mrow>
      <m:mi mathvariant="bold">E</m:mi>
      <m:mi>Q</m:mi>
      <m:mi>4</m:mi>
   </m:mrow>
   <m:mo class="MathClass-close">)</m:mo>
</m:mrow>
<m:mspace class="tmspace" width="2.77695pt"/>
<m:mi mathvariant="bold">M</m:mi>
<m:mi mathvariant="bold">i</m:mi>
<m:msub>
   <m:mrow>
      <m:mi mathvariant="bold">n</m:mi>
   </m:mrow>
   <m:mrow>
      <m:mi mathvariant="script">G</m:mi>
   </m:mrow>
</m:msub>
<m:mo class="MathClass-bin">&#8745;</m:mo>
<m:msub>
   <m:mrow>
      <m:mi>C</m:mi>
   </m:mrow>
   <m:mrow>
      <m:mi mathvariant="script">G</m:mi>
   </m:mrow>
</m:msub>
<m:mo class="MathClass-rel">=</m:mo>
<m:mrow>
   <m:mo class="MathClass-open">{</m:mo>
   <m:mrow>
      <m:msub>
         <m:mrow>
            <m:mi>e</m:mi>
         </m:mrow>
         <m:mrow>
            <m:mi>i</m:mi>
         </m:mrow>
      </m:msub>
   </m:mrow>
   <m:mo class="MathClass-close">}</m:mo>
</m:mrow>
</m:math></inline-formula><it>for some i &gt;</it>0 <it>and the center is asymmetric after the NNI operation</it>.</p>
            <p>Proof: (EQ1) All edges in <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i70"><m:mrow>
   <m:msub>
      <m:mrow>
         <m:mi>C</m:mi>
      </m:mrow>
      <m:mrow>
         <m:mi mathvariant="script">G</m:mi>
      </m:mrow>
   </m:msub>
</m:mrow>
</m:math></inline-formula> are asymmetric (2 stars S1). Then, after the NNI operation <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i45"><m:msubsup>
   <m:mrow>
      <m:mi>e</m:mi>
   </m:mrow>
   <m:mrow>
      <m:mn>0</m:mn>
   </m:mrow>
   <m:mrow>
      <m:mi>&#8242;</m:mi>
   </m:mrow>
</m:msubsup>
</m:math></inline-formula> is asymmetric and (<inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i46"><m:msub>
   <m:mrow>
      <m:mi>C</m:mi>
   </m:mrow>
   <m:mrow>
      <m:msup>
         <m:mrow>
            <m:mi mathvariant="script">G</m:mi>
         </m:mrow>
         <m:mrow>
            <m:mi>&#8242;</m:mi>
         </m:mrow>
      </m:msup>
   </m:mrow>
</m:msub>
</m:math></inline-formula> has 2 stars S1). (EQ2) <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i47"><m:msub>
   <m:mrow>
      <m:mi>C</m:mi>
   </m:mrow>
   <m:mrow>
      <m:mi mathvariant="script">G</m:mi>
   </m:mrow>
</m:msub>
</m:math></inline-formula> consists of 2 stars of type S4/S5 and at most two asymmetric edges. It follows from EQ2 that the asymmetric edges in <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i46"><m:msub><m:mrow><m:mi>C</m:mi></m:mrow><m:mrow><m:msup><m:mrow><m:mi mathvariant="script">G</m:mi></m:mrow><m:mrow><m:mi>&#8242;</m:mi></m:mrow></m:msup></m:mrow></m:msub></m:math></inline-formula> cannot form a star of type other than S5. Together with M1 it follows that <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i46"><m:msub><m:mrow><m:mi>C</m:mi></m:mrow><m:mrow><m:msup><m:mrow><m:mi mathvariant="script">G</m:mi></m:mrow><m:mrow><m:mi>&#8242;</m:mi></m:mrow></m:msup></m:mrow></m:msub></m:math></inline-formula> is optimal. (EQ3) By M1 the center is symmetric in <inline-formula><graphic file="1471-2105-13-S10-S14-i5.gif"/></inline-formula>. It remains symmetric after NNI. From C1 and M2, <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i48"><m:mi mathvariant="bold">M</m:mi>
<m:mi mathvariant="bold">i</m:mi>
<m:msub>
   <m:mrow>
      <m:mi mathvariant="bold">n</m:mi>
   </m:mrow>
   <m:mrow>
      <m:msup>
         <m:mrow>
            <m:mi mathvariant="script">G</m:mi>
         </m:mrow>
         <m:mrow>
            <m:mi>&#8242;</m:mi>
         </m:mrow>
      </m:msup>
   </m:mrow>
</m:msub>
</m:math></inline-formula> consists of the center edge. (EQ4) Note, that the type of <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i49"><m:mo class="MathClass-bin">&#8902;</m:mo>
<m:mrow>
   <m:mo class="MathClass-open">(</m:mo>
   <m:mrow>
      <m:msubsup>
         <m:mrow>
            <m:mi>e</m:mi>
         </m:mrow>
         <m:mrow>
            <m:mi>i</m:mi>
         </m:mrow>
         <m:mrow>
            <m:mi>&#8242;</m:mi>
         </m:mrow>
      </m:msubsup>
      <m:mo class="MathClass-punc">,</m:mo>
      <m:msubsup>
         <m:mrow>
            <m:mi>e</m:mi>
         </m:mrow>
         <m:mrow>
            <m:mn>0</m:mn>
         </m:mrow>
         <m:mrow>
            <m:mi>&#8242;</m:mi>
         </m:mrow>
      </m:msubsup>
   </m:mrow>
   <m:mo class="MathClass-close">)</m:mo>
</m:mrow>
</m:math></inline-formula> is <it>S</it>1, <it>S</it>2 or <it>S</it>3.</p>
            <p><b>Lemma 3 </b>(NE1). <it>If </it><inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i50"><m:mi mathvariant="bold">M</m:mi>
<m:mi mathvariant="bold">i</m:mi>
<m:msub>
   <m:mrow>
      <m:mi mathvariant="bold">n</m:mi>
   </m:mrow>
   <m:mrow>
      <m:mi mathvariant="script">G</m:mi>
   </m:mrow>
</m:msub>
<m:mo class="MathClass-rel">&#8839;</m:mo>
<m:msub>
   <m:mrow>
      <m:mi>C</m:mi>
   </m:mrow>
   <m:mrow>
      <m:mi mathvariant="script">G</m:mi>
   </m:mrow>
</m:msub>
</m:math></inline-formula><it>and there exists a pair </it>{<it>e<sub>i</sub>
, e<sub>j</sub></it>} <it>of asymmetric semi-alternating edges, then </it><inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i51"><m:mi mathvariant="bold">M</m:mi>
<m:mi mathvariant="bold">i</m:mi>
<m:msub>
   <m:mrow>
      <m:mstyle mathvariant="bold">
         <m:msup>
            <m:mrow>
               <m:mi>n</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mi>&#8242;</m:mi>
            </m:mrow>
         </m:msup>
      </m:mstyle>
   </m:mrow>
   <m:mrow>
      <m:mi mathvariant="script">G</m:mi>
   </m:mrow>
</m:msub>
<m:mo class="MathClass-rel">=</m:mo>
<m:mi mathvariant="bold">M</m:mi>
<m:mi mathvariant="bold">i</m:mi>
<m:msub>
   <m:mrow>
      <m:mi mathvariant="bold">n</m:mi>
   </m:mrow>
   <m:mrow>
      <m:mi mathvariant="script">G</m:mi>
   </m:mrow>
</m:msub>
<m:mo class="MathClass-bin">\</m:mo>
<m:msub>
   <m:mrow>
      <m:mi>C</m:mi>
   </m:mrow>
   <m:mrow>
      <m:mi mathvariant="script">G</m:mi>
   </m:mrow>
</m:msub>
<m:mo class="MathClass-bin">&#8746;</m:mo>
<m:mrow>
   <m:mo class="MathClass-open">(</m:mo>
   <m:mrow>
      <m:msub>
         <m:mrow>
            <m:mi>C</m:mi>
         </m:mrow>
         <m:mrow>
            <m:msup>
               <m:mrow>
                  <m:mi mathvariant="script">G</m:mi>
               </m:mrow>
               <m:mrow>
                  <m:mi>&#8242;</m:mi>
               </m:mrow>
            </m:msup>
         </m:mrow>
      </m:msub>
      <m:mo class="MathClass-bin">\</m:mo>
      <m:mrow>
         <m:mo class="MathClass-open">{</m:mo>
         <m:mrow>
            <m:msubsup>
               <m:mrow>
                  <m:mi>e</m:mi>
               </m:mrow>
               <m:mrow>
                  <m:mi>i</m:mi>
               </m:mrow>
               <m:mrow>
                  <m:mi>&#8242;</m:mi>
               </m:mrow>
            </m:msubsup>
            <m:mo class="MathClass-punc">,</m:mo>
            <m:mspace class="tmspace" width="2.77695pt"/>
            <m:msubsup>
               <m:mrow>
                  <m:mi>e</m:mi>
               </m:mrow>
               <m:mrow>
                  <m:mi>j</m:mi>
               </m:mrow>
               <m:mrow>
                  <m:mi>&#8242;</m:mi>
               </m:mrow>
            </m:msubsup>
         </m:mrow>
         <m:mo class="MathClass-close">}</m:mo>
      </m:mrow>
   </m:mrow>
   <m:mo class="MathClass-close">)</m:mo>
</m:mrow>
</m:math></inline-formula>.</p>
            <p>Proof: The type of <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i52"><m:mo class="MathClass-bin">&#8902;</m:mo>
<m:mrow>
   <m:mo class="MathClass-open">(</m:mo>
   <m:mrow>
      <m:msubsup>
         <m:mrow>
            <m:mi>e</m:mi>
         </m:mrow>
         <m:mrow>
            <m:mi>i</m:mi>
         </m:mrow>
         <m:mrow>
            <m:mi>&#8242;</m:mi>
         </m:mrow>
      </m:msubsup>
      <m:mo class="MathClass-punc">,</m:mo>
      <m:msubsup>
         <m:mrow>
            <m:mi>e</m:mi>
         </m:mrow>
         <m:mrow>
            <m:mi>j</m:mi>
         </m:mrow>
         <m:mrow>
            <m:mi>&#8242;</m:mi>
         </m:mrow>
      </m:msubsup>
   </m:mrow>
   <m:mo class="MathClass-close">)</m:mo>
</m:mrow>
</m:math></inline-formula> is S1 or S3 and the other star has type S4 or S5. By M2 <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i53"><m:msubsup>
   <m:mrow>
      <m:mi>e</m:mi>
   </m:mrow>
   <m:mrow>
      <m:mi>i</m:mi>
   </m:mrow>
   <m:mrow>
      <m:mi>&#8242;</m:mi>
   </m:mrow>
</m:msubsup>
</m:math></inline-formula> and <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i54"><m:msubsup>
   <m:mrow>
      <m:mi>e</m:mi>
   </m:mrow>
   <m:mrow>
      <m:mi>j</m:mi>
   </m:mrow>
   <m:mrow>
      <m:mi>&#8242;</m:mi>
   </m:mrow>
</m:msubsup>
</m:math></inline-formula> are not optimal.</p>
            <p><b>Lemma 4 </b>(NE2). <it>If <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i55"><m:mi mathvariant="bold">M</m:mi>
<m:mi>i</m:mi>
<m:msub>
   <m:mrow>
      <m:mi mathvariant="bold">n</m:mi>
   </m:mrow>
   <m:mrow>
      <m:mi mathvariant="script">G</m:mi>
   </m:mrow>
</m:msub>
<m:mo class="MathClass-bin">&#8745;</m:mo>
<m:msub>
   <m:mrow>
      <m:mi>C</m:mi>
   </m:mrow>
   <m:mrow>
      <m:mi mathvariant="script">G</m:mi>
   </m:mrow>
</m:msub>
<m:mo class="MathClass-rel">=</m:mo>
<m:mrow>
   <m:mo class="MathClass-open">{</m:mo>
   <m:mrow>
      <m:msub>
         <m:mrow>
            <m:mi>e</m:mi>
         </m:mrow>
         <m:mrow>
            <m:mi>i</m:mi>
         </m:mrow>
      </m:msub>
   </m:mrow>
   <m:mo class="MathClass-close">}</m:mo>
</m:mrow>
</m:math></inline-formula> for some i &gt;</it>0 <it>and the center is symmetric after the NNI operation then </it><inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i56"><m:mrow>
   <m:mi mathvariant="bold">M</m:mi>
   <m:mi>i</m:mi>
   <m:msub>
      <m:mrow>
         <m:mstyle mathvariant="bold">
            <m:msup>
               <m:mrow>
                  <m:mi>n</m:mi>
               </m:mrow>
               <m:mrow>
                  <m:mi>&#8242;</m:mi>
               </m:mrow>
            </m:msup>
         </m:mstyle>
      </m:mrow>
      <m:mrow>
         <m:mi mathvariant="script">G</m:mi>
      </m:mrow>
   </m:msub>
   <m:mo class="MathClass-rel">=</m:mo>
   <m:mi mathvariant="bold">M</m:mi>
   <m:mi mathvariant="bold">i</m:mi>
   <m:msub>
      <m:mrow>
         <m:mi mathvariant="bold">n</m:mi>
      </m:mrow>
      <m:mrow>
         <m:mi mathvariant="script">G</m:mi>
      </m:mrow>
   </m:msub>
   <m:mo class="MathClass-bin">\</m:mo>
   <m:mrow>
      <m:mo class="MathClass-open">{</m:mo>
      <m:mrow>
         <m:msub>
            <m:mrow>
               <m:mi>e</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mi>i</m:mi>
            </m:mrow>
         </m:msub>
      </m:mrow>
      <m:mo class="MathClass-close">}</m:mo>
   </m:mrow>
   <m:mo class="MathClass-bin">&#8746;</m:mo>
   <m:mo class="MathClass-bin">&#8902;</m:mo>
   <m:mrow>
      <m:mo class="MathClass-open">(</m:mo>
      <m:mrow>
         <m:msubsup>
            <m:mrow>
               <m:mi>e</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mn>0</m:mn>
            </m:mrow>
            <m:mrow>
               <m:mi>&#8242;</m:mi>
            </m:mrow>
         </m:msubsup>
         <m:mo class="MathClass-punc">,</m:mo>
         <m:mspace class="tmspace" width="2.77695pt"/>
         <m:msubsup>
            <m:mrow>
               <m:mi>e</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mi>j</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mi>&#8242;</m:mi>
            </m:mrow>
         </m:msubsup>
      </m:mrow>
      <m:mo class="MathClass-close">)</m:mo>
   </m:mrow>
</m:mrow>
</m:math></inline-formula>.</p>
            <p>Proof: In this case <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i57"><m:msubsup>
   <m:mrow>
      <m:mi>e</m:mi>
   </m:mrow>
   <m:mrow>
      <m:mn>0</m:mn>
   </m:mrow>
   <m:mrow>
      <m:mi>&#8242;</m:mi>
   </m:mrow>
</m:msubsup>
</m:math></inline-formula> has two arrows and <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i58"><m:mo class="MathClass-bin">&#8902;</m:mo>
<m:mrow>
   <m:mo class="MathClass-open">(</m:mo>
   <m:mrow>
      <m:msubsup>
         <m:mrow>
            <m:mi>e</m:mi>
         </m:mrow>
         <m:mrow>
            <m:mn>0</m:mn>
         </m:mrow>
         <m:mrow>
            <m:mi>&#8242;</m:mi>
         </m:mrow>
      </m:msubsup>
      <m:mo class="MathClass-punc">,</m:mo>
      <m:msubsup>
         <m:mrow>
            <m:mi>e</m:mi>
         </m:mrow>
         <m:mrow>
            <m:mi>i</m:mi>
         </m:mrow>
         <m:mrow>
            <m:mi>&#8242;</m:mi>
         </m:mrow>
      </m:msubsup>
   </m:mrow>
   <m:mo class="MathClass-close">)</m:mo>
</m:mrow>
</m:math></inline-formula> is of type S5.</p>
            <p><b>Lemma 5</b>. <it>Assume that <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i59"><m:mi mathvariant="bold">M</m:mi>
<m:mi mathvariant="bold">i</m:mi>
<m:msub>
   <m:mrow>
      <m:mi mathvariant="bold">n</m:mi>
   </m:mrow>
   <m:mrow>
      <m:mi mathvariant="script">G</m:mi>
   </m:mrow>
</m:msub>
<m:mo class="MathClass-bin">&#8745;</m:mo>
<m:msub>
   <m:mrow>
      <m:mi>C</m:mi>
   </m:mrow>
   <m:mrow>
      <m:mi mathvariant="script">G</m:mi>
   </m:mrow>
</m:msub>
<m:mo class="MathClass-rel">=</m:mo>
<m:mrow>
   <m:mo class="MathClass-open">{</m:mo>
   <m:mrow>
      <m:msub>
         <m:mrow>
            <m:mi>e</m:mi>
         </m:mrow>
         <m:mrow>
            <m:mn>0</m:mn>
         </m:mrow>
      </m:msub>
      <m:mo class="MathClass-punc">,</m:mo>
      <m:msub>
         <m:mrow>
            <m:mi>e</m:mi>
         </m:mrow>
         <m:mrow>
            <m:mi>i</m:mi>
         </m:mrow>
      </m:msub>
      <m:mo class="MathClass-punc">,</m:mo>
      <m:msub>
         <m:mrow>
            <m:mi>e</m:mi>
         </m:mrow>
         <m:mrow>
            <m:mi>j</m:mi>
         </m:mrow>
      </m:msub>
   </m:mrow>
   <m:mo class="MathClass-close">}</m:mo>
</m:mrow>
</m:math></inline-formula>
, where i &#8800; </it>0,</p>
            <p><b>(NE3) </b><it>If both e<sub>i </sub>and e<sub>j </sub>are symmetric then </it><inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i60"><m:mrow>
   <m:mi mathvariant="bold">M</m:mi>
   <m:mi>i</m:mi>
   <m:msub>
      <m:mrow>
         <m:mi mathvariant="bold">n</m:mi>
      </m:mrow>
      <m:mrow>
         <m:msup>
            <m:mrow>
               <m:mi mathvariant="script">G</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mi>&#8242;</m:mi>
            </m:mrow>
         </m:msup>
      </m:mrow>
   </m:msub>
   <m:mo class="MathClass-rel">=</m:mo>
   <m:mi mathvariant="bold">M</m:mi>
   <m:mi mathvariant="bold">i</m:mi>
   <m:msub>
      <m:mrow>
         <m:mi mathvariant="bold">n</m:mi>
      </m:mrow>
      <m:mrow>
         <m:mi mathvariant="script">G</m:mi>
      </m:mrow>
   </m:msub>
   <m:mo class="MathClass-bin">\</m:mo>
   <m:msub>
      <m:mrow>
         <m:mi>C</m:mi>
      </m:mrow>
      <m:mrow>
         <m:mi mathvariant="script">G</m:mi>
      </m:mrow>
   </m:msub>
   <m:mo class="MathClass-bin">&#8746;</m:mo>
   <m:msub>
      <m:mrow>
         <m:mi>C</m:mi>
      </m:mrow>
      <m:mrow>
         <m:msup>
            <m:mrow>
               <m:mi mathvariant="script">G</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mi>&#8242;</m:mi>
            </m:mrow>
         </m:msup>
      </m:mrow>
   </m:msub>
</m:mrow>
</m:math></inline-formula><it/>,</p>
            <p><b>(NE4) </b><it>If e<sub>j </sub>is asymmetric and <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i57"><m:msubsup><m:mrow><m:mi>e</m:mi></m:mrow><m:mrow><m:mn>0</m:mn></m:mrow><m:mrow><m:mi>&#8242;</m:mi></m:mrow></m:msubsup></m:math></inline-formula> is symmetric then </it><inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i61"><m:mrow>
   <m:mi mathvariant="bold">M</m:mi>
   <m:mi mathvariant="bold">i</m:mi>
   <m:msub>
      <m:mrow>
         <m:mi mathvariant="bold">n</m:mi>
      </m:mrow>
      <m:mrow>
         <m:msup>
            <m:mrow>
               <m:mi mathvariant="script">G</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mi>&#8242;</m:mi>
            </m:mrow>
         </m:msup>
      </m:mrow>
   </m:msub>
   <m:mo class="MathClass-rel">=</m:mo>
   <m:mi mathvariant="bold">M</m:mi>
   <m:mi mathvariant="bold">i</m:mi>
   <m:msub>
      <m:mrow>
         <m:mi mathvariant="bold">n</m:mi>
      </m:mrow>
      <m:mrow>
         <m:mi mathvariant="script">G</m:mi>
      </m:mrow>
   </m:msub>
   <m:mo class="MathClass-bin">\</m:mo>
   <m:msub>
      <m:mrow>
         <m:mi>C</m:mi>
      </m:mrow>
      <m:mrow>
         <m:mi mathvariant="script">G</m:mi>
      </m:mrow>
   </m:msub>
   <m:mo class="MathClass-bin">&#8746;</m:mo>
   <m:mo class="MathClass-bin">&#8902;</m:mo>
   <m:mrow>
      <m:mo class="MathClass-open">(</m:mo>
      <m:mrow>
         <m:msubsup>
            <m:mrow>
               <m:mi>e</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mn>0</m:mn>
            </m:mrow>
            <m:mrow>
               <m:mi>&#8242;</m:mi>
            </m:mrow>
         </m:msubsup>
         <m:mo class="MathClass-punc">,</m:mo>
         <m:mspace class="tmspace" width="2.77695pt"/>
         <m:msubsup>
            <m:mrow>
               <m:mi>e</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mi>i</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mi>&#8242;</m:mi>
            </m:mrow>
         </m:msubsup>
      </m:mrow>
      <m:mo class="MathClass-close">)</m:mo>
   </m:mrow>
</m:mrow>
</m:math></inline-formula>.</p>
            <p><b>(NE5) </b><it>If both e<sub>j </sub>and <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i57"><m:msubsup><m:mrow><m:mi>e</m:mi></m:mrow><m:mrow><m:mn>0</m:mn></m:mrow><m:mrow><m:mi>&#8242;</m:mi></m:mrow></m:msubsup></m:math></inline-formula> are asymmetric then <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i62"><m:mrow>
   <m:mi mathvariant="bold">M</m:mi>
   <m:mi mathvariant="bold">i</m:mi>
   <m:msub>
      <m:mrow>
         <m:mi mathvariant="bold">n</m:mi>
      </m:mrow>
      <m:mrow>
         <m:msup>
            <m:mrow>
               <m:mi mathvariant="script">G</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mi>&#8242;</m:mi>
            </m:mrow>
         </m:msup>
      </m:mrow>
   </m:msub>
   <m:mo class="MathClass-rel">=</m:mo>
   <m:mi mathvariant="bold">M</m:mi>
   <m:mi mathvariant="bold">i</m:mi>
   <m:msub>
      <m:mrow>
         <m:mi mathvariant="bold">n</m:mi>
      </m:mrow>
      <m:mrow>
         <m:mi mathvariant="script">G</m:mi>
      </m:mrow>
   </m:msub>
   <m:mo class="MathClass-bin">\</m:mo>
   <m:msub>
      <m:mrow>
         <m:mi>C</m:mi>
      </m:mrow>
      <m:mrow>
         <m:mi mathvariant="script">G</m:mi>
      </m:mrow>
   </m:msub>
   <m:mo class="MathClass-bin">&#8746;</m:mo>
   <m:mrow>
      <m:mo class="MathClass-open">{</m:mo>
      <m:mrow>
         <m:msubsup>
            <m:mrow>
               <m:mi>e</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mi>i</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mi>&#8242;</m:mi>
            </m:mrow>
         </m:msubsup>
      </m:mrow>
      <m:mo class="MathClass-close">}</m:mo>
   </m:mrow>
</m:mrow>
</m:math></inline-formula></it>.</p>
            <p>Proof: Note that {<it>e</it><sub>0</sub>, <it>e<sub>i</sub>
, e<sub>j</sub></it>} must be a star in <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i63"><m:mrow>
   <m:mi mathvariant="script">G</m:mi>
   <m:mo class="MathClass-bin">&#8901;</m:mo>
   <m:mrow>
      <m:mo class="MathClass-open">(</m:mo>
      <m:mrow>
         <m:mstyle class="text">
            <m:mtext class="textsf" mathvariant="sans-serif">NE</m:mtext>
         </m:mstyle>
         <m:mn>3</m:mn>
      </m:mrow>
      <m:mo class="MathClass-close">)</m:mo>
   </m:mrow>
   <m:mo class="MathClass-bin">&#8902;</m:mo>
   <m:mrow>
      <m:mo class="MathClass-open">(</m:mo>
      <m:mrow>
         <m:msub>
            <m:mrow>
               <m:mi>e</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mi>i</m:mi>
            </m:mrow>
         </m:msub>
         <m:mo class="MathClass-punc">,</m:mo>
         <m:msub>
            <m:mrow>
               <m:mi>e</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mi>j</m:mi>
            </m:mrow>
         </m:msub>
      </m:mrow>
      <m:mo class="MathClass-close">)</m:mo>
   </m:mrow>
</m:mrow>
</m:math></inline-formula> has type S4 or S5. After the transformation the two stars <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i64"><m:mrow>
   <m:mo class="MathClass-bin">&#8902;</m:mo>
   <m:mrow>
      <m:mo class="MathClass-open">(</m:mo>
      <m:mrow>
         <m:msubsup>
            <m:mrow>
               <m:mi>e</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mn>0</m:mn>
            </m:mrow>
            <m:mrow>
               <m:mi>&#8242;</m:mi>
            </m:mrow>
         </m:msubsup>
         <m:mo class="MathClass-punc">,</m:mo>
         <m:msubsup>
            <m:mrow>
               <m:mi>e</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mi>i</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mi>&#8242;</m:mi>
            </m:mrow>
         </m:msubsup>
      </m:mrow>
      <m:mo class="MathClass-close">)</m:mo>
   </m:mrow>
</m:mrow>
</m:math></inline-formula> and <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i65"><m:mrow>
   <m:mo class="MathClass-bin">&#8902;</m:mo>
   <m:mrow>
      <m:mo class="MathClass-open">(</m:mo>
      <m:mrow>
         <m:msubsup>
            <m:mrow>
               <m:mi>e</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mn>0</m:mn>
            </m:mrow>
            <m:mrow>
               <m:mi>&#8242;</m:mi>
            </m:mrow>
         </m:msubsup>
         <m:mo class="MathClass-punc">,</m:mo>
         <m:msubsup>
            <m:mrow>
               <m:mi>e</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mi>j</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mi>&#8242;</m:mi>
            </m:mrow>
         </m:msubsup>
      </m:mrow>
      <m:mo class="MathClass-close">)</m:mo>
   </m:mrow>
</m:mrow>
</m:math></inline-formula> have type S5. Both are optimal in <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i66"><m:mrow>
   <m:msup>
      <m:mrow>
         <m:mi mathvariant="script">G</m:mi>
      </m:mrow>
      <m:mrow>
         <m:mi>&#8242;</m:mi>
      </m:mrow>
   </m:msup>
   <m:mo class="MathClass-bin">&#8901;</m:mo>
   <m:mrow>
      <m:mo class="MathClass-open">(</m:mo>
      <m:mrow>
         <m:mstyle class="text">
            <m:mtext class="textsf" mathvariant="sans-serif">NE4</m:mtext>
         </m:mstyle>
      </m:mrow>
      <m:mo class="MathClass-close">)</m:mo>
   </m:mrow>
   <m:mo class="MathClass-bin">&#8902;</m:mo>
   <m:mrow>
      <m:mo class="MathClass-open">(</m:mo>
      <m:mrow>
         <m:msub>
            <m:mrow>
               <m:mi>e</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mi>i</m:mi>
            </m:mrow>
         </m:msub>
         <m:mo class="MathClass-punc">,</m:mo>
         <m:msub>
            <m:mrow>
               <m:mi>e</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mi>j</m:mi>
            </m:mrow>
         </m:msub>
      </m:mrow>
      <m:mo class="MathClass-close">)</m:mo>
   </m:mrow>
</m:mrow>
</m:math></inline-formula> has type S5. After the transformation <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i67"><m:mrow>
   <m:mo class="MathClass-bin">&#8902;</m:mo>
   <m:mrow>
      <m:mo class="MathClass-open">(</m:mo>
      <m:mrow>
         <m:msubsup>
            <m:mrow>
               <m:mi>e</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mn>0</m:mn>
            </m:mrow>
            <m:mrow>
               <m:mi>&#8242;</m:mi>
            </m:mrow>
         </m:msubsup>
         <m:mo class="MathClass-punc">,</m:mo>
         <m:msubsup>
            <m:mrow>
               <m:mi>e</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mi>i</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mi>&#8242;</m:mi>
            </m:mrow>
         </m:msubsup>
      </m:mrow>
      <m:mo class="MathClass-close">)</m:mo>
   </m:mrow>
</m:mrow>
</m:math></inline-formula> has type S5 and <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i68"><m:mrow>
   <m:mo class="MathClass-bin">&#8902;</m:mo>
   <m:mrow>
      <m:mo class="MathClass-open">(</m:mo>
      <m:mrow>
         <m:msubsup>
            <m:mrow>
               <m:mi>e</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mn>0</m:mn>
            </m:mrow>
            <m:mrow>
               <m:mi>&#8242;</m:mi>
            </m:mrow>
         </m:msubsup>
         <m:mo class="MathClass-punc">,</m:mo>
         <m:msubsup>
            <m:mrow>
               <m:mi>e</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mi>j</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mi>&#8242;</m:mi>
            </m:mrow>
         </m:msubsup>
      </m:mrow>
      <m:mo class="MathClass-close">)</m:mo>
   </m:mrow>
</m:mrow>
</m:math></inline-formula> has type S3. Only the first is optimal in <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i69"><m:mrow>
   <m:msup>
      <m:mrow>
         <m:mi mathvariant="script">G</m:mi>
      </m:mrow>
      <m:mrow>
         <m:mi>&#8242;</m:mi>
      </m:mrow>
   </m:msup>
   <m:mo class="MathClass-bin">&#8901;</m:mo>
   <m:mrow>
      <m:mo class="MathClass-open">(</m:mo>
      <m:mrow>
         <m:mstyle class="text">
            <m:mtext class="textsf" mathvariant="sans-serif">NE5</m:mtext>
         </m:mstyle>
      </m:mrow>
      <m:mo class="MathClass-close">)</m:mo>
   </m:mrow>
   <m:mo class="MathClass-bin">&#8902;</m:mo>
   <m:mrow>
      <m:mo class="MathClass-open">(</m:mo>
      <m:mrow>
         <m:msub>
            <m:mrow>
               <m:mi>e</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mi>i</m:mi>
            </m:mrow>
         </m:msub>
         <m:mo class="MathClass-punc">,</m:mo>
         <m:msub>
            <m:mrow>
               <m:mi>e</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mi>j</m:mi>
            </m:mrow>
         </m:msub>
      </m:mrow>
      <m:mo class="MathClass-close">)</m:mo>
   </m:mrow>
</m:mrow>
</m:math></inline-formula> has type S5 while the other star in <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i70"><m:mrow><m:msub><m:mrow><m:mi>C</m:mi></m:mrow><m:mrow><m:mi mathvariant="script">G</m:mi></m:mrow></m:msub></m:mrow></m:math></inline-formula> has type S3. After the transformation only <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i71"><m:mrow>
   <m:msubsup>
      <m:mrow>
         <m:mi>e</m:mi>
      </m:mrow>
      <m:mrow>
         <m:mi>i</m:mi>
      </m:mrow>
      <m:mrow>
         <m:mi>&#8242;</m:mi>
      </m:mrow>
   </m:msubsup>
</m:mrow>
</m:math></inline-formula> remains symmetric in <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i72"><m:mrow>
   <m:msub>
      <m:mrow>
         <m:mi>C</m:mi>
      </m:mrow>
      <m:mrow>
         <m:msup>
            <m:mrow>
               <m:mi mathvariant="script">G</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mi>&#8242;</m:mi>
            </m:mrow>
         </m:msup>
      </m:mrow>
   </m:msub>
</m:mrow>
</m:math></inline-formula> therefore it is the only optimal edge in <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i73"><m:mrow>
   <m:msub>
      <m:mrow>
         <m:mi>C</m:mi>
      </m:mrow>
      <m:mrow>
         <m:msup>
            <m:mrow>
               <m:mi mathvariant="script">G</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mi>&#8242;</m:mi>
            </m:mrow>
         </m:msup>
      </m:mrow>
   </m:msub>
</m:mrow>
</m:math></inline-formula>.</p>
            <p><it>Computing the optimal cost (f)</it>. Observe that from Lemmas 2-5 at least one optimal edge remains optimal after the NNI operation. Therefore, to compute the difference in costs between optimal rootings of <inline-formula><graphic file="1471-2105-13-S10-S14-i5.gif"/></inline-formula> and <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i74"><m:mrow>
   <m:msup>
      <m:mrow>
         <m:mi mathvariant="script">G</m:mi>
      </m:mrow>
      <m:mrow>
         <m:mi>&#8242;</m:mi>
      </m:mrow>
   </m:msup>
</m:mrow>
</m:math></inline-formula> we start with the cost analysis for the rootings of such edge.</p>
            <p>First, we introduce a function for computing the cost differences. Consider three nodes <it>x, y, z </it>of some rooted gene tree such that <it>x </it>and <it>y </it>are siblings and the parent of them (denoted by <it>xy</it>), is a sibling of <it>z</it>. In other words we can denote this subtree by ((<it>x, y</it>), <it>z</it>). Then, the partial contribution of ((<it>x, y</it>), <it>z</it>) to the total weighted mutation cost can be described as follows: <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i75"><m:mrow>
   <m:msub>
      <m:mrow>
         <m:mo mathsize="big">&#8721;</m:mo>
      </m:mrow>
      <m:mrow>
         <m:mi>a</m:mi>
         <m:mo class="MathClass-rel">&#8712;</m:mo>
         <m:mi mathvariant="script">S</m:mi>
      </m:mrow>
   </m:msub>
   <m:mi>&#945;</m:mi>
   <m:mo class="MathClass-bin">*</m:mo>
   <m:mrow>
      <m:mo class="MathClass-open">(</m:mo>
      <m:mrow>
         <m:msup>
            <m:mrow>
               <m:mi>&#958;</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mi>D</m:mi>
            </m:mrow>
         </m:msup>
         <m:mrow>
            <m:mo class="MathClass-open">(</m:mo>
            <m:mrow>
               <m:mi>x</m:mi>
               <m:mi>y</m:mi>
               <m:mo class="MathClass-punc">,</m:mo>
               <m:mspace class="tmspace" width="2.77695pt"/>
               <m:mi>a</m:mi>
            </m:mrow>
            <m:mo class="MathClass-close">)</m:mo>
         </m:mrow>
         <m:mo class="MathClass-bin">+</m:mo>
         <m:msup>
            <m:mrow>
               <m:mi>&#958;</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mi>D</m:mi>
            </m:mrow>
         </m:msup>
         <m:mrow>
            <m:mo class="MathClass-open">(</m:mo>
            <m:mrow>
               <m:mi>x</m:mi>
               <m:mi>y</m:mi>
               <m:mi>z</m:mi>
               <m:mo class="MathClass-punc">,</m:mo>
               <m:mspace class="tmspace" width="2.77695pt"/>
               <m:mi>a</m:mi>
            </m:mrow>
            <m:mo class="MathClass-close">)</m:mo>
         </m:mrow>
      </m:mrow>
      <m:mo class="MathClass-close">)</m:mo>
   </m:mrow>
   <m:mo class="MathClass-bin">+</m:mo>
   <m:mi>&#946;</m:mi>
   <m:mo class="MathClass-bin">*</m:mo>
   <m:mrow>
      <m:mo class="MathClass-open">(</m:mo>
      <m:mrow>
         <m:msup>
            <m:mrow>
               <m:mi>&#958;</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mi>L</m:mi>
            </m:mrow>
         </m:msup>
         <m:mrow>
            <m:mo class="MathClass-open">(</m:mo>
            <m:mrow>
               <m:mi>x</m:mi>
               <m:mi>y</m:mi>
               <m:mo class="MathClass-punc">,</m:mo>
               <m:mspace class="tmspace" width="2.77695pt"/>
               <m:mi>a</m:mi>
            </m:mrow>
            <m:mo class="MathClass-close">)</m:mo>
         </m:mrow>
         <m:mo class="MathClass-bin">+</m:mo>
         <m:msup>
            <m:mrow>
               <m:mi>&#958;</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mi>L</m:mi>
            </m:mrow>
         </m:msup>
         <m:mrow>
            <m:mo class="MathClass-open">(</m:mo>
            <m:mrow>
               <m:mi>x</m:mi>
               <m:mi>y</m:mi>
               <m:mi>z</m:mi>
               <m:mo class="MathClass-punc">,</m:mo>
               <m:mspace class="tmspace" width="2.77695pt"/>
               <m:mi>a</m:mi>
            </m:mrow>
            <m:mo class="MathClass-close">)</m:mo>
         </m:mrow>
      </m:mrow>
      <m:mo class="MathClass-close">)</m:mo>
   </m:mrow>
</m:mrow>
</m:math></inline-formula>. Assume that <it>x, y </it>and <it>z </it>are mapped into <it>a, b </it>and <it>c </it>(from the species tree), respectively. It can be proved from the defnition of <it>&#958;<sup>D </sup></it>and <it>&#958;<sup>L </sup></it>that the above contribution equals: <it>&#981;</it>(<it>a, b, c</it>) = <it>&#945; </it>* (<it>D</it>(<it>a, b</it>) + <it>D</it>(<it>a </it>+ <it>b, c</it>)) + <it>&#946; </it>* (<it>L</it>(<it>a, b</it>) + <it>L</it>(<it>a </it>+ <it>b, c</it>)). Now, assume that a single NNI operation changes ((<it>x, y</it>), <it>z</it>)) into (<it>x</it>, (<it>y, z</it>)). It should be clear that the cost difference is given by: &#916;<sub>3</sub>(<it>a, b, c</it>) = <it>&#981;</it>(<it>c, b, a</it>) - <it>&#981;</it>(<it>a, b, c</it>). Similarly, we can define a cost difference when a single NNI operation changes ((<it>x, y</it>), (<it>z, v</it>)) into ((<it>x, v</it>), (<it>y, z</it>)). Assume, that <it>v </it>is mapped into <it>d</it>. Then, the cost contribution of the first subtree is <it>&#981;</it>'(<it>a, b, c, d</it>) = <it>&#981;</it>(<it>a, b, c </it>+ <it>d</it>) + <it>&#945; </it>* (<it>D</it>(<it>c, d</it>) + <it>&#946; </it>* <it>L</it>(<it>c, d</it>). The cost difference is given by: &#916;<sub>4</sub>(<it>a, b, c, d</it>) = <it>&#981;</it>'(<it>a, d, b, c</it>) - <it>&#981;</it>'(<it>a, b, c, d</it>).</p>
            <p><b>Lemma 6</b>. <it>If the center edge is optimal and remains optimal after the NNI operation then the cost difference equals </it>&#916;<sub>4</sub>(<it>a</it><sub>1</sub>, <it>a</it><sub>2</sub>, <it>a</it><sub>3</sub>, <it>a</it><sub>4</sub>)<it>, where a<sub>i </sub>(for i </it>= 1, 2, 3, 4<it>) is the mapping as indicated in </it>Figure <figr fid="F3">3</figr><it/>.</p>
            <p>As mentioned the above lemma can be proved by comparing the rootings placed on the center edges in <inline-formula><graphic file="1471-2105-13-S10-S14-i5.gif"/></inline-formula> and <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i74"><m:mrow><m:msup><m:mrow><m:mi mathvariant="script">G</m:mi></m:mrow><m:mrow><m:mi>&#8242;</m:mi></m:mrow></m:msup></m:mrow></m:math></inline-formula>. Lemma 6 gives a solution for cases: EQ2, EQ3, NE1 and NE3. The next lemma gives a solution for the remaining cases.</p>
            <p><b>Lemma 7</b>. <it>If for some i &gt;</it>0 <it>there exists an optimal edge in T<sub>i </sub></it>&#8746; {<it>e<sub>i</sub></it>} <it>that remains optimal after the NNI operation (under assumption that e<sub>i </sub>is replaced by <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i76"><m:mrow>
   <m:msubsup>
      <m:mrow>
         <m:mi>e</m:mi>
      </m:mrow>
      <m:mrow>
         <m:mi>i</m:mi>
      </m:mrow>
      <m:mrow>
         <m:mi>&#8242;</m:mi>
      </m:mrow>
   </m:msubsup>
</m:mrow>
</m:math></inline-formula>) then the cost difference is </it>&#916;<sub>3</sub>(<it>a</it><sub>4</sub>, <it>a</it><sub>3</sub>, <it>a</it><sub>2</sub>) <it>if i </it>= 1<it/>, &#916;<sub>3</sub>(<it>a</it><sub>3</sub>, <it>a</it><sub>4, </sub><it>a</it><sub>1</sub>) <it>if i </it>= 2<it/>, &#916;<sub>3</sub>(<it>a</it><sub>2</sub>, <it>a</it><sub>1</sub>, <it>a</it><sub>4</sub>) <it>if i </it>= 3 <it>and </it>&#916;<sub>3</sub>(<it>a</it><sub>1</sub>, <it>a</it><sub>2</sub>, <it>a</it><sub>3</sub>) <it>if i </it>= 4<it/>.</p>
            <p>Similarly to Lemma 6 we can prove Lemma 7 by comparing the rootings of <it>e<sub>i </sub></it>and <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i77"><m:msubsup>
   <m:mrow>
      <m:mi>e</m:mi>
   </m:mrow>
   <m:mrow>
      <m:mi>i</m:mi>
   </m:mrow>
   <m:mrow>
      <m:mi>&#8242;</m:mi>
   </m:mrow>
</m:msubsup>
</m:math></inline-formula>.</p>
            <p><it>Error correction algorithm</it>. Finally, we can present the algorithm for computing the optimal weighted mutation cost for a given gene tree and its <it>k</it>-NNI neighborhood. See Figure <figr fid="F4">4</figr> for details. It should be clear that the complexity of this algorithm is <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i78"><m:mi>O</m:mi>
<m:mrow>
   <m:mo class="MathClass-open">(</m:mo>
   <m:mrow>
      <m:mo class="MathClass-rel">|</m:mo>
      <m:mi mathvariant="script">G</m:mi>
      <m:msup>
         <m:mrow>
            <m:mo class="MathClass-rel">|</m:mo>
         </m:mrow>
         <m:mrow>
            <m:mi>k</m:mi>
         </m:mrow>
      </m:msup>
      <m:mo class="MathClass-bin">+</m:mo>
      <m:mtext>max</m:mtext>
      <m:mrow>
         <m:mo class="MathClass-open">(</m:mo>
         <m:mrow>
            <m:mo class="MathClass-rel">|</m:mo>
            <m:mi mathvariant="script">G</m:mi>
            <m:mo class="MathClass-rel">|</m:mo>
            <m:mo class="MathClass-punc">,</m:mo>
            <m:mo class="MathClass-rel">|</m:mo>
            <m:mi mathvariant="script">S</m:mi>
            <m:mo class="MathClass-rel">|</m:mo>
         </m:mrow>
         <m:mo class="MathClass-close">)</m:mo>
      </m:mrow>
   </m:mrow>
   <m:mo class="MathClass-close">)</m:mo>
</m:mrow>
</m:math></inline-formula>. We write that a gene tree has <it>errors </it>if the optimal cost is computed for one of its NNI variants. Otherwise, we write that a gene tree <it>does not require corrections</it>. Please note that it for a special case of <it>k </it>= 1, this algorithm is linear in time (see also our preliminary article <abbrgrp><abbr bid="B22">22</abbr></abbrgrp>).</p>
            <fig id="F4"><title><p>Figure 4</p></title><caption><p>Algorithm</p></caption><text>
   <p><b>Algorithm</b>. Optimal weighted cost for <inline-formula><graphic file="1471-2105-13-S10-S14-i5.gif"/></inline-formula> and its <it>k</it>-NNI neighborhood.</p>
</text><graphic file="1471-2105-13-S10-S14-4"/></fig>
         </sec>
         <sec>
            <st>
               <p>General reconstruction problems</p>
            </st>
            <p>We present several approaches to problems of error correction and phylogeny reconstruction. Let us assume that <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i79"><m:mrow>
   <m:msub>
      <m:mrow>
         <m:mi>&#963;</m:mi>
      </m:mrow>
      <m:mrow>
         <m:mi>&#945;</m:mi>
         <m:mo class="MathClass-punc">,</m:mo>
         <m:mi>&#946;</m:mi>
         <m:mo class="MathClass-punc">,</m:mo>
         <m:mi>k</m:mi>
      </m:mrow>
   </m:msub>
   <m:mrow>
      <m:mo class="MathClass-open">(</m:mo>
      <m:mrow>
         <m:mi mathvariant="script">S</m:mi>
         <m:mo class="MathClass-punc">,</m:mo>
         <m:mspace class="tmspace" width="2.77695pt"/>
         <m:mi mathvariant="script">G</m:mi>
      </m:mrow>
      <m:mo class="MathClass-close">)</m:mo>
   </m:mrow>
</m:mrow>
</m:math></inline-formula> is the cost computed by algorithm from Figure <figr fid="F4">4</figr>, where <it>&#945;, &#946; </it>&gt; 0, <it>k </it>&#8805; 0, <inline-formula><graphic file="1471-2105-13-S10-S14-i1.gif"/></inline-formula> is a rooted species tree and <inline-formula><graphic file="1471-2105-13-S10-S14-i5.gif"/></inline-formula> is an unrooted gene tree.</p>
            <p><b>Problem 1 </b>(<it>k</it>NNIC)<b/>. <it>Given a rooted species tree </it><inline-formula><graphic file="1471-2105-13-S10-S14-i1.gif"/></inline-formula><it>and a set of unrooted gene trees, G compute the total cost </it><inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i80"><m:mrow>
   <m:msub>
      <m:mrow>
         <m:mo mathsize="big">&#8721;</m:mo>
      </m:mrow>
      <m:mrow>
         <m:mi mathvariant="script">G</m:mi>
         <m:mo class="MathClass-rel">&#8712;</m:mo>
         <m:mi>G</m:mi>
      </m:mrow>
   </m:msub>
   <m:msub>
      <m:mrow>
         <m:mi>&#963;</m:mi>
      </m:mrow>
      <m:mrow>
         <m:mi>&#945;</m:mi>
         <m:mo class="MathClass-punc">,</m:mo>
         <m:mi>&#946;</m:mi>
         <m:mo class="MathClass-punc">,</m:mo>
         <m:mi>k</m:mi>
      </m:mrow>
   </m:msub>
   <m:mrow>
      <m:mo class="MathClass-open">(</m:mo>
      <m:mrow>
         <m:mi mathvariant="script">S</m:mi>
         <m:mo class="MathClass-punc">,</m:mo>
         <m:mspace class="tmspace" width="2.77695pt"/>
         <m:mi mathvariant="script">G</m:mi>
      </m:mrow>
      <m:mo class="MathClass-close">)</m:mo>
   </m:mrow>
</m:mrow>
</m:math></inline-formula>.</p>
            <p>The <it>k</it>NNIC problem can be solved in polynomial time by an iterative application of our algorithm. Additionally, we can reconstruct the optimal rootings as well as the correct topology of each gene tree. Please note that for <it>k </it>= 0 (no error correction), we have the cost inference problem for the reconciliation of an unrooted gene tree with a rooted species tree <abbrgrp><abbr bid="B18">18</abbr></abbrgrp>.</p>
            <p><b>Problem 2 </b>(<it>k</it>NNIST). <it>Given a set of unrooted gene trees G find the species tree <inline-formula><graphic file="1471-2105-13-S10-S14-i1.gif"/></inline-formula> that minimizes the total cost </it><inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i81"><m:mrow>
   <m:msub>
      <m:mrow>
         <m:mo mathsize="big">&#8721;</m:mo>
      </m:mrow>
      <m:mrow>
         <m:mi mathvariant="script">G</m:mi>
         <m:mo class="MathClass-rel">&#8712;</m:mo>
         <m:mi>G</m:mi>
      </m:mrow>
   </m:msub>
   <m:msub>
      <m:mrow>
         <m:mi>&#963;</m:mi>
      </m:mrow>
      <m:mrow>
         <m:mi>&#945;</m:mi>
         <m:mo class="MathClass-punc">,</m:mo>
         <m:mi>&#946;</m:mi>
         <m:mo class="MathClass-punc">,</m:mo>
         <m:mi>k</m:mi>
      </m:mrow>
   </m:msub>
   <m:mrow>
      <m:mo class="MathClass-open">(</m:mo>
      <m:mrow>
         <m:mi>S</m:mi>
         <m:mo class="MathClass-punc">,</m:mo>
         <m:mspace class="tmspace" width="2.77695pt"/>
         <m:mi mathvariant="script">G</m:mi>
      </m:mrow>
      <m:mo class="MathClass-close">)</m:mo>
   </m:mrow>
</m:mrow>
</m:math></inline-formula>.</p>
            <p>The complexity of the <it>k</it>NNIST problem is unknown. However, similar problems for the duplication model are NP-hard <abbrgrp><abbr bid="B13">13</abbr></abbrgrp>. Therefore we developed heuristics for the <it>k</it>NNIST problem to use them in our experiments.</p>
            <p>In applications there is typically no need to search over all NNI variants of a gene tree. For instance, a good candidate for an NNI operation is <it>a weak edge</it>. A weak edge is usually defined on the basis of its length, where short length indicates weakness. To formalize this property, let us assume that each edge in a gene tree <inline-formula><graphic file="1471-2105-13-S10-S14-i5.gif"/></inline-formula> has length. We call an edge <it>e </it>in <inline-formula><graphic file="1471-2105-13-S10-S14-i5.gif"/></inline-formula><it>weak </it>if the length of <it>e </it>is smaller than <it>&#969;</it>, where <it>&#969; </it>is a non-negative real. Now we can define variants of <it>k</it>NNIC and <it>k</it>NNIST denoted by <it>&#969;</it>-<it>k</it>NNIC and <it>&#969;</it>-<it>k</it>NNIST, respectively, where the NNI operations are performed on weak edges only. These straighforward definitions are omitted. Please note that the time complexity of the algorithm with NNIs limited to weak edges is <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S10-S14-i82"><m:mi>O</m:mi>
<m:mrow>
   <m:mo class="MathClass-open">(</m:mo>
   <m:mrow>
      <m:msup>
         <m:mrow>
            <m:mi>l</m:mi>
         </m:mrow>
         <m:mrow>
            <m:mi>k</m:mi>
         </m:mrow>
      </m:msup>
      <m:mo class="MathClass-bin">+</m:mo>
      <m:mtext>max</m:mtext>
      <m:mrow>
         <m:mo class="MathClass-open">(</m:mo>
         <m:mrow>
            <m:mo class="MathClass-rel">|</m:mo>
            <m:mi mathvariant="script">G</m:mi>
            <m:mo class="MathClass-rel">|</m:mo>
            <m:mo class="MathClass-punc">,</m:mo>
            <m:mo class="MathClass-rel">|</m:mo>
            <m:mi mathvariant="script">S</m:mi>
            <m:mo class="MathClass-rel">|</m:mo>
         </m:mrow>
         <m:mo class="MathClass-close">)</m:mo>
      </m:mrow>
   </m:mrow>
   <m:mo class="MathClass-close">)</m:mo>
</m:mrow>
</m:math></inline-formula>, where <it>l </it>is the number of weak edges in <inline-formula><graphic file="1471-2105-13-S10-S14-i5.gif"/></inline-formula>.</p>
         </sec>
         <sec>
            <st>
               <p>Software</p>
            </st>
            <p>The unrooted reconciliation algorithm <abbrgrp><abbr bid="B18">18</abbr></abbrgrp> and its data structures are implemented in program URec <abbrgrp><abbr bid="B23">23</abbr></abbrgrp>. Our algorithm partially depends on theses data structures and therefore was implemented as a significantly extended version of URec. Additionally, we implemented a hill climbing heuristic to solve <it>k</it>NNIST and <it>&#969;</it>-kNNIST.</p>
            <p>Software and datasets from our experiments are made freely available through <url>http://bioputer.mimuw.edu.pl/~gorecki/ec</url>.</p>
         </sec>
         <sec>
            <st>
               <p>Experimental results and discussion</p>
            </st>
            <sec>
               <st>
                  <p>Data preparation</p>
               </st>
               <p>First, we inferred 4133 unrooted gene trees with branch lengths from nine yeast genomes contained in the Genolevures 3 data set <abbrgrp><abbr bid="B24">24</abbr></abbrgrp>, which contains protein sequences from the following nine yeast species: <it>C. glabrata </it>(4957 protein sequences, abbreviation CAGL), <it>S. cerevisiae </it>(5396, SACE), <it>Z. rouxii </it>(4840, ZYRO), <it>S. kluyveri </it>(5074, SAKL), <it>K. thermotolerans </it>(4933, KLTH), <it>K. lactis </it>(4851, KLLA), <it>Y. lipolytica </it>(4781, YALI), <it>D. hansenii </it>(5006, DEHA) and <it>E. gossypii </it>(4527, ERGO).</p>
               <p>We aligned the protein sequences of each gene family by using the program TCoffee <abbrgrp><abbr bid="B25">25</abbr></abbrgrp> using the default parameter setting. Then maximum likelihood (unrooted) gene trees were computed from the alignments by using proml from the phylip software package. The original species tree of these yeasts <abbrgrp><abbr bid="B24">24</abbr></abbrgrp>, here denoted by G3, is shown in Figure <figr fid="F5">5</figr>.</p>
               <fig id="F5"><title><p>Figure 5</p></title><caption><p>Yeasts phylogeny</p></caption><text>
   <p><b>Yeasts phylogeny</b>. Species tree topologies. G3 - original phylogeny of Genolevures 3 data set <abbrgrp><abbr bid="B24">24</abbr></abbrgrp>. 1NNIEC - optimal rooted species tree inferred from gene trees with all possible 1-NNI error corrections. NOEC - optimal species tree for the yeast gene trees with no NNI operations (cost 64413, no corrections). Rank denotes a position of a tree on the sorted list of the best trees. The trees below are inferred from other <it>&#969;</it>-<it>k</it>NNIST (see next figures). Please note that NOEC, G3, a1 and a2 are rooted variants of the same unrooted tree. Similar property holds for 1NNIEC, b1 and b2.</p>
</text><graphic file="1471-2105-13-S10-S14-5"/></fig>
            </sec>
            <sec>
               <st>
                  <p>Inferring optimal species trees</p>
               </st>
               <p>The optimal species tree reconstructed with error corrections (1NNIST optimization problem) is depicted in Figure <figr fid="F5">5</figr> and denoted by 1NNIEC. This tree differs from G3 in the rooting and in the middle clade with KLLA and ERGO. Additionally, we inferred by the heuristic an optimal species tree, denoted here by NOEC, with no error corrections (0NNIST optimization). All the trees from this figure are highly scored in each of the optimization schemas.</p>
            </sec>
            <sec>
               <st>
                  <p>From weak edges to species trees</p>
               </st>
               <p>In the previous experiment, the NNI operations were performed on almost every gene tree in the optimal solution and with no restrictions on the edges. In order to reconstruct the trees more accurately, we performed experiments for <it>&#969;</it>-<it>k</it>NNIST optimization with various <it>&#969; </it>parameters and subsets of gene trees. The filtering of gene trees was determined by an integer <it>&#956; </it>&gt; 0 that defines the maximum number of allowed weak edges in a single gene tree. Each gene tree that did not satisfy such condition was rejected.</p>
               <p>Figures <figr fid="F6">6</figr> and <figr fid="F7">7</figr> depict a summary of error correction experiments for weak edges. For each <it>&#969; </it>and <it>&#956; </it>we performed 20 runs of the <it>&#969;</it>-<it>k</it>NNIST heuristic for finding the optimal species tree in the set of gene trees filtered by <it>&#956;</it>. The optimal species trees are depicted in the diagram, where each cell represents the result of a single <it>&#969;</it>-<it>k</it>NNIST experiment. We observed that G3, 1NNIEC and NOEC are significantly well represented in the set of optimal species trees in <it>&#969;</it>-1NNIST experiments, while in <it>&#969;</it>-2NNIST and <it>&#969;</it>-3NNIST experiments only G3 and NOEC were detected. Note that the original yeast phylogeny (G3, black squares in Figures <figr fid="F6">6</figr> and <figr fid="F7">7</figr>) is inferred for <it>&#969; </it>= 0.1-0.2 (in other words approx. 30-40% of edges are weak, see Figure <figr fid="F8">8</figr>) and <it>&#956; </it>&#8805; 10 in most experiments. In particular for <it>&#969; </it>= 0.15 and <it>&#956; </it>= 10, 364 gene trees were rejected (see Figure <figr fid="F9">9</figr>). These results significantly support the G3 phylogeny. Please note that the results for the standard unrooted reconciliation algorithms without error correction are located in the first column of diagrams (<it>&#969; </it>= 0).</p>
               <fig id="F6"><title><p>Figure 6</p></title><caption><p><b><it>&#969;</it></b>-1NNIST and <b><it>&#969;</it></b>-2NNIST experiments</p></caption><text>
   <p><b><b><it>&#969;</it></b>-1NNIST and <b><it>&#969;</it></b>-2NNIST experiments</b>. A summary of <it>&#969;</it>-1NNIST (top) and <it>&#969;</it>-2NNIST experiments (bottom) for <it>&#969; </it>= 0, 0.02, 0.04, ... , 0.98, <it>&#956; </it>= 2, 3, ... , 20. Optimal species trees found by the heuristics. Please note that in some cases two optimal trees were found.</p>
</text><graphic file="1471-2105-13-S10-S14-6"/></fig>
               <fig id="F7"><title><p>Figure 7</p></title><caption><p><b><it>&#969;</it></b>-3NNIST experiments</p></caption><text>
   <p><b><b><it>&#969;</it></b>-3NNIST experiments</b>. A summary of <it>&#969;</it>-3NNIST experiments for <it>&#969; </it>= 0, 0.02, 004, ... , 0.48 and <it>&#956; </it>= 2, 3, ... , 20.</p>
</text><graphic file="1471-2105-13-S10-S14-7"/></fig>
               <fig id="F8"><title><p>Figure 8</p></title><caption><p>Branch lengths</p></caption><text>
   <p><b>Branch lengths</b>. Histogram of branch lengths.</p>
</text><graphic file="1471-2105-13-S10-S14-8"/></fig>
               <fig id="F9"><title><p>Figure 9</p></title><caption><p>Rejected gene trees</p></caption><text>
   <p><b>Rejected gene trees</b>. The number of rejected trees as a function of <it>&#956; </it>and <it>&#969;</it>.</p>
</text><graphic file="1471-2105-13-S10-S14-9"/></fig>
            </sec>
            <sec>
               <st>
                  <p>From trusted species tree to weak edges in gene trees - automated and manual curation</p>
               </st>
               <p>Assume that the set of unrooted gene trees and the rooted (trusted) species tree <inline-formula><graphic file="1471-2105-13-S10-S14-i1.gif"/></inline-formula> are given. Then we can state the following problem: find <it>&#969; </it>and <it>&#956; </it>such that <inline-formula><graphic file="1471-2105-13-S10-S14-i1.gif"/></inline-formula> is the optimal species tree in <it>&#969;</it>-NNIST problem for the set of gene trees filtered by <it>&#956;</it>. For instance in our dataset, if we assume that G3 is a given correct phylogeny of yeasts, then from the diagrams (Figure <figr fid="F6">6</figr> and <figr fid="F7">7</figr>) one can determine appropriate values of <it>&#969; </it>and <it>&#956; </it>that yield G3 as optimal. In other words we can automatically determine weak edges by <it>&#969; </it>and filter gene trees by <it>&#956;</it>. This approach can be applied in tree curation procedures to correct errors in an automated way as well as to find candidates (rejected trees) for further manual curation. For instance, in the previous case, when <it>&#969; </it>= 0.1 and <it>&#956; </it>= 10, we have 3164 trees that can be corrected and rooted by our algorithm, while the 364 rejected trees could be candidates for further manual correction.</p>
            </sec>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Discussion</p>
         </st>
         <p>We present novel theoretical and practical results on the problem of error correction and phylogeny reconstruction. In particular, we describe a polynomial time and space algorithm that simultaneously solves the problem of correction topological errors in unrooted gene trees and the problem of rooting unrooted gene trees. The algorithm allows us to perform efficiently experiments on truly large-scale datasets available for yeast genomes. Our experiments suggest that our algorithm can be used to (i) detect errors, (ii) to infer a correct phylogeny of species under the presence of weak edges in gene trees, and (iii) to help in tree curation procedures.</p>
      </sec>
      <sec>
         <st>
            <p>Conclusion</p>
         </st>
         <p>We introduced a novel polynomial time algorithm for error-corrected and unrooted gene tree reconciliation. Experiments on yeast genomes suggests that an implementation of our algorithm can greatly improve on the accuracy of gene tree reconciliation, and thus, curate error-prone gene trees. Moreover, we use our error-corrected reconciliation to make the gene duplication problem, a standard application of gene tree reconciliation, more robust. We conjecture that the error-corrected gene duplication problem is intrinsically hard to solve, since the gene duplication problem is already NP-hard. Therefore, we introduced an effective heuristic for error-corrected gene duplication problem. Our experimental results for a wide range of error-correction tests on yeasts phylogeny show that our error-corrected reconciliations result in improved predictions of invoked gene duplication and loss events that then allow to infer more accurate phylogenies.</p>
         <p>The presented error correction is based on gene-species tree reconciliation using gene duplication and loss. However, there are other major evolutionary mechanism that infer gene tree topologies that are inconsistent with the actual species tree topology, like horizontal gene transfer and deep coalescence. Gene tree reconciliation using these mechanisms is highly sensitive to topological error, similar to gene tree reconciliation under gene duplication and loss. Future work will focus on the development of algorithms that can also reconcile unrooted and erroneous gene trees using horizontal gene transfer and deep coalescence.</p>
      </sec>
      <sec>
         <st>
            <p>Competing interests</p>
         </st>
         <p>The authors declare that they have no competing interests.</p>
      </sec>
      <sec>
         <st>
            <p>Authors' contributions</p>
         </st>
         <p>PG and OE were responsible for algorithm design and writing the paper. PG implemented the programs, and performed the experimental evaluation and the analysis of the results. Both authors read and approved the final manuscript.</p>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>The reviewers have provided several valuable comments that have improved the presentation. This work was conducted in parts with support from the Gene Tree Reconciliation Working Group at NIMBioS through NSF award #EF-0832858. PG was partially supported by the grant of MNiSW (N N301 065236) and OE was supported in parts by NSF awards #0830012 and #10117189.</p>
            <p>This article has been published as part of <it>BMC Bioinformatics </it>Volume 13 Supplement 10, 2012: "Selected articles from the 7th International Symposium on Bioinformatics Research and Applications (ISBRA'11)". The full contents of the supplement are available online at <url>http://www.biomedcentral.com/bmcbioinformatics/supplements/13/S10</url>.</p>
         </sec>
      </ack>
      <refgrp><bibl id="B1"><aug><au><snm>Graur</snm><fnm>D</fnm></au><au><snm>Li</snm><fnm>WH</fnm></au></aug><source>Fundamentals of Molecular Evolution</source><publisher>Sinauer Associates</publisher><edition>2</edition><pubdate>2000</pubdate><url>http://www.amazon.com/exec/obidos/redirect?tag=citeulike07-20\&amp;path=ASIN/0878932666</url></bibl><bibl id="B2"><title><p>Maps between trees and cladistic analysis of historical associations among genes, organisms, and areas</p></title><aug><au><snm>Page</snm><fnm>RDM</fnm></au></aug><source>Systematic Biology</source><pubdate>1994</pubdate><volume>43</volume><fpage>58</fpage><lpage>77</lpage></bibl><bibl id="B3"><title><p>Reconciling a gene tree to a species tree under the duplication cost model</p></title><aug><au><snm>Bonizzoni</snm><fnm>P</fnm></au><au><snm>Della Vedova</snm><fnm>G</fnm></au><au><snm>Dondi</snm><fnm>R</fnm></au></aug><source>Theoretical Computer Science</source><pubdate>2005</pubdate><volume>347</volume><issue>1-2</issue><fpage>36</fpage><lpage>53</lpage><xrefbib><pubid idtype="doi">10.1016/j.tcs.2005.05.016</pubid></xrefbib></bibl><bibl id="B4"><title><p>Duplication-Based Measures of Difference Between Gene and Species Trees</p></title><aug><au><snm>Eulenstein</snm><fnm>O</fnm></au><au><snm>Mirkin</snm><fnm>B</fnm></au><au><snm>Vingron</snm><fnm>M</fnm></au></aug><source>J Comput Biol</source><pubdate>1998</pubdate><volume>5</volume><fpage>135</fpage><lpage>148</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1089/cmb.1998.5.135</pubid><pubid idtype="pmpid">9541877</pubid></pubidlist></xrefbib></bibl><bibl id="B5"><title><p>Fitting the Gene Lineage into its Species Lineage, a Parsimony Strategy Illustrated by Cladograms Constructed from Globin Sequences</p></title><aug><au><snm>Goodman</snm><fnm>M</fnm></au><au><snm>Czelusniak</snm><fnm>J</fnm></au><au><snm>Moore</snm><fnm>GW</fnm></au><au><snm>Romero-Herrera</snm><fnm>AE</fnm></au><au><snm>Matsuda</snm><fnm>G</fnm></au></aug><source>Systematic Zoology</source><pubdate>1979</pubdate><volume>28</volume><issue>2</issue><fpage>132</fpage><lpage>163</lpage><xrefbib><pubid idtype="doi">10.2307/2412519</pubid></xrefbib></bibl><bibl id="B6"><title><p>A Biologically Consistent Model for Comparing Molecular Phylogenies</p></title><aug><au><snm>Mirkin</snm><fnm>B</fnm></au><au><snm>Muchnik</snm><fnm>IB</fnm></au><au><snm>Smith</snm><fnm>TF</fnm></au></aug><source>J Comput Biol</source><pubdate>1995</pubdate><volume>2</volume><issue>4</issue><fpage>493</fpage><lpage>507</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1089/cmb.1995.2.493</pubid><pubid idtype="pmpid">8634901</pubid></pubidlist></xrefbib></bibl><bibl id="B7"><title><p>Inferring angiosperm phylogeny from EST data with widespread gene duplication</p></title><aug><au><snm>Sanderson</snm><fnm>M</fnm></au><au><snm>McMahon</snm><fnm>M</fnm></au></aug><source>BMC Evolutionary Biology</source><pubdate>2007</pubdate><volume>7</volume><issue>Suppl 1</issue><url>http://dx.doi.org/10.1186/1471-2148-7-S1-S3</url></bibl><bibl id="B8"><title><p>The multiple gene duplication problem revisited</p></title><aug><au><snm>Bansal</snm><fnm>MS</fnm></au><au><snm>Eulenstein</snm><fnm>O</fnm></au></aug><source>Bioinformatics</source><pubdate>2008</pubdate><volume>24</volume><issue>13</issue><fpage>i132</fpage><lpage>8</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/bioinformatics/btn150</pubid><pubid idtype="pmcid">2718628</pubid><pubid idtype="pmpid" link="fulltext">18586705</pubid></pubidlist></xrefbib></bibl><bibl id="B9"><title><p>On the Multiple Gene Duplication Problem</p></title><aug><au><snm>Fellows</snm><fnm>MR</fnm></au><au><snm>Hallett</snm><fnm>MT</fnm></au><au><snm>Stege</snm><fnm>U</fnm></au></aug><source>ISAAC, Volume 1533 of LNCS</source><editor>Chwa KY, Ibarra OH, Springer</editor><pubdate>1998</pubdate><fpage>347</fpage><lpage>356</lpage></bibl><bibl id="B10"><title><p>Reconstruction of ancient molecular phylogeny</p></title><aug><au><snm>Guig&#243;</snm><fnm>R</fnm></au><au><snm>Muchnik</snm><fnm>IB</fnm></au><au><snm>Smith</snm><fnm>TF</fnm></au></aug><source>Molecular Phylogenetics and Evolution</source><pubdate>1996</pubdate><volume>6</volume><issue>2</issue><fpage>189</fpage><lpage>213</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1006/mpev.1996.0071</pubid><pubid idtype="pmpid" link="fulltext">8899723</pubid></pubidlist></xrefbib></bibl><bibl id="B11"><title><p>Reconstructing Domain Compositions of Ancestral Multi-domain Proteins</p></title><aug><au><snm>Behzadi</snm><fnm>B</fnm></au><au><snm>Vingron</snm><fnm>M</fnm></au></aug><source>Comparative Genomics, Volume 4205 of LNCS</source><publisher>Springer</publisher><editor>Bourque G, El-Mabrouk N</editor><pubdate>2006</pubdate><fpage>1</fpage><lpage>10</lpage></bibl><bibl id="B12"><title><p>Heuristics for the Gene-Duplication Problem: A &#920;(<it>n</it>) Speed-Up for the Local Search</p></title><aug><au><snm>Bansal</snm><fnm>MS</fnm></au><au><snm>Burleigh</snm><fnm>GJ</fnm></au><au><snm>Eulenstein</snm><fnm>O</fnm></au><au><snm>Wehe</snm><fnm>A</fnm></au></aug><source>RECOMB, Volume 4453 of LNCS</source><publisher>Springer</publisher><pubdate>2007</pubdate><fpage>238</fpage><lpage>252</lpage></bibl><bibl id="B13"><title><p>From Gene Trees to Species Trees</p></title><aug><au><snm>Ma</snm><fnm>B</fnm></au><au><snm>Li</snm><fnm>M</fnm></au><au><snm>Zhang</snm><fnm>L</fnm></au></aug><source>SIAM Journal on Computing</source><pubdate>2000</pubdate><volume>30</volume><issue>3</issue><fpage>729</fpage><lpage>752</lpage><xrefbib><pubid idtype="doi">10.1137/S0097539798343362</pubid></xrefbib></bibl><bibl id="B14"><title><p>GeneTree: comparing gene and species phylogenies using reconciled trees</p></title><aug><au><snm>Page</snm><fnm>RDM</fnm></au></aug><source>Bioinformatics</source><pubdate>1998</pubdate><volume>14</volume><issue>9</issue><fpage>819</fpage><lpage>820</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/bioinformatics/14.9.819</pubid><pubid idtype="pmpid" link="fulltext">9918954</pubid></pubidlist></xrefbib></bibl><bibl id="B15"><title><p>Bias in phylogenetic tree reconciliation methods: implications for vertebrate genome evolution</p></title><aug><au><snm>Hahn</snm><fnm>MW</fnm></au></aug><source>Genome biology</source><pubdate>2007</pubdate><volume>8</volume><issue>7</issue><fpage>R141</fpage><url>http://dx.doi.org/10.1186/gb-2007-8-7-r141</url><xrefbib><pubidlist><pubid idtype="doi">10.1186/gb-2007-8-7-r141</pubid><pubid idtype="pmcid">2323230</pubid><pubid idtype="pmpid" link="fulltext">17634151</pubid></pubidlist></xrefbib></bibl><bibl id="B16"><title><p>NOTUNG: a program for dating gene duplications and optimizing gene family trees</p></title><aug><au><snm>Chen</snm><fnm>K</fnm></au><au><snm>Durand</snm><fnm>D</fnm></au><au><snm>Farach-Colton</snm><fnm>M</fnm></au></aug><source>J Comput Biol</source><pubdate>2000</pubdate><volume>7</volume><issue>3-4</issue><fpage>429</fpage><lpage>447</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1089/106652700750050871</pubid><pubid idtype="pmpid" link="fulltext">11108472</pubid></pubidlist></xrefbib></bibl><bibl id="B17"><title><p>A Hybrid Micro-Macroevolutionary Approach to Gene Tree Reconstruction</p></title><aug><au><snm>Durand</snm><fnm>D</fnm></au><au><snm>Halldorsson</snm><fnm>BV</fnm></au><au><snm>Vernot</snm><fnm>B</fnm></au></aug><source>J Comput Biol</source><pubdate>2006</pubdate><volume>13</volume><issue>2</issue><fpage>320</fpage><lpage>335</lpage><url>http://dx.doi.org/10.1089/cmb.2006.13.320</url><xrefbib><pubidlist><pubid idtype="doi">10.1089/cmb.2006.13.320</pubid><pubid idtype="pmpid" link="fulltext">16597243</pubid></pubidlist></xrefbib></bibl><bibl id="B18"><title><p>Inferring phylogeny from whole genomes</p></title><aug><au><snm>G&#243;recki</snm><fnm>P</fnm></au><au><snm>Tiuryn</snm><fnm>J</fnm></au></aug><source>Bioinformatics</source><pubdate>2007</pubdate><volume>23</volume><issue>2</issue><fpage>e116</fpage><lpage>22</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/bioinformatics/btl296</pubid><pubid idtype="pmpid" link="fulltext">17237078</pubid></pubidlist></xrefbib></bibl><bibl id="B19"><title><p>Dup-Tree: a program for large-scale phylogenetic analyses using gene tree parsimony</p></title><aug><au><snm>Wehe</snm><fnm>A</fnm></au><au><snm>Bansal</snm><fnm>MS</fnm></au><au><snm>Burleigh</snm><fnm>GJ</fnm></au><au><snm>Eulenstein</snm><fnm>O</fnm></au></aug><source>Bioinformatics</source><pubdate>2008</pubdate><volume>24</volume><issue>13</issue><fpage>1540</fpage><lpage>1541</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/bioinformatics/btn230</pubid><pubid idtype="pmpid" link="fulltext">18474508</pubid></pubidlist></xrefbib></bibl><bibl id="B20"><title><p>Reconciling phylogenetic trees</p></title><aug><au><snm>Eulenstein</snm><fnm>O</fnm></au><au><snm>Huzurbazar</snm><fnm>S</fnm></au><au><snm>Liberles</snm><fnm>D</fnm></au></aug><source>Evolution After Gene Duplication</source><publisher>Dittmar, Liberles, Wiley</publisher><pubdate>2010</pubdate></bibl><bibl id="B21"><aug><au><snm>Bender</snm><fnm>MA</fnm></au><au><snm>Farach-Colton</snm><fnm>M</fnm></au></aug><source>The LCA Problem Revisited LATIN, Volume 1776 of LNCS</source><publisher>Springer</publisher><editor>Gonnet GH, Panario D, Viola A</editor><pubdate>2000</pubdate><fpage>88</fpage><lpage>94</lpage></bibl><bibl id="B22"><title><p>A Linear Time Algorithm for Error-Corrected Reconciliation of Unrooted Gene Trees</p></title><aug><au><snm>G&#243;recki</snm><fnm>P</fnm></au><au><snm>Eulenstein</snm><fnm>O</fnm></au></aug><source>Bioinformatics Research and Applications, Volume 6674 of Lecture Notes in Computer Science</source><publisher>Springer Berlin/Heidelberg</publisher><editor>Chen J, Wang J, Zelikovsky A</editor><pubdate>2011</pubdate><fpage>148</fpage><lpage>159</lpage></bibl><bibl id="B23"><title><p>URec: a system for unrooted reconciliation</p></title><aug><au><snm>G&#243;recki</snm><fnm>P</fnm></au><au><snm>Tiuryn</snm><fnm>J</fnm></au></aug><source>Bioinformatics</source><pubdate>2007</pubdate><volume>23</volume><issue>4</issue><fpage>511</fpage><lpage>512</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/bioinformatics/btl634</pubid><pubid idtype="pmpid" link="fulltext">17182699</pubid></pubidlist></xrefbib></bibl><bibl id="B24"><title><p>G&#232;nolevures: protein families and synteny among complete hemiascomycetous yeast proteomes and genomes</p></title><aug><au><snm>Sherman</snm><fnm>DJ</fnm></au><au><snm>Martin</snm><fnm>T</fnm></au><au><snm>Nikolski</snm><fnm>M</fnm></au><au><snm>Cayla</snm><fnm>C</fnm></au><au><snm>Souciet</snm><fnm>JL</fnm></au><au><snm>Durrens</snm><fnm>P</fnm></au></aug><source>Nucleic Acids Research</source><pubdate>2009</pubdate><volume>37</volume><issue>suppl 1</issue><fpage>D550</fpage><lpage>D554</lpage><url>http://nar.oxfordjournals.org/content/37/suppl_1/D550.abstract</url><xrefbib><pubidlist><pubid idtype="pmcid">2686504</pubid><pubid idtype="pmpid" link="fulltext">19015150</pubid></pubidlist></xrefbib></bibl><bibl id="B25"><title><p>T-coffee: a novel method for fast and accurate multiple sequence alignment</p></title><aug><au><snm>Notredame</snm><fnm>C</fnm></au><au><snm>Higgins</snm><fnm>DG</fnm></au><au><snm>Jaap</snm><fnm>H</fnm></au></aug><source>J Mol Biol</source><pubdate>2000</pubdate><volume>302</volume><fpage>205</fpage><lpage>217</lpage><url>http://dx.doi.org/10.1006/jmbi.2000.4042</url><xrefbib><pubidlist><pubid idtype="doi">10.1006/jmbi.2000.4042</pubid><pubid idtype="pmpid" link="fulltext">10964570</pubid></pubidlist></xrefbib></bibl></refgrp>
   </bm>
</art>