<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1471-2148-9-108</ui>
   <ji>1471-2148</ji>
   <fm>
      <dochead>Software</dochead>
      <bibl>
         <title>
            <p>PhyloExplorer: a web server to validate, explore and query phylogenetic trees</p>
         </title>
         <aug>
            <au id="A1" ca="yes">
               <snm>Ranwez</snm>
               <fnm>Vincent</fnm>
               <insr iid="I1"/>
               <email>vincent.ranwez@univ-montp2.fr</email>
            </au>
            <au id="A2">
               <snm>Clairon</snm>
               <fnm>Nicolas</fnm>
               <insr iid="I1"/>
               <email>clairon@gmail.com</email>
            </au>
            <au id="A3">
               <snm>Delsuc</snm>
               <fnm>Fr&#233;d&#233;ric</fnm>
               <insr iid="I1"/>
               <email>frederic.delsuc@univ-montp2.fr</email>
            </au>
            <au id="A4">
               <snm>Pourali</snm>
               <fnm>Saeed</fnm>
               <insr iid="I1"/>
               <insr iid="I2"/>
               <email>spourali@yahoo.com</email>
            </au>
            <au id="A5">
               <snm>Auberval</snm>
               <fnm>Nicolas</fnm>
               <insr iid="I1"/>
               <email>nicolas.auberval@gmail.com</email>
            </au>
            <au id="A6">
               <snm>Diser</snm>
               <fnm>Sorel</fnm>
               <insr iid="I1"/>
               <email>sorel.diser@gmail.com</email>
            </au>
            <au id="A7" ca="yes">
               <snm>Berry</snm>
               <fnm>Vincent</fnm>
               <insr iid="I2"/>
               <email>vberry@lirmm.fr</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Institut des Sciences de l'Evolution (ISEM, UMR 5554 CNRS), Universit&#233; Montpellier II, Place E. Bataillon &#8211; 34095 Montpellier Cedex 05, France</p>
            </ins>
            <ins id="I2">
               <p>Equipe M&#233;thodes et Algorithmes pour la Bioinformatique, LIRMM (UMR 5506 CNRS), Universit&#233; Montpellier II, Place E Bataillon &#8211; 34095 Montpellier, France</p>
            </ins>
         </insg>
         <source>BMC Evolutionary Biology</source>
         <issn>1471-2148</issn>
         <pubdate>2009</pubdate>
         <volume>9</volume>
         <issue>1</issue>
         <fpage>108</fpage>
         <url>http://www.biomedcentral.com/1471-2148/9/108</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">19450253</pubid>
               <pubid idtype="doi">10.1186/1471-2148-9-108</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>03</day>
               <month>6</month>
               <year>2008</year>
            </date>
         </rec>
         <acc>
            <date>
               <day>18</day>
               <month>5</month>
               <year>2009</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>18</day>
               <month>5</month>
               <year>2009</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2009</year>
         <collab>Ranwez et al; licensee BioMed Central Ltd.</collab>
         <note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>Many important problems in evolutionary biology require molecular phylogenies to be reconstructed. Phylogenetic trees must then be manipulated for subsequent inclusion in publications or analyses such as supertree inference and tree comparisons. However, no tool is currently available to facilitate the management of tree collections providing, for instance: standardisation of taxon names among trees with respect to a reference taxonomy; selection of relevant subsets of trees or sub-trees according to a taxonomic query; or simply computation of descriptive statistics on the collection. Moreover, although several databases of phylogenetic trees exist, there is currently no easy way to find trees that are both relevant and complementary to a given collection of trees.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>We propose a tool to facilitate assessment and management of phylogenetic tree collections. Given an input collection of rooted trees, PhyloExplorer provides facilities for obtaining statistics describing the collection, correcting invalid taxon names, extracting taxonomically relevant parts of the collection using a dedicated query language, and identifying related trees in the TreeBASE database.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusion</p>
               </st>
               <p>PhyloExplorer is a simple and interactive website implemented through underlying Python libraries and MySQL databases. It is available at: <url>http://www.ncbi.orthomam.univ-montp2.fr/phyloexplorer/</url> and the source code can be downloaded from: <url>http://code.google.com/p/taxomanie/</url>.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <meta>
      <classifications>
         <classification type="bmc" subtype="user_supplied_xml" id="endnote"/>
      </classifications>
   </meta>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <sec>
            <st>
               <p>Motivation</p>
            </st>
            <p>Evolutionary biologists now have to deal to an increasing extent with many phylogenetic trees and user-friendly bioinformatic tools are needed for handling large tree collections. Such tools are especially relevant for tasks that are cumbersome to perform manually, such as validating taxon names using a reference taxonomy, providing taxon sampling statistics, or pruning source trees so that they only contain taxa from a taxonomic group of interest. Such questions are crucial in phylogenomics, where phylogenies are mainly obtained by concatenating gene sequences into large molecular datasets (e.g. supermatrix approach) or by combining individual trees inferred separately on each gene (e.g. supertree approach). Several problems addressed here from a tree framework standpoint only take their constitutive taxa into account. Described solutions are thus also suitable for many other kinds of data (e.g. sequence alignments, morphological measures), for which taxonomic representation is essential.</p>
         </sec>
         <sec>
            <st>
               <p>Problems to solve</p>
            </st>
            <p>Knowing how many different taxa belonging to a particular taxonomic group (e.g. placentals) are represented in a tree collection, determining how many trees contain at least one placental, finding these trees, and pruning them from every taxon except mammals are various tasks that cannot be performed without mapping user taxa to a reference taxonomy. It is relatively complicated to compute such a mapping <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>, but the results can enable powerful tree search, while also providing pruning functions and relevant statistics.</p>
            <sec>
               <st>
                  <p>Dealing with taxon names</p>
               </st>
               <p>Because of the multiplicity of phylogeny origins and the lack of a single reference taxonomic framework, some taxon names they host are often taxonomically invalid, misspelled, and/or supplemented with indications relative to primary data such as sequence accession numbers, geographical origins of the samples, marker identifiers, etc. An initial step has thus to be performed to standardize names so that software analyzing the collection will not consider them as different taxa. Besides, mapping to proper scientific names may fail for some taxon names. This problem has already been pointed out <abbrgrp><abbr bid="B1">1</abbr></abbrgrp> and was recently addressed for the TreeBASE repository <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>. Indeed, TreeBASE is a very helpful collaborative resource but it has been largely under-exploited, partly due to its lack of taxonomic consistency <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>.</p>
            </sec>
            <sec>
               <st>
                  <p>Obtaining statistics on taxon coverage in a tree collection</p>
               </st>
               <p>As the number of trees and considered taxa grows, statistics become crucial for depicting the collection content, and thus measuring its relevance for a given phylogenetic problem. In a phylogenomic context, having hundreds of genes for dozens of taxa gives rise to simple, but fundamental questions, such as: How sparse is the dataset? Is the taxonomic sampling homogeneous among the main taxonomic groups? How complete is the taxonomic coverage of the dataset? How many genes provide information for a taxon that appeared in an unexpected position in the final phylogeny? The frequency of appearance of each taxon, although useful, is less informative than a trees &#215; taxa presence matrix. Such a data availability matrix <abbrgrp><abbr bid="B4">4</abbr></abbrgrp> clearly highlights under-represented taxa and provides an intuitive picture of the dataset sparseness. However, it is not sufficient to grasp the taxonomic coverage of a given group.</p>
            </sec>
            <sec>
               <st>
                  <p>Finding trees with taxa of interest</p>
               </st>
               <p>With a large collection of trees at hand, a key task is to find those containing a relevant taxon sample with respect to a targeted biological question. Indeed, as the number of phylogenomic projects increases, it is becoming easier to access large datasets used in previous phylogenomic studies or stored in dedicated repositories such as TreeBASE <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>, Homolens <abbrgrp><abbr bid="B5">5</abbr></abbrgrp>, TreeFAM <abbrgrp><abbr bid="B6">6</abbr></abbrgrp>, OrthoMam <abbrgrp><abbr bid="B7">7</abbr></abbrgrp>, and EnsEMBL <abbrgrp><abbr bid="B8">8</abbr></abbrgrp>. For example, when studying the phylogeny of rodents, one can be interested in isolating trees with at least four rodents from TreeBASE or from databases developed from larger-scale phylogenomic analyses (e.g. focused on mammals or eutheria). A tool for exploring tree collections should therefore incorporate a feature allowing complex queries on trees according to taxonomy.</p>
            </sec>
            <sec>
               <st>
                  <p>Pruning trees according to taxa of interest</p>
               </st>
               <p>Even trees containing several relevant taxa may contain numerous additional taxa that are deemed irrelevant for subsequent analyses. These useless parts should be pruned from trees both to speed the forthcoming analyses and to avoid interference with the signal of the core data to be analyzed. It is thus useful to have a tool allowing automatic pruning of numerous source trees so that they will only contain taxa belonging to the taxonomic groups of interest.</p>
            </sec>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Implementation</p>
         </st>
         <sec>
            <st>
               <p>Taxon naming convention</p>
            </st>
            <p>In order to manage tree collections, taxon names first have to be identified and then placed in a reference taxonomic scheme. PhyloExplorer allows the user to choose between two such schemes: the NCBI Taxonomy <abbrgrp><abbr bid="B9">9</abbr></abbrgrp> and the Catalogue of Life 2008 Annual Checklist <abbrgrp><abbr bid="B10">10</abbr></abbrgrp> from the Integrated Taxonomic Information System (ITIS) <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>. Most published phylogenies use a liberal naming scheme that combines taxonomic information with other kinds of information such as gene names, sequence identifiers or geographical origins. For example, when a tree inferred using the BRCA1 gene contains a domestic mouse among its taxa, its label can be either: mouse, BRCA1_mouse, Musmusculus_NM_015745, and so on. This freedom may allow the user to uncover some relevant information, but it also impedes simple automatic determination of the taxon represented by the given name. We use a naming convention to facilitate this determination without any loss of generality. The principle is to use separator characters chosen by the user to distinguish among the various pieces of information encompassed in the taxon name. The complete name is thus split into several words, with the first ones being used to identify this taxon within the reference taxonomy while others can be used freely for storing any other kind of information. We adopted this naming convention because it is so frequently used in TreeBASE that it almost seems to be a <it>de facto </it>standard. Most taxonomic terms consist of a single word (e.g. higher-level taxa), but some consist of two distinct terms to reflect the Linnean "genus-species" classification or to avoid ambiguity. Indeed, the reference taxonomy may contain ambiguous names because of name homonymy and synonymy. For example, <it>Echinops </it>is both a plant genus and a mammal genus. In this case, two distinct taxa have the same genus name and additional information can be provided between "&lt;" and ">" to overcome this ambiguity. In the NCBI taxonomy, <it>Echinops </it>is annotated as either "<it>Echinops </it>&lt; mammal >" or "<it>Echinops </it>&lt; plant >". When a taxon name is composed of several words, PhyloExplorer first checks if the whole name corresponds to a term of the reference taxonomy. If the check fails, the last word is ignored and the thus-obtained shorter name is checked against the taxonomy &#8211; this shortening operation is repeated until the name is found within the taxonomy or is reduced to an empty string. For example, the following names respect our naming convention: mus, mus_BRCA1, mus_musculus_BRCA1_France, echinops_&lt;mammal>_BRCA1, when using the underscore character as a separator.</p>
         </sec>
         <sec>
            <st>
               <p>Mapping TreeBASE</p>
            </st>
            <p>PhyloExplorer also proposes a function that allows users to perform taxonomic queries on trees stored in TreeBASE <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>. For this purpose, the content of TreeBASE was obtained from its ftp server <url>ftp://www.treebase.org/pub/treebase</url>. The 3,696 studies available as of September 2008, containing a total of 6,237 trees and 93,013 taxa, were collected. Since the trees available in TreeBASE have been provided by different scientists without adopting a common naming convention, a considerable proportion of the taxon names are not scientific names present in the reference taxonomies. For TreeBASE taxon names not present in the NCBI taxonomy (i.e. less than 33%), we relied on the TBMap tables which translate most TreeBASE taxa into proper scientific names based on a number of taxonomic databases (IPNI, uBio, ITIS, RDMP) <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>. This allowed identification of a further 8% of the taxon names.</p>
         </sec>
         <sec>
            <st>
               <p>PhyloExplorer implementation overview</p>
            </st>
            <p>PhyloExplorer is written in the Python language and has been implemented according to the Django web framework <abbrgrp><abbr bid="B12">12</abbr></abbrgrp>. Django is a stable and powerful system that is able to handle thousands of page views per second and to associate a graphical user interface with complex manipulations of an underlying database. In our case, the database is composed of relational tables encoding NCBI and ITIS taxonomies and a version of TreeBASE curated as detailed above. The database is handled by the MySQL database management system <abbrgrp><abbr bid="B13">13</abbr></abbrgrp>. Although a Django project can consist of several applications, PhyloExplorer has currently only one application called "DjangoPhyloCore". "PhyloCore" is a wrapper of this library that enables users to access it outside any Django project, i.e. to use its facilities under the Python interpreter as if it were a standard library.</p>
            <p>The UML diagram structure of the "PhyloCore" library (Fig. <figr fid="F1">1</figr>) is composed of several classes. Most of them allow interaction with the corresponding tables in the database since Django imposes a class for each table of the database. The class "Tree" is used to store phylogenies as individual objects, including the Newick format and other associated properties such as a name or the tree's rooting status.</p>
            <fig id="F1">
               <title>
                  <p>Figure 1</p>
               </title>
               <caption>
                  <p>UML diagram of the PhyloCore library structure of PhyloExplorer</p>
               </caption>
               <text>
                  <p><b>UML diagram of the PhyloCore library structure of PhyloExplorer</b>.</p>
               </text>
               <graphic file="1471-2148-9-108-1"/>
            </fig>
            <p>"TreeCollection" is the main class of the project and allows storage of user tree collections, TreeBASE itself, and sets of trees extracted from TreeBASE through the querying system. The queries are performed through functions of this class. The "TreeCollection" class also provides functions to filter trees by pruning them (e.g. restricting them to a taxonomic group of interest) and to obtain statistics on taxon coverage among trees. The "Taxonomy" class references all taxa of the current reference taxonomy (NCBI or ITIS) as well as their synonyms, homonyms, and hierarchical relationships between higher-level taxa. The "Rank" class encodes the taxonomic rank of the taxa (kingdom, phylum, class, family, genus, species), exactly as in the ITIS database <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>. The "TaxonomyReference" class provides useful functions common to trees and tree collection querying facilities. Lastly, the "BadTaxa" class stores taxa from currently managed tree collections that are not found in the chosen reference taxonomy.</p>
            <p>The "PhyloCore" facilities can be accessed through a graphical interface provided by the PhyloExplorer web server. This server allows several users to simultaneously query "PhyloCore" and its database. The web server was developed using the CherryPy object-oriented HTTP framework for Python <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>. The NetworkX Python package <abbrgrp><abbr bid="B15">15</abbr></abbrgrp> enables implementation of the browsable tree structures. PhyloExplorer relies on the PHY.FI graphical tool <abbrgrp><abbr bid="B16">16</abbr></abbrgrp> for displaying labeled trees and also links to the PhyloWidget applet <abbrgrp><abbr bid="B17">17</abbr></abbrgrp> for a less static tree display. Taxon pictures are extracted from the corresponding Wikispecies <abbrgrp><abbr bid="B18">18</abbr></abbrgrp> and Wikipedia <abbrgrp><abbr bid="B19">19</abbr></abbrgrp> pages. The PhyloExplorer source code is distributed under the CeCILL license version 3 <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>, a French variant of the GNU GPL licence adopted by the Centre National de la Recherche Scientifique (CNRS). It can be downloaded from the following Google Code project page: <url>http://code.google.com/p/taxomanie/</url>.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Results and Discussion</p>
         </st>
         <sec>
            <st>
               <p>Existing tools: software and websites</p>
            </st>
            <p>Many software packages propose to display and handle single user trees or collections of trees such as the long-standing classic Treeview <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>, the sophisticated TreeDyn <abbrgrp><abbr bid="B22">22</abbr></abbrgrp> and the rising star Dendroscope <abbrgrp><abbr bid="B23">23</abbr></abbrgrp>, among others. Websites aiming at interactively displaying phylogenetic trees, such as PHY.FI <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>, iTOL <abbrgrp><abbr bid="B24">24</abbr></abbrgrp> and PhyloWidget <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>, for example, have also recently flourished. However, most of these tools are tree rendering programs that only allow the user to graphically manage trees and associated annotations in some cases. Database orientated web servers such as TreeFam <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> and PhyloFinder <abbrgrp><abbr bid="B25">25</abbr></abbrgrp> permit taxonomic querying and retrieval of phylogenetic trees from dedicated databases. Whereas others, like GRUNT <abbrgrp><abbr bid="B26">26</abbr></abbrgrp> and Summary Tree Explorer <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>, implement some tree filtering and pruning features according to the taxonomy. However, none of these latter tools can be used to upload and explore user phylogenies or obtain detailed summary statistics on user tree collections. The only programs providing basic statistics on data availability are Clann <abbrgrp><abbr bid="B28">28</abbr></abbrgrp> for tree collections in the context of supertree reconstruction, and PhyloTA Browser <abbrgrp><abbr bid="B29">29</abbr></abbrgrp> for molecular datasets. In fact, PhyloFinder <abbrgrp><abbr bid="B25">25</abbr></abbrgrp> is the closest conceptually to the PhyloExplorer tool presented here, but it is currently restricted to trees stored in TreeBASE and does not allow obtaining taxonomic coverage statistics or performing complex taxonomic queries.</p>
         </sec>
         <sec>
            <st>
               <p>A brief overview of PhyloExplorer</p>
            </st>
            <p>A typical use of PhyloExplorer begins by uploading a tree collection on the website in Newick or Nexus format. A simple taxon list can also be entered as a trivial multifurcated tree. This allows users to deal with various kind of data (e.g. sequence alignments, morphological measures), for which the taxonomic representation is important. The trees are then parsed and taxon labels are mapped against scientific names from the chosen reference taxonomy (either NCBI or ITIS) with listed homonyms and synonyms. For automatic correction, the user is provided with a list of alternatives for up to ten unrecognized names. The corresponding excerpt of the reference taxonomy, including only mapped taxa, is then displayed. Two statistics are provided at each taxonomic rank. The first states the number of user trees containing a representative of the corresponding taxonomic group. The second states the number of representatives of this taxonomic group encountered in the user tree collection. Each mapped taxon is provided with a link to the corresponding entry in the NCBI or ITIS database. For pedagogical purposes, available pictures from Wikispecies <abbrgrp><abbr bid="B18">18</abbr></abbrgrp> or Wikipedia <abbrgrp><abbr bid="B19">19</abbr></abbrgrp> are popped up when browsing the taxonomic excerpt. Links to the corresponding iSpecies <abbrgrp><abbr bid="B30">30</abbr></abbrgrp> and Wikispecies <abbrgrp><abbr bid="B18">18</abbr></abbrgrp> pages are also provided for each taxon. A taxa &#215; trees matrix scoring the presence or absence of taxa in trees of the user collection can be generated to visualize the taxonomic overlap among trees. The user can then perform several operations on the input collection, such as: browsing the taxonomy excerpt corresponding to each input tree; restricting its trees to a subset of species by a simple mouse click; mining a relevant subset of trees through a simple querying language; and locating trees in TreeBASE matching taxa of the input collection.</p>
         </sec>
         <sec>
            <st>
               <p>Mapping and correcting taxon names</p>
            </st>
            <p>When PhyloExplorer is provided with a tree collection, either by simple upload or by searching TreeBASE, every leaf label is mapped to terms contained in the reference taxonomy. These terms include scientific and common names with their official taxonomic synonyms and homonyms. The lack of a single universal reference taxonomy <abbrgrp><abbr bid="B1">1</abbr></abbrgrp> led us to consider several existing alternatives. By relying on the comprehensive, widely used, and up to date NCBI and ITIS taxonomic projects, we hope to be able to fulfil a broad spectrum of evolutionary biologists' needs. The PhyloExplorer taxon naming facility helps to detect misspelled or taxonomically incorrect taxon names and proposes corrections according to known synonymies. Leaf labels that cannot be mapped to the reference taxonomy are listed, as well as those having an ambiguous taxonomic name. As automatic correction is error prone, the user is prompted to correct problematic taxon names by selecting the appropriate name among close alternatives present in the reference taxonomy. Then PhyloExplorer will automatically correct taxon names accordingly in the whole tree collection. Corrected trees can be downloaded afterwards, as well as the detailed list of corrections.</p>
            <p>For example, we checked the 77-taxa metazoan phylogenetic tree from the recent study of Dunn <it>et al</it>. <abbrgrp><abbr bid="B31">31</abbr></abbrgrp> (Fig. <figr fid="F2">2</figr>, Additional file <supplr sid="S1">1</supplr>). Using the NCBI reference taxonomy resulted in successful mapping of 76 leaf labels, including one homonym and three synonyms (Fig. <figr fid="F3">3A</figr>). Only one label (mertensiid_sp) could not be found in the reference taxonomy because it represents an English common name for an unidentified member of the family Mertensiidae. PhyloExplorer correctly suggested this name correction (Fig. <figr fid="F3">3A</figr>). By contrast, when using ITIS as the reference taxonomy, only one synonym and 13 labels that could not be mapped were revealed (Fig. <figr fid="F3">3B</figr>). This illustrates the differences in name mapping resulting from different taxonomic schemes. We hope that semi-automating this fastidious, but nevertheless essential task will encourage researchers to check the validity of their trees before depositing them in phylogenetic tree databases such as TreeBASE. This tree proofing stage is much easier with PhyloExplorer than with any other existing tool.</p>
            <fig id="F2">
               <title>
                  <p>Figure 2</p>
               </title>
               <caption>
                  <p>Snapshots of PhyloExplorer's name correcting feature</p>
               </caption>
               <text>
                  <p><b>Snapshots of PhyloExplorer's name correcting feature</b>. The 77 taxa metazoan phylogenetic tree obtained by Dunn <it>et al</it>. <abbrgrp><abbr bid="B31">31</abbr></abbrgrp> [see Additional file <supplr sid="S1">1</supplr>] is checked against the NCBI (A) and ITIS (B) reference taxonomies. Note the many differences in name mapping induced by the use of a different taxonomic scheme.</p>
               </text>
               <graphic file="1471-2148-9-108-2"/>
            </fig>
            <fig id="F3">
               <title>
                  <p>Figure 3</p>
               </title>
               <caption>
                  <p>Snapshots of PhyloExplorer's tree collection description and restriction features</p>
               </caption>
               <text>
                  <p><b>Snapshots of PhyloExplorer's tree collection description and restriction features</b>. The 146-individual gene tree collection from the phylogenomic study of Delsuc <it>et al</it>. <abbrgrp><abbr bid="B32">32</abbr></abbrgrp> [see Additional file <supplr sid="S2">2</supplr>] has been uploaded. A) PhyloExplorer produces statistics depicting the user tree collection in the form of simple tree size and taxon distribution histograms. B) Data availability matrix scoring the presence/absence of taxa in trees from the collection. C) Excerpt of the reference taxonomy (here NCBI) containing all mapped taxa from the tree collection with associated statistics at each node. For each taxonomic group, the number of its representatives found within the tree collection and for each taxon the number of user trees where it is represented. Here, all members of Cnidaria are selected from checkboxes to restrict the trees to members of this phylum. An illustrative picture is automatically popped up from the Wikispecies page for Cnidaria. D) Summary NCBI excerpt of the tree collection restricted to Cnidaria with adjusted statistics.</p>
               </text>
               <graphic file="1471-2148-9-108-3"/>
            </fig>
            <suppl id="S1">
               <title>
                  <p>Additional file 1</p>
               </title>
               <text>
                  <p><b>77-taxon metazoan phylogenetic tree</b>. This file contains the metazoan phylogenetic tree obtained by Dunn <it>et al</it>. <abbrgrp><abbr bid="B31">31</abbr></abbrgrp> in nexus format.</p>
               </text>
               <file name="1471-2148-9-108-S1.txt">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <suppl id="S2">
               <title>
                  <p>Additional file 2</p>
               </title>
               <text>
                  <p><b>Collection of 146 phylogenetic trees</b>. This file contains the tree collection (146 trees) from the phylogenomic study of Delsuc <it>et al</it>. <abbrgrp><abbr bid="B32">32</abbr></abbrgrp> in nexus format.</p>
               </text>
               <file name="1471-2148-9-108-S2.txt">
                  <p>Click here for file</p>
               </file>
            </suppl>
         </sec>
         <sec>
            <st>
               <p>Describing, browsing and restricting tree collections</p>
            </st>
            <p>As an illustration, we considered the collection of 146 individual gene trees from the phylogenomic study of Delsuc <it>et al</it>. <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>(Additional file <supplr sid="S2">2</supplr>). This example is of particular interest because its trees contain both binomial species names and higher taxonomic rank names. The latter labels designate chimerical taxa assembled from gene sequences of different representatives, as is often the case in supermatrix studies <abbrgrp><abbr bid="B33">33</abbr></abbrgrp>. Once this collection has been uploaded, PhyloExplorer produces simple statistics to depict it within the "Statistics &amp; queries" view. Both tree size and taxon frequency distributions are summarized through graphical plots (Fig. <figr fid="F3">3A</figr>). The general information states that this dataset consists of a total of 146 trees containing 38 distinct taxa. The tree size distribution indicates that the smallest tree contains 18 taxa. The taxon frequency distribution reflects the substantial overlap among these trees, with no taxon appearing in less than 42 trees among 146 and most taxa occurring in more than a hundred different trees. When selecting the "Matrix trees &#215; taxa" view, a data availability matrix is constructed from the collection (Fig. <figr fid="F3">3B</figr>). This representation provides a general overview of the degree of overlap among trees from the collection in the form of a matrix scoring the presence (blue squares) or absence (white squares) of each taxon in each tree. This matrix can easily be browsed by pointing the mouse on each matrix cell to display its associated taxon and tree name information. The detailed data availability matrix can also be downloaded in csv format (compatible with most spreadsheet programs) for use as supplementary material in supertree and supermatrix studies, for example.</p>
            <p>By coming back to the "Statistics &amp; queries" view, a summary excerpt of the reference taxonomy containing all mapped taxa is provided as an interactive tree (Fig. <figr fid="F3">3C</figr>). For educational purposes and to facilitate browsing of the taxonomy, images from Wikispecies <abbrgrp><abbr bid="B18">18</abbr></abbrgrp> or Wikipedia <abbrgrp><abbr bid="B19">19</abbr></abbrgrp> are automatically popped up when passing the mouse pointer over each taxonomic rank and taxon. The summary excerpt indicates, for each taxonomic rank, the number of its representatives contained in the tree collection and the number of user trees in which members of this taxon are represented. Based on these indications, it can easily be seen that four different cnidarians are represented in the whole tree collection and that 146 trees contain at least one cnidarian. A simple click on the first number allows the user to prune trees so that they only contain cnidarians, while clicking on the second will restrict the collection to the subset of trees containing at least one cnidarian. This latter function allows the user to easily restricting the tree collection to subsets of trees containing members of a particular taxonomic group or a terminal taxon of interest. PhyloExplorer also enables more complex and flexible tree restrictions by selecting relevant taxa through the corresponding checkboxes. Once the internal nodes of interest are selected, PhyloExplorer restricts the trees of the current collection to these taxa (Fig. <figr fid="F3">3D</figr>). The summary statistics are updated accordingly, and the modified collection can then be browsed through the taxonomic excerpt and downloaded in Newick format.</p>
            <p>The "Individual trees" view offers the option of browsing trees from the collection individually. A list of all individual trees is given with links for displaying each tree using the PhyloWidget <abbrgrp><abbr bid="B17">17</abbr></abbrgrp> applet (Fig. <figr fid="F4">4A</figr>). Provided that trees have been named in the Nexus formatted collection, individual trees are listed after this name. Once a particular tree is selected, i.e. the tree from the <it>if2p </it>gene in the current example, its internal nodes are automatically labelled by PhyloExplorer with taxonomic ranks when they can be unambiguously inferred. The resulting decorated tree is rendered using PHY.FI <abbrgrp><abbr bid="B16">16</abbr></abbrgrp> within the "Image" tab. The "Browsable taxonomy excerpt" tab allows the user to browse the tree in the reference taxonomy context with taxon image pop-ups and links to the iSpecies <abbrgrp><abbr bid="B30">30</abbr></abbrgrp> and Wikispecies <abbrgrp><abbr bid="B18">18</abbr></abbrgrp> pages of each taxon (Fig. <figr fid="F4">4B</figr>). Finally, the "Newick format" tab provides access to the labelled tree in Newick format (Fig. <figr fid="F4">4C</figr>).</p>
            <fig id="F4">
               <title>
                  <p>Figure 4</p>
               </title>
               <caption>
                  <p>Snapshots of PhyloExplorer's individual tree browsing and representation features</p>
               </caption>
               <text>
                  <p><b>Snapshots of PhyloExplorer's individual tree browsing and representation features</b>. A) Browsable tree list of 146 individual gene tree collection of Delsuc <it>et al</it>. <abbrgrp><abbr bid="B32">32</abbr></abbrgrp> with links to PhyloWidget <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>. The 21-taxon tree inferred from the gene <it>if2p </it>is displayed using PHY.FI <abbrgrp><abbr bid="B16">16</abbr></abbrgrp> with internal nodes labelled according to the NCBI reference taxonomy. B) Browsable NCBI excerpt corresponding to the <it>if2p </it>gene tree. C) Labelled <it>if2p </it>gene tree in Newick format.</p>
               </text>
               <graphic file="1471-2148-9-108-4"/>
            </fig>
         </sec>
         <sec>
            <st>
               <p>Performing complex taxonomic queries on tree collections</p>
            </st>
            <p>Finding a particular phylogeny in phylogenetic tree databases, or more generally in tree collections, is often like trying to find a needle in a haystack. Indeed, though confronted with phylogenetic tree databases containing thousands of trees, the user is usually offered only basic query capabilities. Recent efforts have been made in the right direction in the particular case of TreeBASE with tools designed to taxonomically validate the trees (TBMap) <abbrgrp><abbr bid="B3">3</abbr></abbrgrp> and provide enhanced query options (PhyloFinder) <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>. However, a general system allowing complex taxonomic queries to be performed on taxonomically validated tree collections is still lacking. PhyloExplorer now offers this possibility.</p>
            <p>For instance, we extracted a tree collection from the OrthoMam database <abbrgrp><abbr bid="B7">7</abbr></abbrgrp> by performing a request aimed at selecting trees inferred from slow evolving markers (rate &#8804; 0.5) of reasonable size (length > 1000 nucleotide sites). This led to a collection of 79 trees which was uploaded into PhyloExplorer (Fig. <figr fid="F5">5A</figr>, Additional file <supplr sid="S3">3</supplr>). This tree collection contains a total of 25 mammalian taxa for which complete genome sequences are available in the EnsEMBL database <abbrgrp><abbr bid="B8">8</abbr></abbrgrp>. A question of primary interest in mammalian phylogenetics is the order in which diversification among the four major placental groups (Afrotheria, Xenarthra, Euarchontoglires and Laurasiatheria) occurred &#8211; a problem that is directly dependent upon the position of the placental root <abbrgrp><abbr bid="B34">34</abbr></abbrgrp>. When studying the position of the root of the placental tree, it is better to consider trees or molecular datasets containing at least six eutherians and one representative of each major placental clade, plus a marsupial or monotreme outgroup. Such trees can be located very easily in PhyloExplorer by applying the following query to the OrthoMam trees under the "Search trees" tab: {EUARCHONTOGLIRES}>0 and {LAURASIATHERIA}>0 and {AFROTHERIA}>0 and {XENARTHRA}>0 and ({METATHERIA}>0 or {MONOTREMATA}>0) and {EUTHERIA} > 6. For each tree T of the current collection, PhyloExplorer replaces {TAXONOMIC_GROUP} by the number of representatives of this taxonomic group within T. Tree T matches the query if the resulting Boolean expression is true. The above query returned 23 candidate trees, while providing an even sampling of the major placental clades (Fig. <figr fid="F5">5B</figr>). Then, the user can conduct a supertree analysis from the indicated subcollection with an external tool. Alternatively, the user can collect the corresponding datasets in the OrthoMam database to construct a supermatrix with an optimal distribution of missing data with regards to the biological question at issue.</p>
            <fig id="F5">
               <title>
                  <p>Figure 5</p>
               </title>
               <caption>
                  <p>Snapshots of PhyloExplorer's statistics and queries feature</p>
               </caption>
               <text>
                  <p><b>Snapshots of PhyloExplorer's statistics and queries feature</b>. A) Statistics and data availability matrix for a mammalian tree collection containing 79 trees extracted from the OrthoMaM <abbrgrp><abbr bid="B7">7</abbr></abbrgrp> [see Additional file <supplr sid="S3">3</supplr>]. B) Updated statistics and data availability matrix for the 23 trees filtered from the initial collection using the query: {EUARCHONTOGLIRES}>0 and {LAURASIATHERIA}>0 and {AFROTHERIA}>0 and {XENARTHRA}>0 and ({METATHERIA}>0 or {MONOTREMATA}>0) and {EUTHERIA} > 6.</p>
               </text>
               <graphic file="1471-2148-9-108-5"/>
            </fig>
            <suppl id="S3">
               <title>
                  <p>Additional file 3</p>
               </title>
               <text>
                  <p><b>Collection of 79 mammalian phylogenetic trees</b>. This file contains the mammalian tree collection (79 trees) extracted from the OrthoMam database <abbrgrp><abbr bid="B7">7</abbr></abbrgrp> in nexus format.</p>
               </text>
               <file name="1471-2148-9-108-S3.txt">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <p>PhyloExplorer's query language can also be used to mine trees from the TBMap-curated version of TreeBASE. The same request for mammalian trees on TreeBASE results in 13 trees containing a total of 210 distinct taxa from which a supertree can be reconstructed. This querying of TreeBASE is also helpful for supermatrix studies, as primary data matrices can be downloaded from the TreeBASE website once relevant study numbers have been identified thanks to PhyloExplorer. Finally, the "Pruning trees" tab provides an easy way to restrict tree collections by defining taxonomic filters. Once filters have been defined, trees can then be restricted to include only taxa respecting these filters, or alternatively be pruned from the taxa defined in these filters. This function can be particularly useful for testing the influence of taxon sampling on diversification studies which require handling and statistical analysis of very large trees <abbrgrp><abbr bid="B35">35</abbr></abbrgrp>.</p>
         </sec>
         <sec>
            <st>
               <p>Measuring the taxonomic content and coverage of TreeBASE</p>
            </st>
            <p>The TBMap-curated version of TreeBASE <abbrgrp><abbr bid="B2">2</abbr></abbrgrp> we use includes 6,237 trees. However, its taxonomic coverage is highly irregular, with some groups being much more represented than others. Moreover, islands of taxa exist in which constitutive taxa are not linked by any tree to other taxa. This has, in particular, been pointed out by Page <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>, who provided several graphical views of the situation as of 2004. However, a more detailed and live picture can be obtained as TreeBASE is available as a specific collection in PhyloExplorer. The complete TreeBASE tree collection can be queried taxonomically from the PhyloExplorer front page (Fig. <figr fid="F6">6A</figr>). This enables the user to query and investigate its taxonomic coverage much more closely and dynamically than is possible when using the TreeBASE website interface or even the PhyloFinder dedicated tool <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>. Note, however, that some taxa appearing in TreeBASE cannot be mapped to our reference taxonomies despite the use of the TBMap-curated version of TreeBASE <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>. Also, the query result depends on the chosen reference taxonomy. For example, the simple query {mammalian}>0, allows retrieving all trees from TreeBASE that contain at least one mammal (Fig. <figr fid="F6">6A</figr>). Using the NCBI reference taxonomy results in 553 trees containing 5,371 different labels of which 5,075 (including 23 homonyms 176 synonyms) are successfully mapped to 3,660 distinct taxa (Fig. <figr fid="F6">6B</figr>). Whereas, using the ITIS reference taxonomy returns 566 trees containing 6,404 different labels of which 5,054 (including 10 homonyms, 127 synonyms and 1 vernacular) are successfully mapped to 3,677 distinct taxa (Fig. <figr fid="F6">6C</figr>). We plan to annually update the TreeBASE content of our site.</p>
            <fig id="F6">
               <title>
                  <p>Figure 6</p>
               </title>
               <caption>
                  <p>Snapshots of PhyloExplorer's TreeBase statistics and query features</p>
               </caption>
               <text>
                  <p><b>Snapshots of PhyloExplorer's TreeBase statistics and query features</b>. A) PhyloExplorer home page showing the simple query {mammalia}>0 that allows users to retrieved all trees containing at least one mammalian taxon from our TBMap-curated <abbrgrp><abbr bid="B3">3</abbr></abbrgrp> version of TreeBASE <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>. B) Taxonomic mapping of the resulting TreeBASE collection using the NCBI reference taxonomy. C) Taxonomic mapping of the resulting TreeBASE collection using the ITIS reference taxonomy.</p>
               </text>
               <graphic file="1471-2148-9-108-6"/>
            </fig>
         </sec>
         <sec>
            <st>
               <p>Educational value of the web server</p>
            </st>
            <p>PhyloExplorer also provides a useful educational tool for assessing a taxonomic group that the user is not familiar with. This feature could be of particular value when preparing practical courses for undergraduates. Indeed, when uploading a tree or a list of taxa of interest, PhyloExplorer constructs an excerpted version of the corresponding taxonomic tree that can be browsed interactively. Internal nodes of the taxonomic excerpt tree are labelled by the name of the smallest taxonomic rank containing all taxa belonging to this group. For example, PhyloExplorer can be used to quickly view and browse the taxonomy of taxa represented in the NCBI Trace Archives <abbrgrp><abbr bid="B36">36</abbr></abbrgrp>. Pictures of taxa available in Wikispecies <abbrgrp><abbr bid="B18">18</abbr></abbrgrp> can be popped-up by positioning the mouse pointer over the corresponding taxon name. Clicking on taxon names will redirect the user to the full taxon description in the NCBI Taxonomy Browser <abbrgrp><abbr bid="B37">37</abbr></abbrgrp>, which indicates the number of nucleotide sequences available for that particular taxon. PhyloExplorer also provides a link to automatically perform a search for each taxon in the iSpecies engine <abbrgrp><abbr bid="B30">30</abbr></abbrgrp>, which returns information on the geographic distribution, available pictures on the web, and associated bibliography. A link is also given for each taxon to the corresponding Wikispecies page for additional information.</p>
            <p>Taxon pictures are of great educational value in many cases in which a phylogeny is to be shown or described. It is indeed common practice to display such taxon images at the tips of phylogenies in slide presentations, posters, and publications. Such pictures can be found on dedicated websites such as Wikispecies or Animal Diversity Web <abbrgrp><abbr bid="B38">38</abbr></abbrgrp>, or through search engines such as iSpecies. However, manually typing each taxon name in these websites can be cumbersome and is prone to spelling errors. PhyloExplorer automates this task by providing a feature to get available pictures from Wikispecies for all terminal taxa of an input tree or tree collection and for all taxa of a taxon list. Shifting to the "Taxon images" view prompts PhyloExplorer to display, by default, available images for the first 20 taxa. Annotated thumbnails of retrieved images are returned and the collection of full-size images can be viewed as a slide show (Fig. <figr fid="F7">7</figr>). A link is also provided to launch the search for all taxon images in a separate window of the browser. The pictures can then be downloaded for direct use in scientific presentations and lectures. An interactive tree viewer program like Dendroscope <abbrgrp><abbr bid="B23">23</abbr></abbrgrp> can also directly upload these pictures for display at the tips of trees by mapping taxon names with image filenames.</p>
            <fig id="F7">
               <title>
                  <p>Figure 7</p>
               </title>
               <caption>
                  <p>Snapshot of PhyloExplorer's image search feature</p>
               </caption>
               <text>
                  <p><b>Snapshot of PhyloExplorer's image search feature</b>. Taxon images are searched for the 20 first taxa of the OrthoMaM-based tree collection [see Additional file <supplr sid="S3">3</supplr>]. The 18 images found are displayed as thumbnails and the full-size picture of the armadillo (<it>Dasypus</it>) is viewed within a slideshow where all available images can be browsed.</p>
               </text>
               <graphic file="1471-2148-9-108-7"/>
            </fig>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Conclusion</p>
         </st>
         <p>Combining user phylogenies with a reference taxonomy allows PhyloExplorer to propose advanced facilities to taxonomically explore, correct, query and filter user tree collections. Various cumbersome operations currently performed by hand can thus be automated &#8211; in the best case &#8211; before publishing phylogenies in papers or repositories. PhyloExplorer's features are available through a simple interactive web interface. In addition, the taxonomic querying system can also be applied to a TBMap-curated version of TreeBASE <abbrgrp><abbr bid="B3">3</abbr></abbrgrp> which can thus be efficiently mined with respect to an input tree collection or to the user's interests. We also hope that PhyloExplorer will benefit the phylogenetic community as a tool to increase the taxonomic validity of trees submitted to phylogenetic tree databases. This would greatly enhance the usefulness of such databases in meta-analyses. Moreover, by providing a powerful taxonomic query system that enables the mining of user collections or databases containing hundreds to thousands of trees, PhyloExplorer could eventually be of great assistance to researchers interested in performing large-scale supertree and supermatrix analyses. PhyloExplorer's educational potential should also be underlined as it might be particularly helpful for preparing undergraduate courses and boosting public awarenes on taxonomy and phylogenetics. Possible developments of PhyloExplorer include allowing users to upload their own custom reference taxonomic schemes, downloading standardised taxon image collections, adding more tree handling operations such as automatic re-rooting, and implementing topological queries.</p>
      </sec>
      <sec>
         <st>
            <p>Availability and requirements</p>
         </st>
         <p>Project name: PhyloExplorer</p>
         <p>Project home page: <url>http://www.ncbi.orthomam.univ-montp2.fr/phyloexplorer/</url></p>
         <p>Code available at: <url>http://code.google.com/p/taxomanie/</url></p>
         <p>Operating system(s): Platform independent</p>
         <p>Other requirements: None.</p>
         <p>Licence: CeCILL v3.</p>
      </sec>
      <sec>
         <st>
            <p>Authors' contributions</p>
         </st>
         <p>VR initiated the project. FD, VB, and VR supervised the project. NA, NC and SD, three Master's students at University Montpellier II contributed to the implementation of the Python library and the development of the first website version. NC and VR wrote an updated version of the website. SP, a Master's student at University Montpellier II, extracted the trees from TreeBASE and converted them according to the TBMap and Catalogue of Life resources. FD provided the illustrative biological examples. FD, VB and VR wrote the manuscript. All authors read and approved the final manuscript.</p>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>We would like to thank Khalid Belkhir and Alexandre Dehne-Garcia for their help in implementing the web server. We also thank the anonymous reviewers for helpful comments which enabled many improvements to PhyloExplorer. This work has been supported by grants from the University Montpellier II Scientific Council and the French National Research Agency (PhylAriane project: ANR-08-EMER-011). This is contribution ISEM 2009-050 of the Institut des Sciences de l'Evolution de Montpellier (UMR5554-CNRS).</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>A Taxonomic Search Engine: federating taxonomic databases using web services</p>
            </title>
            <aug>
               <au>
                  <snm>Page</snm>
                  <fnm>RD</fnm>
               </au>
            </aug>
            <source>BMC Bioinformatics</source>
            <pubdate>2005</pubdate>
            <volume>6</volume>
            <fpage>48</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">555944</pubid>
                  <pubid idtype="pmpid" link="fulltext">15757517</pubid>
                  <pubid idtype="doi">10.1186/1471-2105-6-48</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>TreeBASE: a prototype database of phylogenetic analyses and an interactive tool for browsing the phylogeny of life</p>
            </title>
            <aug>
               <au>
                  <snm>Sanderson</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Donoghue</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Piel</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Eriksson</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>Amer Jour Bot</source>
            <pubdate>1994</pubdate>
            <volume>81</volume>
            <fpage>183</fpage>
            <xrefbib>
               <pubid idtype="doi">10.2307/2445299</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>TBMap: a taxonomic perspective on the phylogenetic database TreeBASE</p>
            </title>
            <aug>
               <au>
                  <snm>Page</snm>
                  <fnm>RD</fnm>
               </au>
            </aug>
            <source>BMC Bioinformatics</source>
            <pubdate>2007</pubdate>
            <volume>8</volume>
            <fpage>158</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1885449</pubid>
                  <pubid idtype="pmpid" link="fulltext">17511869</pubid>
                  <pubid idtype="doi">10.1186/1471-2105-8-158</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>The challenge of constructing large phylogenetic trees</p>
            </title>
            <aug>
               <au>
                  <snm>Sanderson</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Driskell</snm>
                  <fnm>AC</fnm>
               </au>
            </aug>
            <source>Trends in Plant Science</source>
            <pubdate>2003</pubdate>
            <volume>8</volume>
            <fpage>374</fpage>
            <lpage>379</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S1360-1385(03)00165-1</pubid>
                  <pubid idtype="pmpid" link="fulltext">12927970</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>Hovergen &#8211; a Database of Homologous Vertebrate Genes</p>
            </title>
            <aug>
               <au>
                  <snm>Duret</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Mouchiroud</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Gouy</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Research</source>
            <pubdate>1994</pubdate>
            <volume>22</volume>
            <fpage>2360</fpage>
            <lpage>2365</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">523695</pubid>
                  <pubid idtype="pmpid" link="fulltext">8036164</pubid>
                  <pubid idtype="doi">10.1093/nar/22.12.2360</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>TreeFam: 2008 update</p>
            </title>
            <aug>
               <au>
                  <snm>Ruan</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Chen</snm>
                  <fnm>ZZ</fnm>
               </au>
               <au>
                  <snm>Coghlan</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Coin</snm>
                  <fnm>LJM</fnm>
               </au>
               <au>
                  <snm>Guo</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Heriche</snm>
                  <fnm>JK</fnm>
               </au>
               <au>
                  <snm>Hu</snm>
                  <fnm>YF</fnm>
               </au>
               <au>
                  <snm>Kristiansen</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>RQ</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nucleic Acids Research</source>
            <pubdate>2008</pubdate>
            <volume>36</volume>
            <fpage>D735</fpage>
            <lpage>D740</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">2238856</pubid>
                  <pubid idtype="pmpid" link="fulltext">18056084</pubid>
                  <pubid idtype="doi">10.1093/nar/gkm1005</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>OrthoMaM: a database of orthologous genomic markers for placental mammal phylogenetics</p>
            </title>
            <aug>
               <au>
                  <snm>Ranwez</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Delsuc</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Ranwez</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Belkhir</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Tilak</snm>
                  <fnm>MK</fnm>
               </au>
               <au>
                  <snm>Douzery</snm>
                  <fnm>EJ</fnm>
               </au>
            </aug>
            <source>BMC Evol Biol</source>
            <pubdate>2007</pubdate>
            <volume>7</volume>
            <fpage>241</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">2249597</pubid>
                  <pubid idtype="pmpid" link="fulltext">18053139</pubid>
                  <pubid idtype="doi">10.1186/1471-2148-7-241</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>Ensembl 2009</p>
            </title>
            <aug>
               <au>
                  <snm>Hubbard</snm>
                  <fnm>TJP</fnm>
               </au>
               <au>
                  <snm>Aken</snm>
                  <fnm>BL</fnm>
               </au>
               <au>
                  <snm>Ayling</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Ballester</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Beal</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Bragin</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Brent</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Chen</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Clapham</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Clarke</snm>
                  <fnm>L</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nucleic Acids Research</source>
            <pubdate>2009</pubdate>
            <volume>37</volume>
            <fpage>D690</fpage>
            <lpage>D697</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/nar/gkn828</pubid>
                  <pubid idtype="pmpid" link="fulltext">19033362</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>Database resources of the National Center for Biotechnology Information</p>
            </title>
            <aug>
               <au>
                  <snm>Wheeler</snm>
                  <fnm>DL</fnm>
               </au>
               <au>
                  <snm>Chappey</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Lash</snm>
                  <fnm>AE</fnm>
               </au>
               <au>
                  <snm>Leipe</snm>
                  <fnm>DD</fnm>
               </au>
               <au>
                  <snm>Madden</snm>
                  <fnm>TL</fnm>
               </au>
               <au>
                  <snm>Schuler</snm>
                  <fnm>GD</fnm>
               </au>
               <au>
                  <snm>Tatusova</snm>
                  <fnm>TA</fnm>
               </au>
               <au>
                  <snm>Rapp</snm>
                  <fnm>BA</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2000</pubdate>
            <volume>28</volume>
            <fpage>10</fpage>
            <lpage>14</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">102437</pubid>
                  <pubid idtype="pmpid" link="fulltext">10592169</pubid>
                  <pubid idtype="doi">10.1093/nar/28.1.10</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>Catalogue of Life: 2008 Annual Checklist</p>
            </title>
            <url>http://www.catalogueoflife.org</url>
         </bibl>
         <bibl id="B11">
            <title>
               <p>ITIS: Integrated Taxonomic Information System</p>
            </title>
            <url>http://www.itis.gov</url>
         </bibl>
         <bibl id="B12">
            <title>
               <p>Django: a High-level Python Web framework</p>
            </title>
            <url>http://www.djangoproject.com</url>
         </bibl>
         <bibl id="B13">
            <title>
               <p>MySQL: the World's Most Popular Open Source Database</p>
            </title>
            <url>http://www.mysql.com</url>
         </bibl>
         <bibl id="B14">
            <title>
               <p>CherryPy: a Pythonic, Object-oriented HTTP Framework</p>
            </title>
            <url>http://www.cherrypy.org</url>
         </bibl>
         <bibl id="B15">
            <title>
               <p>NetworkX: a Python Package for the Creation, Manipulation, and Study of the Structure, Dynamics, and Functions of Complex Networks</p>
            </title>
            <url>http://networkx.lanl.gov</url>
         </bibl>
         <bibl id="B16">
            <title>
               <p>PHY.FI: fast and easy online creation and manipulation of phylogeny color figures</p>
            </title>
            <aug>
               <au>
                  <snm>Fredslund</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>BMC Bioinformatics</source>
            <pubdate>2006</pubdate>
            <volume>7</volume>
            <fpage>315</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1513607</pubid>
                  <pubid idtype="pmpid" link="fulltext">16792795</pubid>
                  <pubid idtype="doi">10.1186/1471-2105-7-315</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>PhyloWidget: web-based visualizations for the tree of life</p>
            </title>
            <aug>
               <au>
                  <snm>Jordan</snm>
                  <fnm>GE</fnm>
               </au>
               <au>
                  <snm>Piel</snm>
                  <fnm>WH</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2008</pubdate>
            <volume>24</volume>
            <fpage>1641</fpage>
            <lpage>1642</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/btn235</pubid>
                  <pubid idtype="pmpid" link="fulltext">18487241</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p>Wikispecies: a Free Directory of Life</p>
            </title>
            <url>http://species.wikimedia.org</url>
         </bibl>
         <bibl id="B19">
            <title>
               <p>Wikipedia: the Free Encyclopedia</p>
            </title>
            <url>http://www.wikipedia.org/</url>
         </bibl>
         <bibl id="B20">
            <title>
               <p>CeCILL: Licence Fran&#231;aise de Logiciel Libre</p>
            </title>
            <url>http://www.cecill.info/index.en.html</url>
         </bibl>
         <bibl id="B21">
            <title>
               <p>TreeView: an application to display phylogenetic trees on personal computers</p>
            </title>
            <aug>
               <au>
                  <snm>Page</snm>
                  <fnm>RD</fnm>
               </au>
            </aug>
            <source>Comput Appl Biosci</source>
            <pubdate>1996</pubdate>
            <volume>12</volume>
            <fpage>357</fpage>
            <lpage>358</lpage>
            <xrefbib>
               <pubid idtype="pmpid">8902363</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B22">
            <title>
               <p>TreeDyn: towards dynamic graphics and annotations for analyses of trees</p>
            </title>
            <aug>
               <au>
                  <snm>Chevenet</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Brun</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Banuls</snm>
                  <fnm>AL</fnm>
               </au>
               <au>
                  <snm>Jacq</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Christen</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>BMC Bioinformatics</source>
            <pubdate>2006</pubdate>
            <volume>7</volume>
            <fpage>439</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1615880</pubid>
                  <pubid idtype="pmpid" link="fulltext">17032440</pubid>
                  <pubid idtype="doi">10.1186/1471-2105-7-439</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B23">
            <title>
               <p>Dendroscope: An interactive viewer for large phylogenetic trees</p>
            </title>
            <aug>
               <au>
                  <snm>Huson</snm>
                  <fnm>DH</fnm>
               </au>
               <au>
                  <snm>Richter</snm>
                  <fnm>DC</fnm>
               </au>
               <au>
                  <snm>Rausch</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Dezulian</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Franz</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Rupp</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>BMC Bioinformatics</source>
            <pubdate>2007</pubdate>
            <volume>8</volume>
            <fpage>460</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">2216043</pubid>
                  <pubid idtype="pmpid" link="fulltext">18034891</pubid>
                  <pubid idtype="doi">10.1186/1471-2105-8-460</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B24">
            <title>
               <p>Interactive Tree Of Life (iTOL): an online tool for phylogenetic tree display and annotation</p>
            </title>
            <aug>
               <au>
                  <snm>Letunic</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Bork</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2007</pubdate>
            <volume>23</volume>
            <fpage>127</fpage>
            <lpage>128</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/btl529</pubid>
                  <pubid idtype="pmpid" link="fulltext">17050570</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B25">
            <title>
               <p>PhyloFinder: an intelligent search engine for phylogenetic tree databases</p>
            </title>
            <aug>
               <au>
                  <snm>Chen</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Burleigh</snm>
                  <fnm>JG</fnm>
               </au>
               <au>
                  <snm>Bansal</snm>
                  <fnm>MS</fnm>
               </au>
               <au>
                  <snm>Fernandez-Baca</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>BMC Evol Biol</source>
            <pubdate>2008</pubdate>
            <volume>8</volume>
            <fpage>90</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">2362120</pubid>
                  <pubid idtype="pmpid" link="fulltext">18366717</pubid>
                  <pubid idtype="doi">10.1186/1471-2148-8-90</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B26">
            <title>
               <p>Automated group assignment in large phylogenetic trees using GRUNT: GRouping, Ungrouping, Naming Tool</p>
            </title>
            <aug>
               <au>
                  <snm>Dalevi</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Desantis</snm>
                  <fnm>TZ</fnm>
               </au>
               <au>
                  <snm>Fredslund</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Andersen</snm>
                  <fnm>GL</fnm>
               </au>
               <au>
                  <snm>Markowitz</snm>
                  <fnm>VM</fnm>
               </au>
               <au>
                  <snm>Hugenholtz</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>BMC Bioinformatics</source>
            <pubdate>2007</pubdate>
            <volume>8</volume>
            <fpage>402</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">2228325</pubid>
                  <pubid idtype="pmpid" link="fulltext">17949484</pubid>
                  <pubid idtype="doi">10.1186/1471-2105-8-402</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B27">
            <title>
               <p>Interactive visualization software for exploring phylogenetic trees and clades</p>
            </title>
            <aug>
               <au>
                  <snm>Derthick</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2008</pubdate>
            <volume>24</volume>
            <fpage>868</fpage>
            <lpage>869</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/btn038</pubid>
                  <pubid idtype="pmpid" link="fulltext">18263642</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B28">
            <title>
               <p>Clann: investigating phylogenetic information through supertree analyses</p>
            </title>
            <aug>
               <au>
                  <snm>Creevey</snm>
                  <fnm>CJ</fnm>
               </au>
               <au>
                  <snm>McInerney</snm>
                  <fnm>JO</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2005</pubdate>
            <volume>21</volume>
            <fpage>390</fpage>
            <lpage>392</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/bti020</pubid>
                  <pubid idtype="pmpid" link="fulltext">15374874</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B29">
            <title>
               <p>The PhyLoTA Browser: processing GenBank for molecular phylogenetics research</p>
            </title>
            <aug>
               <au>
                  <snm>Sanderson</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Boss</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Chen</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Cranston</snm>
                  <fnm>KA</fnm>
               </au>
               <au>
                  <snm>Wehe</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Syst Biol</source>
            <pubdate>2008</pubdate>
            <volume>57</volume>
            <fpage>335</fpage>
            <lpage>346</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1080/10635150802158688</pubid>
                  <pubid idtype="pmpid" link="fulltext">18570030</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B30">
            <title>
               <p>iSpecies: a Species Search Engine</p>
            </title>
            <url>http://ispecies.org</url>
         </bibl>
         <bibl id="B31">
            <title>
               <p>Broad phylogenomic sampling improves resolution of the animal tree of life</p>
            </title>
            <aug>
               <au>
                  <snm>Dunn</snm>
                  <fnm>CW</fnm>
               </au>
               <au>
                  <snm>Hejnol</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Matus</snm>
                  <fnm>DQ</fnm>
               </au>
               <au>
                  <snm>Pang</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Browne</snm>
                  <fnm>WE</fnm>
               </au>
               <au>
                  <snm>Smith</snm>
                  <fnm>SA</fnm>
               </au>
               <au>
                  <snm>Seaver</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Rouse</snm>
                  <fnm>GW</fnm>
               </au>
               <au>
                  <snm>Obst</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Edgecombe</snm>
                  <fnm>GD</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nature</source>
            <pubdate>2008</pubdate>
            <volume>452</volume>
            <fpage>745</fpage>
            <lpage>749</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nature06614</pubid>
                  <pubid idtype="pmpid" link="fulltext">18322464</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B32">
            <title>
               <p>Tunicates and not cephalochordates are the closest living relatives of vertebrates</p>
            </title>
            <aug>
               <au>
                  <snm>Delsuc</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Brinkmann</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Chourrout</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Philippe</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2006</pubdate>
            <volume>439</volume>
            <fpage>965</fpage>
            <lpage>968</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nature04336</pubid>
                  <pubid idtype="pmpid" link="fulltext">16495997</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B33">
            <title>
               <p>SCaFoS: a tool for selection, concatenation and fusion of sequences for phylogenomics</p>
            </title>
            <aug>
               <au>
                  <snm>Roure</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Rodriguez-Ezpeleta</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Philippe</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>BMC Evol Biol</source>
            <pubdate>2007</pubdate>
            <volume>7</volume>
            <issue>Suppl 1</issue>
            <fpage>S2</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1796611</pubid>
                  <pubid idtype="pmpid" link="fulltext">17288575</pubid>
                  <pubid idtype="doi">10.1186/1471-2148-7-S1-S2</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B34">
            <title>
               <p>Molecular phylogeny of living xenarthrans and the impact of character and taxon sampling on the placental tree rooting</p>
            </title>
            <aug>
               <au>
                  <snm>Delsuc</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Scally</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Madsen</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Stanhope</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>de Jong</snm>
                  <fnm>WW</fnm>
               </au>
               <au>
                  <snm>Catzeflis</snm>
                  <fnm>FM</fnm>
               </au>
               <au>
                  <snm>Springer</snm>
                  <fnm>MS</fnm>
               </au>
               <au>
                  <snm>Douzery</snm>
                  <fnm>EJP</fnm>
               </au>
            </aug>
            <source>Molecular Biology and Evolution</source>
            <pubdate>2002</pubdate>
            <volume>19</volume>
            <fpage>1656</fpage>
            <lpage>1671</lpage>
         </bibl>
         <bibl id="B35">
            <title>
               <p>Estimating diversification rates from phylogenetic information</p>
            </title>
            <aug>
               <au>
                  <snm>Ricklefs</snm>
                  <fnm>RE</fnm>
               </au>
            </aug>
            <source>Trends Ecol Evol</source>
            <pubdate>2007</pubdate>
            <volume>22</volume>
            <fpage>601</fpage>
            <lpage>610</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.tree.2007.06.013</pubid>
                  <pubid idtype="pmpid" link="fulltext">17963995</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B36">
            <title>
               <p>The NCBI Trace Archives</p>
            </title>
            <url>http://www.ncbi.nlm.nih.gov/Traces/home/</url>
         </bibl>
         <bibl id="B37">
            <title>
               <p>The NCBI Taxonomy Browser</p>
            </title>
            <url>http://www.ncbi.nlm.nih.gov/Taxonomy/</url>
         </bibl>
         <bibl id="B38">
            <title>
               <p>Animal Diversity Web</p>
            </title>
            <url>http://animaldiversity.ummz.umich.edu/site/index.html</url>
         </bibl>
      </refgrp>
   </bm>
</art>

