<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1471-2148-9-37</ui>
   <ji>1471-2148</ji>
   <fm>
      <dochead>Methodology article</dochead>
      <bibl>
         <title>
            <p>Mega-phylogeny approach for comparative biology: an alternative to supertree and supermatrix approaches</p>
         </title>
         <aug>
            <au id="A1" ca="yes">
               <snm>Smith</snm>
               <mi>A</mi>
               <fnm>Stephen</fnm>
               <insr iid="I1"/>
               <insr iid="I2"/>
               <email>sasmith@nescent.org</email>
            </au>
            <au id="A2">
               <snm>Beaulieu</snm>
               <mi>M</mi>
               <fnm>Jeremy</fnm>
               <insr iid="I2"/>
               <email>jeremy.beaulieu@yale.edu</email>
            </au>
            <au id="A3">
               <snm>Donoghue</snm>
               <mi>J</mi>
               <fnm>Michael</fnm>
               <insr iid="I2"/>
               <email>michael.donoghue@yale.edu</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>National Evolutionary Synthesis Center, 2024 W Main St A200, Durham, NC 27705, USA</p>
            </ins>
            <ins id="I2">
               <p>Department of Ecology and Evolutionary Biology, Yale University, PO Box 208105, New Haven, CT 06520, USA</p>
            </ins>
         </insg>
         <source>BMC Evolutionary Biology</source>
         <issn>1471-2148</issn>
         <pubdate>2009</pubdate>
         <volume>9</volume>
         <issue>1</issue>
         <fpage>37</fpage>
         <url>http://www.biomedcentral.com/1471-2148/9/37</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">19210768</pubid>
               <pubid idtype="doi">10.1186/1471-2148-9-37</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>22</day>
               <month>10</month>
               <year>2008</year>
            </date>
         </rec>
         <acc>
            <date>
               <day>11</day>
               <month>2</month>
               <year>2009</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>11</day>
               <month>2</month>
               <year>2009</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2009</year>
         <collab>Smith et al; licensee BioMed Central Ltd.</collab>
         <note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>Biology has increasingly recognized the necessity to build and utilize larger phylogenies to address broad evolutionary questions. Large phylogenies have facilitated the discovery of differential rates of molecular evolution between trees and herbs. They have helped us understand the diversification patterns of mammals as well as the patterns of seed evolution. In addition to these broad evolutionary questions there is increasing awareness of the importance of large phylogenies for addressing conservation issues such as biodiversity hotspots and response to global change. Two major classes of methods have been employed to accomplish the large tree-building task: supertrees and supermatrices. Although these methods are continually being developed, they have yet to be made fully accessible to comparative biologists making extremely large trees rare.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>Here we describe and demonstrate a modified supermatrix method termed mega-phylogeny that uses databased sequences as well as taxonomic hierarchies to make extremely large trees with denser matrices than supermatrices. The two major challenges facing large-scale supermatrix phylogenetics are assembling large data matrices from databases and reconstructing trees from those datasets. The mega-phylogeny approach addresses the former as the latter is accomplished by employing recently developed methods that have greatly reduced the run time of large phylogeny construction. We present an algorithm that requires relatively little human intervention. The implemented algorithm is demonstrated with a dataset and phylogeny for Asterales (within Campanulidae) containing 4954 species and 12,033 sites and an <it>rbcL </it>matrix for green plants (Viridiplantae) with 13,533 species and 1,401 sites.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusion</p>
               </st>
               <p>By examining much larger phylogenies, patterns emerge that were otherwise unseen. The phylogeny of Viridiplantae successfully reconstructs major relationships of vascular plants that previously required many more genes. These demonstrations underscore the importance of using large phylogenies to uncover important evolutionary patterns and we present a fast and simple method for constructing these phylogenies.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>All species on Earth &#8211; current estimates exceed 1.8 million &#8211; are related through common ancestors in the evolutionary Tree of Life. The construction of this phylogeny is a major endeavor for biology and largely now depends on the unprecedented growth of molecular sequence data available in public databases. Efforts focused on single clades, whole genome sequencing, genomic library construction (ESTs, BACs), and large collaborative efforts, such as NSF's Assembling the Tree of Life project, are contributing to the fast-paced growth of public databases, with more than 92 million sequences stored in the current release of GenBank (release 167). Current efforts to infer really large phylogenetic trees center on data combination using so-called supertree [e.g., <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>] and supermatrix methods [e.g., <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr></abbrgrp>] as opposed to using a single gene (or multiple genes) sampled very widely across taxa [e.g., <abbrgrp><abbr bid="B5">5</abbr><abbr bid="B6">6</abbr></abbrgrp>]. For example, recent large-scale database-enabled phylogenetic analyses employing these approaches have shed light on the radiation and early evolution of mammals <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>, and the phylogenetic diversity of Bacteria <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>. Recent advances in phylogenetic tree-building methods have provided the necessary first steps in approaching the problem of producing large and comprehensive phylogenetic trees <abbrgrp><abbr bid="B7">7</abbr><abbr bid="B8">8</abbr><abbr bid="B9">9</abbr></abbrgrp>. However, assembling large datasets from databases remains a critical problem upstream of the tree-building process.</p>
         <p>Supertree methods compile many source trees with partially overlapping taxa into a single comprehensive tree <abbrgrp><abbr bid="B10">10</abbr><abbr bid="B11">11</abbr></abbrgrp>. Generally, each source topology is converted into a data matrix and combined with other topological matrices. Many different algorithms exist for creating the final supertree including MRP (matrix representation with parsimony; <abbrgrp><abbr bid="B12">12</abbr><abbr bid="B13">13</abbr></abbrgrp>), MRF (matrix representation with flipping; <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>), MinCut <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>, and modified MinCut <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>. Although straightforward, supertree methods are not without their limitations, including problems related to data independence (same data can contribute to more than one source tree), "signal enhancement" (<abbrgrp><abbr bid="B11">11</abbr></abbrgrp> novel relationships in supertrees contradicting one or several source trees), and the assessment of uncertainty and confidence in relationships <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>. In addition, supertrees are strictly topological, thus requiring sequence data to obtain useful branch lengths <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. Most importantly however, supertrees do not directly rely on the primary data for tree inference, making novel topologies suspect. Perhaps due to these limitations, and despite active development of methodologies [e.g., <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>], few large supertrees for diverse groups have been successfully constructed (but see <abbrgrp><abbr bid="B18">18</abbr><abbr bid="B19">19</abbr></abbrgrp>).</p>
         <p>Supermatrix methods, on the other hand, are directly inferred from the sequence data through the construction of a large multiple sequence alignment for simultaneous analysis of the final data-matrix <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>. Given the fact that few genes are sampled very completely across many taxa, supermatrix methods often sacrifice completeness in the interest of size. In fact, one of the largest supermatrices, with >2000 tips, had 95% missing data <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>. Other supermatrix analyses have focused on the number of gene regions and not on the number of species <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B3">3</abbr></abbrgrp>. The construction of a large supermatrix involves a number of computationally challenging steps including, but not limited to, database operations, BLAST comparisons, sequence clustering, multiple sequence alignment, and combining data sets. An exhaustive discussion of these steps is presented elsewhere <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>, but each will be briefly touched upon as it relates to the approach presented here. Typically, sequences have been deposited in a database and all-by-all sequence comparisons with BLAST are conducted to assemble sequence clusters based on similarity. Methods for this step include agglomerative procedures, like single linkage clustering (e.g. blastclust) and stochastic methods (e.g. Tribe-MCL; <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>). Clustered sequences are then submitted to multiple sequence alignment (MSA). There are a host of other procedures that can be conducted once multiple sequence alignments are produced, especially related to identifying sequence orthology. Multi-locus datasets are created from individual alignments that do not have "too many" missing entries using a bipartite graph of taxa and loci and combining with bicliques or quasi-bicliques <abbrgrp><abbr bid="B22">22</abbr><abbr bid="B23">23</abbr></abbrgrp>.</p>
         <p>Each step described above is computationally difficult and rarely has been discussed in the context of what might be optimal for the final goal of tree construction. Despite the computational difficulties and potential shortcomings of specific steps in their construction, supermatrix methods allow for simultaneous data analysis. Also, unlike supertrees, they do not suffer from data independence or "signal enhancement" problems, and, at least in principle, confidence can be assessed using standard bootstrapping approaches. However, problems related to missing data and assessing the quality of the trees produced persists. Tools addressing certain steps of supermatrix construction are beginning to become available (e.g., Phylota; <abbrgrp><abbr bid="B24">24</abbr></abbrgrp>) and some notable large trees have been successfully produced <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>. However, tools for constructing supermatrices are not readily available for comparative biologists, and rather few large matrices for specific clades have been successfully analyzed. Nevertheless, supermatrix methods have made enormous strides forward and recent discussions have begun to center on methods that combine elements of both supertree and supermatrix approaches [e.g. <abbrgrp><abbr bid="B17">17</abbr><abbr bid="B25">25</abbr><abbr bid="B26">26</abbr></abbrgrp>].</p>
         <p>The method introduced here, to which we refer to as a "mega-phylogeny", is most similar to supermatrix methods, but differs from previous methods used to create large matrices in a number of ways. The mega-phylogeny method relies on the user identifying the gene regions of interest by presenting actual examples of the gene region and the breadth of molecular diversity of that gene within the clade of interest. Also, the mega-phylogeny method employs profile alignments to combine alignments of orthologous gene regions that would either be poorly aligned if done across a broad taxonomic group or would be broken up by clustering analyses. The mega-phylogeny approach can quickly create enormous phylogenetic matrices as more data from the same gene may be used, and the problems associated with sequence saturation are specifically attenuated. The first demonstration of this method <abbrgrp><abbr bid="B27">27</abbr></abbrgrp> produced phylogenies for plant clades from 366 species and 11,374 sites (Dipsacales) to 4657 species and 22,391 sites (Commelinidae).</p>
         <p>Here, we describe this new approach and its current implementation. We also present two example phylogenies for two plant clades created using our method: an Asterales phylogeny containing 4954 species and five gene regions and an <it>rbcL </it>phylogeny of green plants (Viridiplantae) comprising more than 13,533 species.</p>
      </sec>
      <sec>
         <st>
            <p>Methods</p>
         </st>
         <sec>
            <st>
               <p>Implemented Pipeline</p>
            </st>
            <p>The basic steps for a mega-phylogeny include (1) designating the clade of interest, (2) identifying the gene region(s) of interest, (3) recording the extent of molecular diversity of the gene region in the clade of interest, (4) recording the threshold of coverage and identity to be used for orthology tests, (5) narrowing the possible sequences with a very broad term search [optional], (6) remove all potential sequences that are not members of the clade of interest, (7) testing orthology by BLASTing each potential sequence to each gene region identified for the breadth and removing those sequences that differ by more than the established threshold, (8) identifying sequences that should be reverse complemented, (9) removing sequences for duplicate taxon names, keeping the sequence with the best coverage and identity, and (10) test for saturation. If the sequences are saturated, subdivide them using the next available subclade and perform additional tests of saturation (step 10). Finally, once all of the sequences are in an alignment or exist as singletons (i.e. are not found to be contained in any subdivisions), profile each alignment to a master alignment. This can be repeated an arbitrary number of times for each gene region of interest. If multiple gene regions are used, these are then concatenated into a large matrix and the phylogeny inferred.</p>
            <p>We implemented this pipeline in Python (vers. 2.5) with the BioPython (vers. 1.48) module and using the BioSQL (vers. 1.0.1) database schema. Each mega-phylogeny matrix assembly analysis presented here was run on a Linux laptop with 1 GB RAM and a 2.4 Ghz dual-core processor. The phylogenetic analyses were conducted on an eight-way SMP Linux computer with 2.4 Ghz processors and 32 GB of RAM using RAxML (vers. 7.0.4; <abbrgrp><abbr bid="B8">8</abbr></abbrgrp>). The steps that are novel for matrix assembly are described briefly below.</p>
         </sec>
         <sec>
            <st>
               <p>Orthology</p>
            </st>
            <p>Determining whether sequences are orthologous is a challenge for large tree construction. Supermatrix methods have attempted to overcome this problem by identifying orthologous sets of sequences using clustering techniques <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B4">4</abbr></abbrgrp>, but these can be time consuming and are typically not developed with the goal of large phylogeny assembly [e.g., <abbrgrp><abbr bid="B28">28</abbr></abbrgrp>]. Here, we determine orthologous sequences using designated sequences representing the breadth of variation observed in the gene region of interest across the clade of interest. We BLAST all of the potential sequences from the database against these designated sequences and other potential sequences that are determined to match with a certain threshold (i.e. according to both coverage and identity). At this stage, reverse complements are corrected by determining which direction best matches the designated regions of interest. Instead of N &#215; N comparisons between each potentially useful sequence, only N &#215; n comparisons are necessary, where n is the number of example sequences used to represent the region. This dramatically shortened the run time of the algorithm as well as generally produced denser matrices.</p>
         </sec>
         <sec>
            <st>
               <p>Profile alignments</p>
            </st>
            <p>One major problem for large matrices using broadly sampled sequences or smaller matrices with quickly evolving sequences is that multiple sequence alignments become more challenging as sequences become more divergent <abbrgrp><abbr bid="B28">28</abbr><abbr bid="B29">29</abbr></abbrgrp>. Almost all multiple sequence alignment algorithms build a phylogeny during the estimation procedures <abbrgrp><abbr bid="B30">30</abbr></abbrgrp>. The phylogenies built for multiple sequence alignment are often based on model-corrected or raw pair-wise distances. These methods are susceptible to problems with saturation (i.e. multiple mutations at the same site for the same organism) and are therefore much less accurate for large and broadly sampled alignments that are likely to contain very distantly related sequences. The quality of multiple sequence alignments can have a dramatic impact on the accuracy of the phylogenies produced [e.g., <abbrgrp><abbr bid="B31">31</abbr><abbr bid="B32">32</abbr><abbr bid="B33">33</abbr></abbrgrp>]. As a result, other supermatrix methods sidestep multiple sequence alignment by employing clustering techniques to determine "alignable regions." Such clustering techniques have allowed for the assembly of very large matrices of sufficiently similar sequences <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B4">4</abbr><abbr bid="B34">34</abbr></abbrgrp>. This approach can be dramatically affected by the parameters used during clustering, sometimes resulting in multiple informative clusters for slower gene regions (e.g. two large <it>rbcL </it>clusters for Ericales in Phylota).</p>
            <p>We combine the analysis of sequence saturation with recent advances in multiple profile-to-profile alignment methodology. A profile alignment is an algorithmic approach to identifying structural elements that are highly conserved between different alignments <abbrgrp><abbr bid="B34">34</abbr><abbr bid="B35">35</abbr><abbr bid="B36">36</abbr></abbrgrp>. To accomplish this, separate alignments are aligned together while preserving the columns in the individual alignments. Newer profile alignment programs allow for more flexibility in profile alignment procedures (e.g. MAFFT; <abbrgrp><abbr bid="B37">37</abbr></abbrgrp>). In our case, we separate sequences into subgroups of aligned sequences based on the degree of sequence saturation. For example, if the algorithm determines that the most inclusive group of sequences is saturated, then the group is broken up into less inclusive groups using the next level in the taxonomic hierarchy. In a Linnaean taxonomic system, if an "order" is found to be saturated, it would be broken into "families". Each smaller subset of sequences is then re-aligned and the saturation reassessed. This process continues iteratively to less inclusive groups until sequences no longer appear saturated and these alignments are then stored. We note, however, that the taxonomic groups used in this procedure need not correspond to ranks in the Linnaean hierarchy, but should simply be hierarchically nested (as in the NCBI taxonomy). Grouping using a rank-free classification (PhyloCode; <abbrgrp><abbr bid="B38">38</abbr></abbrgrp>) could be used and will be possible once a database of phylogenetic names is implemented and usable. After every sequence has been either placed in an alignment or placed as a "singleton," the individual alignments are then profiled to a larger alignment. The order of the profiling can be random, optimized to find the best order, or aided by a hierarchical "guide" tree (e.g. first aligning more closely related matrices). Currently, we employ highly conservative guide trees based on published studies to carry out profile alignments.</p>
         </sec>
         <sec>
            <st>
               <p>Assessing saturation</p>
            </st>
            <p>We introduce a simple method based on dispersion statistics to rapidly detect saturation across a set of sequence data. Dispersion (an indicator of spread) is assessed on the one-dimensional Euclidean distance between the raw pair-wise sequence distances and those corrected according to a Jukes-Cantor model of molecular substitution. A one-dimensional Euclidean distance is the absolute difference between two points. Our measure of dispersion is based on the median and is commonly referred to as the median absolute deviation (MAD) and given by</p>
            <p>
               <display-formula>MAD = 1.4826 &#215; Med (| x<sub>i </sub>- Med (x)|),</display-formula>
            </p>
            <p>where the median is estimated from the residual variation about the median of all pair-wise Euclidean distances. The constant 1.4826 is used to make MAD consistent for the standard deviation <abbrgrp><abbr bid="B39">39</abbr></abbrgrp>. Thus, in our use, the larger the MAD the larger the overall spread in the Euclidean distances &#8211; that is, above a certain value the assumed nucleotide substitution model is no longer adequately accounting for the rate variation exhibited by pair-wise distances among species.</p>
            <p>We performed a simple simulation study to explore the behavior of MAD. First, we wanted to determine a threshold for subdividing sequences into smaller alignments. In addition, we wanted to compare MAD with alternative measures of dispersion based on the sample mean (i.e. mean square error, MSE; root mean square, RMSE). Sequence data were simulated across randomly constructed 20- and 100-tip phylogenies. Different rates of molecular evolution were simulated by incrementally scaling the total tree length by a factor of 0.10, starting from 0.10 and stopping at 2.0. All molecular simulations were carried out using Seq-gen (Ver 1.3.2; <abbrgrp><abbr bid="B40">40</abbr></abbrgrp>).</p>
            <p>The results from these simulations clearly highlight the utility of MAD. First, unlike MSE or RMSE, MAD does not require an underlying Gaussian distribution, which is useful as the distribution of Euclidean distances becomes skewed as the degree of sequence divergence increases. A second advantage, and perhaps the most critical, is that MAD appears stable when sequence divergence is unrealistically high (e.g. tree length scaled by a factor of 2; Figure <figr fid="F1">1</figr>). This situation is also analogous to the presence of outliers that have well-known influences on dispersion statistics based on the mean. Because MAD does not require an explicit distribution and is the 50<sup>th </sup>percentile of the residual variation, it has the inherent property of being robust to outliers (Figure <figr fid="F1">1G, H</figr>). Finally, our simulations indicate that a MAD exceeding ~0.01 provided a conservative indication of a saturation level necessitating a profile alignment scheme.</p>
            <fig id="F1">
               <title>
                  <p>Figure 1</p>
               </title>
               <caption>
                  <p>Simulation exploring the behavior of MAD in relation to alternative measures of dispersion</p>
               </caption>
               <text>
                  <p><b>Simulation exploring the behavior of MAD in relation to alternative measures of dispersion</b>. Each panel is a simulation of sequence data on a balanced phylogeny of 20-(A, C, E, and G) and 100-tips (B, D, F, and H). A and B total tree length scaled to 0.10. C and D total tree length scaled to 0.25. E and F total tree length scaled to 0.50. G and H total tree length scaled to 2.00. Saturation was assessed by descriptors of dispersion on the one-dimensional Euclidean distance between the raw pair-wise sequence distances (uncorrected distance) and those corrected according to a Jukes-Cantor model of molecular substitution (corrected distance). Our simulations demonstrated that the use of the non-parametric median absolute deviation (MAD) had several advantages of detecting saturation over alternative measures of dispersion based on the sample mean (i.e. mean square error, MSE; root mean square, RMSE).</p>
               </text>
               <graphic file="1471-2148-9-37-1"/>
            </fig>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Results</p>
         </st>
         <sec>
            <st>
               <p>Asterales</p>
            </st>
            <p>Nearly 10% of all angiosperms are contained within the Asterales; a clade that is mainly comprised of 12 recognized families with a majority of the diversity being attributed to just two families, Asteraceae (e.g. sunflowers, thistles) and Campanulaceae (e.g. <it>Lobelia </it>and relatives). The monophyly of Asterales is well supported despite uncertainty in its position within the more inclusive Campanulidae clade <abbrgrp><abbr bid="B41">41</abbr></abbrgrp>. There are roughly 5200 species of Asterales represented in GenBank, or roughly 20% of the entire clade. However, aside from studies of carefully selected exemplar taxa representing major lineages of Asterales <abbrgrp><abbr bid="B41">41</abbr><abbr bid="B42">42</abbr><abbr bid="B43">43</abbr><abbr bid="B44">44</abbr><abbr bid="B45">45</abbr><abbr bid="B46">46</abbr></abbrgrp>, a comprehensive phylogeny has never been produced for this clade. Here we apply our mega-phylogeny approach to the Asterales to reconstruct the most complete phylogeny of the group to date.</p>
            <p>Our Asterales sequence matrix was comprised of <it>rbcL</it>, <it>mat</it>K, <it>trnL-F</it>, <it>trnK</it>, <it>ndhF</it>, and ITS. The combined matrix of 12,033 sites was comprised of 90.959% gaps or missing sequence. However, the individual gene regions were more variable in gap or missing sequence composition: 98.043% in ETS, 36.348% in ITS, 98.188% in <it>matK</it>, 90.338% in <it>ndhF</it>, 92.597% in <it>rbcL</it>, 98.002% in <it>trnK</it>, and 81.445% in <it>trnL-F</it>. Of the five gene regions sampled, ITS was the best represented taxonomically (with 4242 species) and was the only region identified by our procedure as requiring profile-to-profile alignments. The MAD score indicated that the degree of ITS saturation varied among groups, but within-group alignments were never carried out above the traditional "tribal" level. This resulted in 180 separate within-group alignment files of differing hierarchical level.</p>
            <p>As an efficient means to direct the profile-to-profile alignments, we assembled a "guide" tree by compiling and grafting together published phylogenies (<it>sensu </it><abbrgrp><abbr bid="B47">47</abbr></abbrgrp>). Briefly, we first obtained a backbone phylogeny from Winkworth et al., <abbrgrp><abbr bid="B41">41</abbr></abbrgrp> and Lundberg and Bremer <abbrgrp><abbr bid="B44">44</abbr></abbrgrp> for the major lineages of Asterales. We then grafted trees based on more focused studies into the backbone tree. We started this process with the most inclusive clade and proceeding "inwards", adding more and more detailed analyses of included clades. Our final grafted tree was pruned down to correspond to the 180-alignment files output from our saturation analysis (see Additional file <supplr sid="S1">1</supplr>). This guide tree was then traversed in a post-order fashion, performing profile-to-profile alignments starting at the "terminals" and working recursively back to the root. The phylogeny was then inferred using RAxML (vers. 7.0.4; <abbrgrp><abbr bid="B8">8</abbr></abbrgrp>), partitioning for each gene region using the GTR+GAMMA model of rate substitution.</p>
            <suppl id="S1">
               <title>
                  <p>Additional file 1</p>
               </title>
               <text>
                  <p><b>Figure S1.</b> The final assembled "guide" tree used to direct the profile alignments across Asterales. Each "terminal" represents the name of the name of the 180-alignment files output from our saturation analysis. We assembled a "guide" tree representing the relationships among the alignment files by compiling and grafting together published phylogenies. This guide tree was then traversed in a post-order manner, performing each profile-to-profile alignment working recursively back to the root.</p>
               </text>
               <file name="1471-2148-9-37-S1.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <p>Our final phylogeny includes 4954 tips with the branching within and among "families" being mostly consistent with previously published results (Figure <figr fid="F2">2</figr>). One exception concerns the early branching lineages of Asterales, involving the placement of Rousseaceae+Carpodetaceae, Campanulaceae, and Pentaphragmataceae. The current consensus recognizes a basal trichotomy among these three clades (<abbrgrp><abbr bid="B48">48</abbr></abbrgrp>; but see <abbrgrp><abbr bid="B41">41</abbr><abbr bid="B44">44</abbr></abbrgrp>). Our analysis recovered Rousseaceae+Carpodetaceae as the sister groups of all other Asterales, within which Campanulaceae (including Lobeliaceae) is sister to the rest (Figure <figr fid="F2">2</figr>). This result has been recovered before <abbrgrp><abbr bid="B44">44</abbr></abbrgrp>, but other studies have suggested a basal split between Rousseaceae+Carpodetaceae plus Campanulaceae <abbrgrp><abbr bid="B41">41</abbr><abbr bid="B44">44</abbr></abbrgrp> and all the rest. Our analysis shows Pentaphragmataceae as sister to a clade comprising Stylidiaceae, Alseuosmiaceae, Argophyllaceae, Phellinaceae, Menyanthaceae, Goodeniaceae, Calyceraceae, and Asteraceae (Figure <figr fid="F2">2</figr>). A recent combined analysis of chloroplast and nuclear genes found strong support for this relationship <abbrgrp><abbr bid="B41">41</abbr></abbrgrp>.</p>
            <fig id="F2">
               <title>
                  <p>Figure 2</p>
               </title>
               <caption>
                  <p>Maximum-likelihood phylogeny for 4954 species of Asterales</p>
               </caption>
               <text>
                  <p><b>Maximum-likelihood phylogeny for 4954 species of Asterales</b>. The data matrix was constructed using the mega-phylogeny method and includes DNA sequences for five genes: <it>rbcL, matK, trnL-F, ndhF</it>, and ITS. Each of the 12 major families of Asterales is labeled. We also note the placement of the "Doronicum" clade in relation to the tribe Senecioneae; although we assumed a sister relationship <it>a priori</it>, the phylogenetic analysis overruled this assumption, indicating that the two clades may be more distantly related. Pentaphragma, Pentaphragmataceae; Alseu, Alseuosmiaceae; Argo, Argophyllaceae; Phel, Phellinaceae.</p>
               </text>
               <graphic file="1471-2148-9-37-2"/>
            </fig>
            <p>Relationships within Asteraceae coincide well the subfamilial classification of Panero and Funk <abbrgrp><abbr bid="B46">46</abbr></abbrgrp>. However, we note that our phylogeny does not support the monophyly of Wunderlichioideae, Stifftioideae, Mutisioideae, or Gochnatioideae (<it>sensu </it><abbrgrp><abbr bid="B46">46</abbr></abbrgrp>). Instead we found these groups to be broken into smaller successive sister clades to the rest of Asteraceae.</p>
            <p>Several relationships, within major clades, are worth noting as they highlight the utility of our method. Based on the NCBI taxonomy, the profile-alignment portion of our algorithm assumed that <it>Campanula </it>was monophyletic. However, the MAD score detected extreme sequence variation that required profile alignments among species. This variation is an indication of extreme molecular differentiation, and in the case of <it>Campanula</it>, paraphyly, which is consistent with more focused systematic studies of Campanulaceae <abbrgrp><abbr bid="B49">49</abbr><abbr bid="B50">50</abbr></abbrgrp>. In several cases we also assumed sister relationships where the primary literature suggested low nodal support. For example, we profiled the genus <it>Doronicum </it>and the tribe Senecioneae as sister clades within the more inclusive Asteroideae, though there is generally low confidence in this hypothesis <abbrgrp><abbr bid="B51">51</abbr></abbrgrp>. The resulting tree showed these two clades to be more distantly related, as <it>Doronicum </it>is placed near the early branching lineages of Asteroideae (Figure <figr fid="F2">2</figr>). Taken together, these results show that even though we assume some phylogenetic relationships at the outset in doing the profile alignments, our results need not recover the same relationship within the final phylogenetic tree.</p>
         </sec>
         <sec>
            <st>
               <p>Green plants</p>
            </st>
            <p>The green plants (Viridiplantae) contain more than 350,000 species including green algae, liverworts, mosses, ferns, and seed plants, including the flowering plants. The early branches of the entire clade and of each major group of green plants have attracted extensive molecular work <abbrgrp><abbr bid="B52">52</abbr><abbr bid="B53">53</abbr><abbr bid="B54">54</abbr><abbr bid="B55">55</abbr></abbrgrp>. Two large clades of living green plants are supported: Streptophyta and Chlorophyta <abbrgrp><abbr bid="B52">52</abbr></abbrgrp>. Charophytes (stonewarts) have been supported as the sister group to land plants based on the inclusion of six genes <abbrgrp><abbr bid="B53">53</abbr><abbr bid="B55">55</abbr></abbrgrp>. Despite the large number of molecular studies that have focused on deep relationships within green plants, few studies with very large numbers of taxa have been conducted.</p>
            <p>Here, we create a mega-phylogeny of the chloroplast ribulose-bisphosphate carboxylase (<it>rbcL</it>) gene for all green plants. This well sampled gene has been extensively examined in smaller studies throughout plants, especially in flowering plants [beginning with 5]. Despite the widespread use of <it>rbcL</it>, the addition of other genes has generally been necessary to confidently reconstruct many relationships. Here, our goal was to construct the largest <it>rbcL </it>phylogeny for green plants while simultaneously accommodating saturation across the alignment.</p>
            <p>Over 16,000 <it>rbcL </it>sequences were found to be orthologous to the designated sequences sampled throughout flowering plants. Our final matrix with duplicate taxa removed consisted of 13,533 tips and 1401 nucleotide sites with 4.6238% of the matrix consisting of gaps. Our saturation analysis recognized 15 separate aligned subgroupings: Chlorophyta (486 sp.), Zygnesnophyceae (131 sp.), Coleochaetophyceae (21 sp.), Charophyceae (34 sp.), Marchantiophyta (462 sp.), Bryophyta (570 sp.), Anthocerotophyta (56 sp.), Lycopodiopsida (48 sp.), Isoetopsida (101 sp.), Equisetophyta (18 sp.), Marattiopsida (59 sp.), Ophioglossopsida (29 sp.), Filicopsida (1624 sp.), Psilotophyta (3 sp.), and Spermatophyta (9900). These aligned subgroups were combined using profile alignments across a guide tree based on Donoghue <abbrgrp><abbr bid="B56">56</abbr></abbrgrp> and Cantino et al. <abbrgrp><abbr bid="B38">38</abbr></abbrgrp> (see additional file <supplr sid="S2">2</supplr>). The phylogeny was constructed using RAxML (vers. 7.0.4; <abbrgrp><abbr bid="B8">8</abbr></abbrgrp>) using the GTR+GAMMA model of rate substitution.</p>
            <suppl id="S2">
               <title>
                  <p>Additional file 2</p>
               </title>
               <text>
                  <p><b>The "guide" tree used to direct the profile alignments across green plants (Viridiplantae).</b> Similar to Fig S1, each terminal in the tree represents a separate alignment file output from our saturation analysis. Due to the large uncertainty at several nodes, members of a polytomy were profile aligned to find the best order.</p>
               </text>
               <file name="1471-2148-9-37-S2.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <p>A recent analysis by Qiu et al. <abbrgrp><abbr bid="B55">55</abbr></abbrgrp> compiled the most comprehensive dataset to date, using six genes and 193 species to resolve relationships of the four major land plant lineages: liverworts, hornworts, mosses, and vascular plants. They recover, with strong support, a resolution of successive sister clades starting with liverworts, then mosses, then hornworts, and vascular plants. Our maximum-likelihood tree of more than 13,000 species recovers this same relationship with the use of <it>rbcL </it>alone (Figure <figr fid="F3">3</figr>). However, within vascular plants, our trees differ in the placement of lycophytes. In the Qiu et al. <abbrgrp><abbr bid="B56">56</abbr></abbrgrp> tree, monilophytes are more closely related to seed plants than lycophytes and these relationships are well established based on other evidence (e.g., morphology <abbrgrp><abbr bid="B57">57</abbr></abbrgrp>, gene order and gene losses <abbrgrp><abbr bid="B58">58</abbr><abbr bid="B59">59</abbr></abbrgrp>). We find the lycophytes to be more closely related to seed plants, which is likely to be mistaken and reflects an artifact in the evolution of <it>rbcL</it>.</p>
            <fig id="F3">
               <title>
                  <p>Figure 3</p>
               </title>
               <caption>
                  <p>Maximum-likelihood phylogeny for 13,533 species of green plants based on <it>rbcL </it>DNA sequences</p>
               </caption>
               <text>
                  <p><b>Maximum-likelihood phylogeny for 13,533 species of green plants based on <it>rbcL </it>DNA sequences</b>. The data matrix was constructed using the mega-phylogeny method; major clades are labeled and denoted with a star.</p>
               </text>
               <graphic file="1471-2148-9-37-3"/>
            </fig>
            <p>Our much larger phylogeny resolves some relationships by including more data in the form of more species instead of more genes. This has been documented previously but has rarely been tested on such a large scale <abbrgrp><abbr bid="B60">60</abbr><abbr bid="B61">61</abbr></abbrgrp>. With the inclusion of more taxa other broad evolutionary patterns emerge [cf. <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>]. For example, in this case, the ferns appear to have faster rates of evolution than the other vascular plants. Further study is required to quantify this pattern and its important, as the timing and rate of evolution of ferns has been interpreted in light of angiosperm evolution <abbrgrp><abbr bid="B62">62</abbr></abbrgrp>. With more taxa sampled, rate heterogeneity can become more apparent, raising an important issue about the possible effects of clade-specific rates on divergence-time estimates <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>. Unfortunately, accurate estimates of divergence times using tens of thousands of species remain impractical.</p>
            <p>Another important result is that <it>rbcL </it>appears to be saturated across green plants. That is, despite the conservative nature of this coding region, when looking very broadly there are likely to be multiple mutations at sites throughout the gene causing either less accurate multiple sequence alignments or causing clustering methods to break up the matrix into smaller sections. Broad analyses of green plants will need to take this into account. Our analysis also demonstrates the limitations of conventional computers for analyzing large phylogenies. The matrix manipulation, tree construction, and tree rerooting required at least 8 GB of memory and were conducted on an 8 CPU machine. To build even larger matrices, more memory and faster machines will be essential.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Discussion and conclusion</p>
         </st>
         <p>The examples presented here demonstrate the utility of our strategy for building large phylogenetic trees. The mega-phylogeny method is capable of producing large and somewhat denser phylogenetic matrices with the addition of human intervention in the selection of gene regions. These matrices can be a partitioned multi-locus dataset, as in the Asterales example, or a single-locus analysis of tens of thousands of terminals, as in the green plants. The size is limited only by computing power. Also, our examples illustrate how well sampled regions (such as ITS) that may be evolving too fast for traditional multiple sequence alignment may be included in broad phylogenetic analyses. Our mega-phylogeny approach also demonstrates that the addition of many more taxa can help resolve relationships where, traditionally, more genes would be required. A direct comparison of our mega-phylogeny method to trees constructed from supermatrix methods is difficult as the two approaches have somewhat different goals. Supermatrix methods can, as implemented by McMahon and Sanderson [<abbrgrp><abbr bid="B4">4</abbr></abbrgrp>, also see <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B3">3</abbr></abbrgrp>], attempt to make the largest matrix from a database of sequences without specifically identifying particular regions of interest. The mega-phylogeny approach attempts to make the matrix with the largest number of taxa for any clade given the gene regions pre-identified as being of interest. This allows for the creation of somewhat denser matrices, faster. Because fewer gene regions would, in general, be used in the mega-phylogeny approach, partitioned likelihood analyses can be employed more easily with shorter run times. At the moment, there is no standard software for supermatrix methods that could be benchmarked against the mega-phylogeny approach.</p>
         <p>Our mega-phylogeny method will perhaps be most useful for comparative biology. In recent years there is an emerging interest in compiling broad-scale datasets to identify general patterns and test specific hypotheses using a phylogenetic framework. For example, new and interesting patterns have emerged in topics ranging from molecular rates <abbrgrp><abbr bid="B27">27</abbr></abbrgrp> to ecophysiology <abbrgrp><abbr bid="B63">63</abbr><abbr bid="B64">64</abbr><abbr bid="B65">65</abbr></abbrgrp> to biodiversity <abbrgrp><abbr bid="B66">66</abbr></abbrgrp> and ecosystem processes <abbrgrp><abbr bid="B67">67</abbr></abbrgrp>. However, the level of resolution in the underlying phylogeny has limited many of these studies with many being constructed from multiple literature-based trees (e.g. Phylomatic; <abbrgrp><abbr bid="B68">68</abbr></abbrgrp>). Our method provides a means to construct large phylogenies from primary data, which we hope will facilitate more sophisticated and robust comparative analyses. This has been demonstrated with rate heterogeneity in plants <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>. The limiting factor may soon become the ability of software for various comparative analyses to handle mega-phylogenies.</p>
         <sec>
            <st>
               <p>Modularity</p>
            </st>
            <p>The mega-phylogeny method is inherently modular, making each step easily extensible. For example, instead of using BLAST comparisons for sequence orthology tests, another test could be used. In fact, a modified clustering method, as is typically found in supermatrix construction, could be utilized. Additionally, instead of the MAD measurement for saturation, other measures could be devised. Concomitantly, any number of different taxonomic databases can be used when saturation is detected. We have relied on the NCBI hierarchical taxonomy, but increased precision might eventually be obtained by using a system containing many additional levels. The modularity of the mega-phylogeny approach encourages its longevity when better methods and approaches become available to address specific procedures underlying mega-phylogeny matrix creation.</p>
            <p>Modularity is especially important with respect to the guide trees involved in profile alignment, where the results from different guide trees (or no guide tree) can be compared. For example, there may be a published study of broadly sampled taxa included in the clade of interest for the mega-phylogeny approach. A profile alignment using this tree could easily be compared to one that consists only of basal polytomies, which will be profiled randomly. The use of guide trees for this step highlights the need for available definitive bifurcating trees for profile alignments, especially broadly sampled trees. From this perspective, the compilation of large-scale trees from published phylogenies (e.g., available on TreeBASE, <url>http://www.treebase.org</url>) becomes a highly relevant endeavor, not only from the standpoint of the initial guide tree but also as a basis for the comparison of results. Important differences could then highlight areas in special need of attention. For example, further attention is needed to the signal in <it>rbcL </it>that places lycophytes with seed plants.</p>
         </sec>
         <sec>
            <st>
               <p>Potential pitfalls</p>
            </st>
            <p>A key element of our mega-phylogeny method is its reliance on prior knowledge of phylogenetic relationships when performing profile-to-profile alignments. We assume that each group being aligned is monophyletic, which is potentially a problem once saturation is detected and less inclusive multiple sequence alignments are employed. However, despite such assumptions, our saturation analysis using the MAD statistic is not irreversibly susceptible to outliers and can detect extreme variation when, for example, it is not monophyletic as demonstrated with <it>Campanula </it>within our Asterales matrix. In this case, the MAD score suggested that the assumption of monophyly for <it>Campanula </it>was violated, and it emerged as paraphyletic in the final tree. Even though the assumption of monophyly is a potential problem, it is not always detrimental. Further work is needed to explore the sensitivity of the results to such assumptions. In the meantime, the approach highlights the need for taxonomic databases to most accurately reflect current best knowledge of phylogenetic relationships.</p>
            <p>Various problems identified in supermatrix construction may also pertain to the mega-phylogeny method. For example, there are problems with database-enabled phylogenetics that are hard or impossible to avoid, such as misidentification or mislabeling in GenBank <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>. Sequence orthology tests can help identify such problems, however outliers are likely to still cause difficulties in some matrices. Additionally, there are potential problems with "rogue taxa" that can lower resolution and support throughout the tree. However, the problem of rogue taxa continues to also be a problem for supermatrices and therefore the development of solutions will likely benefit both methods.</p>
         </sec>
         <sec>
            <st>
               <p>Extensions</p>
            </st>
            <p>It may be possible to incorporate diversity estimates for each taxonomic group, such that large clades represented by single (or a few) species for a particular gene could be excluded. This would likely reduce problems associated with rogue taxa. Although this information is not currently readily available, its inclusion could greatly increase the efficacy of the mega-phylogeny approach.</p>
            <p>Our method can also be extended to deal with the problem of sequence outliers. Unfortunately, the size of the matrices that can be constructed makes checking for outliers by hand impractical. But saturation statistics could be extended to identify these outliers in the individual gene regions. Although the orthology tests and reverse complement procedures identify the vast majority of problematic sequences, the MAD statistic has the potential to cleanse the datasets further, allowing for almost complete automation of large tree construction.</p>
            <p>Finally, the mega-phylogeny procedure can be parallelized. Many of the procedures related to sequence-to-sequence comparisons (e.g., orthology tests, reverse complements) can be easily distributed on multiple CPU's or computers. This is also true of some of the multiple sequence alignment calculations. Parallelizing these procedures would yield even faster the mega-phylogenic analyses.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Authors' contributions</p>
         </st>
         <p>SAS developed the approach and conducted the analyses. JMB developed the saturation tests. SAS, JMB, and MJD wrote and edited the manuscript.</p>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>Valuable feedback was obtained from Jeff Oliver and David Tank. We are especially grateful for initial discussions on large tree methodology with Brian Moore. SAS was partially supported by the Cyberinfrastructure for Phylogenetic Research (CIPRES) program (NSF #EF-0331654) and by the National Evolutionary Synthesis Center (NESCent; NSF #EF-0423641). MJD and JMB have been supported through a NSF "Tree of Life" (ATOL) award.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>The delayed rise of present-day mammals</p>
            </title>
            <aug>
               <au>
                  <snm>Bininda-Emonds</snm>
                  <fnm>ORP</fnm>
               </au>
               <au>
                  <snm>Cardillo</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Jones</snm>
                  <fnm>KE</fnm>
               </au>
               <au>
                  <snm>MacPhee</snm>
                  <fnm>RDE</fnm>
               </au>
               <au>
                  <snm>Beck</snm>
                  <fnm>RMD</fnm>
               </au>
               <au>
                  <snm>Grenyer</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Price</snm>
                  <fnm>SA</fnm>
               </au>
               <au>
                  <snm>Vos</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Gittleman</snm>
                  <fnm>JL</fnm>
               </au>
               <au>
                  <snm>Purvis</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2007</pubdate>
            <volume>446</volume>
            <fpage>507</fpage>
            <lpage>512</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">17392779</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>Prospects for building the tree of life from large sequence databases</p>
            </title>
            <aug>
               <au>
                  <snm>Driskell</snm>
                  <fnm>AC</fnm>
               </au>
               <au>
                  <snm>An&#233;</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Burleigh</snm>
                  <fnm>JG</fnm>
               </au>
               <au>
                  <snm>McMahon</snm>
                  <fnm>MM</fnm>
               </au>
               <au>
                  <snm>O'meara</snm>
                  <fnm>BC</fnm>
               </au>
               <au>
                  <snm>Sanderson</snm>
                  <fnm>MJ</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2004</pubdate>
            <volume>306</volume>
            <fpage>1172</fpage>
            <lpage>1174</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">15539599</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>Toward automatic reconstruction of a highly resolved tree of life</p>
            </title>
            <aug>
               <au>
                  <snm>Ciccarelli</snm>
                  <fnm>FD</fnm>
               </au>
               <au>
                  <snm>Doerks</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Mering</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Creevey</snm>
                  <fnm>CJ</fnm>
               </au>
               <au>
                  <snm>Snell</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Bork</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2006</pubdate>
            <volume>311</volume>
            <fpage>1283</fpage>
            <lpage>1287</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">16513982</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>Phylogenetic supermatrix analysis of GenBank sequences from 2228 papilionoid legumes</p>
            </title>
            <aug>
               <au>
                  <snm>McMahon</snm>
                  <fnm>MM</fnm>
               </au>
               <au>
                  <snm>Sanderson</snm>
                  <fnm>MJ</fnm>
               </au>
            </aug>
            <source>Systematic Biology</source>
            <pubdate>2006</pubdate>
            <volume>55</volume>
            <fpage>818</fpage>
            <lpage>836</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">17060202</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>Phylogenetics of seed plants: an analysis of nucleotide sequences from the plastid gene rbcL</p>
            </title>
            <aug>
               <au>
                  <snm>Chase</snm>
                  <fnm>MW</fnm>
               </au>
               <au>
                  <snm>Soltis</snm>
                  <fnm>DE</fnm>
               </au>
               <au>
                  <snm>Olmstead</snm>
                  <fnm>RG</fnm>
               </au>
               <au>
                  <snm>Morgan</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Les</snm>
                  <fnm>DH</fnm>
               </au>
               <au>
                  <snm>Mishler</snm>
                  <fnm>BD</fnm>
               </au>
               <au>
                  <snm>Duvall</snm>
                  <fnm>MR</fnm>
               </au>
               <au>
                  <snm>Price</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Hills</snm>
                  <fnm>HG</fnm>
               </au>
               <au>
                  <snm>Qui</snm>
                  <fnm>YL</fnm>
               </au>
               <au>
                  <snm>Kron</snm>
                  <fnm>KA</fnm>
               </au>
               <au>
                  <snm>Rettig</snm>
                  <fnm>JH</fnm>
               </au>
               <au>
                  <snm>Conti</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Palmer</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Manhart</snm>
                  <fnm>JR</fnm>
               </au>
               <au>
                  <snm>Sytsma</snm>
                  <fnm>KJ</fnm>
               </au>
               <au>
                  <snm>Michael</snm>
                  <fnm>HJ</fnm>
               </au>
               <au>
                  <snm>Kress</snm>
                  <fnm>WJ</fnm>
               </au>
               <au>
                  <snm>Karol</snm>
                  <fnm>KG</fnm>
               </au>
               <au>
                  <snm>Clark</snm>
                  <fnm>WD</fnm>
               </au>
               <au>
                  <snm>Hedren</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Gaut</snm>
                  <fnm>BS</fnm>
               </au>
               <au>
                  <snm>Jansen</snm>
                  <fnm>RK</fnm>
               </au>
               <au>
                  <snm>Kim</snm>
                  <fnm>KJ</fnm>
               </au>
               <au>
                  <snm>Wimpee</snm>
                  <fnm>CF</fnm>
               </au>
               <au>
                  <snm>Smith</snm>
                  <fnm>JF</fnm>
               </au>
               <au>
                  <snm>Furnier</snm>
                  <fnm>GR</fnm>
               </au>
               <au>
                  <snm>Strauss</snm>
                  <fnm>SH</fnm>
               </au>
               <au>
                  <snm>Xiang</snm>
                  <fnm>QY</fnm>
               </au>
               <au>
                  <snm>Plunkett</snm>
                  <fnm>GM</fnm>
               </au>
               <au>
                  <snm>Soltis</snm>
                  <fnm>PS</fnm>
               </au>
               <au>
                  <snm>Swensen</snm>
                  <fnm>SM</fnm>
               </au>
               <au>
                  <snm>Williams</snm>
                  <fnm>SE</fnm>
               </au>
               <au>
                  <snm>Gadek</snm>
                  <fnm>PA</fnm>
               </au>
               <au>
                  <snm>Quinn</snm>
                  <fnm>CJ</fnm>
               </au>
               <au>
                  <snm>Eguiarte</snm>
                  <fnm>LE</fnm>
               </au>
               <au>
                  <snm>Golenberg</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Learn</snm>
                  <fnm>GH</fnm>
                  <suf>Jr</suf>
               </au>
               <au>
                  <snm>Graham</snm>
                  <fnm>SW</fnm>
               </au>
               <au>
                  <snm>Barrett</snm>
                  <fnm>SCH</fnm>
               </au>
               <au>
                  <snm>Dayanandan</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Albert</snm>
                  <fnm>VA</fnm>
               </au>
            </aug>
            <source>Annals of the Missouri Botanical Garden</source>
            <pubdate>1993</pubdate>
            <volume>80</volume>
            <fpage>528</fpage>
            <lpage>580</lpage>
         </bibl>
         <bibl id="B6">
            <title>
               <p>Angiosperm phylogeny based on <it>matK </it>sequence information</p>
            </title>
            <aug>
               <au>
                  <snm>Hilu</snm>
                  <fnm>KW</fnm>
               </au>
               <au>
                  <snm>Borsch</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Muller</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Soltis</snm>
                  <fnm>DE</fnm>
               </au>
               <au>
                  <snm>Soltis</snm>
                  <fnm>PS</fnm>
               </au>
               <au>
                  <snm>Savolainen</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Chase</snm>
                  <fnm>MW</fnm>
               </au>
               <au>
                  <snm>Powell</snm>
                  <fnm>MP</fnm>
               </au>
               <au>
                  <snm>Alice</snm>
                  <fnm>LA</fnm>
               </au>
               <au>
                  <snm>Evans</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Sauquet</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Neinhuis</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Slotta</snm>
                  <fnm>TAB</fnm>
               </au>
               <au>
                  <snm>Rohwer</snm>
                  <fnm>JG</fnm>
               </au>
               <au>
                  <snm>Campbell</snm>
                  <fnm>CS</fnm>
               </au>
               <au>
                  <snm>Chatrou</snm>
                  <fnm>LW</fnm>
               </au>
            </aug>
            <source>American Journal of Botany</source>
            <pubdate>2003</pubdate>
            <volume>90</volume>
            <fpage>1758</fpage>
            <lpage>1776</lpage>
         </bibl>
         <bibl id="B7">
            <title>
               <p>Disk-covering, a fast-converging method for phylogenetic tree reconstruction</p>
            </title>
            <aug>
               <au>
                  <snm>Huson</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Nettle</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Warnow</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>Journal of Computational Biology</source>
            <pubdate>1999</pubdate>
            <volume>6</volume>
            <fpage>369</fpage>
            <lpage>386</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">10582573</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models</p>
            </title>
            <aug>
               <au>
                  <snm>Stamatakis</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2006</pubdate>
            <volume>22</volume>
            <fpage>2688</fpage>
            <lpage>2690</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">16928733</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>Genetic algorithm approaches for the phylogenetic analysis of large biological sequence datasets under the maximum likelihood criterion</p>
            </title>
            <aug>
               <au>
                  <snm>Zwickl</snm>
                  <fnm>DJ</fnm>
               </au>
            </aug>
            <publisher>Ph.D. dissertation, The University of Texas at Austin</publisher>
            <pubdate>2006</pubdate>
         </bibl>
         <bibl id="B10">
            <title>
               <p>Phylogenetic supertrees: assembling the trees of life</p>
            </title>
            <aug>
               <au>
                  <snm>Sanderson</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Purvis</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Henze</snm>
                  <fnm>C</fnm>
               </au>
            </aug>
            <source>Trends in Ecology and Evolution</source>
            <pubdate>1998</pubdate>
            <volume>13</volume>
            <fpage>105</fpage>
            <lpage>109</lpage>
         </bibl>
         <bibl id="B11">
            <title>
               <p>The evolution of supertrees</p>
            </title>
            <aug>
               <au>
                  <snm>Bininda-Emonds</snm>
                  <fnm>ORP</fnm>
               </au>
            </aug>
            <source>Trends Ecol Evol</source>
            <pubdate>2004</pubdate>
            <volume>19</volume>
            <issue>6</issue>
            <fpage>315</fpage>
            <lpage>322</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">16701277</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>Combining trees as a way of combining data sets for phylogenetic inference, and the desirability of combining gene trees</p>
            </title>
            <aug>
               <au>
                  <snm>Baum</snm>
                  <fnm>BR</fnm>
               </au>
            </aug>
            <source>Taxon</source>
            <pubdate>1992</pubdate>
            <volume>41</volume>
            <fpage>3</fpage>
            <lpage>10</lpage>
         </bibl>
         <bibl id="B13">
            <title>
               <p>Phylogenetic inference based on matrix representation of trees</p>
            </title>
            <aug>
               <au>
                  <snm>Ragan</snm>
                  <fnm>MA</fnm>
               </au>
            </aug>
            <source>Mol Phylogenet Evol</source>
            <pubdate>1992</pubdate>
            <volume>1</volume>
            <issue>1</issue>
            <fpage>53</fpage>
            <lpage>58</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid">1342924</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>Supertrees by flipping</p>
            </title>
            <aug>
               <au>
                  <snm>Chen</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Eulenstein</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Fernandez</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Sanderson</snm>
                  <fnm>MJ</fnm>
               </au>
            </aug>
            <source>Proceedings of COCOON 2002</source>
            <publisher>Springer-Verlag LNCS</publisher>
            <pubdate>2002</pubdate>
         </bibl>
         <bibl id="B15">
            <title>
               <p>A supertree method for rooted trees</p>
            </title>
            <aug>
               <au>
                  <snm>Semple</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Steel</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Discrete Appl Math</source>
            <pubdate>2000</pubdate>
            <volume>105</volume>
            <fpage>147</fpage>
            <lpage>158</lpage>
         </bibl>
         <bibl id="B16">
            <title>
               <p>Modified MinCut supertrees</p>
            </title>
            <aug>
               <au>
                  <snm>Page</snm>
                  <fnm>RDM</fnm>
               </au>
            </aug>
            <source>WABI 2002</source>
            <publisher>LNCS</publisher>
            <editor>Guigo R, Gusfield D</editor>
            <pubdate>2002</pubdate>
            <fpage>537</fpage>
            <lpage>551</lpage>
         </bibl>
         <bibl id="B17">
            <title>
               <p>Increasing data transparency and estimating phylogenetic uncertainty in supertrees: approaches using nonparametric bootstrapping</p>
            </title>
            <aug>
               <au>
                  <snm>Moore</snm>
                  <fnm>BR</fnm>
               </au>
               <au>
                  <snm>Smith</snm>
                  <fnm>SA</fnm>
               </au>
               <au>
                  <snm>Donoghue</snm>
                  <fnm>MJ</fnm>
               </au>
            </aug>
            <source>Syst Biol</source>
            <pubdate>2007</pubdate>
            <volume>55</volume>
            <issue>4</issue>
            <fpage>662</fpage>
            <lpage>676</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid">16969942</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p>A phylogenetic supertree of the bats (Mammalia: Chiroptera)</p>
            </title>
            <aug>
               <au>
                  <snm>Jones</snm>
                  <fnm>KE</fnm>
               </au>
               <au>
                  <snm>Purvis</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>MacLarnon</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Bininda-Emonds</snm>
                  <fnm>OR</fnm>
               </au>
               <au>
                  <snm>Simmons</snm>
                  <fnm>NB</fnm>
               </au>
            </aug>
            <source>Biological Reviews</source>
            <pubdate>2002</pubdate>
            <volume>77</volume>
            <fpage>223</fpage>
            <lpage>259</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">12056748</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B19">
            <title>
               <p>A supertree of early tetrapods</p>
            </title>
            <aug>
               <au>
                  <snm>Ruta</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Jeffery</snm>
                  <fnm>JE</fnm>
               </au>
               <au>
                  <snm>Coates</snm>
                  <fnm>MI</fnm>
               </au>
            </aug>
            <source>Proc Biol Sci</source>
            <pubdate>2003</pubdate>
            <volume>270</volume>
            <issue>1532</issue>
            <fpage>2507</fpage>
            <lpage>2516</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">14667343</pubid>
                  <pubid idtype="pmcid">1691537</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B20">
            <title>
               <p>The supermatrix approach to systematics</p>
            </title>
            <aug>
               <au>
                  <snm>de Queiroz</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Gatesy</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Trends Ecol Evol</source>
            <pubdate>2007</pubdate>
            <volume>22</volume>
            <issue>1</issue>
            <fpage>34</fpage>
            <lpage>41</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">17046100</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B21">
            <title>
               <p>An efficient algorithm for large-scale detection of protein families</p>
            </title>
            <aug>
               <au>
                  <snm>Enright</snm>
                  <fnm>AJ</fnm>
               </au>
               <au>
                  <snm>Dongen</snm>
                  <fnm>SV</fnm>
               </au>
               <au>
                  <snm>Ouzounis</snm>
                  <fnm>CA</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Research</source>
            <pubdate>2002</pubdate>
            <volume>30</volume>
            <fpage>1575</fpage>
            <lpage>1584</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">101833</pubid>
                  <pubid idtype="pmpid" link="fulltext">11917018</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B22">
            <title>
               <p>Obtaining maximal concatenated phylogenetic data sets from large sequence databases</p>
            </title>
            <aug>
               <au>
                  <snm>Sanderson</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Driskell</snm>
                  <fnm>AC</fnm>
               </au>
               <au>
                  <snm>Ree</snm>
                  <fnm>RH</fnm>
               </au>
               <au>
                  <snm>Eulenstein</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Langley</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Mol Biol Evol</source>
            <pubdate>2003</pubdate>
            <volume>20</volume>
            <issue>7</issue>
            <fpage>1036</fpage>
            <lpage>1042</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">12777519</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B23">
            <title>
               <p>Identifying optimal incomplete phylogenetic data sets from sequence databases</p>
            </title>
            <aug>
               <au>
                  <snm>Yan</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Burleigh</snm>
                  <fnm>JG</fnm>
               </au>
               <au>
                  <snm>Eulenstein</snm>
                  <fnm>O</fnm>
               </au>
            </aug>
            <source>Mol Phylogenet Evol</source>
            <pubdate>2005</pubdate>
            <volume>35</volume>
            <issue>2</issue>
            <fpage>528</fpage>
            <lpage>535</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">15878123</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B24">
            <title>
               <p>The PhyLoTA browser: processing GenBank for molecular phylogenetics research</p>
            </title>
            <aug>
               <au>
                  <snm>Sanderson</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Boss</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Chen</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Cranston</snm>
                  <fnm>KA</fnm>
               </au>
               <au>
                  <snm>Wehe</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Systematic Biology</source>
            <pubdate>2008</pubdate>
            <volume>57</volume>
            <fpage>335</fpage>
            <lpage>346</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">18570030</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B25">
            <title>
               <p>SCaFoS: a tool for selection, concatenation and fusion of sequences for phylogenomics</p>
            </title>
            <aug>
               <au>
                  <snm>Roure</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Rodriguez-Ezpeleta</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Philippe</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>BMC Evolutionary Biology</source>
            <pubdate>2007</pubdate>
            <volume>7</volume>
            <fpage>S2</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1796611</pubid>
                  <pubid idtype="pmpid" link="fulltext">17288575</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B26">
            <title>
               <p>A likelihood look at the supermatrix-supertree controversy</p>
            </title>
            <aug>
               <au>
                  <snm>Ren</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Tanaka</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Yang</snm>
                  <fnm>Z</fnm>
               </au>
            </aug>
            <source>Gene</source>
            <inpress/>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">18502054</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B27">
            <title>
               <p>Rates of molecular evolution linked to life history in flowering plants</p>
            </title>
            <aug>
               <au>
                  <snm>Smith</snm>
                  <fnm>SA</fnm>
               </au>
               <au>
                  <snm>Donoghue</snm>
                  <fnm>MJ</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2008</pubdate>
            <volume>322</volume>
            <fpage>86</fpage>
            <lpage>89</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">18832643</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B28">
            <title>
               <p>The Performance of Several Multiple-Sequence Alignment Programs in Relation to Secondary-Structure Features for an rRNA Sequence</p>
            </title>
            <aug>
               <au>
                  <snm>Hickson</snm>
                  <fnm>RE</fnm>
               </au>
               <au>
                  <snm>Simon</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Perrey</snm>
                  <fnm>SW</fnm>
               </au>
            </aug>
            <source>Mol Biol Evol</source>
            <pubdate>2000</pubdate>
            <volume>17</volume>
            <issue>4</issue>
            <fpage>530</fpage>
            <lpage>539</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">10742045</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B29">
            <title>
               <p>Universal trees based on large combined protein sequence data sets</p>
            </title>
            <aug>
               <au>
                  <snm>Brown</snm>
                  <fnm>JR</fnm>
               </au>
               <au>
                  <snm>Douady</snm>
                  <fnm>CJ</fnm>
               </au>
               <au>
                  <snm>Italia</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Marshall</snm>
                  <fnm>WE</fnm>
               </au>
               <au>
                  <snm>Stanhope</snm>
                  <fnm>MJ</fnm>
               </au>
            </aug>
            <source>Nature Genetics</source>
            <pubdate>2001</pubdate>
            <volume>28</volume>
            <fpage>281</fpage>
            <lpage>285</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">11431701</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B30">
            <title>
               <p>Multiple sequence alignment</p>
            </title>
            <aug>
               <au>
                  <snm>Edgar</snm>
                  <fnm>RC</fnm>
               </au>
               <au>
                  <snm>Batzoglou</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Current Opinion in Structural Biology</source>
            <pubdate>2006</pubdate>
            <volume>16</volume>
            <fpage>368</fpage>
            <lpage>373</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">16679011</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B31">
            <title>
               <p>The archaea monophyly issue: A phylogeny of translational elongation factor G(2) sequences inferred from anoptimized selection of alignment positions</p>
            </title>
            <aug>
               <au>
                  <snm>Cammarano</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Creti</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Sanangelantoni</snm>
                  <fnm>AM</fnm>
               </au>
               <au>
                  <snm>Palm</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>J Mol Evol</source>
            <pubdate>1999</pubdate>
            <volume>49</volume>
            <fpage>524</fpage>
            <lpage>537</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">10486009</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B32">
            <title>
               <p>Effects of sequence alignment and structural domains of ribosomal DNA on phylogeny reconstruction for the protozoan family sarcocystidae</p>
            </title>
            <aug>
               <au>
                  <snm>Mugridge</snm>
                  <fnm>NB</fnm>
               </au>
               <au>
                  <snm>Morrison</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>J&#228;kel</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Heckeroth</snm>
                  <fnm>AR</fnm>
               </au>
               <au>
                  <snm>Tenter</snm>
                  <fnm>AM</fnm>
               </au>
               <au>
                  <snm>Johnson</snm>
                  <fnm>AM</fnm>
               </au>
            </aug>
            <source>Mol Biol Evol</source>
            <pubdate>2000</pubdate>
            <volume>17</volume>
            <issue>12</issue>
            <fpage>1842</fpage>
            <lpage>1853</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">11110900</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B33">
            <title>
               <p>Multiple sequence alignment accuracy and phylogenetic inference</p>
            </title>
            <aug>
               <au>
                  <snm>Ogden</snm>
                  <fnm>TH</fnm>
               </au>
               <au>
                  <snm>Rosenberg</snm>
                  <fnm>MS</fnm>
               </au>
            </aug>
            <source>Systematic Biology</source>
            <pubdate>2006</pubdate>
            <volume>55</volume>
            <fpage>314</fpage>
            <lpage>328</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">16611602</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B34">
            <title>
               <p>Broad phylogenomic sampling improves the resolution of the animal tree of life</p>
            </title>
            <aug>
               <au>
                  <snm>Dunn</snm>
                  <fnm>CW</fnm>
               </au>
               <au>
                  <snm>Hejnol</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Matus</snm>
                  <fnm>DQ</fnm>
               </au>
               <au>
                  <snm>Pang</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Browne</snm>
                  <fnm>WE</fnm>
               </au>
               <au>
                  <snm>Smith</snm>
                  <fnm>SA</fnm>
               </au>
               <au>
                  <snm>Seaver</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Rouse</snm>
                  <fnm>GW</fnm>
               </au>
               <au>
                  <snm>Obst</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Edgecombe</snm>
                  <fnm>GD</fnm>
               </au>
               <au>
                  <snm>Sorensen</snm>
                  <fnm>MV</fnm>
               </au>
               <au>
                  <snm>Haddock</snm>
                  <fnm>SHD</fnm>
               </au>
               <au>
                  <snm>Schmidt-Rhaesa</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Okusu</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Kristensen</snm>
                  <fnm>RM</fnm>
               </au>
               <au>
                  <snm>Wheeler</snm>
                  <fnm>WC</fnm>
               </au>
               <au>
                  <snm>Martindale</snm>
                  <fnm>MQ</fnm>
               </au>
               <au>
                  <snm>Giribet</snm>
                  <fnm>G</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2008</pubdate>
            <volume>452</volume>
            <fpage>745</fpage>
            <lpage>749</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">18322464</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B35">
            <title>
               <p>Profile-profile alignment: a powerful tool for protein structure prediction</p>
            </title>
            <aug>
               <au>
                  <snm>von Ohsen</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Sommer</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Zimmer</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Pac Symp Biocomput</source>
            <pubdate>2003</pubdate>
            <fpage>252</fpage>
            <lpage>263</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid">12603033</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B36">
            <title>
               <p>MUSCLE: multiple sequence alignment with high accuracy and high throughput</p>
            </title>
            <aug>
               <au>
                  <snm>Edgar</snm>
                  <fnm>RC</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2004</pubdate>
            <volume>32</volume>
            <issue>5</issue>
            <fpage>1792</fpage>
            <lpage>1797</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">15034147</pubid>
                  <pubid idtype="pmcid">390337</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B37">
            <title>
               <p>MAFFT version 5: improvement in accuracy of multiple sequence alignment</p>
            </title>
            <aug>
               <au>
                  <snm>Katoh</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Kuma</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Toh</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Miyata</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Research</source>
            <pubdate>2005</pubdate>
            <volume>33</volume>
            <fpage>511</fpage>
            <lpage>518</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">548345</pubid>
                  <pubid idtype="pmpid" link="fulltext">15661851</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B38">
            <title>
               <p>Towards a phylogenetic nomenclature of Tracheophyta</p>
            </title>
            <aug>
               <au>
                  <snm>Cantino</snm>
                  <fnm>PD</fnm>
               </au>
               <au>
                  <snm>Doyle</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Graham</snm>
                  <fnm>SW</fnm>
               </au>
               <au>
                  <snm>Judd</snm>
                  <fnm>WS</fnm>
               </au>
               <au>
                  <snm>Olmstead</snm>
                  <fnm>RG</fnm>
               </au>
               <au>
                  <snm>Soltis</snm>
                  <fnm>DE</fnm>
               </au>
               <au>
                  <snm>Soltis</snm>
                  <fnm>PS</fnm>
               </au>
               <au>
                  <snm>Donoghue</snm>
                  <fnm>MJ</fnm>
               </au>
            </aug>
            <source>Taxon</source>
            <pubdate>2007</pubdate>
            <volume>56</volume>
            <fpage>822</fpage>
            <lpage>846</lpage>
         </bibl>
         <bibl id="B39">
            <title>
               <p>Alternatives to the median absolute deviation</p>
            </title>
            <aug>
               <au>
                  <snm>Rousseeuw</snm>
                  <fnm>PJ</fnm>
               </au>
               <au>
                  <snm>Croux</snm>
                  <fnm>C</fnm>
               </au>
            </aug>
            <source>Journal of the American Statistical Association</source>
            <pubdate>1993</pubdate>
            <volume>88</volume>
            <fpage>1273</fpage>
            <lpage>1283</lpage>
         </bibl>
         <bibl id="B40">
            <title>
               <p>Seq-Gen: an application for the Monte Carlo simulation of DNA sequence evolution along phylogenetic trees</p>
            </title>
            <aug>
               <au>
                  <snm>Rambaut</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Grassly</snm>
                  <fnm>NC</fnm>
               </au>
            </aug>
            <source>Comput Appl Biosci</source>
            <pubdate>1997</pubdate>
            <volume>13</volume>
            <issue>3</issue>
            <fpage>235</fpage>
            <lpage>238</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid">9183526</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B41">
            <title>
               <p>Resolving campanulid phylogeny: some preliminary insights</p>
            </title>
            <aug>
               <au>
                  <snm>Winkworth</snm>
                  <fnm>RC</fnm>
               </au>
               <au>
                  <snm>Lundberg</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Donoghue</snm>
                  <fnm>MJ</fnm>
               </au>
            </aug>
            <source>Taxon</source>
            <pubdate>2008</pubdate>
            <volume>57</volume>
            <fpage>53</fpage>
            <lpage>65</lpage>
         </bibl>
         <bibl id="B42">
            <title>
               <p>Evolution of the Australasian families Alseuosmiaceae, Argophyllaceae, and Phellinaceae</p>
            </title>
            <aug>
               <au>
                  <snm>Karehed</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Lundberg</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Bremer</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Bremer</snm>
                  <fnm>K</fnm>
               </au>
            </aug>
            <source>Systematic Botany</source>
            <pubdate>1999</pubdate>
            <volume>24</volume>
            <fpage>660</fpage>
            <lpage>682</lpage>
         </bibl>
         <bibl id="B43">
            <title>
               <p>The phylogeny of the Asteridae <it>sensu lato </it>based on chloroplast ndhF gene sequences</p>
            </title>
            <aug>
               <au>
                  <snm>Olmstead</snm>
                  <fnm>RG</fnm>
               </au>
               <au>
                  <snm>Kim</snm>
                  <fnm>KJ</fnm>
               </au>
               <au>
                  <snm>Jansen</snm>
                  <fnm>RK</fnm>
               </au>
               <au>
                  <snm>Wagstaff</snm>
                  <fnm>SJ</fnm>
               </au>
            </aug>
            <source>Mol Phylogenet Evol</source>
            <pubdate>2000</pubdate>
            <volume>16</volume>
            <issue>1</issue>
            <fpage>96</fpage>
            <lpage>112</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">10877943</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B44">
            <title>
               <p>A phylogenetic study of the order Asterales using one morphological and three molecular data sets</p>
            </title>
            <aug>
               <au>
                  <snm>Lundberg</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Bremer</snm>
                  <fnm>K</fnm>
               </au>
            </aug>
            <source>International Journal of Plant Sciences</source>
            <pubdate>2003</pubdate>
            <volume>164</volume>
            <fpage>553</fpage>
            <lpage>578</lpage>
         </bibl>
         <bibl id="B45">
            <title>
               <p>A 567-taxon data set for angiosperms: the challenges posed by Bayesian analyses of large data sets</p>
            </title>
            <aug>
               <au>
                  <snm>Soltis</snm>
                  <fnm>DE</fnm>
               </au>
               <au>
                  <snm>Gitzendanner</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Soltis</snm>
                  <fnm>PS</fnm>
               </au>
            </aug>
            <source>International Journal of Plant Sciences</source>
            <pubdate>2007</pubdate>
            <volume>168</volume>
            <fpage>137</fpage>
            <lpage>157</lpage>
         </bibl>
         <bibl id="B46">
            <title>
               <p>The value of sampling anomalous taxa in phylogenetic studies: major clades of the Asteraceae revealed</p>
            </title>
            <aug>
               <au>
                  <snm>Panero</snm>
                  <fnm>JL</fnm>
               </au>
               <au>
                  <snm>Funk</snm>
                  <fnm>VA</fnm>
               </au>
            </aug>
            <source>Mol Phylogenet Evol</source>
            <pubdate>2008</pubdate>
            <volume>47</volume>
            <issue>2</issue>
            <fpage>757</fpage>
            <lpage>782</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">18375151</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B47">
            <title>
               <p>Phylogenetic analysis of dioecy in monocotyledons</p>
            </title>
            <aug>
               <au>
                  <snm>Weiblen</snm>
                  <fnm>GD</fnm>
               </au>
               <au>
                  <snm>Oyama</snm>
                  <fnm>RK</fnm>
               </au>
               <au>
                  <snm>Donoghue</snm>
                  <fnm>MJ</fnm>
               </au>
            </aug>
            <source>The American Naturalist</source>
            <pubdate>2000</pubdate>
            <volume>155</volume>
            <fpage>46</fpage>
            <lpage>58</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">10657176</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B48">
            <title>
               <p>Asterales: introduction and conspectus</p>
            </title>
            <aug>
               <au>
                  <snm>Kadereit</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>The families and genera of vascular plants: flowering plants, eudicots, Asterales</source>
            <publisher>Springer</publisher>
            <editor>Kadereit JW, Jeffrey C</editor>
            <pubdate>2007</pubdate>
            <fpage>1</fpage>
            <lpage>6</lpage>
         </bibl>
         <bibl id="B49">
            <title>
               <p>Phylogeny and biogeography if isophyllous species of <it>Campanula </it>(Campanulaceae) in the Mediterranean area</p>
            </title>
            <aug>
               <au>
                  <snm>Park</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Kovacic</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Liber</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Eddie</snm>
                  <fnm>WMM</fnm>
               </au>
               <au>
                  <snm>Schneeweiss</snm>
                  <fnm>G</fnm>
               </au>
            </aug>
            <source>Systematic Botany</source>
            <pubdate>2006</pubdate>
            <volume>31</volume>
            <fpage>862</fpage>
            <lpage>880</lpage>
         </bibl>
         <bibl id="B50">
            <title>
               <p>Natural delineation, molecular phylogeny and floral evolution in <it>Campanula</it></p>
            </title>
            <aug>
               <au>
                  <snm>Roquet</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Saez</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Aldasoro</snm>
                  <fnm>JJ</fnm>
               </au>
               <au>
                  <snm>Susanna</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Alarcon</snm>
                  <fnm>ML</fnm>
               </au>
               <au>
                  <snm>Garcia-Jacas</snm>
                  <fnm>N</fnm>
               </au>
            </aug>
            <source>Systematic Botany</source>
            <pubdate>2008</pubdate>
            <volume>33</volume>
            <fpage>203</fpage>
            <lpage>217</lpage>
         </bibl>
         <bibl id="B51">
            <title>
               <p>An ITS phylogeny of tribe Senecioneae (Asteraceae) and a new delimitation of <it>Senecio </it>L</p>
            </title>
            <aug>
               <au>
                  <snm>Pelser</snm>
                  <fnm>PB</fnm>
               </au>
               <au>
                  <snm>Nordenstam</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Kadereit</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Watson</snm>
                  <fnm>LE</fnm>
               </au>
            </aug>
            <source>Taxon</source>
            <pubdate>2007</pubdate>
            <volume>56</volume>
            <fpage>1077</fpage>
            <lpage>1114</lpage>
         </bibl>
         <bibl id="B52">
            <title>
               <p>Ancestral chloroplast genome in <it>Mesostigma viride </it>reveals an early branch of green plant evolution</p>
            </title>
            <aug>
               <au>
                  <snm>Lemieux</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Otis</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Turmel</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2000</pubdate>
            <volume>403</volume>
            <fpage>649</fpage>
            <lpage>52</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">10688199</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B53">
            <title>
               <p>The closest living relatives of land plants</p>
            </title>
            <aug>
               <au>
                  <snm>Karol</snm>
                  <fnm>KG</fnm>
               </au>
               <au>
                  <snm>McCourt</snm>
                  <fnm>RM</fnm>
               </au>
               <au>
                  <snm>Cimino</snm>
                  <fnm>MT</fnm>
               </au>
               <au>
                  <snm>Delwiche</snm>
                  <fnm>CF</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2001</pubdate>
            <volume>294</volume>
            <fpage>2351</fpage>
            <lpage>2353</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">11743201</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B54">
            <title>
               <p>Horsetails and ferns are a monophyletic group and the closest living relatives to seed plants</p>
            </title>
            <aug>
               <au>
                  <snm>Pryer</snm>
                  <fnm>KM</fnm>
               </au>
               <au>
                  <snm>Schneider</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Smith</snm>
                  <fnm>AR</fnm>
               </au>
               <au>
                  <snm>Cranfill</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Wolf</snm>
                  <fnm>PG</fnm>
               </au>
               <au>
                  <snm>Hunt</snm>
                  <fnm>JS</fnm>
               </au>
               <au>
                  <snm>Sipes</snm>
                  <fnm>SD</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2001</pubdate>
            <volume>409</volume>
            <fpage>618</fpage>
            <lpage>621</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">11214320</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B55">
            <title>
               <p>The deepest divergences in land plants inferred from phylogenomic evidence</p>
            </title>
            <aug>
               <au>
                  <snm>Qiu</snm>
                  <fnm>YL</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>LB</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Chen</snm>
                  <fnm>ZD</fnm>
               </au>
               <au>
                  <snm>Knoop</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Groth-Malonek</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Dombrovska</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Lee</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Kent</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Rest</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Estabrook</snm>
                  <fnm>GF</fnm>
               </au>
               <au>
                  <snm>Hendry</snm>
                  <fnm>TA</fnm>
               </au>
               <au>
                  <snm>Taylor</snm>
                  <fnm>DW</fnm>
               </au>
               <au>
                  <snm>Testa</snm>
                  <fnm>CM</fnm>
               </au>
               <au>
                  <snm>Ambros</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Crandall-Stotler</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Duff</snm>
                  <fnm>RJ</fnm>
               </au>
               <au>
                  <snm>Stech</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Frey</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Quandt</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Davis</snm>
                  <fnm>CC</fnm>
               </au>
            </aug>
            <source>Proceedings of the National Academy of Sciences of the United States of America</source>
            <pubdate>2006</pubdate>
            <volume>103</volume>
            <fpage>15511</fpage>
            <lpage>15516</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1622854</pubid>
                  <pubid idtype="pmpid" link="fulltext">17030812</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B56">
            <title>
               <p>Immeasurable progress on the Tree of Life</p>
            </title>
            <aug>
               <au>
                  <snm>Donoghue</snm>
                  <fnm>MJ</fnm>
               </au>
            </aug>
            <source>Assembling the Tree of Life</source>
            <publisher>Oxford Press</publisher>
            <editor>Cracraft J, Donoghue MJ</editor>
            <pubdate>2004</pubdate>
            <fpage>548</fpage>
            <lpage>552</lpage>
         </bibl>
         <bibl id="B57">
            <title>
               <p>The origin and early diversification of land plants: a cladistic study</p>
            </title>
            <aug>
               <au>
                  <snm>Kendrick</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Crane</snm>
                  <fnm>PR</fnm>
               </au>
            </aug>
            <publisher>Smithsonian Institution Press</publisher>
            <pubdate>1997</pubdate>
         </bibl>
         <bibl id="B58">
            <title>
               <p>The first complete chloroplast genome sequence of a lycophyte, <it>Huperzia lucidula </it>(Lycopodiaceae)</p>
            </title>
            <aug>
               <au>
                  <snm>Wolf</snm>
                  <fnm>PG</fnm>
               </au>
               <au>
                  <snm>Karol</snm>
                  <fnm>KG</fnm>
               </au>
               <au>
                  <snm>Mandoli</snm>
                  <fnm>DF</fnm>
               </au>
               <au>
                  <snm>Kuehl</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Arumuganathan</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Ellis</snm>
                  <fnm>MW</fnm>
               </au>
               <au>
                  <snm>Mishler</snm>
                  <fnm>BD</fnm>
               </au>
               <au>
                  <snm>Kelch</snm>
                  <fnm>DG</fnm>
               </au>
               <au>
                  <snm>Olmstead</snm>
                  <fnm>RG</fnm>
               </au>
               <au>
                  <snm>Boore</snm>
                  <fnm>JL</fnm>
               </au>
            </aug>
            <source>Gene</source>
            <pubdate>2005</pubdate>
            <volume>350</volume>
            <fpage>117</fpage>
            <lpage>128</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">15788152</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B59">
            <title>
               <p>The chloroplast genome from a lycophyte (microphyllophyte), <it>Selaginella uncinata</it>, has a unique inversion, transpositions and many gene losses</p>
            </title>
            <aug>
               <au>
                  <snm>Tsuji</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Ueda</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Nishiyama</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Hasebe</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Yoshikawa</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Konagaya</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Nishiuchi</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Yamaguchi</snm>
                  <fnm>K</fnm>
               </au>
            </aug>
            <source>Journal of Plant Research</source>
            <pubdate>2007</pubdate>
            <volume>120</volume>
            <fpage>281</fpage>
            <lpage>290</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">17297557</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B60">
            <title>
               <p>Taxonomic sampling, phylogenetic accuracy, and investigator bias</p>
            </title>
            <aug>
               <au>
                  <snm>Hillis</snm>
                  <fnm>DM</fnm>
               </au>
            </aug>
            <source>Systematic Biology</source>
            <pubdate>1998</pubdate>
            <volume>47</volume>
            <fpage>3</fpage>
            <lpage>8</lpage>
            <xrefbib>
               <pubid idtype="pmpid">12064238</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B61">
            <title>
               <p>Increased taxon sampling greatly reduces phylogenetic error</p>
            </title>
            <aug>
               <au>
                  <snm>Zwickl</snm>
                  <fnm>DJ</fnm>
               </au>
               <au>
                  <snm>Hillis</snm>
                  <fnm>DM</fnm>
               </au>
            </aug>
            <source>Systematic Biology</source>
            <pubdate>2002</pubdate>
            <volume>51</volume>
            <fpage>588</fpage>
            <lpage>598</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">12228001</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B62">
            <title>
               <p>Ferns diversified in the shadows of angiosperms</p>
            </title>
            <aug>
               <au>
                  <snm>Schneider</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Shuettpelz</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Pryer</snm>
                  <fnm>KM</fnm>
               </au>
               <au>
                  <snm>Cranfill</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Magallon</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Lupia</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2004</pubdate>
            <volume>428</volume>
            <fpage>553</fpage>
            <lpage>557</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">15058303</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B63">
            <title>
               <p>A brief history of seed size</p>
            </title>
            <aug>
               <au>
                  <snm>Moles</snm>
                  <fnm>AT</fnm>
               </au>
               <au>
                  <snm>Ackerly</snm>
                  <fnm>DD</fnm>
               </au>
               <au>
                  <snm>Webb</snm>
                  <fnm>CO</fnm>
               </au>
               <au>
                  <snm>Tweddle</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Dickie</snm>
                  <fnm>JB</fnm>
               </au>
               <au>
                  <snm>Westoby</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2005</pubdate>
            <volume>307</volume>
            <fpage>576</fpage>
            <lpage>580</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">15681384</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B64">
            <title>
               <p>Correlated evolution of genome size and seed mass</p>
            </title>
            <aug>
               <au>
                  <snm>Beaulieu</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Moles</snm>
                  <fnm>AT</fnm>
               </au>
               <au>
                  <snm>Leitch</snm>
                  <fnm>IJ</fnm>
               </au>
               <au>
                  <snm>Bennett</snm>
                  <fnm>MD</fnm>
               </au>
               <au>
                  <snm>Dickie</snm>
                  <fnm>JB</fnm>
               </au>
               <au>
                  <snm>Knight</snm>
                  <fnm>CA</fnm>
               </au>
            </aug>
            <source>New Phytologist</source>
            <pubdate>2007</pubdate>
            <volume>173</volume>
            <fpage>422</fpage>
            <lpage>437</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">17204088</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B65">
            <title>
               <p>Relationships among ecologically important dimensions of plant trait variation in seven neotropical forests</p>
            </title>
            <aug>
               <au>
                  <snm>Wright</snm>
                  <fnm>IJ</fnm>
               </au>
               <au>
                  <snm>Ackerly</snm>
                  <fnm>DD</fnm>
               </au>
               <au>
                  <snm>Bongers</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Harms</snm>
                  <fnm>KE</fnm>
               </au>
               <au>
                  <snm>Ibarra-Manriquez</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Martinez-Ramos</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Mazer</snm>
                  <fnm>SJ</fnm>
               </au>
               <au>
                  <snm>Muller-Landau</snm>
                  <fnm>HC</fnm>
               </au>
               <au>
                  <snm>Paz</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Pitman</snm>
                  <fnm>NCA</fnm>
               </au>
               <au>
                  <snm>Poorter</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Silman</snm>
                  <fnm>MR</fnm>
               </au>
               <au>
                  <snm>Vriesendorp</snm>
                  <fnm>CF</fnm>
               </au>
               <au>
                  <snm>Webb</snm>
                  <fnm>CO</fnm>
               </au>
               <au>
                  <snm>Westoby</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Wright</snm>
                  <fnm>SJ</fnm>
               </au>
            </aug>
            <source>Annals of Botany</source>
            <pubdate>2007</pubdate>
            <volume>99</volume>
            <fpage>1003</fpage>
            <lpage>1015</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">16595553</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B66">
            <title>
               <p>Preserving the evolutionary potential of floras in biodiversity hotspots</p>
            </title>
            <aug>
               <au>
                  <snm>Forest</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Grenyer</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Rouget</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Davies</snm>
                  <fnm>TJ</fnm>
               </au>
               <au>
                  <snm>Cowling</snm>
                  <fnm>RM</fnm>
               </au>
               <au>
                  <snm>Faith</snm>
                  <fnm>DP</fnm>
               </au>
               <au>
                  <snm>Balmford</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Manning</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Proche</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Bank</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Reeves</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Hedderson</snm>
                  <fnm>TAJ</fnm>
               </au>
               <au>
                  <snm>Savolainen</snm>
                  <fnm>V</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2007</pubdate>
            <volume>445</volume>
            <fpage>757</fpage>
            <lpage>760</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">17301791</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B67">
            <title>
               <p>The relevance of phylogeny to studies of global change</p>
            </title>
            <aug>
               <au>
                  <snm>Edwards</snm>
                  <fnm>EJ</fnm>
               </au>
               <au>
                  <snm>Still</snm>
                  <fnm>CJ</fnm>
               </au>
               <au>
                  <snm>Donoghue</snm>
                  <fnm>MJ</fnm>
               </au>
            </aug>
            <source>Trends Ecol Evol</source>
            <pubdate>2007</pubdate>
            <volume>22</volume>
            <issue>5</issue>
            <fpage>243</fpage>
            <lpage>249</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">17296242</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B68">
            <title>
               <p>Phylomatic: tree assembly for applied phylogenetics</p>
            </title>
            <aug>
               <au>
                  <snm>Webb</snm>
                  <fnm>CO</fnm>
               </au>
               <au>
                  <snm>Donoghue</snm>
                  <fnm>MJ</fnm>
               </au>
            </aug>
            <source>Molecular Ecology Notes</source>
            <pubdate>2005</pubdate>
            <volume>5</volume>
            <fpage>181</fpage>
            <lpage>183</lpage>
         </bibl>
      </refgrp>
   </bm>
</art>
