<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
	<ui>gb-2005-6-5-r41</ui>
	<ji>GBJ</ji>
	<fm>
		<dochead>Research</dochead>
		<bibl>
			<title>
				<p>Genome-scale evidence of the nematode-arthropod clade</p>
			</title>
			<aug>
				<au id="A1">
					<snm>Dopazo</snm>
					<fnm>Hern&#225;n</fnm>
					<insr iid="I1"/>
					<email>hdopazo@ochoa.fib.es</email>
				</au>
				<au id="A2" ca="yes">
					<snm>Dopazo</snm>
					<fnm>Joaqu&#237;n</fnm>
					<insr iid="I2"/>
					<insr iid="I3"/>
					<email>jdopazo@ochoa.fib.es</email>
				</au>
			</aug>
			<insg>
				<ins id="I1">
					<p>Pharmacogenomics and Comparative Genomics Unit, Bioinformatics Department, Centro de Investigaci&#243;n Pr&#237;ncipe Felipe, Autopista del Saler 16, 46013 Valencia, Spain</p>
				</ins>
				<ins id="I2">
					<p>Functional Genomics Unit, Bioinformatics Department, Centro de Investigaci&#243;n Pr&#237;ncipe Felipe, Autopista del Saler 16, 46013 Valencia, Spain</p>
				</ins>
				<ins id="I3">
					<p>Functional Genomics Node, INB, Centro de Investigaci&#243;n Pr&#237;ncipe Felipe, Autopista del Saler 16, 46013 Valencia, Spain</p>
				</ins>
			</insg>
			<source>Genome Biology</source>
			<issn>1465-6906</issn>
			<pubdate>2005</pubdate>
			<volume>6</volume>
			<issue>5</issue>
			<fpage>R41</fpage>
			<url>http://genomebiology.com/2005/6/5/R41</url>
			<xrefbib>
				<pubidlist><pubid idtype="pmpid">15892869</pubid><pubid idtype="doi">10.1186/gb-2005-6-5-r41</pubid>
				</pubidlist></xrefbib>
		</bibl>
		<history>
			<rec>
				<date>
					<day>7</day>
					<month>3</month>
					<year>2005</year>
				</date>
			</rec>
			<acc>
				<date>
					<day>6</day>
					<month>4</month>
					<year>2005</year>
				</date>
			</acc>
			<pub>
				<date>
					<day>28</day>
					<month>4</month>
					<year>2005</year>
				</date>
			</pub>
		</history>
		<cpyrt>
			<year>2005</year>
			<collab>Dopazo and Dopazo; licensee BioMed Central Ltd.</collab>
			<note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
		</cpyrt>
		<shorttitle>
			<p>Genome-scale evidence for the nematodes-arthropods clade</p>
		</shorttitle>
		<shortabs>
			<p>The most extensive phylogenetic analysis carried out to date, including 11 complete genomes, is shown to support the Ecdysozoa hypothesis in the open-ended debate of the Coelomata-Ecdysozoa evolutionary problem.</p>
		</shortabs>
		<abs>
			<sec>
				<st>
					<p>Abstract</p>
				</st>
				<sec>
					<st>
						<p>Background</p>
					</st>
					<p>The issue of whether coelomates form a single clade, the Coelomata, or whether all animals that moult an exoskeleton (such as the coelomate arthropods and the pseudocoelomate nematodes) form a distinct clade, the Ecdysozoa, is the most puzzling issue in animal systematics and a major open-ended subject in evolutionary biology. Previous single-gene and genome-scale analyses designed to resolve the issue have produced contradictory results. Here we present the first genome-scale phylogenetic evidence that strongly supports the Ecdysozoa hypothesis.</p>
				</sec>
				<sec>
					<st>
						<p>Results</p>
					</st>
					<p>Through the most extensive phylogenetic analysis carried out to date, the complete genomes of 11 eukaryotic species have been analyzed in order to find homologous sequences derived from 18 human chromosomes. Phylogenetic analysis of datasets showing an increased adjustment to equal evolutionary rates between nematode and arthropod sequences produced a gradual change from support for Coelomata to support for Ecdysozoa. Transition between topologies occurred when fast-evolving sequences of <it>Caenorhabditis elegans </it>were removed. When chordate, nematode and arthropod sequences were constrained to fit equal evolutionary rates, the Ecdysozoa topology was statistically accepted whereas Coelomata was rejected.</p>
				</sec>
				<sec>
					<st>
						<p>Conclusions</p>
					</st>
					<p>The reliability of a monophyletic group clustering arthropods and nematodes was unequivocally accepted in datasets where traces of the long-branch attraction effect were removed. This is the first phylogenomic evidence to strongly support the 'moulting clade' hypothesis.</p>
				</sec>
			</sec>
		</abs>
	</fm>
	<meta>
		<classifications>
			<classification type="BMC" subtype="man_spc_id" id="30010008">Evolution</classification>
			<classification type="BMC" subtype="man_spc_id" id="30010010">Genome studies</classification>
		</classifications>
	</meta>
	<bdy>
		<sec>
			<st>
				<p>Background</p>
			</st>
			<p>Understanding the evolution of the great diversity of life is a major goal in biology. Despite decades of effort by systematists, evolutionary relationships between major groups of animals still remain unresolved. The inability to cluster taxa in monophyletic groups was originally due to the lack of morphological synapomorphies among phyla. An alternative solution came from embryology, and animal systematics relied on criteria based on increasing complexity of body plan <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. Thus, the traditional metazoan phylogeny clusters animals from the simplest basal forms with loose tissue organization (for example, sponges) to those having two germ layers (dipoblastic animals, for example cnidarians), and those developing from three germ layers (triploblastic animals, such as the Bilateria - animals with bilateral symmetry). Bilateral animals were ordered into those lacking a coelom (the acoelomates, such as platyhelminths), those with a false coelom (the pseudocoelomates, such as nematodes), and, finally, those animals with a true coelom (the Coelomata, such as the arthropods and chordates). This comparative developmental theory of animal evolution dominated animal systematics for more than 50 years <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>.</p>
			<p>Subsequently, molecular systematic studies based on small subunit ribosomal RNA (18S rRNA) sequences began to undermine this scenario <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. Put briefly, the new animal phylogeny suggested that clades such as acoelomates and pseudocoelomates are artificial systematic groups. Moreover, although the coelomate designation still remains, this clade now contains two new lineages: the lophotrochozoa and the Ecdysozoa <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>. The 'Ecdysozoa hypothesis' postulated that all phyla composed of animals that grow by moulting a cuticular exoskeleton (such as arthropods and nematodes) originate from a common ancestor, thus forming a distinct clade. Thus, under the Ecdysozoa hypothesis arthropods are genetically more closely related to nematodes than to chordates. Under the 'Coelomata hypothesis' of animal evolution, however, arthropods are more closely related to chordates than to nematodes.</p>
			<p>At the heart of this systematic debate, a technical discussion emerged surrounding the long-branch attraction effect (LBAE), taxon sampling, and the number of characters used. Subsequent molecular and morphological studies have been carried out, but the controversy remains unresolved and is presented as a multifurcation <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>. Although the use of different single-gene sequences supported the Ecdysozoa hypothesis <abbrgrp><abbr bid="B5">5</abbr><abbr bid="B6">6</abbr><abbr bid="B7">7</abbr><abbr bid="B8">8</abbr><abbr bid="B9">9</abbr><abbr bid="B10">10</abbr><abbr bid="B11">11</abbr></abbrgrp>, the analysis of dozens to hundreds of concatenated sequences supported the Coelomata clade <abbrgrp><abbr bid="B12">12</abbr><abbr bid="B13">13</abbr><abbr bid="B14">14</abbr><abbr bid="B15">15</abbr></abbrgrp>. Indeed, with an element of caution, we favored the Coelomata hypothesis in a previous whole-genome study designed to determine the number of characters needed to obtain a reliable topology <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>. The gene-based Ecdysozoa versus genome-scale Coelomata alternative hypotheses were recently challenged by two phylogenomics studies that partly supported the Ecdysozoa clade <abbrgrp><abbr bid="B17">17</abbr></abbrgrp> and a paraphyletic Coelomata group <abbrgrp><abbr bid="B18">18</abbr></abbrgrp>. Although it is generally accepted that phylogenetic analysis of whole genomes has begun to supplement (and in some cases improve on) phylogenetic studies previously carried out with one or a few genes <abbrgrp><abbr bid="B19">19</abbr></abbrgrp>, all genome-wide phylogenetic studies have failed to support the proposed new animal phylogeny.</p>
			<p>Here we present the first phylogenomic evidence that strongly supports the Ecdysozoa hypothesis and at the same time demonstrates that the LBAE biases the position of <it>Caenorhabditis elegans </it>in the phylogenetic tree. We show that by using a large number of characters and choosing a phylogenetic weighted scheme of outgroups to test the constancy of evolutionary rates, the new animal phylogeny can be statistically supported. Moreover, we show that both the Coelomata and the Ecdysozoa hypotheses can be supported with the highest statistical confidence when genomic datasets are ordered according to a gradually increased adjustment to equal evolutionary rates between <it>C. elegans </it>and <it>Drosophila melanogaster </it>sequences. In between, neither Ecdysozoa nor Coelomata were sufficiently supported. To our knowledge, this is the most extensive phylogenomic analysis carried out to date in the number of characters and the number of eukaryotic species involved.</p>
		</sec>
		<sec>
			<st>
				<p>Results</p>
			</st>
			<sec>
				<st>
					<p>Dataset properties</p>
				</st>
				<p>Sequences homologous to human exon sequences were derived from filtering tblastn search results on 11 complete eukaryotic genomes. Because the most-criticized issue in resolving the Ecdysozoa-Coelomata problem seems to be the LBAE produced by the nematode species, we decided to rearrange homologous sequences in a series of nested datasets that gradually reduced LBAE. Aligned homologous sequences were arranged in eight datasets (<it>D</it><sub><it>i</it></sub>) and concatenated in their corresponding matrices (<it>M</it><sub><it>i</it></sub>) (see Materials and methods), such that as suffix <it>i </it>increases, datasets and matrices comprise a smaller number of homologous sequences showing more similar relative branch lengths (RBL) between <it>C. elegans </it>(<it>L</it><sub><it>Ce</it></sub>) and <it>D. melanogaster </it>(<it>L</it><sub><it>Dm</it></sub>) (Figure <figr fid="F1">1</figr>). RBL are relative human distances.</p>
				<fig id="F1">
					<title>
						<p>Figure 1</p>
					</title>
					<caption>
						<p>Description of the dataset</p>
					</caption>
					<text>
						<p>Description of the dataset. <it>D</it><sub><it>i </it></sub>datasets are arranged according to a gradual decrease in the parameter &#948;. &#948; controls the inclusion of each homologous exon sequence in the dataset by defining margins above and below (<it>y </it>= <it>x </it>&#177; &#948;) a diagonal line (<it>y </it>= <it>x</it>) that constrains clock-like behavior in the evolution of <it>C. elegans </it>and <it>D. melanogaster </it>sequences. <it>L</it><sub><it>Ce </it></sub>and <it>L</it><sub><it>Dm </it></sub>are the respective relative branch lengths of <it>C. elegans </it>and <it>D. melanogaster </it>using <it>H. sapiens </it>as reference. Comma-separated values represent the number of homologous sequences and characters aligned in the <it>M</it><sub><it>i </it></sub>concatenated matrix. <it>D</it><sub><it>i </it></sub>contains all the sequences without any constraint of evolutionary rates. Dotted black and red lines represent mean <graphic file="gb-2005-6-5-r41-i1.gif"/>, <graphic file="gb-2005-6-5-r41-i2.gif"/> and median values, respectively.</p>
					</text>
					<graphic file="gb-2005-6-5-r41-1"/>
				</fig>
				<p>To quantify the effect on the RBL of <it>C. elegans </it>of concatenating alternative homologous sequences, maximum likelihood (ML) estimates of branch length were obtained using the star-like unrooted tree transformation for each dataset (see Materials and methods). Figure <figr fid="F2">2a</figr> shows that the RBL of <it>C. elegans </it>over <it>D. melanogaster </it>decreased by approximately 30% continuously from dataset <it>D</it><sub>1 </sub>to <it>D</it><sub>8</sub>. To test whether the gradual decrease in <it>C. elegans </it>branch length was enough to produce statistical confidence on equal evolutionary rates between the nematode and the arthropod sequences, relative rate tests using two outgroup schemes were assayed on concatenated sequences (see Materials and methods). Figure <figr fid="F2">2b</figr> shows that using <it>Saccharomyces cerevisae </it>as the unique outgroup species (OUG1), all the individual tests on the eight matrices failed to detect statistical deviations (at the 5% level family-wise) between sequences. Only when the phylogenetically weighted scheme of outgroup species (OUG2) was used did the relative rate test detect significant deviation of clock behavior from <it>D</it><sub>1 </sub>to <it>D</it><sub>5 </sub>datasets. We are therefore confident that the arthropod and nematode concatenated sequences of the <it>M</it><sub>6</sub>, <it>M</it><sub>7</sub>, and <it>M</it><sub>8 </sub>matrices meet the desired clock-like conditions to test the Coelomata and Ecdysozoa hypotheses and exclude any artifacts derived from a possible LBAE. This result supports previous work suggesting that the genetic distance between ingroup and outgroup modifies the power of the relative rate test <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>.</p>
				<fig id="F2">
					<title>
						<p>Figure 2</p>
					</title>
					<caption>
						<p>Relative rate test</p>
					</caption>
					<text>
						<p>Relative rate test. <b>(a) </b>Relative <it>C. elegans </it>branch lengths derived from each one of the eight <it>M</it><sub><it>i </it></sub>matrices. Maximum likelihood estimates are expressed as relative distance units of <it>D. melanogaster</it>. <b>(b) </b>Relative rate test probability values evaluated at the 5% level family-wise (red line 1.7%). OUG1, <it>S. cerevisae</it>; OUG2, phylogenetic weighted scheme using <it>S. cerevisae</it>, <it>A. thaliana</it>, <it>O. sativa </it>and <it>P. falciparum </it>as outgroup species.</p>
					</text>
					<graphic file="gb-2005-6-5-r41-2"/>
				</fig>
				<p>To test whether concatenated matrices carry sufficient phylogenetic signal, the ML mapping method was used. The compound posterior probability point (<it>P</it>) for all the possible quartets of each <it>M</it><sub><it>i </it></sub>matrix could be placed, with almost equivalent values (approximately 33%), inside the corner areas of the equilateral triangle probability surface (see <supplr sid="S1">Additional data file 1</supplr>). Thus, concatenated matrices derived from selecting a different number of homologous sequences contained sufficient phylogenetic signal to represent topologies as strictly bifurcating trees. Finally, using the Akaike information criterion (AIC) <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>, the statistical test of the best-fit model of sequence evolution for each dataset was selected from six different alternatives (see Materials and methods). As all the models are not nested and share the same number of parameters, the best one was that with the greatest log likelihood result. The WAG amino-acid replacement matrix <abbrgrp><abbr bid="B22">22</abbr></abbrgrp> adjusted for frequencies (+<it>F</it>), rate heterogeneity (+&#915;) and invariable sites (+<it>I</it>) was the best evolutionary model chosen for all the datasets. Moreover, model-fit-data values followed the same inequality independently of the dataset (WAG <abbrgrp><abbr bid="B22">22</abbr></abbrgrp> &gt; VT <abbrgrp><abbr bid="B23">23</abbr></abbrgrp> &gt; BLOSUM62 <abbrgrp><abbr bid="B24">24</abbr></abbrgrp> &gt; JTT <abbrgrp><abbr bid="B25">25</abbr></abbrgrp> &gt; PAM <abbrgrp><abbr bid="B26">26</abbr></abbrgrp> &gt; mtREV24 <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>), suggesting that the best models were those that consider more distantly related amino-acid sequences.</p>
			</sec>
			<sec>
				<st>
					<p>The clade Coelomata disappears under clock conditions</p>
				</st>
				<p>Distance and ML phylogenetic methods were used on all the datasets (see Materials and methods). Figure <figr fid="F3">3</figr> shows phylogenetic reconstructions and statistical support for the two extreme conditions of the nested datasets. Whereas the <it>M</it><sub>1 </sub>matrix supported the Coelomata tree with the highest statistical confidence, <it>M</it><sub>8 </sub>showed the same result for the Ecdysozoa tree. Thus, by decreasing the RBL of <it>C. elegans</it>, the statistical support switched from the Coelomata to the Ecdysozoa hypothesis. Figure <figr fid="F4">4</figr> shows that, whichever phylogenetic method was used, <it>C. elegans </it>bootstrap support between datasets and topologies changed in agreement with the gradual RBL decrement. Specifically, using <it>M</it><sub>1 </sub>and <it>M</it><sub>8 </sub>(the matrices showing the most extreme evolutionary rate conditions for <it>C. elegans </it>and <it>D. melanogaster </it>sequences - from a clock-absent to the most adjusted behavior), the statistical support moved from Coelomata to Ecdysozoa. The same occurred with <it>M</it><sub>2 </sub>and <it>M</it><sub>7</sub>. Alternatively, using <it>M</it><sub>3 </sub>and <it>M</it><sub>6</sub>, only one of the two distance and ML methods (Figure <figr fid="F4">4a,b</figr>) provided sufficient support (90% or more) to the hypothesis. Finally, using <it>M</it><sub>4 </sub>and <it>M</it><sub>5</sub>, only one distance method supported Coelomata and Ecdysozoa with confidence. Given that datasets differed principally in the RBL of <it>C. elegans </it>over <it>D. melanogaster</it>, the gradual change in topology strongly favors an LBAE between <it>C. elegans </it>and the more basal species. To test whether a paired-sites test <abbrgrp><abbr bid="B28">28</abbr></abbrgrp> supports the bootstrap conclusions, Shimodaira-Hasegawa (SH) and expected-likelihood weight (ELW) tests were evaluated on the datasets (see Materials and methods).</p>
				<fig id="F3">
					<title>
						<p>Figure 3</p>
					</title>
					<caption>
						<p>Phylogenetic trees</p>
					</caption>
					<text>
						<p>Phylogenetic trees. Trees derived from <it>M</it><sub>1 </sub>and <it>M</it><sub>8 </sub>datasets, respectively support (a) the Coelomata and (b) the Ecdysozoa hypothesis. From left to right or top to bottom, values besides nodes show the maximum likelihood reliability values of the quartet-puzzling tree and bootstrap values using maximum likelihood, least squares, and neighbor-joining methods, respectively. Values in red show the support for (a) Coelomata and (b) Ecdysozoa nodes. Red branches display distances between <it>C. elegans </it>and <it>D. melanogaster</it>. Smaller trees are minimal representations of both hypothesis.</p>
					</text>
					<graphic file="gb-2005-6-5-r41-3"/>
				</fig>
				<fig id="F4">
					<title>
						<p>Figure 4</p>
					</title>
					<caption>
						<p>Bootstrap and reliability support for alternative topologies</p>
					</caption>
					<text>
						<p>Bootstrap and reliability support for alternative topologies. Bootstrap and reliability support (50% majority consensus rule) for Coelomata (C) and Ecdysozoa (E) hypotheses derived from each one of the eight <it>M</it><sub><it>i </it></sub>matrices. <b>(a) </b>Distance methods. LS, least squares; NJ, neighbor joining. <b>(b) </b>Maximum likelihood, using PHYLIP (ph) and PUZZLE (pz). Highly supported trees were considered those with values above 90% (dotted red line).</p>
					</text>
					<graphic file="gb-2005-6-5-r41-4"/>
				</fig>
				<p>Figure <figr fid="F5">5</figr> shows the assessment of paired-sites tests for the two competing trees on all the datasets. Paired-sites tests supporting topologies (<it>p </it>&gt; 0.05) changed almost gradually on datasets. Figure <figr fid="F5">5a</figr> and <figr fid="F5">5b</figr> show that the SH test is more conservative than the ELW <abbrgrp><abbr bid="B29">29</abbr></abbrgrp>. Using matrices <it>M</it><sub>1 </sub>and <it>M</it><sub>2</sub>, both tests strongly rejected the Ecdysozoa hypothesis, whereas <it>M</it><sub>6</sub>, <it>M</it><sub>7</sub>, and <it>M</it><sub>8 </sub>rejected the Coelomata tree. Interestingly, datasets between them did not reject any topology with sufficient statistical evidence. We can conclude that by decreasing the RBL of <it>C. elegans </it>over <it>D. melanogaster </it>by around 13% (Figure <figr fid="F2">2a</figr>) the LBAE favoring the Coelomata hypothesis disappears and we can confirm that under strict conditions of clock-like behavior, the Coelomata hypothesis was strongly rejected by paired-sites tests and bootstrap support.</p>
				<fig id="F5">
					<title>
						<p>Figure 5</p>
					</title>
					<caption>
						<p>Paired-sites tests</p>
					</caption>
					<text>
						<p>Paired-sites tests. <it>p</it>-values inferred from paired-sites tests considering Coelomata (C) and Ecdysozoa (E) hypotheses at the 5% level (red line) for all the datasets. (<b>a</b>) Shimodaira-Hasegawa test (SH); (<b>b</b>) expected-likelihood weight method (ELW).</p>
					</text>
					<graphic file="gb-2005-6-5-r41-5"/>
				</fig>
				<p>To test if the shortness of the evolutionary distances between <it>C. elegans </it>and <it>D. melanogaster </it>resulting from the above filtering method biased topology over the common ancestry of arthropods and nematodes, we searched for chordate, arthropod, and nematode sequences showing clock-like behavior between them. To increase the probability of finding sequences to fit the criteria, we focused on sequences from the most closely related chordate to the molting species, that is, the ascidian <it>Ciona intestinalis</it>. Only 14 exon sequences met the above criteria. A relative rate test showed that the probability of a perfect clock-like behavior was <it>p </it>= 0.515 for <it>C. elegans </it>and <it>D. melanogaster</it>, <it>p </it>= 0.308 for <it>C. intestinalis </it>and <it>D. melanogaster </it>and <it>p </it>= 0.712 for <it>C. intestinalis </it>and <it>C. elegans</it>. The ML mapping method showed that the concatenation of all the 810 characters carried sufficient phylogenetic signal in the matrix to represent a strictly bifurcating tree (see Additional data file 2). Despite the reduced number of characters, phylogenetic analysis showed significant support for the Ecdysozoa hypothesis. Using distance and ML methods, bootstrap values reached 97%. Moreover, the Ecdysozoa hypothesis was accepted with a probability of <it>p </it>= 1.00 and <it>p </it>= 0.997 when SH and ELW paired-sites tests, respectively, were performed. Conversely, the Coelomata hypothesis was rejected at <it>p </it>= 0.006 and <it>p </it>= 0.0023, respectively.</p>
			</sec>
			<sec>
				<st>
					<p>The clade Coelomata disappears by removing fast-evolving sequences of <it>C. elegans</it></p>
				</st>
				<p>In order to discard a probable biased selection of exon sequences favoring the Ecdysozoa hypothesis, two additional matrices were built by removing from the original dataset (<it>D</it><sub>1</sub>) the exons in which the <it>C. elegans </it>sequences evolved at a faster rate. Figure <figr fid="F6">6</figr> shows that by removing the fastest 15% of total exon sequences the reliability of the Coelomata hypothesis is reduced from 100% to 78%. Moreover, when the fastest 30% of all exons were removed, the topology changes to Ecdysozoa with 90% confidence level. The change in topology in parallel with the reduction of the <it>C. elegans </it>branch length points to the LBAE as the main obstacle to obtaining the true phylogenetic relationship between chordates, arthropods and nematodes. We conclude that the Ecdysozoa hypothesis does not depend on adjusting a particular set of homologous exon sequences to clock-like behavior.</p>
				<fig id="F6">
					<title>
						<p>Figure 6</p>
					</title>
					<caption>
						<p>Removing fast-evolving sequences</p>
					</caption>
					<text>
						<p>Removing fast-evolving sequences. Exon sequences of <it>C. elegans </it>showing <it>L</it><sub><it>Ce </it></sub>&#8805; <graphic file="gb-2005-6-5-r41-i1.gif"/> = 4.06 represent 15% of the total exon. When these faster exons were removed (above blue line), support for the Coelomata topology was reduced from the original 100% to 85%. Furthermore, when 28% of the faster exons were deleted (red line), Ecdysozoa is recovered with 90% statistical support. This suggests that LBAE is the main problem in obtaining the Ecdysozoa tree. Blue line, <graphic file="gb-2005-6-5-r41-i1.gif"/> = 4.06; red line, <graphic file="gb-2005-6-5-r41-i2.gif"/> = 2.66.</p>
					</text>
					<graphic file="gb-2005-6-5-r41-6"/>
				</fig>
			</sec>
		</sec>
		<sec>
			<st>
				<p>Discussion</p>
			</st>
			<p>There are many reasons why the Coelomata-Ecdysozoa problem should be considered the most puzzling problem in animal systematics and a major open-ended subject in evolutionary biology. The monophyly of the Ecdysozoa group, strongly championed by the evo-devo community <abbrgrp><abbr bid="B30">30</abbr></abbrgrp>, was originally deduced, and continually recovered, through the analysis of different single-gene sequences <abbrgrp><abbr bid="B3">3</abbr><abbr bid="B5">5</abbr><abbr bid="B6">6</abbr><abbr bid="B8">8</abbr><abbr bid="B9">9</abbr><abbr bid="B10">10</abbr><abbr bid="B11">11</abbr></abbrgrp>, sometimes in combination with morphological characters <abbrgrp><abbr bid="B7">7</abbr></abbrgrp>. There is need for caution, however, as previous studies had shown that individual genes are not sufficient to estimate the correct genome phylogeny <abbrgrp><abbr bid="B19">19</abbr><abbr bid="B31">31</abbr></abbrgrp>. Furthermore, the reliability of some of the phylogenetic markers used to derive Ecdysozoa has been seriously questioned <abbrgrp><abbr bid="B32">32</abbr><abbr bid="B33">33</abbr></abbrgrp>. Those that consider the Ecdysozoa hypothesis as more plausible insist that the Coelomata topology is an artifact of LBAE, derived from the fact that nematode genomes, particularly that of <it>C. elegans</it>, evolve at higher rates <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>, and are consequently displaced to a more basal position.</p>
			<p>On the other hand, as phylogenetic reconstruction assumes that sampled data are representative of the whole genome from which they are drawn <abbrgrp><abbr bid="B34">34</abbr></abbrgrp>, there is increasing agreement to consider genome-scale analysis more accurate than single-gene analysis when deciding between conflicting topologies <abbrgrp><abbr bid="B19">19</abbr><abbr bid="B31">31</abbr></abbrgrp>. Conflict derives from the fact that all previous genome-wide phylogenetic attempts to test the hypothesis have failed to confirm the 'moulting group' - the Ecdysozoa - as a clade. All phylogenomic analyses carried out to date favor the Coelomata hypothesis with the highest statistical support <abbrgrp><abbr bid="B12">12</abbr><abbr bid="B13">13</abbr><abbr bid="B14">14</abbr><abbr bid="B15">15</abbr><abbr bid="B16">16</abbr></abbrgrp>. Furthermore, the Coelomata tree has shown to be robust to criticism deriving from LBAE <abbrgrp><abbr bid="B12">12</abbr><abbr bid="B14">14</abbr><abbr bid="B15">15</abbr><abbr bid="B16">16</abbr></abbrgrp> and nematode species inclusion <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>. Those that consider the Coelomata hypothesis to be more appropriate insist that longer sequences, rather than extensive taxon sampling <abbrgrp><abbr bid="B35">35</abbr></abbrgrp>, will more effectively improve the accuracy of phylogenetic inference <abbrgrp><abbr bid="B14">14</abbr><abbr bid="B15">15</abbr><abbr bid="B36">36</abbr><abbr bid="B37">37</abbr></abbrgrp>, and emphasize that an inevitable trade-off exists between the number of characters and the number of species used in the study <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>.</p>
			<p>We show here that by using the fast-evolving nematode <it>C. elegans </it>the Ecdysozoa can be recovered using genome-scale phylogenetic analysis. Our analysis has been performed over the largest number of eukaryotic genomes and over the largest number of amino-acid residues ever used to test the hypothesis. The major differences from previous genomic approaches are threefold. First, we used a large number of short conserved sequences (around 50 amino acids long) derived from human homologous exon sequences. Only exon sequences derived from eight genes, out of a total of around 100 analyzed by Blair <it>et al</it>. <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>, were used in our analysis. The remaining genes contained in the 18 human chromosomes did not pass the BLAST filters applied in the analysis. Second, we arranged the dataset such that the sequences, including those evolving faster or slower, were included if they met the condition of equal rate of change between two (<it>C. elegans </it>and <it>D. melanogaster</it>) or three species (<it>C. intestinalis</it>, <it>D. melanogaster </it>and <it>C. elegans</it>). Third, we used a large number of characters (amino-acid residues) and a weighted distant outgroup species to enhance the power of the relative rate test <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>.</p>
			<p>As discussed in our previous paper <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>, by including or excluding certain human homologous exon sequences, we reduced the problem of LBAE and added a probable bias favoring Coelomata. The present work confirms that this bias exists. The concatenation and the posterior phylogenetic analysis of the sequences shared by the eukaryotes used in this analysis provide a viable solution to the ancestor-descendant relationships of animal species once the LBAE is removed.</p>
		</sec>
		<sec>
			<st>
				<p>Conclusions</p>
			</st>
			<p>Acceptance of the new animal phylogeny and the Ecdysozoa hypothesis would provide a new scheme to understand the Cambrian explosion <abbrgrp><abbr bid="B38">38</abbr><abbr bid="B39">39</abbr></abbrgrp> and the origin of metazoan body plans <abbrgrp><abbr bid="B9">9</abbr><abbr bid="B30">30</abbr></abbrgrp> and consequently would set a new phylogenetic framework for comparative genomics <abbrgrp><abbr bid="B40">40</abbr></abbrgrp>. We have shown how phylogenetic reconstruction based on whole-genome sequences has the potential to solve one of the most controversial hypotheses in animal evolution: the reliability of the Ecdysozoa clade.</p>
		</sec>
		<sec>
			<st>
				<p>Materials and methods</p>
			</st>
			<sec>
				<st>
					<p>Dataset collection</p>
				</st>
				<p>Complete genome sequences from <it>Plasmodium falciparum </it><abbrgrp><abbr bid="B41">41</abbr></abbrgrp>, <it>Arabidopsis thaliana </it><abbrgrp><abbr bid="B42">42</abbr></abbrgrp>, <it>Oryza sativa </it><abbrgrp><abbr bid="B43">43</abbr></abbrgrp>, <it>Saccharomyces cerevisae </it><abbrgrp><abbr bid="B44">44</abbr></abbrgrp>, <it>Caenorhabditis elegans </it><abbrgrp><abbr bid="B45">45</abbr></abbrgrp>, <it>Anopheles gambiae </it><abbrgrp><abbr bid="B46">46</abbr></abbrgrp>, <it>Drosophila melanogaster </it><abbrgrp><abbr bid="B47">47</abbr></abbrgrp>, <it>Ciona intestinalis </it><abbrgrp><abbr bid="B48">48</abbr></abbrgrp>, <it>Fugu rubripes </it><abbrgrp><abbr bid="B49">49</abbr></abbrgrp>, <it>Mus musculus </it><abbrgrp><abbr bid="B50">50</abbr></abbrgrp> and <it>Homo sapiens </it><abbrgrp><abbr bid="B51">51</abbr></abbrgrp> were downloaded and formatted to run local BLAST <abbrgrp><abbr bid="B52">52</abbr></abbrgrp>. Amino-acid sequences corresponding to all the gene exons in a sample of 18 human chromosome including 6-18, 20-22, X and Y (approximately 14,000 genes and 140,000 exons), were obtained from the Ensembl database project <abbrgrp><abbr bid="B53">53</abbr></abbrgrp>. Human paralogous exons were excluded by running local blastp <abbrgrp><abbr bid="B52">52</abbr></abbrgrp> on a human exon database built <it>ad hoc</it>. Only the best of those sequences, with more than a single hit with a fraction of aligned and conserved amino-acid sequence &#8805; 95% and &#8805; 90% respectively, were retained to find homologous sequences in the other eukaryotic species (threshold values based on a previous human paralogous study <abbrgrp><abbr bid="B54">54</abbr></abbrgrp>). We used tblastn <abbrgrp><abbr bid="B52">52</abbr></abbrgrp> that searches a query amino-acid sequence on the six translation frames of the target sequence to search for homology in the complete genome databases of the species mentioned above. Exons less than 22 amino acids were removed from the analysis. Each best hit of tblastn was filtered by means of a threshold e-value (&#8804; 1<it>e</it>-03) and a threshold proportion of the query over the subject sequence length (&#8805; 75%). Only those exons that pass through all the species filter conditions were selected as the final dataset of human exon homologous sequences. All the exon homologous sequences were aligned using Clustal W <abbrgrp><abbr bid="B55">55</abbr></abbrgrp> with default parameters. The total number of homologous sequences, derived from 18 human chromosomes, corresponds to 1,192 exons selected from 610 known genes, adding up to more than 55,500 amino-acid characters.</p>
				<p>To arrange homologous sequences in different datasets, pairwise distances between sequences were extracted using the PROTDIST program (Kimura option) of the PHYLIP package <abbrgrp><abbr bid="B56">56</abbr></abbrgrp>. Distances between <it>C. elegans</it>, <it>D. melanogaster </it>and <it>H. sapiens </it>were transformed into branch lengths in a star-like unrooted tree (<it>l</it><sub><it>a </it></sub>= (<it>d</it><sub><it>ab </it></sub>+ <it>d</it><sub><it>ac </it></sub>- <it>d</it><sub><it>bc</it></sub>)/2, where <it>l</it><sub><it>a </it></sub>is the length of the branch leading to <it>a </it>and <it>d</it><sub><it>ab</it></sub>, <it>d</it><sub><it>ac</it></sub>, <it>d</it><sub><it>bc </it></sub>are the distances between <it>a </it>and <it>b</it>, <it>a </it>and <it>c</it>, and <it>b </it>and <it>c</it>, respectively). It is important to emphasize that we are not considering that the phylogenetic relationships of <it>C. elegans</it>, <it>D. melanogaster </it>and <it>H. sapiens </it>is a star topology. We used this exact equation for determining the branch lengths of the three species, because the unique way to arrange three species in a phylogenetic tree is a star topology. We consider <it>C. elegans</it>, <it>D. melanogaster </it>and <it>H. sapiens </it>to be members of the ingroup and <it>P. falciparum</it>, <it>A. thaliana</it>, <it>O. sativa </it>and <it>S. cerevisae </it>as the outgroup species at the moment to root the phylogenetic tree. Homologous exon sequences were arranged in eight datasets according to their pertinence to more inclusive areas surrounding the straight line representing identical relative branch lengths (RBLs) of <it>C. elegans </it>(<it>L</it><sub><it>Ce </it></sub>= <it>l</it><sub><it>Ce</it></sub>/<it>l</it><sub><it>Hs</it></sub>) and <it>D. melanogaster </it>(<it>L</it><sub><it>Dm </it></sub>= <it>l</it><sub><it>Dm</it></sub>/<it>l</it><sub><it>Hs</it></sub>). The <it>D</it><sub><it>i </it></sub>dataset clusters all the homologous exon alignments where <it>L</it><sub><it>Dm </it></sub>- &#948;<sub><it>i </it></sub>&#8804; <it>L</it><sub><it>Ce </it></sub>&#8804; <it>L</it><sub><it>Dm </it></sub>+ &#948;<sub><it>i</it></sub>, where <it>i </it>is an integer ranging from 2 to 7 and &#948;<sub><it>i </it></sub>= 5.0, 3.0,2.5,2.0,15,1.0,0.5. The <it>D</it><sub>1 </sub>dataset contains all the exon homologous sequences without the constraints of evolutionary rates. Exons with negative or undefined normalized distances (<it>l</it><sub><it>Hs </it></sub>= 0) were excluded from the analysis. All the aligned homologous exon sequences of the <it>D</it><sub><it>i </it></sub>dataset were concatenated in the <it>M</it><sub><it>i </it></sub>matrix. Three additional matrices were derived from <it>D</it><sub>1</sub>: two by removing exons containing <it>L</it><sub><it>Ce </it></sub>&#8805; <graphic file="gb-2005-6-5-r41-i1.gif"/> and <it>L</it><sub><it>Ce </it></sub>&#8805; <graphic file="gb-2005-6-5-r41-i2.gif"/>, and the last one by adjusting the sequences of <it>C. intestinalis</it>, <it>D. melanogaster </it>and <it>C. elegans </it>to clock-like behavior.</p>
			</sec>
			<sec>
				<st>
					<p>Phylogenetic methods</p>
				</st>
				<p>The relative rate test was performed at the 5% statistical level by means of the RRTree program <abbrgrp><abbr bid="B57">57</abbr></abbrgrp> using outgroups with one (<it>S. cerevisae</it>; OUG1) or more species (<it>S. cerevisae</it>, <it>A. thaliana</it>, <it>O. sativa </it>and <it>P. falciparum</it>; OUG2). In the latter case, an explicit weighted phylogenetic scheme was chosen (1/2 <it>S. cerevisae</it>, ((1/8 <it>A. thaliana</it>, 1/8 <it>O. sativa</it>), 1/4 <it>P. falciparum</it>)). Given that three ingroups were set for all analyses (the chordates <it>H. sapiens</it>, <it>M. musculus</it>, <it>F. rubripes</it>, and <it>C. intestinalis</it>; the arthropods <it>Anopheles gambiae </it>and <it>Drosophila melanogaster</it>; and the nematode <it>C. elegans</it>), the threshold value was corrected for multiple testing to 5/3 = 1.7%. TREE-PUZZLE <abbrgrp><abbr bid="B58">58</abbr></abbrgrp> was used to evaluate six alternative evolutionary models adjusted for frequencies (+<it>F</it>), site rate variation (+&#915; distribution with two rates) and a proportion of invariable sites (+<it>I</it>), to estimate the amount of evolutionary information of datasets by the likelihood-mapping method <abbrgrp><abbr bid="B59">59</abbr></abbrgrp>, to derive the maximum likelihood (ML) trees using the quartet-puzzling algorithm, to set the ML pairwise sequence distances, and to test alternative topologies using SH <abbrgrp><abbr bid="B60">60</abbr></abbrgrp> and ELW <abbrgrp><abbr bid="B29">29</abbr></abbrgrp> tests. The PROML (JTT+f) program of the PHYLIP package <abbrgrp><abbr bid="B56">56</abbr></abbrgrp> was used to estimate ML trees derived from the stepwise addition algorithm. Distance methods of phylogenetic reconstruction were performed using PROTDIST (JTT, Kimura options), NEIGHBOR (neighbor-joining (NJ) <abbrgrp><abbr bid="B61">61</abbr></abbrgrp>) and least squares (LS) <abbrgrp><abbr bid="B62">62</abbr></abbrgrp> algorithms, and CONSENSE (50% majority-consensus rule option) programs on 100 bootstrap replications using PHYLIP.</p>
			</sec>
		</sec>
		<sec>
			<st>
				<p>Additional data files</p>
			</st>
			<p>The following additional data files are available with the online version of this paper. Additional data file <supplr sid="S1">1</supplr> contains a figure showing ML puzzle mapping of the <it>M</it><sub><it>i </it></sub>matrices. 
Additional data file <supplr sid="S2">2</supplr> contains a figure showing ML puzzle mapping of the matrix derived from chordate, arthropod and nematode sequences showing clock-like behavior. Additional data file <supplr sid="S3">3</supplr> contains the matrices.</p>
			<suppl id="S1">
				<title>
					<p>Additional File 1</p>
				</title>
				<caption>
					<p>ML puzzle mapping of the <it>M</it><sub><it>i </it></sub>matrices.</p>
				</caption>
				<text>
					<p><b>ML puzzle mapping of the <it>M</it><sub><it>i </it></sub>matrices. </b>Maximum likelihood mapping results for each one of the <it>M</it><sub><it>i </it></sub>concatenated matrices. From the first row and from left to right, M1 to M2 until the fourth row, M7 to M8.</p>
				</text>
				<file name="gb-2005-6-5-r41-S1.pdf">
					<p>Click here for file</p>
				</file>
			</suppl>
			<suppl id="S2">
				<title>
					<p>Additional File 2</p>
				</title>
				<caption>
					<p>ML puzzle mapping of the matrix derived from chordate, arthropod and nematode sequences showing clock-like behavior.</p>
				</caption>
				<text>
					<p><b>ML puzzle mapping of the matrix derived from chordate, arthropod and nematode sequences showing clock-like behavior.</b> ML mapping of the concatenated matrix derived from constraining sequences to 3 clocks-like behavior.</p>
				</text>
				<file name="gb-2005-6-5-r41-S2.pdf">
					<p>Click here for file</p>
				</file>
			</suppl>
			<suppl id="S3">
				<title>
					<p>Additional File 3</p>
				</title>
				<caption>
					<p>Matrices</p>
				</caption>
				<text>
					<p><b>Matrices. </b>The full set of matrices (phylip format) used in the phylogenetic analyzes.</p>
				</text>
				<file name="gb-2005-6-5-r41-S3.txt">
					<p>Click here for file</p>
				</file>
			</suppl>
		</sec>
	</bdy>
	<bm>
		<ack>
			<sec>
				<st>
					<p>Acknowledgements</p>
				</st>
				<p>We thank especially Javier Santoyo and the Bioinformatics department members at the Centro de Investigaci&#243;n Pr&#237;ncipe Felipe. We thank J. Castresana, D. Posada and R. Zardoya for comments and suggestions, and M. Robinson-Rechavi for updating the code of the RRTree software. Special thanks goes to Amanda Wren for her revision of the English. H.D. acknowledges the support of Fundaci&#243;n Carolina and Fundaci&#243;n la Caixa.</p>
			</sec>
		</ack>
		<refgrp>
			<bibl id="B1">
				<title>
					<p>Animal evolution. The end of the intermediate taxa?</p>
				</title>
				<aug>
					<au>
						<snm>Adoutte</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Balavoine</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Lartillot</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>de Rosa</snm>
						<fnm>R</fnm>
					</au>
				</aug>
				<source>Trends Genet</source>
				<pubdate>1999</pubdate>
				<volume>15</volume>
				<fpage>104</fpage>
				<lpage>108</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/S0168-9525(98)01671-0</pubid>
						<pubid idtype="pmpid" link="fulltext">10203807</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B2">
				<aug>
					<au>
						<snm>Raff</snm>
						<fnm>RR</fnm>
					</au>
				</aug>
				<source>The Shape of Life. Genes, Development and the Evolution of Animal Form</source>
				<publisher>Chicago: The University of Chicago Press</publisher>
				<pubdate>1996</pubdate>
			</bibl>
			<bibl id="B3">
				<title>
					<p>Evidence for a clade of nematodes, arthropods and other moulting animals.</p>
				</title>
				<aug>
					<au>
						<snm>Aguinaldo</snm>
						<fnm>AM</fnm>
					</au>
					<au>
						<snm>Turbeville</snm>
						<fnm>JM</fnm>
					</au>
					<au>
						<snm>Linford</snm>
						<fnm>LS</fnm>
					</au>
					<au>
						<snm>Rivera</snm>
						<fnm>MC</fnm>
					</au>
					<au>
						<snm>Garey</snm>
						<fnm>JR</fnm>
					</au>
					<au>
						<snm>Raff</snm>
						<fnm>RA</fnm>
					</au>
					<au>
						<snm>Lake</snm>
						<fnm>JA</fnm>
					</au>
				</aug>
				<source>Nature</source>
				<pubdate>1997</pubdate>
				<volume>387</volume>
				<fpage>489</fpage>
				<lpage>493</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1038/387489a0</pubid>
						<pubid idtype="pmpid">9168109</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B4">
				<title>
					<p>The origin and evolution of model organisms.</p>
				</title>
				<aug>
					<au>
						<snm>Hedges</snm>
						<fnm>SB</fnm>
					</au>
				</aug>
				<source>Nat Rev Genet</source>
				<pubdate>2002</pubdate>
				<volume>3</volume>
				<fpage>838</fpage>
				<lpage>849</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1038/nrg929</pubid>
						<pubid idtype="pmpid" link="fulltext">12415314</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B5">
				<title>
					<p>Testing the new animal phylogeny: first use of combined large-subunit and small-subunit rRNA gene sequences to classify the protostomes.</p>
				</title>
				<aug>
					<au>
						<snm>Mallatt</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Winchell</snm>
						<fnm>CJ</fnm>
					</au>
				</aug>
				<source>Mol Biol Evol</source>
				<pubdate>2002</pubdate>
				<volume>19</volume>
				<fpage>289</fpage>
				<lpage>301</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">11861888</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B6">
				<title>
					<p>A phylogenetic analysis of myosin heavy chain type II sequences corroborates that Acoela and Nemertodermatida are basal bilaterians.</p>
				</title>
				<aug>
					<au>
						<snm>Ruiz-Trillo</snm>
						<fnm>I</fnm>
					</au>
					<au>
						<snm>Paps</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Loukota</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Ribera</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Jondelius</snm>
						<fnm>U</fnm>
					</au>
					<au>
						<snm>Baguna</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Riutort</snm>
						<fnm>M</fnm>
					</au>
				</aug>
				<source>Proc Natl Acad Sci USA</source>
				<pubdate>2002</pubdate>
				<volume>99</volume>
				<fpage>11246</fpage>
				<lpage>11251</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">123241</pubid>
						<pubid idtype="pmpid" link="fulltext">12177440</pubid>
						<pubid idtype="doi">10.1073/pnas.172390199</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B7">
				<title>
					<p>Animal phylogeny and the ancestry of bilaterians: inferences from morphology and 18S rDNA gene sequences.</p>
				</title>
				<aug>
					<au>
						<snm>Peterson</snm>
						<fnm>KJ</fnm>
					</au>
					<au>
						<snm>Eernisse</snm>
						<fnm>DJ</fnm>
					</au>
				</aug>
				<source>Evol Dev</source>
				<pubdate>2001</pubdate>
				<volume>3</volume>
				<fpage>170</fpage>
				<lpage>205</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1046/j.1525-142x.2001.003003170.x</pubid>
						<pubid idtype="pmpid" link="fulltext">11440251</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B8">
				<title>
					<p>The comparison of beta-thymosin homologues among metazoa supports an arthropod-nematode clade.</p>
				</title>
				<aug>
					<au>
						<snm>Manuel</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Kruse</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Muller</snm>
						<fnm>WE</fnm>
					</au>
					<au>
						<snm>Le Parco</snm>
						<fnm>Y</fnm>
					</au>
				</aug>
				<source>J Mol Evol</source>
				<pubdate>2000</pubdate>
				<volume>51</volume>
				<fpage>378</fpage>
				<lpage>381</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">11040289</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B9">
				<title>
					<p>Hox genes in brachiopods and priapulids and protostome evolution.</p>
				</title>
				<aug>
					<au>
						<snm>de Rosa</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Grenier</snm>
						<fnm>JK</fnm>
					</au>
					<au>
						<snm>Andreeva</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Cook</snm>
						<fnm>CE</fnm>
					</au>
					<au>
						<snm>Adoutte</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Akam</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Carrol</snm>
						<fnm>SB</fnm>
					</au>
					<au>
						<snm>Balavoine</snm>
						<fnm>G</fnm>
					</au>
				</aug>
				<source>Nature</source>
				<pubdate>1999</pubdate>
				<volume>399</volume>
				<fpage>772</fpage>
				<lpage>776</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1038/21631</pubid>
						<pubid idtype="pmpid" link="fulltext">10391241</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B10">
				<title>
					<p>Ecdysozoan phylogeny and Bayesian inference: first use of nearly complete 28S and 18S rRNA gene sequences to classify the arthropods and their kin.</p>
				</title>
				<aug>
					<au>
						<snm>Mallatt</snm>
						<fnm>JM</fnm>
					</au>
					<au>
						<snm>Garey</snm>
						<fnm>JR</fnm>
					</au>
					<au>
						<snm>Shultz</snm>
						<fnm>JW</fnm>
					</au>
				</aug>
				<source>Mol Phylogenet Evol</source>
				<pubdate>2004</pubdate>
				<volume>31</volume>
				<fpage>178</fpage>
				<lpage>191</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/j.ympev.2003.07.013</pubid>
						<pubid idtype="pmpid" link="fulltext">15019618</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B11">
				<title>
					<p>Bilaterian phylogeny based on analyzes of a region of the sodium-potassium ATPase beta-subunit gene.</p>
				</title>
				<aug>
					<au>
						<snm>Anderson</snm>
						<fnm>FE</fnm>
					</au>
					<au>
						<snm>Cordoba</snm>
						<fnm>AJ</fnm>
					</au>
					<au>
						<snm>Thollesson</snm>
						<fnm>M</fnm>
					</au>
				</aug>
				<source>J Mol Evol</source>
				<pubdate>2004</pubdate>
				<volume>58</volume>
				<fpage>252</fpage>
				<lpage>268</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1007/s00239-003-2548-9</pubid>
						<pubid idtype="pmpid" link="fulltext">15045481</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B12">
				<title>
					<p>Large-scale taxonomic profiling of eukaryotic model organisms: a comparison of orthologous proteins encoded by the human, fly, nematode, and yeast genomes.</p>
				</title>
				<aug>
					<au>
						<snm>Mushegian</snm>
						<fnm>AR</fnm>
					</au>
					<au>
						<snm>Garey</snm>
						<fnm>JR</fnm>
					</au>
					<au>
						<snm>Martin</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Liu</snm>
						<fnm>LX</fnm>
					</au>
				</aug>
				<source>Genome Res</source>
				<pubdate>1998</pubdate>
				<volume>8</volume>
				<fpage>590</fpage>
				<lpage>598</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">9647634</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B13">
				<title>
					<p>Early evolution of the bilateria.</p>
				</title>
				<aug>
					<au>
						<snm>Hausdorf</snm>
						<fnm>B</fnm>
					</au>
				</aug>
				<source>Syst Biol</source>
				<pubdate>2000</pubdate>
				<volume>49</volume>
				<fpage>130</fpage>
				<lpage>142</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1080/10635150050207438</pubid>
						<pubid idtype="pmpid">12116476</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B14">
				<title>
					<p>The evolutionary position of nematodes.</p>
				</title>
				<aug>
					<au>
						<snm>Blair</snm>
						<fnm>JE</fnm>
					</au>
					<au>
						<snm>Ikeo</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Gojobori</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Hedges</snm>
						<fnm>SB</fnm>
					</au>
				</aug>
				<source>BMC Evol Biol</source>
				<pubdate>2002</pubdate>
				<volume>2</volume>
				<fpage>7</fpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">102755</pubid>
						<pubid idtype="pmpid" link="fulltext">11985779</pubid>
						<pubid idtype="doi">10.1186/1471-2148-2-7</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B15">
				<title>
					<p>Coelomata and not Ecdysozoa: evidence from genome-wide phylogenetic analysis.</p>
				</title>
				<aug>
					<au>
						<snm>Wolf</snm>
						<fnm>YI</fnm>
					</au>
					<au>
						<snm>Rogozin</snm>
						<fnm>IB</fnm>
					</au>
					<au>
						<snm>Koonin</snm>
						<fnm>EV</fnm>
					</au>
				</aug>
				<source>Genome Res</source>
				<pubdate>2004</pubdate>
				<volume>14</volume>
				<fpage>29</fpage>
				<lpage>36</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">314272</pubid>
						<pubid idtype="pmpid" link="fulltext">14707168</pubid>
						<pubid idtype="doi">10.1101/gr.1347404</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B16">
				<title>
					<p>Phylogenomics and the number of characters required for obtaining an accurate phylogeny of eukaryote model species.</p>
				</title>
				<aug>
					<au>
						<snm>Dopazo</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Santoyo</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Dopazo</snm>
						<fnm>J</fnm>
					</au>
				</aug>
				<source>Bioinformatics</source>
				<pubdate>2004</pubdate>
				<volume>20</volume>
				<issue>Suppl 1</issue>
				<fpage>I116</fpage>
				<lpage>I121</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1093/bioinformatics/bth902</pubid>
						<pubid idtype="pmpid" link="fulltext">15262789</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B17">
				<title>
					<p>Systematic searches for molecular synapomorphies in model metazoan genomes give some support for Ecdysozoa after accounting for the idiosyncrasies of <it>Caenorhabditis elegans</it>.</p>
				</title>
				<aug>
					<au>
						<snm>Copley</snm>
						<fnm>RR</fnm>
					</au>
					<au>
						<snm>Aloy</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Russell</snm>
						<fnm>RB</fnm>
					</au>
					<au>
						<snm>Telford</snm>
						<fnm>MJ</fnm>
					</au>
				</aug>
				<source>Evol Dev</source>
				<pubdate>2004</pubdate>
				<volume>6</volume>
				<fpage>164</fpage>
				<lpage>169</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1111/j.1525-142X.2004.04021.x</pubid>
						<pubid idtype="pmpid" link="fulltext">15099303</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B18">
				<title>
					<p>Phylogenomics of eukaryotes: the impact of missing data on large alignments.</p>
				</title>
				<aug>
					<au>
						<snm>Philippe</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Snell</snm>
						<fnm>EA</fnm>
					</au>
					<au>
						<snm>Bapteste</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>Lopez</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Holland</snm>
						<fnm>PW</fnm>
					</au>
					<au>
						<snm>Casane</snm>
						<fnm>D</fnm>
					</au>
				</aug>
				<source>Mol Biol Evol</source>
				<pubdate>2004</pubdate>
				<volume>21</volume>
				<fpage>1740</fpage>
				<lpage>1752</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1093/molbev/msh182</pubid>
						<pubid idtype="pmpid" link="fulltext">15175415</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B19">
				<title>
					<p>Genome-scale approaches to resolving incongruence in molecular phylogenies.</p>
				</title>
				<aug>
					<au>
						<snm>Rokas</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Williams</snm>
						<fnm>BL</fnm>
					</au>
					<au>
						<snm>King</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Carroll</snm>
						<fnm>SB</fnm>
					</au>
				</aug>
				<source>Nature</source>
				<pubdate>2003</pubdate>
				<volume>425</volume>
				<fpage>798</fpage>
				<lpage>804</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1038/nature02053</pubid>
						<pubid idtype="pmpid" link="fulltext">14574403</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B20">
				<title>
					<p>The power of relative rates tests depends on the data.</p>
				</title>
				<aug>
					<au>
						<snm>Bromham</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Penny</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Rambaut</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Hendy</snm>
						<fnm>MD</fnm>
					</au>
				</aug>
				<source>J Mol Evol</source>
				<pubdate>2000</pubdate>
				<volume>50</volume>
				<fpage>296</fpage>
				<lpage>301</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">10754073</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B21">
				<title>
					<p>On information and sufficiency.</p>
				</title>
				<aug>
					<au>
						<snm>Kullback</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Leibler</snm>
						<fnm>RA</fnm>
					</au>
				</aug>
				<source>Annls Math Stat</source>
				<pubdate>1951</pubdate>
				<volume>22</volume>
				<fpage>79</fpage>
				<lpage>86</lpage>
			</bibl>
			<bibl id="B22">
				<title>
					<p>A general empirical model of protein evolution derived from multiple protein families using a maximum-likelihood approach.</p>
				</title>
				<aug>
					<au>
						<snm>Whelan</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Goldman</snm>
						<fnm>N</fnm>
					</au>
				</aug>
				<source>Mol Biol Evol</source>
				<pubdate>2001</pubdate>
				<volume>18</volume>
				<fpage>691</fpage>
				<lpage>699</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">11319253</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B23">
				<title>
					<p>Modeling amino acid replacement.</p>
				</title>
				<aug>
					<au>
						<snm>Muller</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Vingron</snm>
						<fnm>M</fnm>
					</au>
				</aug>
				<source>J Comput Biol</source>
				<pubdate>2000</pubdate>
				<volume>7</volume>
				<fpage>761</fpage>
				<lpage>776</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1089/10665270050514918</pubid>
						<pubid idtype="pmpid" link="fulltext">11382360</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B24">
				<title>
					<p>Amino acid substitution matrices from protein blocks.</p>
				</title>
				<aug>
					<au>
						<snm>Henikoff</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Henikoff</snm>
						<fnm>JG</fnm>
					</au>
				</aug>
				<source>Proc Natl Acad Sci USA</source>
				<pubdate>1992</pubdate>
				<volume>89</volume>
				<fpage>10915</fpage>
				<lpage>10919</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">50453</pubid>
						<pubid idtype="pmpid" link="fulltext">1438297</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B25">
				<title>
					<p>The rapid generation of mutation data matrices from protein sequences.</p>
				</title>
				<aug>
					<au>
						<snm>Jones</snm>
						<fnm>DT</fnm>
					</au>
					<au>
						<snm>Taylor</snm>
						<fnm>WR</fnm>
					</au>
					<au>
						<snm>Thornton</snm>
						<fnm>JM</fnm>
					</au>
				</aug>
				<source>Comput Appl Biosci</source>
				<pubdate>1992</pubdate>
				<volume>8</volume>
				<fpage>275</fpage>
				<lpage>282</lpage>
				<xrefbib>
					<pubid idtype="pmpid">1633570</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B26">
				<title>
					<p>A model of evolutionary change in proteins.</p>
				</title>
				<aug>
					<au>
						<snm>Dayhoff</snm>
						<fnm>MO</fnm>
					</au>
					<au>
						<snm>Schwartz</snm>
						<fnm>RM</fnm>
					</au>
					<au>
						<snm>Orcutt</snm>
						<fnm>BC</fnm>
					</au>
				</aug>
				<source>Atlas of Protein Sequence and Structure</source>
				<publisher>Washington DC: National Biomedical Research Foundation</publisher>
				<editor>Dayhoff MO</editor>
				<pubdate>1978</pubdate>
				<volume>5</volume>
				<fpage>345</fpage>
				<lpage>358</lpage>
			</bibl>
			<bibl id="B27">
				<title>
					<p>Model of amino acid substitution in proteins encoded by mitochondrial DNA.</p>
				</title>
				<aug>
					<au>
						<snm>Adachi</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Hasegawa</snm>
						<fnm>M</fnm>
					</au>
				</aug>
				<source>J Mol Evol</source>
				<pubdate>1996</pubdate>
				<volume>42</volume>
				<fpage>459</fpage>
				<lpage>468</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">8642615</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B28">
				<aug>
					<au>
						<snm>Felsenstein</snm>
						<fnm>J</fnm>
					</au>
				</aug>
				<source>Inferring Phylogenies</source>
				<publisher>Sunderland, MA: Sinauer</publisher>
				<pubdate>2004</pubdate>
			</bibl>
			<bibl id="B29">
				<title>
					<p>Inferring confidence sets of possibly misspecified gene trees.</p>
				</title>
				<aug>
					<au>
						<snm>Strimmer</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Rambaut</snm>
						<fnm>A</fnm>
					</au>
				</aug>
				<source>Proc Biol Sci</source>
				<pubdate>2002</pubdate>
				<volume>269</volume>
				<fpage>137</fpage>
				<lpage>142</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmpid" link="fulltext">11798428</pubid>
						<pubid idtype="doi">10.1098/rspb.2001.1862</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B30">
				<aug>
					<au>
						<snm>Carrol</snm>
						<fnm>SB</fnm>
					</au>
					<au>
						<snm>Grenier</snm>
						<fnm>JK</fnm>
					</au>
					<au>
						<snm>Weatherbee</snm>
						<fnm>SD</fnm>
					</au>
				</aug>
				<source>From DNA to Diversity. Molecular Genetics and the Evolution of Animal Design</source>
				<publisher>Malden, MA: Blackwell Science</publisher>
				<pubdate>2001</pubdate>
			</bibl>
			<bibl id="B31">
				<title>
					<p>Sampling properties of DNA sequence data in phylogenetic analysis.</p>
				</title>
				<aug>
					<au>
						<snm>Cummings</snm>
						<fnm>MP</fnm>
					</au>
					<au>
						<snm>Otto</snm>
						<fnm>SP</fnm>
					</au>
					<au>
						<snm>Wakeley</snm>
						<fnm>J</fnm>
					</au>
				</aug>
				<source>Mol Biol Evol</source>
				<pubdate>1995</pubdate>
				<volume>12</volume>
				<fpage>814</fpage>
				<lpage>822</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">7476127</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B32">
				<title>
					<p>Ribosomal RNA trees misleading?</p>
				</title>
				<aug>
					<au>
						<snm>Hasegawa</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Hashimoto</snm>
						<fnm>T</fnm>
					</au>
				</aug>
				<source>Nature</source>
				<pubdate>1993</pubdate>
				<volume>361</volume>
				<fpage>23</fpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1038/361023b0</pubid>
						<pubid idtype="pmpid" link="fulltext">8421491</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B33">
				<title>
					<p>Limitations of metazoan 18S rRNA sequence data: implications for reconstructing a phylogeny of the animal kingdom and inferring the reality of the Cambrian explosion.</p>
				</title>
				<aug>
					<au>
						<snm>Abouheif</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>Zardoya</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Meyer</snm>
						<fnm>A</fnm>
					</au>
				</aug>
				<source>J Mol Evol</source>
				<pubdate>1998</pubdate>
				<volume>47</volume>
				<fpage>394</fpage>
				<lpage>405</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">9767685</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B34">
				<title>
					<p>A method for determining the position and size of optimal sequence regions for phylogenetic analysis.</p>
				</title>
				<aug>
					<au>
						<snm>Martin</snm>
						<fnm>MJ</fnm>
					</au>
					<au>
						<snm>Gonzalez-Candelas</snm>
						<fnm>F</fnm>
					</au>
					<au>
						<snm>Sobrino</snm>
						<fnm>F</fnm>
					</au>
					<au>
						<snm>Dopazo</snm>
						<fnm>J</fnm>
					</au>
				</aug>
				<source>J Mol Evol</source>
				<pubdate>1995</pubdate>
				<volume>41</volume>
				<fpage>1128</fpage>
				<lpage>1138</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1007/BF00173194</pubid>
						<pubid idtype="pmpid">8587110</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B35">
				<title>
					<p>Is sparse taxon sampling a problem for phylogenetic inference?</p>
				</title>
				<aug>
					<au>
						<snm>Hillis</snm>
						<fnm>DM</fnm>
					</au>
					<au>
						<snm>Pollock</snm>
						<fnm>DD</fnm>
					</au>
					<au>
						<snm>McGuire</snm>
						<fnm>JA</fnm>
					</au>
					<au>
						<snm>Zwickl</snm>
						<fnm>DJ</fnm>
					</au>
				</aug>
				<source>Syst Biol</source>
				<pubdate>2003</pubdate>
				<volume>52</volume>
				<fpage>124</fpage>
				<lpage>126</lpage>
				<xrefbib>
					<pubid idtype="pmpid">12554446</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B36">
				<title>
					<p>Incomplete taxon sampling is not a problem for phylogenetic inference.</p>
				</title>
				<aug>
					<au>
						<snm>Rosenberg</snm>
						<fnm>MS</fnm>
					</au>
					<au>
						<snm>Kumar</snm>
						<fnm>S</fnm>
					</au>
				</aug>
				<source>Proc Natl Acad Sci USA</source>
				<pubdate>2001</pubdate>
				<volume>98</volume>
				<fpage>10751</fpage>
				<lpage>10756</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">58547</pubid>
						<pubid idtype="pmpid" link="fulltext">11526218</pubid>
						<pubid idtype="doi">10.1073/pnas.191248498</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B37">
				<title>
					<p>Taxon sampling, bioinformatics, and phylogenomics.</p>
				</title>
				<aug>
					<au>
						<snm>Rosenberg</snm>
						<fnm>MS</fnm>
					</au>
					<au>
						<snm>Kumar</snm>
						<fnm>S</fnm>
					</au>
				</aug>
				<source>Syst Biol</source>
				<pubdate>2003</pubdate>
				<volume>52</volume>
				<fpage>119</fpage>
				<lpage>124</lpage>
				<xrefbib>
					<pubid idtype="pmpid">12554445</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B38">
				<title>
					<p>One or three Cambrian radiations?</p>
				</title>
				<aug>
					<au>
						<snm>Balavoine</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Adoutte</snm>
						<fnm>A</fnm>
					</au>
				</aug>
				<source>Science</source>
				<pubdate>1998</pubdate>
				<volume>4280</volume>
				<fpage>397</fpage>
				<lpage>398</lpage>
				<xrefbib>
					<pubid idtype="doi">10.1126/science.280.5362.397</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B39">
				<title>
					<p>The Cambrian "explosion": slow-fuse or megatonnage.</p>
				</title>
				<aug>
					<au>
						<snm>Conway Morris</snm>
						<fnm>S</fnm>
					</au>
				</aug>
				<source>Proc Natl Acad Sci USA</source>
				<pubdate>2000</pubdate>
				<volume>97</volume>
				<fpage>4426</fpage>
				<lpage>4429</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">34314</pubid>
						<pubid idtype="pmpid" link="fulltext">10781036</pubid>
						<pubid idtype="doi">10.1073/pnas.97.9.4426</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B40">
				<title>
					<p>Phylogenomics: intersection of evolution and genomics.</p>
				</title>
				<aug>
					<au>
						<snm>Eisen</snm>
						<fnm>JA</fnm>
					</au>
					<au>
						<snm>Fraser</snm>
						<fnm>CM</fnm>
					</au>
				</aug>
				<source>Science</source>
				<pubdate>2003</pubdate>
				<volume>300</volume>
				<fpage>1706</fpage>
				<lpage>1707</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1126/science.1086292</pubid>
						<pubid idtype="pmpid" link="fulltext">12805538</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B41">
				<title>
					<p>Genome sequence of the human malaria parasite <it>Plasmodium falciparum</it>.</p>
				</title>
				<aug>
					<au>
						<snm>Gardner</snm>
						<fnm>MJ</fnm>
					</au>
					<au>
						<snm>Hall</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Fung</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>White</snm>
						<fnm>O</fnm>
					</au>
					<au>
						<snm>Berriman</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Hyman</snm>
						<fnm>RW</fnm>
					</au>
					<au>
						<snm>Carlton</snm>
						<fnm>JM</fnm>
					</au>
					<au>
						<snm>Pain</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Nelson</snm>
						<fnm>KE</fnm>
					</au>
					<au>
						<snm>Bowman</snm>
						<fnm>S</fnm>
					</au>
					<etal/>
				</aug>
				<source>Nature</source>
				<pubdate>2002</pubdate>
				<volume>419</volume>
				<fpage>498</fpage>
				<lpage>511</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1038/nature01097</pubid>
						<pubid idtype="pmpid" link="fulltext">12368864</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B42">
				<title>
					<p>Analysis of the genome sequence of the flowering plant <it>Arabidopsis thaliana</it>.</p>
				</title>
				<aug>
					<au>
						<cnm>Arabidopsis Genome Initiative</cnm>
					</au>
				</aug>
				<source>Nature</source>
				<pubdate>2000</pubdate>
				<volume>408</volume>
				<fpage>796</fpage>
				<lpage>815</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1038/35048692</pubid>
						<pubid idtype="pmpid" link="fulltext">11130711</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B43">
				<title>
					<p>A draft sequence of the rice genome (<it>Oryza sativa </it>L. ssp. <it>indica</it>).</p>
				</title>
				<aug>
					<au>
						<snm>Yu</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Hu</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Wang</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Wong</snm>
						<fnm>GK</fnm>
					</au>
					<au>
						<snm>Li</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Liu</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Deng</snm>
						<fnm>Y</fnm>
					</au>
					<au>
						<snm>Dai</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Zhou</snm>
						<fnm>Y</fnm>
					</au>
					<au>
						<snm>Zhang</snm>
						<fnm>X</fnm>
					</au>
					<etal/>
				</aug>
				<source>Science</source>
				<pubdate>2002</pubdate>
				<volume>296</volume>
				<fpage>79</fpage>
				<lpage>92</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1126/science.1068037</pubid>
						<pubid idtype="pmpid" link="fulltext">11935017</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B44">
				<title>
					<p>The yeast genome directory.</p>
				</title>
				<aug>
					<au>
						<snm>Goffeau</snm>
						<fnm>A</fnm>
					</au>
				</aug>
				<source>Nature</source>
				<pubdate>1997</pubdate>
				<volume>387</volume>
				<issue>Suppl 5</issue>
			</bibl>
			<bibl id="B45">
				<title>
					<p>Genome sequence of the nematode <it>C. elegans</it>: a platform for investigating biology.</p>
				</title>
				<aug>
					<au>
						<cnm><it>C. elegans </it>Sequencing Consortium</cnm>
					</au>
				</aug>
				<source>Science</source>
				<pubdate>1998</pubdate>
				<volume>282</volume>
				<fpage>2012</fpage>
				<lpage>2018</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1126/science.282.5396.2012</pubid>
						<pubid idtype="pmpid" link="fulltext">9851916</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B46">
				<title>
					<p>The genome sequence of the malaria mosquito <it>Anopheles gambiae</it>.</p>
				</title>
				<aug>
					<au>
						<snm>Holt</snm>
						<fnm>RA</fnm>
					</au>
					<au>
						<snm>Subramanian</snm>
						<fnm>GM</fnm>
					</au>
					<au>
						<snm>Halpern</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Sutton</snm>
						<fnm>GG</fnm>
					</au>
					<au>
						<snm>Charlab</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Nusskern</snm>
						<fnm>DR</fnm>
					</au>
					<au>
						<snm>Wincker</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Clark</snm>
						<fnm>AG</fnm>
					</au>
					<au>
						<snm>Ribeiro</snm>
						<fnm>JM</fnm>
					</au>
					<au>
						<snm>Wides</snm>
						<fnm>R</fnm>
					</au>
					<etal/>
				</aug>
				<source>Science</source>
				<pubdate>2002</pubdate>
				<volume>298</volume>
				<fpage>129</fpage>
				<lpage>149</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1126/science.1076181</pubid>
						<pubid idtype="pmpid" link="fulltext">12364791</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B47">
				<title>
					<p>The genome sequence of <it>Drosophila melanogaster</it>.</p>
				</title>
				<aug>
					<au>
						<snm>Adams</snm>
						<fnm>MD</fnm>
					</au>
					<au>
						<snm>Celniker</snm>
						<fnm>SE</fnm>
					</au>
					<au>
						<snm>Holt</snm>
						<fnm>RA</fnm>
					</au>
					<au>
						<snm>Evans</snm>
						<fnm>CA</fnm>
					</au>
					<au>
						<snm>Gocayne</snm>
						<fnm>JD</fnm>
					</au>
					<au>
						<snm>Amanatides</snm>
						<fnm>PG</fnm>
					</au>
					<au>
						<snm>Scherer</snm>
						<fnm>SE</fnm>
					</au>
					<au>
						<snm>Li</snm>
						<fnm>PW</fnm>
					</au>
					<au>
						<snm>Hoskins</snm>
						<fnm>RA</fnm>
					</au>
					<au>
						<snm>Galle</snm>
						<fnm>RF</fnm>
					</au>
					<etal/>
				</aug>
				<source>Science</source>
				<pubdate>2000</pubdate>
				<volume>287</volume>
				<fpage>2185</fpage>
				<lpage>2195</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1126/science.287.5461.2185</pubid>
						<pubid idtype="pmpid" link="fulltext">10731132</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B48">
				<title>
					<p>The draft genome of <it>Ciona intestinalis </it>: insights into chordate and vertebrate origins.</p>
				</title>
				<aug>
					<au>
						<snm>Dehal</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Satou</snm>
						<fnm>Y</fnm>
					</au>
					<au>
						<snm>Campbell</snm>
						<fnm>RK</fnm>
					</au>
					<au>
						<snm>Chapman</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Degnan</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>De Tomaso</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Davidson</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Di Gregorio</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Gelpke</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Goodstein</snm>
						<fnm>DM</fnm>
					</au>
					<etal/>
				</aug>
				<source>Science</source>
				<pubdate>2002</pubdate>
				<volume>298</volume>
				<fpage>2157</fpage>
				<lpage>2167</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1126/science.1080049</pubid>
						<pubid idtype="pmpid" link="fulltext">12481130</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B49">
				<title>
					<p>Whole-genome shotgun assembly and analysis of the genome of <it>Fugu rubripes</it>.</p>
				</title>
				<aug>
					<au>
						<snm>Aparicio</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Chapman</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Stupka</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>Putnam</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Chia</snm>
						<fnm>JM</fnm>
					</au>
					<au>
						<snm>Dehal</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Christoffels</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Rash</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Hoon</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Smit</snm>
						<fnm>A</fnm>
					</au>
					<etal/>
				</aug>
				<source>Science</source>
				<pubdate>2002</pubdate>
				<volume>297</volume>
				<fpage>1301</fpage>
				<lpage>1310</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1126/science.1072104</pubid>
						<pubid idtype="pmpid" link="fulltext">12142439</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B50">
				<title>
					<p>Initial sequencing and comparative analysis of the mouse genome.</p>
				</title>
				<aug>
					<au>
						<snm>Waterston</snm>
						<fnm>RH</fnm>
					</au>
					<au>
						<snm>Lindblad-Toh</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Birney</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>Rogers</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Abril</snm>
						<fnm>JF</fnm>
					</au>
					<au>
						<snm>Agarwal</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Agarwala</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Ainscough</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Alexandersson</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>An</snm>
						<fnm>P</fnm>
					</au>
					<etal/>
				</aug>
				<source>Nature</source>
				<pubdate>2002</pubdate>
				<volume>420</volume>
				<fpage>520</fpage>
				<lpage>562</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1038/nature01262</pubid>
						<pubid idtype="pmpid" link="fulltext">12466850</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B51">
				<title>
					<p>Initial sequencing and analysis of the human genome.</p>
				</title>
				<aug>
					<au>
						<snm>Lander</snm>
						<fnm>ES</fnm>
					</au>
					<au>
						<snm>Linton</snm>
						<fnm>LM</fnm>
					</au>
					<au>
						<snm>Birren</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Nusbaum</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Zody</snm>
						<fnm>MC</fnm>
					</au>
					<au>
						<snm>Baldwin</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Devon</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Dewar</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Doyle</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>FitzHugh</snm>
						<fnm>W</fnm>
					</au>
					<etal/>
				</aug>
				<source>Nature</source>
				<pubdate>2001</pubdate>
				<volume>409</volume>
				<fpage>860</fpage>
				<lpage>921</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1038/35057062</pubid>
						<pubid idtype="pmpid" link="fulltext">11237011</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B52">
				<title>
					<p>Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.</p>
				</title>
				<aug>
					<au>
						<snm>Altschul</snm>
						<fnm>SF</fnm>
					</au>
					<au>
						<snm>Madden</snm>
						<fnm>TL</fnm>
					</au>
					<au>
						<snm>Schaffer</snm>
						<fnm>AA</fnm>
					</au>
					<au>
						<snm>Zhang</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Zhang</snm>
						<fnm>Z</fnm>
					</au>
					<au>
						<snm>Miller</snm>
						<fnm>W</fnm>
					</au>
					<au>
						<snm>Lipman</snm>
						<fnm>DJ</fnm>
					</au>
				</aug>
				<source>Nucleic Acids Res</source>
				<pubdate>1997</pubdate>
				<volume>25</volume>
				<fpage>3389</fpage>
				<lpage>3402</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">146917</pubid>
						<pubid idtype="pmpid" link="fulltext">9254694</pubid>
						<pubid idtype="doi">10.1093/nar/25.17.3389</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B53">
				<title>
					<p>Ensembl 2004.</p>
				</title>
				<aug>
					<au>
						<snm>Birney</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>Andrews</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Bevan</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Caccamo</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Cameron</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Chen</snm>
						<fnm>Y</fnm>
					</au>
					<au>
						<snm>Clarke</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Coates</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Cox</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Cuff</snm>
						<fnm>J</fnm>
					</au>
					<etal/>
				</aug>
				<source>Nucleic Acids Res</source>
				<pubdate>2004</pubdate>
				<volume>32</volume>
				<issue>Database issue</issue>
				<fpage>D468</fpage>
				<lpage>D470</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">308772</pubid>
						<pubid idtype="pmpid" link="fulltext">14681459</pubid>
						<pubid idtype="doi">10.1093/nar/gkh038</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B54">
				<title>
					<p>Recent segmental duplications in the human genome.</p>
				</title>
				<aug>
					<au>
						<snm>Bailey</snm>
						<fnm>JA</fnm>
					</au>
					<au>
						<snm>Gu</snm>
						<fnm>Z</fnm>
					</au>
					<au>
						<snm>Clark</snm>
						<fnm>RA</fnm>
					</au>
					<au>
						<snm>Reinert</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Samonte</snm>
						<fnm>RV</fnm>
					</au>
					<au>
						<snm>Schwartz</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Adams</snm>
						<fnm>MD</fnm>
					</au>
					<au>
						<snm>Myers</snm>
						<fnm>EW</fnm>
					</au>
					<au>
						<snm>Li</snm>
						<fnm>PW</fnm>
					</au>
					<au>
						<snm>Eichler</snm>
						<fnm>EE</fnm>
					</au>
				</aug>
				<source>Science</source>
				<pubdate>2002</pubdate>
				<volume>297</volume>
				<fpage>1003</fpage>
				<lpage>1007</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1126/science.1072047</pubid>
						<pubid idtype="pmpid" link="fulltext">12169732</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B55">
				<title>
					<p>CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice.</p>
				</title>
				<aug>
					<au>
						<snm>Thompson</snm>
						<fnm>JD</fnm>
					</au>
					<au>
						<snm>Higgins</snm>
						<fnm>DG</fnm>
					</au>
					<au>
						<snm>Gibson</snm>
						<fnm>TJ</fnm>
					</au>
				</aug>
				<source>Nucleic Acids Res</source>
				<pubdate>1994</pubdate>
				<volume>22</volume>
				<fpage>4673</fpage>
				<lpage>4680</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">308517</pubid>
						<pubid idtype="pmpid">7984417</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B56">
				<aug>
					<au>
						<snm>Felsenstein</snm>
						<fnm>J</fnm>
					</au>
				</aug>
				<source>PHYLIP (Phylogeny Inference Package) version 3.6a3</source>
				<publisher>Seattle, WA: Department of Genome Sciences, University of Washington</publisher>
				<pubdate>2002</pubdate>
			</bibl>
			<bibl id="B57">
				<title>
					<p>RRTree: relative-rate tests between groups of sequences on a phylogenetic tree.</p>
				</title>
				<aug>
					<au>
						<snm>Robinson-Rechavi</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Huchon</snm>
						<fnm>D</fnm>
					</au>
				</aug>
				<source>Bioinformatics</source>
				<pubdate>2000</pubdate>
				<volume>16</volume>
				<fpage>296</fpage>
				<lpage>297</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1093/bioinformatics/16.3.296</pubid>
						<pubid idtype="pmpid" link="fulltext">10869026</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B58">
				<title>
					<p>TREE-PUZZLE: maximum likelihood phylogenetic analysis using quartets and parallel computing.</p>
				</title>
				<aug>
					<au>
						<snm>Schmidt</snm>
						<fnm>HA</fnm>
					</au>
					<au>
						<snm>Strimmer</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Vingron</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>von Haeseler</snm>
						<fnm>A</fnm>
					</au>
				</aug>
				<source>Bioinformatics</source>
				<pubdate>2002</pubdate>
				<volume>18</volume>
				<fpage>502</fpage>
				<lpage>504</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1093/bioinformatics/18.3.502</pubid>
						<pubid idtype="pmpid" link="fulltext">11934758</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B59">
				<title>
					<p>Likelihood-mapping: a simple method to visualize phylogenetic content of a sequence alignment.</p>
				</title>
				<aug>
					<au>
						<snm>Strimmer</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>von Haeseler</snm>
						<fnm>A</fnm>
					</au>
				</aug>
				<source>Proc Natl Acad Sci USA</source>
				<pubdate>1997</pubdate>
				<volume>94</volume>
				<fpage>6815</fpage>
				<lpage>6819</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">21241</pubid>
						<pubid idtype="pmpid" link="fulltext">9192648</pubid>
						<pubid idtype="doi">10.1073/pnas.94.13.6815</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B60">
				<title>
					<p>Multiple comparisons of log-likelihoods with applications to phylogenetic inference.</p>
				</title>
				<aug>
					<au>
						<snm>Shimodaira</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Hasegawa</snm>
						<fnm>M</fnm>
					</au>
				</aug>
				<source>Mol Biol Evol</source>
				<pubdate>1999</pubdate>
				<volume>16</volume>
				<fpage>1114</fpage>
				<lpage>1116</lpage>
			</bibl>
			<bibl id="B61">
				<title>
					<p>The neighbor-joining method: a new method for reconstructing phylogenetic trees.</p>
				</title>
				<aug>
					<au>
						<snm>Saitou</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Nei</snm>
						<fnm>M</fnm>
					</au>
				</aug>
				<source>Mol Biol Evol</source>
				<pubdate>1987</pubdate>
				<volume>4</volume>
				<fpage>406</fpage>
				<lpage>425</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">3447015</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B62">
				<title>
					<p>Construction of phylogenetic trees: a method based on mutation distances as estimated from cytochrome c sequences is of general applicability.</p>
				</title>
				<aug>
					<au>
						<snm>Fitch</snm>
						<fnm>WM</fnm>
					</au>
					<au>
						<snm>Margoliash</snm>
						<fnm>E</fnm>
					</au>
				</aug>
				<source>Science</source>
				<pubdate>1967</pubdate>
				<volume>155</volume>
				<fpage>279</fpage>
				<lpage>284</lpage>
				<xrefbib>
					<pubid idtype="pmpid">5334057</pubid>
				</xrefbib>
			</bibl>
		</refgrp>
	</bm>
</art>
