<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1471-2148-6-106</ui>
   <ji>1471-2148</ji>
   <fm>
      <dochead>Research article</dochead>
      <bibl>
         <title>
            <p>Fast sequence evolution of <it>Hox </it>and <it>Hox</it>-derived genes in the genus <it>Drosophila</it></p>
         </title>
         <aug>
            <au id="A1">
               <snm>Casillas</snm>
               <fnm>S&#242;nia</fnm>
               <insr iid="I1"/>
               <email>Sonia.Casillas@uab.es</email>
            </au>
            <au id="A2">
               <snm>Negre</snm>
               <fnm>B&#225;rbara</fnm>
               <insr iid="I1"/>
               <insr iid="I2"/>
               <email>bn219@cam.ac.uk</email>
            </au>
            <au id="A3">
               <snm>Barbadilla</snm>
               <fnm>Antonio</fnm>
               <insr iid="I1"/>
               <email>Antonio.Barbadilla@uab.es</email>
            </au>
            <au id="A4" ca="yes">
               <snm>Ruiz</snm>
               <fnm>Alfredo</fnm>
               <insr iid="I1"/>
               <email>Alfredo.Ruiz@uab.es</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Departament de Gen&#232;tica i de Microbiologia, Universitat Aut&#242;noma de Barcelona, 08193 Bellaterra (Barcelona), Spain</p>
            </ins>
            <ins id="I2">
               <p>Department of Zoology, University of Cambridge, Cambridge CB2 3EJ, UK</p>
            </ins>
         </insg>
         <source>BMC Evolutionary Biology</source>
         <issn>1471-2148</issn>
         <pubdate>2006</pubdate>
         <volume>6</volume>
         <issue>1</issue>
         <fpage>106</fpage>
         <url>http://www.biomedcentral.com/1471-2148/6/106</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">17163987</pubid>
               <pubid idtype="doi">10.1186/1471-2148-6-106</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>13</day>
               <month>6</month>
               <year>2006</year>
            </date>
         </rec>
         <acc>
            <date>
               <day>12</day>
               <month>12</month>
               <year>2006</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>12</day>
               <month>12</month>
               <year>2006</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2006</year>
         <collab>Casillas et al; licensee BioMed Central Ltd.</collab>
         <note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>It is expected that genes that are expressed early in development and have a complex expression pattern are under strong purifying selection and thus evolve slowly. <it>Hox </it>genes fulfill these criteria and thus, should have a low evolutionary rate. However, some observations point to a completely different scenario. <it>Hox </it>genes are usually highly conserved inside the homeobox, but very variable outside it.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>We have measured the rates of nucleotide divergence and indel fixation of three <it>Hox </it>genes, <it>labial </it>(<it>lab</it>), <it>proboscipedia </it>(<it>pb</it>) and <it>abdominal-A </it>(<it>abd-A</it>), and compared them with those of three genes derived by duplication from <it>Hox3</it>, <it>bicoid </it>(<it>bcd</it>), <it>zerkn&#252;llt </it>(<it>zen</it>) and <it>zerkn&#252;llt-related </it>(<it>zen2</it>), and 15 non-<it>Hox </it>genes in sets of orthologous sequences of three species of the genus <it>Drosophila</it>. These rates were compared to test the hypothesis that <it>Hox </it>genes evolve slowly. Our results show that the evolutionary rate of <it>Hox </it>genes is higher than that of non-<it>Hox </it>genes when both amino acid differences and indels are taken into account: 43.39% of the amino acid sequence is altered in <it>Hox </it>genes, versus 30.97% in non-<it>Hox </it>genes and 64.73% in <it>Hox</it>-derived genes. Microsatellites scattered along the coding sequence of <it>Hox </it>genes explain partially, but not fully, their fast sequence evolution.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusion</p>
               </st>
               <p>These results show that <it>Hox </it>genes have a higher evolutionary dynamics than other developmental genes, and emphasize the need to take into account indels in addition to nucleotide substitutions in order to accurately estimate evolutionary rates.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <meta>
      <classifications>
         <classification type="bmc" subtype="user_supplied_xml" id="endnote"/>
      </classifications>
   </meta>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p><it>Hox </it>genes are homeobox containing genes involved in the specification of regional identities along the anteroposterior body axis and, thus, play a fundamental role in animal development <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. They encode transcription factors that regulate the expression of other genes downstream in the regulatory cascade of development and have been found in all metazoans, including flies, worms, tunicates, lampreys, fish and tetrapods. A particular feature of these genes is that they are usually clustered together in complexes and arranged in the chromosome in the same order as they are expressed along the anteroposterior body axis of the embryo <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B3">3</abbr></abbrgrp>. Ten genes arranged in a single complex comprised the ancestral <it>Hox </it>gene complex of arthropods (HOM-C) <abbrgrp><abbr bid="B4">4</abbr><abbr bid="B5">5</abbr><abbr bid="B6">6</abbr></abbrgrp>. However, at least three different HOM-C splits have occurred during the evolution of diptera <abbrgrp><abbr bid="B7">7</abbr><abbr bid="B8">8</abbr><abbr bid="B9">9</abbr><abbr bid="B10">10</abbr></abbrgrp>, and several non-homeotic genes and other genes derived from ancestral <it>Hox </it>genes are interspersed among the <it>Drosophila Hox </it>genes.</p>
         <p>The stability of <it>Hox </it>gene number and the conservation of <it>Hox </it>ortholog sequences prompted the notion that <it>Hox </it>proteins have not significantly diverged in function. However, it is now known that several arthropod <it>Hox </it>proteins have changed in sequence and/or function, including those encoded by <it>Hox3 </it><abbrgrp><abbr bid="B11">11</abbr><abbr bid="B12">12</abbr><abbr bid="B13">13</abbr></abbrgrp>, <it>fushi tarazu </it>(<it>ftz</it>) <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>, <it>Ultrabithorax </it>(<it>Ubx</it>) <abbrgrp><abbr bid="B15">15</abbr></abbrgrp> and <it>Antennapedia </it>(<it>Antp</it>) <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>. In winged insects, including <it>Drosophila</it>, <it>Hox3 </it>and <it>ftz </it>lost their homeotic function, that is, their ability to transform the characteristics of one body part into those of another body part <abbrgrp><abbr bid="B17">17</abbr><abbr bid="B18">18</abbr></abbrgrp>, and their expression domains are no longer arranged along the anteroposterior axis of the embryo. Therefore, only eight <it>Hox </it>genes remain in these species <abbrgrp><abbr bid="B6">6</abbr></abbrgrp>. <it>Hox3 </it>gained a novel extraembryonic function, and underwent two consecutive duplications that gave rise to <it>bicoid </it>(<it>bcd</it>), <it>zerkn&#252;llt </it>(<it>zen</it>) and <it>zerkn&#252;llt-related </it>(<it>zen2</it>). The first duplication took place in the cyclorrhaphan fly lineage and gave rise to <it>zen </it>and <it>bcd </it><abbrgrp><abbr bid="B12">12</abbr><abbr bid="B13">13</abbr></abbrgrp>. Afterwards, but before the <it>Drosophila </it>radiation, <it>zen </it>went through a second duplication that gave birth to <it>zen2 </it><abbrgrp><abbr bid="B19">19</abbr></abbrgrp>. Seemingly, <it>bcd </it>and <it>zen </it>have specialized and perform separate functions in the establishment of the embryo's body plan: the maternal gene <it>bcd </it>codes for an important morphogen that establishes anteroposterior polarity <abbrgrp><abbr bid="B20">20</abbr></abbrgrp> and <it>zen </it>is a zygotic gene involved in dorsoventral differentiation <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>. <it>zen2 </it>has the same expression pattern of <it>zen</it>, although its function is unknown. Despite its high sequence divergence across species, it has been maintained for more than 60 Myr <abbrgrp><abbr bid="B19">19</abbr></abbrgrp>.</p>
         <p><it>Hox </it>proteins contain a highly conserved domain of 60 amino acids (coded by the homeobox) that binds DNA through a '<it>helix-turn-helix</it>' structure. This motif is very similar in terms of sequence and structure to that of many DNA binding proteins. Functional comparisons of <it>Hox </it>orthologs have largely focused on their highly conserved homeodomain sequences and have demonstrated their functional interchangeability between species <abbrgrp><abbr bid="B22">22</abbr><abbr bid="B23">23</abbr><abbr bid="B24">24</abbr><abbr bid="B25">25</abbr><abbr bid="B26">26</abbr></abbrgrp>. <it>Hox</it>-derived genes, although having lost their homeotic function, still retain the homeobox.</p>
         <p>It has been shown that housekeeping genes, which are expressed in all cells and at all times, are under strong purifying selection and thus evolve slowly (e.g. histones, or genes involved in the cell cycle) <abbrgrp><abbr bid="B27">27</abbr><abbr bid="B28">28</abbr></abbrgrp>. <it>Hox </it>genes, on the contrary, are expressed early in development and have a complex regulated expression pattern. Mutations in such genes will on average have more deleterious fitness consequences than mutations occurring in genes expressed later on, because they may have cascading consequences for the later steps in development and thus may broadly alter the adult phenotype <abbrgrp><abbr bid="B29">29</abbr><abbr bid="B30">30</abbr><abbr bid="B31">31</abbr></abbrgrp>. Therefore, we also expect <it>Hox </it>genes to be highly constrained and thus evolve slowly. In fact, Davis, Brandman, and Petrov <abbrgrp><abbr bid="B29">29</abbr></abbrgrp> found a highly significant relationship between the developmental timing of gene expression and their nonsynonymous evolutionary rate: genes expressed early in development are likely to have a slower rate of evolution at the protein level than those expressed later. Surprisingly, the strongest negative relationship between expression and evolutionary rate occurred only after the main burst of expression of segment polarity and <it>Hox </it>genes in embryonic development, so these genes could be evolving differently from other developmental genes. However, only one segment polarity gene, <it>wingless </it>(<it>wg</it>), and two <it>Hox </it>genes, <it>Antp </it>and <it>abdominal-A </it>(<it>abd-A</it>), were analyzed.</p>
         <p>Furthermore, Marais <it>et al</it>. <abbrgrp><abbr bid="B32">32</abbr></abbrgrp> found a negative correlation between evolutionary rate at the protein level (as measured by the number of nonsynonymous substitutions per nonsynonymous site, <it>d</it><sub><it>N</it></sub>) and intron size in <it>Drosophila</it>, likely due to a higher abundance of <it>cis</it>-regulatory elements in introns (especially first introns) in genes under strong selective constraints. We know from a previous study that the <it>Hox </it>genes used in this study contain a long intron replete with regulatory elements <abbrgrp><abbr bid="B19">19</abbr></abbrgrp>. Therefore, we would expect these genes to be strongly constrained.</p>
         <p>However, other studies seem to point to a completely different scenario. Developmental biologists noticed a long time ago that a large portion of the sequence of <it>Hox </it>proteins diverges so fast that it is difficult to align homologues from different arthropod classes <abbrgrp><abbr bid="B33">33</abbr></abbrgrp>. In fact, nucleotide sequences outside the homeobox in <it>labial (lab) </it>and <it>Ubx </it>have been reported to diverge significantly <abbrgrp><abbr bid="B8">8</abbr><abbr bid="B15">15</abbr></abbrgrp>. These sequence differences may be neutral with respect to protein function or, more intriguingly, they could be involved in the functional divergence of <it>Hox </it>proteins and the evolutionary diversification of animals <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>. Moreover, Karlin and Burge <abbrgrp><abbr bid="B34">34</abbr></abbrgrp> have shown that many essential developmental genes, including <it>Hox </it>genes, contain long microsatellites within their coding sequence (e.g. trinucleotide repeats that do not disrupt the open reading frame). The vast majority of these genes function in development and/or transcription regulation, and are expressed in the nervous system. Due to the particular mutation mechanism acting on these repetitive sequences by replication slippage <abbrgrp><abbr bid="B35">35</abbr><abbr bid="B36">36</abbr></abbrgrp>, microsatellites are subject to frequent insertions and deletions. Thus, these repetitive sequences could be responsible for a higher than expected evolutionary rate of <it>Hox </it>genes. However, and despite all the previous contributions, no quantification of the rates of nucleotide and indel evolution has been reported so far for a set of <it>Hox </it>genes.</p>
         <p>On the other hand, the origin by duplication and the functional evolution of <it>Hox</it>-derived genes suggest that they might be evolving fast at the sequence level as well. Duplicated genes are known to undergo a period of accelerated evolution where: they may degenerate to a pseudogene (pseudogenization), each daughter gene may adopt part of the functions of their parental gene (subfunctionalization), or they may acquire new functions (neofunctionalization) <abbrgrp><abbr bid="B37">37</abbr><abbr bid="B38">38</abbr><abbr bid="B39">39</abbr><abbr bid="B40">40</abbr></abbrgrp>. The only divergence estimate reported in a <it>Hox</it>-derived gene was calculated between two close species (<it>D. melanogaster </it>and <it>D. simulans</it>) in <it>bcd </it><abbrgrp><abbr bid="B41">41</abbr></abbrgrp>. A recent study found an increased sequence polymorphism in <it>bcd </it>in comparison to <it>zen</it>, which was ascribed to a relaxation of selective constraint on this maternal gene resulting from sex-limited expression <abbrgrp><abbr bid="B42">42</abbr></abbrgrp>. Therefore, <it>bcd </it>is expected to evolve faster than <it>zen </it>under this model. The evolutionary rates of <it>zen </it>and <it>zen2</it>, however, have not been reported so far.</p>
         <p>We have measured the rates of nucleotide substitution and indel fixation of three <it>Hox </it>genes, <it>lab</it>, <it>proboscipedia </it>(<it>pb</it>) and <it>abd-A</it>, and compared them with those of <it>bcd</it>, <it>zen </it>and <it>zen2</it>, which were derived by duplication from <it>Hox3</it>, and a sample of 15 non-<it>Hox </it>genes, in the genus <it>Drosophila</it>. These rates were compared to test the hypothesis that <it>Hox </it>genes, similar to other genes with complex expression patterns and that are essential in the early development, evolve slowly. We have also evaluated the contribution of the homeobox and the repetitive regions within <it>Hox </it>and <it>Hox</it>-derived genes to the evolutionary rates.</p>
         <p>The sequences compared comprise all the complete genes available in <it>D. buzzatii </it>(representative of the Drosophila subgenus), and their orthologs in <it>D. melanogaster </it>and <it>D. pseudoobscura </it>(both species in the Sophophora subgenus). <it>D. buzzatii </it>belongs to the <it>repleta </it>species group, a group comprising ~100 species that has been widely used as a model in studies of genome evolution, ecological adaptation and speciation. Negre <it>et al</it>. <abbrgrp><abbr bid="B19">19</abbr></abbrgrp> have recently compared the genomic organization of the HOM-C complex in <it>D. buzzatii </it>to that of <it>D. melanogaster </it>and <it>D. pseudoobscura</it>, and studied the functional consequences of two HOM-C splits present in this species. When our study began, this was the largest set of orthologous <it>Hox </it>genes in species from both subgenera of the <it>Drosophila </it>genus, and this allowed the exploration of evolutionary rates throughout the <it>Drosophila </it>phylogeny. Due to the high divergence of <it>Hox </it>genes <abbrgrp><abbr bid="B8">8</abbr></abbrgrp>, the inclusion of more distant species outside the <it>Drosophila </it>genus (such as mosquito or honeybee) would probably not be appropriate for the estimation of genetic distances. Moreover, these species do not contain the <it>Hox</it>-derived genes studied here.</p>
      </sec>
      <sec>
         <st>
            <p>Results</p>
         </st>
         <sec>
            <st>
               <p>Nucleotide evolution of <it>Hox</it>, <it>Hox</it>-derived and non-<it>Hox </it>genes</p>
            </st>
            <p>Nucleotide substitution parameters were calculated for the coding nucleotide alignments independently for each gene [see <supplr sid="S1">Additional file 1</supplr>]. We then tested for differences between the three groups of genes (<it>Hox</it>, <it>Hox</it>-derived and non-<it>Hox</it>) (top section of Table <tblr tid="T1">1</tblr>) [see <supplr sid="S2">Additional file 2</supplr>]. Our results showed that <it>Hox</it>-derived genes are evolving much faster and with less functional constraint than <it>Hox </it>and non-<it>Hox </it>genes. Differences among the three groups are significant for the number of nonsynonymous substitutions per nonsynonymous site, <it>d</it><sub><it>N </it></sub>(P = 0.022), and the level of functional constraint, &#969; (P = 0.000) (see Methods). The gene <it>zen2 </it>is the main gene responsible for the high values of nucleotide substitutions (both synonymous and nonsynonymous) in its group [see <supplr sid="S1">Additional file 1</supplr>]. On the contrary, <it>Hox </it>and non-<it>Hox </it>genes have a similar number of nucleotide substitutions, <it>t </it>(P > 0.1). However the level of functional constraint is even higher (lower &#969;) in non-<it>Hox </it>genes than in <it>Hox </it>genes (&#969; = 0.04156 versus &#969; = 0.06094, respectively), although differences are only marginally significant (P = 0.063). Therefore, <it>Hox </it>genes do not seem to be evolving more slowly than other non-homeotic genes, despite their essential function in early development.</p>
            <suppl id="S1">
               <title>
                  <p>Additional File 1</p>
               </title>
               <text>
                  <p>Parameters of gene structure, base composition and nucleotide evolution for each gene.</p>
               </text>
               <file name="1471-2148-6-106-S1.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <tbl id="T1">
               <title>
                  <p>Table 1</p>
               </title>
               <caption>
                  <p>Mean nucleotide substitution parameters and ANOVAs for the three groups of genes.</p>
               </caption>
               <tblbdy cols="6">
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <b>
                              <it>t</it>
                           </b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>
                              <it>d</it>
                              <sub>
                                 <it>N</it>
                              </sub>
                           </b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>
                              <it>d</it>
                              <sub>
                                 <it>S</it>
                              </sub>
                           </b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>&#969;</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c cspan="6">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Complete coding sequences</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Hox</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>2.10917</p>
                     </c>
                     <c ca="center">
                        <p>0.15964</p>
                     </c>
                     <c ca="center">
                        <p>2.59066</p>
                     </c>
                     <c ca="center">
                        <p>0.06094</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p><it>Hox</it>-derived</p>
                     </c>
                     <c ca="center">
                        <p>3.86336</p>
                     </c>
                     <c ca="center">
                        <p>0.39380</p>
                     </c>
                     <c ca="center">
                        <p>4.27598</p>
                     </c>
                     <c ca="center">
                        <p>0.09226</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Non-<it>Hox</it></p>
                     </c>
                     <c ca="center">
                        <p>2.91160</p>
                     </c>
                     <c ca="center">
                        <p>0.15802</p>
                     </c>
                     <c ca="center">
                        <p>3.80668</p>
                     </c>
                     <c ca="center">
                        <p>0.04156</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c cspan="5">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>ANOVA</p>
                     </c>
                     <c ca="center">
                        <p>n.s.</p>
                     </c>
                     <c ca="center">
                        <p>*</p>
                     </c>
                     <c ca="center">
                        <p>n.s.</p>
                     </c>
                     <c ca="center">
                        <p>***</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="6">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Coding sequences excluding the homeobox</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Hox</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>2.27653</p>
                     </c>
                     <c ca="center">
                        <p>0.18257</p>
                     </c>
                     <c ca="center">
                        <p>2.65921</p>
                     </c>
                     <c ca="center">
                        <p>0.06673</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p><it>Hox</it>-derived</p>
                     </c>
                     <c ca="center">
                        <p>5.04914</p>
                     </c>
                     <c ca="center">
                        <p>0.54809</p>
                     </c>
                     <c ca="center">
                        <p>5.26666</p>
                     </c>
                     <c ca="center">
                        <p>0.11320</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Non-<it>Hox</it></p>
                     </c>
                     <c ca="center">
                        <p>2.91160</p>
                     </c>
                     <c ca="center">
                        <p>0.15802</p>
                     </c>
                     <c ca="center">
                        <p>3.80668</p>
                     </c>
                     <c ca="center">
                        <p>0.04156</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c cspan="5">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>ANOVA</p>
                     </c>
                     <c ca="center">
                        <p>n.s.</p>
                     </c>
                     <c ca="center">
                        <p>**</p>
                     </c>
                     <c ca="center">
                        <p>n.s.</p>
                     </c>
                     <c ca="center">
                        <p>***</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="6">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Coding sequences excluding repetitive regions</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Hox</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1.81997</p>
                     </c>
                     <c ca="center">
                        <p>0.12399</p>
                     </c>
                     <c ca="center">
                        <p>2.35029</p>
                     </c>
                     <c ca="center">
                        <p>0.05310</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p><it>Hox</it>-derived</p>
                     </c>
                     <c ca="center">
                        <p>3.71981</p>
                     </c>
                     <c ca="center">
                        <p>0.37759</p>
                     </c>
                     <c ca="center">
                        <p>4.14242</p>
                     </c>
                     <c ca="center">
                        <p>0.09042</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Non-<it>Hox</it></p>
                     </c>
                     <c ca="center">
                        <p>2.85593</p>
                     </c>
                     <c ca="center">
                        <p>0.15444</p>
                     </c>
                     <c ca="center">
                        <p>3.76458</p>
                     </c>
                     <c ca="center">
                        <p>0.04035</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c cspan="5">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>ANOVA</p>
                     </c>
                     <c ca="center">
                        <p>n.s.</p>
                     </c>
                     <c ca="center">
                        <p>*</p>
                     </c>
                     <c ca="center">
                        <p>n.s.</p>
                     </c>
                     <c ca="center">
                        <p>***</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="6">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Coding sequences excluding the homeobox and repetitive regions</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Hox</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1.94286</p>
                     </c>
                     <c ca="center">
                        <p>0.14684</p>
                     </c>
                     <c ca="center">
                        <p>2.33783</p>
                     </c>
                     <c ca="center">
                        <p>0.06146</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p><it>Hox</it>-derived</p>
                     </c>
                     <c ca="center">
                        <p>4.88928</p>
                     </c>
                     <c ca="center">
                        <p>0.53011</p>
                     </c>
                     <c ca="center">
                        <p>5.12014</p>
                     </c>
                     <c ca="center">
                        <p>0.11245</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Non-<it>Hox</it></p>
                     </c>
                     <c ca="center">
                        <p>2.85593</p>
                     </c>
                     <c ca="center">
                        <p>0.15444</p>
                     </c>
                     <c ca="center">
                        <p>3.76458</p>
                     </c>
                     <c ca="center">
                        <p>0.04035</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c cspan="5">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>ANOVA</p>
                     </c>
                     <c ca="center">
                        <p>n.s.</p>
                     </c>
                     <c ca="center">
                        <p>**</p>
                     </c>
                     <c ca="center">
                        <p>n.s.</p>
                     </c>
                     <c ca="center">
                        <p>***</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>n.s. (P>0.05), * (P&lt;0.05), ** (P&lt;0.01), *** (P&lt;0.001)</p>
               </tblfn>
            </tbl>
            <suppl id="S2">
               <title>
                  <p>Additional File 2</p>
               </title>
               <text>
                  <p>ANOVA and contrast analyses for all group comparisons.</p>
               </text>
               <file name="1471-2148-6-106-S2.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <p>Then, we plotted <it>d</it><sub><it>N </it></sub>and &#969; in sliding windows along the coding sequences of <it>Hox </it>and <it>Hox</it>-derived genes to see whether or not these parameters behave homogeneously along the sequence. Figure <figr fid="F1">1</figr> shows that, in all genes except <it>zen2</it>, there is a substantial decrease of both <it>d</it><sub><it>N </it></sub>and &#969; near the homeobox. <it>zen2 </it>contains a rapidly evolving homeobox with high &#969; values. Contrarily, we have observed that peaks of <it>d</it><sub><it>N </it></sub>tend to lie within repetitive regions (data not shown).</p>
            <fig id="F1">
               <title>
                  <p>Figure 1</p>
               </title>
               <caption>
                  <p>Distribution of <it>d</it><sub><it>N </it></sub>and &#969; in sliding windows along the coding sequence of genes</p>
               </caption>
               <text>
                  <p><b>Distribution of <it>d</it><sub><it>N </it></sub>and &#969; in sliding windows along the coding sequence of genes</b>. Distribution of <it>d</it><sub><it>N </it></sub>(broken line) and &#969; (solid line) in sliding windows of 240 nucleotides. (a) <it>abd-A</it>, (b) <it>lab</it>, (c) <it>pb</it>, (d) <it>bcd</it>, (e) <it>zen </it>and (f) <it>zen2</it>. In each case, the position of the homeobox is represented by a yellow box within the X axis.</p>
               </text>
               <graphic file="1471-2148-6-106-1"/>
            </fig>
            <p>To control for a possible effect on the overall nucleotide evolution of both the homeobox and the repetitive regions (see Methods) of these <it>Hox </it>and <it>Hox</it>-derived genes, we tested again for differences among the three groups of genes excluding these regions. Removing the homeobox in <it>Hox </it>and <it>Hox</it>-derived coding sequences (second section of Table <tblr tid="T1">1</tblr>) elevated the number of nucleotide substitutions in these two groups, and decreased further their level of functional constraint. Again, differences among groups were significant for <it>d</it><sub><it>N </it></sub>(P = 0.005) and &#969; (P = 0.000), and the same tendency of the previous analysis with complete coding sequences was observed. In contrast, removing repetitive regions (third section of Table <tblr tid="T1">1</tblr>) decreased the number of nucleotide substitutions, especially in <it>Hox </it>genes, where all the genes in the group contain this type of region. Therefore, the elimination of repetitive regions slightly increases the difference between <it>Hox </it>and non-<it>Hox </it>genes in terms of nucleotide substitutions, and reduces the difference in functional constraint. Once more, differences among groups were significant for <it>d</it><sub><it>N </it></sub>(P = 0.030) and &#969; (P = 0.001). Excluding both the homeobox and the repetitive regions (bottom section of Table <tblr tid="T1">1</tblr>) gave intermediate results. Therefore, we can conclude that: (1)<it>Hox </it>and non-<it>Hox </it>genes are evolving similarly in terms of nucleotide substitutions, (2) <it>Hox</it>-derived genes are evolving much faster and with less functional constraint than the other two groups of genes, and (3) neither the homeobox nor the repetitive regions alter the estimates significantly, and thus are not entirely responsible for the two previous conclusions.</p>
            <p>An excess of nonsynonymous over synonymous substitutions is a robust indicator of positive selection at the molecular level. Therefore, we searched for values of nonsynonymous/synonymous rate ratio (<it>d</it><sub><it>N</it></sub><it>/d</it><sub><it>S </it></sub>= &#969;) greater than 1 to investigate whether Darwinian selection has been acting on any of the coding sequences analyzed in this study. However, no evidence of positive selection in any coding sequence or region of it was found.</p>
         </sec>
         <sec>
            <st>
               <p>Amino acid and structural changes at the protein level</p>
            </st>
            <p>We used the protein alignments to calculate the proportion of amino acid differences and indels. In the first case (Table <tblr tid="T2">2</tblr>, Figure <figr fid="F2">2</figr>), differences among the three groups &#8211; <it>Hox</it>, <it>Hox</it>-derived and non-<it>Hox </it>&#8211; were not significant (P = 0.101). However, the proportion of amino acid differences was substantially higher for <it>Hox</it>-derived genes (40.43%) than for <it>Hox </it>and non-<it>Hox </it>genes (22.80% and 23.77%, respectively). This result is in full agreement with our previous estimates of <it>d</it><sub><it>N </it></sub>(Table <tblr tid="T1">1</tblr>), which showed high values of this parameter for <it>Hox</it>-derived genes, but very similar values for <it>Hox </it>and non-<it>Hox </it>genes.</p>
            <fig id="F2">
               <title>
                  <p>Figure 2</p>
               </title>
               <caption>
                  <p>Proportion of amino acid differences and indels in the set of genes analyzed in this study</p>
               </caption>
               <text>
                  <p>Proportion of amino acid differences and indels in the set of genes analyzed in this study.</p>
               </text>
               <graphic file="1471-2148-6-106-2"/>
            </fig>
            <tbl id="T2">
               <title>
                  <p>Table 2</p>
               </title>
               <caption>
                  <p>Percentage of amino acid differences in the alignment (&#177; SD) in the three groups of proteins.</p>
               </caption>
               <tblbdy cols="5">
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <b>TOTAL</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>UNIQUE</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>REPETITIVE</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>T-test<sup>&#167;</sup></b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c cspan="5">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>
                              <it>Hox</it>
                           </b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>22.80 &#177; <it>10.44</it></p>
                     </c>
                     <c ca="center">
                        <p>18.22 &#177; <it>10.50</it></p>
                     </c>
                     <c ca="center">
                        <p>37.11 &#177; <it>12.33</it></p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b><it>Hox</it>-derived</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>40.43 &#177; <it>18.26</it></p>
                     </c>
                     <c ca="center">
                        <p>39.00 &#177; <it>19.64</it></p>
                     </c>
                     <c ca="center">
                        <p>62.97 &#177; <it>24.08</it></p>
                     </c>
                     <c ca="center">
                        <p>***</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>Non-<it>Hox</it></b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>23.77 &#177; <it>10.81</it></p>
                     </c>
                     <c ca="center">
                        <p>23.38 &#177; <it>10.93</it></p>
                     </c>
                     <c ca="center">
                        <p>55.46 &#177; <it>31.35</it></p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c cspan="5">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>ANOVA</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>n.s.</p>
                     </c>
                     <c ca="center">
                        <p>n.s.</p>
                     </c>
                     <c ca="center">
                        <p>n.s.</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>n.s. (P > 0.05), * (P &lt; 0.05), ** (P &lt; 0.01), *** (P &lt; 0.001).</p>
                  <p><sup>&#167; </sup>T-test for paired samples (unique <it>vs</it>. repetitive) on proteins having both types of regions [ABD-A, LAB, PB, BCD, ZEN, Ccp84Ac, CG13617, CG14290 and LAP (product of <it>CG2520</it>)].</p>
               </tblfn>
            </tbl>
            <p>Second, we analyzed the proportion of indels in the alignments (Table <tblr tid="T3">3</tblr>, Figure <figr fid="F2">2</figr>). In this case, differences among the three groups of genes were highly significant (P = 0.000). Surprisingly, differences were due to the low indel proportion in non-<it>Hox </it>genes (8.73%) compared to the high values for <it>Hox </it>and <it>Hox</it>-derived genes (25.77% and 37.53%, respectively). Furthermore, we tested for differences in indel length using a nested ANOVA. The results indicated that, although the variation in indel length between genes within groups is significant (P = 0.021), the difference between groups is even more significant (P = 0.001). Mean indel length for <it>Hox</it>, <it>Hox</it>-derived and non-<it>Hox </it>genes is 4.22, 5.99 and 3.55 amino acids, respectively. Non-<it>Hox </it>genes not only have on average shorter indels, but also their longest indel is only 23 amino acids, in comparison with 43 and 40 amino acids for <it>Hox </it>and <it>Hox</it>-derived genes, respectively. In all groups, the indel length distribution follows a negative exponential curve: short indels are common and their abundance declines as length increases (data not shown).</p>
            <tbl id="T3">
               <title>
                  <p>Table 3</p>
               </title>
               <caption>
                  <p>Percentage of indels in the alignment (&#177; SD) in the three groups of proteins.</p>
               </caption>
               <tblbdy cols="5">
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>
                           <b>TOTAL</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>UNIQUE</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>REPETITIVE</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>T-test<sup>&#167;</sup></b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c cspan="5">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>
                              <it>Hox</it>
                           </b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>25.77 &#177; <it>4.31</it></p>
                     </c>
                     <c ca="center">
                        <p>16.21 &#177; <it>8.40</it></p>
                     </c>
                     <c ca="center">
                        <p>44.82 &#177; <it>2.38</it></p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b><it>Hox</it>-derived</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>37.53 &#177; <it>9.63</it></p>
                     </c>
                     <c ca="center">
                        <p>34.88 &#177; <it>12.40</it></p>
                     </c>
                     <c ca="center">
                        <p>75.64 &#177; <it>34.45</it></p>
                     </c>
                     <c ca="center">
                        <p>**</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>Non-<it>Hox</it></b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>8.73 &#177; <it>10.24</it></p>
                     </c>
                     <c ca="center">
                        <p>8.46 &#177; <it>10.28</it></p>
                     </c>
                     <c ca="center">
                        <p>23.79 &#177; <it>25.66</it></p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c cspan="5">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>ANOVA</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>***</p>
                     </c>
                     <c ca="center">
                        <p>**</p>
                     </c>
                     <c ca="center">
                        <p>n.s.</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>n.s. (P > 0.05), * (P &lt; 0.05), ** (P &lt; 0.01), *** (P &lt; 0.001).</p>
                  <p><sup>&#167; </sup>T-test for paired samples (unique <it>vs</it>. repetitive) on proteins having both types of regions [ABD-A, LAB, PB, BCD, ZEN, Ccp84Ac, CG13617, CG14290 and LAP (product of <it>CG2520</it>)].</p>
               </tblfn>
            </tbl>
            <p>Finally, we tested whether the proportions of amino acid differences and indels are correlated. The Pearson correlation indicated that these two variables are positively but not significantly correlated (r<sub>Pearson </sub>= 0.307, P = 0.175). Therefore, genes with a high proportion of indels do not necessarily have a high proportion of amino acid substitutions. This probably points to different causal mechanisms for amino acid substitutions and indels.</p>
         </sec>
         <sec>
            <st>
               <p>Effect of long repetitive tracks in the percentages of amino acid differences and indels of <it>Hox </it>and <it>Hox</it>-derived proteins</p>
            </st>
            <p>Most <it>Hox </it>and <it>Hox</it>-derived proteins contain large repetitive regions present throughout the protein except the region near the homeobox and other highly conserved regions (see for instance the amino acid sequence of ABD-A in Figure <figr fid="F3">3</figr>). Predominant repetitions are poly-glutamine (poly-Q), poly-alanine (poly-A) and serine-rich regions (S-rich). These repetitive regions seem to include most of the indels and amino acid differences, and therefore they might be responsible for the surprisingly high evolutionary rate of <it>Hox </it>and <it>Hox</it>-derived proteins.</p>
            <fig id="F3">
               <title>
                  <p>Figure 3</p>
               </title>
               <caption>
                  <p>Alignment of a <it>Hox </it>protein (ABD-A) showing multiple long repeats spacing functional domains</p>
               </caption>
               <text>
                  <p><b>Alignment of a <it>Hox </it>protein (ABD-A) showing multiple long repeats spacing functional domains</b>. Functional domains are represented by red boxes, and repeats by blue boxes as follows: repetitive regions annotated in UniProt are represented by solid boxes, simple repeats by dashed boxes and complex repeats by dotted light boxes (see Methods). Notation: Dbuz = <it>D. buzzatii</it>; Dmel = <it>D. melanogaster</it>; Dpse = <it>D. pseudoobscura</it>.</p>
               </text>
               <graphic file="1471-2148-6-106-3"/>
            </fig>
            <p>To test this hypothesis, we repeated the analyses of amino acid differences and indels inside and outside these repetitive regions (see Methods), and compared these two kinds of sequences (repetitive and unique). In the case of amino acid differences (Table <tblr tid="T2">2</tblr>), the percentage of aligned, non-conserved amino acids is higher in repetitive regions than in unique sequence in all the three groups. The T-test for paired samples (unique versus repetitive) on proteins having both types of regions showed significant differences between unique and repetitive sequences (P = 0.001), the mean of repetitive sequences being more than twice that for unique sequences (51.01% versus 23.19%, respectively). Despite this higher percentage of amino acid differences in repetitive than in unique sequence, the three groups of genes behave in a similar manner in both types of regions (note that the ranking is the same in both unique and repetitive regions).</p>
            <p>Finally, we wanted to determine whether or not repetitive regions accumulate a larger number of indels than unique sequence (Table <tblr tid="T3">3</tblr>). The results show that in all the three groups, the percentage of indels in repetitive regions is much higher than that in unique sequence. These differences are significant (P = 0.006) according to a T-test for paired samples, giving an average value of 42.32% in repetitive regions versus 15.53% in unique sequence. Nevertheless, the ANOVA computed after removing repetitive regions remained highly significant (P = 0.003). Thus repetitive regions are not entirely responsible for the high percentage of indels in <it>Hox </it>and <it>Hox</it>-derived proteins. Therefore, <it>Hox </it>and <it>Hox</it>-derived genes have a tendency to accumulate indels even outside of repetitive regions, which does not seem to be allowed in non-<it>Hox </it>genes.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Discussion</p>
         </st>
         <sec>
            <st>
               <p>Evolutionary rates of <it>Hox </it>genes</p>
            </st>
            <p>This study shows that <it>Hox </it>genes seem to be evolving differently from other essential genes expressed in early development, with complex expression patterns or with long introns rich in <it>cis</it>-regulatory elements. Both the number of nonsynonymous substitutions and the degree of functional constraint are not significantly different between <it>Hox </it>and non-<it>Hox </it>genes, and this remains true even when the most peculiar regions (the homeobox and the repetitive regions) are excluded (Table <tblr tid="T1">1</tblr>). Therefore, <it>Hox </it>genes do not seem to be evolving more slowly than other non-homeotic genes, despite their essential function in the early development and even though their interchangeability between species has been proven to be functional in some cases <abbrgrp><abbr bid="B22">22</abbr><abbr bid="B23">23</abbr><abbr bid="B24">24</abbr><abbr bid="B25">25</abbr><abbr bid="B26">26</abbr></abbrgrp>.</p>
            <p>Differences in the evolutionary rate among the three groups of genes (<it>Hox</it>, <it>Hox</it>-derived and non-<it>Hox</it>) could be mediated by some properties of genes that are correlated with the number of nucleotide substitutions (<it>t</it>). One possibility is that <it>Hox </it>and <it>Hox</it>-derived genes experience similar background rates of mutation that are different from those of non-<it>Hox </it>genes. We can use the number of synonymous substitutions per synonymous site (<it>d</it><sub><it>S</it></sub>) as a measure of the mutation rate of a gene. This variable is not significantly different among the three groups of genes (P = 0.530), and thus we can consider that mutation rate is constant across groups [see <supplr sid="S2">Additional file 2</supplr>]. Another possibility is that genes within a group may have correlated levels of synonymous codon bias. Given that genes with higher codon bias tend to evolve more slowly <abbrgrp><abbr bid="B28">28</abbr><abbr bid="B43">43</abbr></abbrgrp>, codon bias may contribute to spurious differences in the rates of protein evolution among groups. We have measured codon bias for each gene using the Effective Number of Codons, <it>N</it><sub><it>C </it></sub><abbrgrp><abbr bid="B44">44</abbr></abbrgrp>. There are no significant differences in the codon bias among groups, and the average <it>N</it><sub><it>C </it></sub>value for non-<it>Hox </it>genes is the lowest among the three groups (the highest codon bias) [see Additional files <supplr sid="S1">1</supplr> and <supplr sid="S2">2</supplr>].</p>
            <p>Some <it>Hox </it>and <it>Hox</it>-derived genes considered here have been included in previous studies <abbrgrp><abbr bid="B29">29</abbr><abbr bid="B41">41</abbr></abbrgrp>. Davis <it>et al</it>. <abbrgrp><abbr bid="B29">29</abbr></abbrgrp> showed that the strongest negative relationship between expression profile and evolutionary rate occurs at a late stage in embryonic development, soon after the main burst of expression of segment polarity and <it>Hox </it>genes. However, they also show that the most constrained transcription factors and signal transducers, the functional class that contains many developmentally essential genes, are expressed precisely at the same time as the segment polarity and <it>Hox </it>genes. One of the two <it>Hox </it>genes included in their study has also been analyzed here (<it>abd-A</it>), and it is incidentally the gene with the lowest number of nonsynonymous substitutions and the one that is most constrained in our sample of <it>Hox </it>genes. On the other hand, <it>bcd</it>, although being one of the first genes acting in <it>Drosophila </it>development, was reported in the same study as an exceptional case of a gene acting in the earliest stages of development but evolving surprisingly fast <abbrgrp><abbr bid="B29">29</abbr></abbrgrp>.</p>
            <p>Furthermore, <it>Hox </it>genes depart from a negative correlation found in previous studies between evolutionary rate at the protein level and intron size, number of conserved noncoding sequences within introns, or regulatory complexity <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>. In this respect, all <it>Hox </it>genes used in this study contain a total intron size >10 Kb [see <supplr sid="S3">Additional file 3</supplr>], which corresponds to the longest intron size category used in <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>. Therefore, <it>Hox </it>genes are expected to evolve slowly as they contain long intronic sequences. Both <it>Hox</it>-derived and non-<it>Hox </it>genes contain shorter intron lengths than <it>Hox </it>genes [see <supplr sid="S3">Additional file 3</supplr>], and thus would be expected to evolve faster.</p>
            <suppl id="S3">
               <title>
                  <p>Additional File 3</p>
               </title>
               <text>
                  <p>Genes from <it>D. buzzatii</it>, <it>D. melanogaster </it>and <it>D. pseudoobscura </it>used in the analyses with their accession number in Genbank or Flybase and their location on the chromosome.</p>
               </text>
               <file name="1471-2148-6-106-S3.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
         </sec>
         <sec>
            <st>
               <p>Amino acid differences and indels</p>
            </st>
            <p>The percentages of amino acid differences and of indels in <it>Hox </it>proteins also depart from the initial expectations. While the percentage of amino acid differences is not significantly different among the three groups compared (Table <tblr tid="T2">2</tblr>), the percentages of indels in <it>Hox </it>and <it>Hox</it>-derived proteins are much higher than that in non-<it>Hox </it>proteins (Table <tblr tid="T3">3</tblr>). Therefore, <it>Hox </it>proteins are as divergent as non-<it>Hox </it>proteins in terms of amino acid changes, but they are much more divergent in terms of indels. A visual inspection of the alignments pointed out a possible explanation to these results (Figure <figr fid="F3">3</figr>). <it>Hox </it>and some <it>Hox</it>-derived proteins contain large repetitive regions, mostly homopeptides, present all along the protein except the region near the homeodomain and other highly conserved regions. It is within these repetitive regions where most indels and amino acid differences seem to accumulate, in some cases resulting in poor alignment, and therefore they could be responsible for the surprisingly high amino acid and indel evolution of <it>Hox </it>and <it>Hox</it>-derived proteins.</p>
            <p>Although repetitive regions have been shown to be richer in amino acid differences and indels than unique sequence, they do not fully explain the high variation found in <it>Hox </it>and <it>Hox</it>-derived proteins. Even excluding repetitive regions, <it>Hox </it>and <it>Hox</it>-derived genes contain many more indels than non-<it>Hox </it>genes, although the percentage of amino acid substitutions is not significantly different between <it>Hox </it>and non-<it>Hox </it>genes. Therefore, taking amino acid differences and indels altogether we can state that the overall rate of evolution of <it>Hox </it>and <it>Hox</it>-derived genes is faster than that of non-<it>Hox </it>genes. The percentage of the alignment that has changed is 43.39% in <it>Hox </it>proteins, 64.73% in <it>Hox</it>-derived proteins and 30.97% in non-<it>Hox </it>proteins (the percentage of amino acid differences has been recalculated before being added to the percentage of indels to account for the total number of sites, both gapped and non-gapped, in order to make both percentages comparable). Finally, a lack of correlation between the proportion of indels and amino acid differences in the set of genes used in this study highlights the different evolutionary mechanisms that regulate both types of changes.</p>
         </sec>
         <sec>
            <st>
               <p>Homopeptides and other repetitions in <it>Hox </it>and <it>Hox</it>-derived proteins</p>
            </st>
            <p>Multiple long homopeptides are found in 7% of <it>Drosophila </it>proteins, most of which are essential developmental proteins expressed in the nervous system and involved in transcriptional regulation <abbrgrp><abbr bid="B34">34</abbr><abbr bid="B45">45</abbr></abbrgrp>. What is the role of these homopeptides? They could be tolerated, non-essential insertions that may play a role as transcriptional activity modulators. Some examples have been described in <it>Hox </it>and <it>Hox</it>-derived proteins <abbrgrp><abbr bid="B15">15</abbr></abbrgrp> that illustrate the acquisition of new functions in the insect lineage while maintaining their homeotic role. In these examples, selection against coding changes might have been relaxed because of functional redundancy among <it>Hox </it>paralogs. These sequence differences could be involved in the functional divergence of <it>Hox </it>proteins and the evolutionary diversification of animals <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>.</p>
            <p>The large effects of <it>Hox </it>genes on morphology suggest that they regulate, directly or indirectly, a large number of genes. It would be expected that such pleiotropic proteins would be constrained in their sequence variation and, hence, their contribution to morphological variation. However, it has been shown that microsatellite sequences in developmental genes are a source of variation in natural populations, affecting visible traits by expanding or contracting at very high rates <abbrgrp><abbr bid="B46">46</abbr></abbrgrp>. One intrinsic characteristic of microsatellites is their hypervariability, resulting from a balance between slippage events and point mutations <abbrgrp><abbr bid="B35">35</abbr><abbr bid="B36">36</abbr></abbrgrp>. Their mutation rate has been estimated to be 1.5 &#215; 10<sup>-6 </sup>per locus per generation in the case of trinucleotide repeats in <it>D. melanogaster </it><abbrgrp><abbr bid="B47">47</abbr></abbrgrp>, and is even greater in the case of dinucleotides. These values contrast with the general mutation rate of ~10<sup>-8 </sup>per site per generation of base pair substitutions <abbrgrp><abbr bid="B48">48</abbr></abbrgrp>. These repeats typically generate regions in the alignment with high variability in sequence and length, and that are difficult to align.</p>
            <p>A potential role for homopeptides is to serve as spacer elements between functional domains, to provide flexibility to the three-dimensional conformation, and fine-tuning domain orientation of the protein in its interactions with DNA and other proteins. To that effect, changes in nucleotide distances between target binding sites might be accompanied by complementary changes in the sequences spacing the binding domains of transcription factors (mostly homopeptides). This would produce a coordinated evolution between transcription factors and their target binding sites. Excessive expansions of homopeptides, however, have often been associated with disease in humans <abbrgrp><abbr bid="B49">49</abbr><abbr bid="B50">50</abbr><abbr bid="B51">51</abbr><abbr bid="B52">52</abbr></abbrgrp>. Amazingly, essential developmental proteins like homeotic proteins that apparently need such homopeptides for their correct functioning have to suffer the consequences of their quick and apparently unpredictable evolution, and sacrifice in this way the conservation that would be expected in proteins of this type.</p>
            <p>Among non-<it>Hox </it>genes, the cluster of cuticular genes (<it>Ccp84Ac</it>, <it>Ccp84Ae</it>, <it>Ccp84Af </it>and <it>Ccp84Ag</it>) behave similarly to <it>Hox </it>and <it>Hox</it>-derived genes and account for the vast majority of indels in their group (Figure <figr fid="F2">2</figr>). These short proteins share a conserved C-terminal section <abbrgrp><abbr bid="B53">53</abbr></abbrgrp> and include a 35&#8211;36 amino acid motif known as the R&amp;R consensus, present in many insect cuticle proteins, an extended form of which has been shown to bind chitin (chitin-bind 4; PF00379) <abbrgrp><abbr bid="B54">54</abbr></abbrgrp>. Outside these conserved domains, cuticular proteins share hydrophobic regions dominated by tetrapeptide repeats (A-A-P-A/V), which are presumed to be functionally important <abbrgrp><abbr bid="B55">55</abbr><abbr bid="B56">56</abbr></abbrgrp> and are responsible for the high percentage of indels found in these proteins. These repeats are usually complex repeats that are not annotated in UniProt, nor detected as runs of identical amino acid repetitions (see Methods), and thus contribute to the percentage of indels in unique sequence in non-<it>Hox </it>genes (Table <tblr tid="T3">3</tblr>). When complex repeats were annotated and considered as repetitive sequence (see Methods), the percentage of indels in the unique portion of all classes of genes decreased substantially, but especially in non-<it>Hox </it>genes [see <supplr sid="S4">Additional file 4</supplr>]. The elimination of complex repeats in cuticular genes was crucial in this reduction, and further increased the differences among groups.</p>
            <suppl id="S4">
               <title>
                  <p>Additional File 4</p>
               </title>
               <text>
                  <p>Set of tables of the main text, obtained according to three different annotation criteria to define repetitive sequences.</p>
               </text>
               <file name="1471-2148-6-106-S4.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <p>Therefore, our results show that long repetitive sequences are not enough to explain all the differences found between <it>Hox </it>or <it>Hox</it>-derived genes and non-<it>Hox </it>genes. <it>Hox </it>and <it>Hox</it>-derived genes have a tendency to accumulate indels outside these repetitive regions that is not observed in non-<it>Hox </it>genes. We propose that spontaneous deletions between short repeated sequences could be the mechanism responsible for this difference <abbrgrp><abbr bid="B57">57</abbr></abbrgrp>. Such deletions have been described in phages <abbrgrp><abbr bid="B58">58</abbr><abbr bid="B59">59</abbr></abbrgrp>, <it>Escherichia coli </it><abbrgrp><abbr bid="B60">60</abbr><abbr bid="B61">61</abbr><abbr bid="B62">62</abbr><abbr bid="B63">63</abbr><abbr bid="B64">64</abbr><abbr bid="B65">65</abbr></abbrgrp> and humans <abbrgrp><abbr bid="B66">66</abbr><abbr bid="B67">67</abbr></abbrgrp>, and predominate between short sequence similarities of as few as 5&#8211;8 base pairs <abbrgrp><abbr bid="B68">68</abbr></abbrgrp>. Two different models can explain the generation of spontaneous deletions: slipped mispairing during DNA synthesis, and recombination events mediated by enzymes that recognize these sequence similarities. In either case, the repetitive and compositionally biased nature of several regions within <it>Hox </it>and <it>Hox</it>-derived sequences might explain the major incidence of indels in these two groups. This would also explain the large differences in protein lengths among species that have been observed in some <it>Hox </it>proteins <abbrgrp><abbr bid="B8">8</abbr></abbrgrp>. This higher probability of mutation would presumably be accompanied by a higher tolerance to indels of <it>Hox </it>and <it>Hox</it>-derived proteins outside their binding domains.</p>
            <p>For a correct interpretation of our results, the set of non-<it>Hox </it>genes should be an unbiased sample of genes, both in terms of protein expression and structure. We have gathered this information from the literature, and verified that our non-<it>Hox </it>sample comprises a variable group of genes that are expressed through the fly life cycle (from young embryo to adult) and contains a wide variety of protein domains [see <supplr sid="S5">Additional file 5</supplr>]. Therefore, we assume that, although small, it represents an unbiased sample of all non-<it>Hox </it>genes in the genome, and that results presented here are reliable.</p>
            <suppl id="S5">
               <title>
                  <p>Additional File 5</p>
               </title>
               <text>
                  <p>Structure and expression of non-<it>Hox </it>proteins.</p>
               </text>
               <file name="1471-2148-6-106-S5.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
         </sec>
         <sec>
            <st>
               <p>The fate of <it>Hox</it>-derived genes after their origination by duplication</p>
            </st>
            <p>The three <it>Hox</it>-derived genes used in this study (<it>bcd</it>, <it>zen </it>and <it>zen2</it>) originated from two consecutive duplications of the ancestral <it>Hox3 </it>gene. Seemingly, <it>bcd </it>and <it>zen </it>have specialized and perform separate functions in the establishment of the embryo's body plan <abbrgrp><abbr bid="B11">11</abbr><abbr bid="B12">12</abbr><abbr bid="B13">13</abbr></abbrgrp>. This is supported by our data, as these two genes have a moderate evolutionary rate but low level of functional constraint (high <it>d</it><sub><it>N</it></sub>/<it>d</it><sub><it>S </it></sub>rate ratio). However, the finding of Barker <it>et al</it>. <abbrgrp><abbr bid="B42">42</abbr></abbrgrp> that genes with a maternal effect experience relaxed selective constraint resulting from sex-limited expression is not supported by our data. Our results show that <it>bcd </it>and <it>zen </it>are evolving at very similar rates in the <it>Drosophila </it>lineage, and <it>bcd </it>is even more constrained than <it>zen </it>[see <supplr sid="S1">Additional file 1</supplr>].</p>
            <p>The function of <it>zen2 </it>is unclear. It has the same expression pattern as <it>zen </it>and, despite its high divergence across species, it has been maintained for more than 60 Myr <abbrgrp><abbr bid="B19">19</abbr></abbrgrp>. Conservation of two paralogous genes maintaining the same function is unlikely, and could only be explained under some peculiar conditions (e.g. two strongly expressed genes whose products are in high demand <abbrgrp><abbr bid="B40">40</abbr></abbrgrp>). It could be that this gene is experiencing a process of pseudogenization, supported by the fact that the evolutionary rate of <it>zen2 </it>is more than twice that of <it>bcd </it>and <it>zen</it>, and that it has also the highest percentage of the alignment represented by indels. If so, we would expect to see a relaxation of the functional constraint. However, the relatively high level of functional constraint of <it>zen2 </it>(&#969; = 0.09144) rather indicates a process of neofunctionalization, even though positive selection was not detected. The fact that this gene does not show an explicit pattern of variation of &#969; along its sequence (Figure <figr fid="F1">1</figr>) further supports the progressive loss of its original homeotic function and the acquisition of new functions.</p>
            <p>Compared to the other two groups (<it>Hox </it>and non-<it>Hox </it>genes), <it>Hox</it>-derived genes are evolving significantly much faster and with less functional constraint. It is also the group with the highest proportion of amino acid differences and indels. These results reflect their relatively recent origin by duplication, which was followed by extensive changes in their role during the development of insects.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Conclusion</p>
         </st>
         <p>Many studies so far have largely focused on <it>Hox </it>gene homeobox sequences, and have demonstrated that they are highly conserved across species. However, <it>Hox </it>genes and in general all transcription factors share a particular structure where different highly conserved modules are interspersed with long repetitive regions, mostly microsatellites. Our results show that both <it>Hox </it>and <it>Hox</it>-derived genes have an overall high rate of evolution, especially in terms of indels. Moreover, although repetitive regions are richer in both amino acid differences and indels than the rest of the coding sequence, they do not seem to fully explain the differences in evolutionary rates found between <it>Hox </it>or <it>Hox</it>-derived genes and non-<it>Hox </it>genes. Therefore, by using complete gene sequences rather than their conserved modules, we observe that the <it>Hox </it>gene evolutionary rate is as high as that of non-<it>Hox </it>genes in terms of nucleotide evolution, and even higher in terms of indels. <it>Hox</it>-derived genes constitute the group with the highest evolutionary rate by all criteria. These results emphasize the need to take into account indels in addition to nucleotide substitutions in order to estimate evolutionary rates accurately. This study is the first quantification of the rates of nucleotide and indel evolution in these groups of genes, and shows that <it>Hox </it>and <it>Hox</it>-derived genes have a higher evolutionary dynamics than other developmental genes.</p>
      </sec>
      <sec>
         <st>
            <p>Methods</p>
         </st>
         <sec>
            <st>
               <p>Genes analyzed and their classification</p>
            </st>
            <p>All the completely sequenced genes in <it>D. buzzatii </it>with a clear ortholog in <it>D. melanogaster </it>and <it>D. pseudoobscura </it>(23) were included in our analysis: <it>abd-A</it>, <it>lab</it>, <it>pb</it>, <it>bcd</it>, <it>zen</it>, <it>zen2</it>, <it>Dbuz\Ccp3 </it>(ortholog of <it>Dmel\Ccp84Ac</it>), <it>Dbuz\Ccp6 </it>(ortholog of <it>Dmel\Ccp84Ae</it>), <it>Dbuz\Ccp7 </it>(ortholog of <it>Dmel\Ccp84Af</it>), <it>Dbuz\Ccp8 </it>(ortholog of <it>Dmel\Ccp84Ag</it>), <it>CG1288</it>, <it>CG14290</it>, <it>CG14609</it>, <it>CG14899</it>, <it>CG17836</it>, <it>CG2520 </it>and <it>CG31363 </it>from Negre <it>et al</it>. <abbrgrp><abbr bid="B19">19</abbr></abbrgrp>; <it>Adh-related </it>(<it>Adhr</it>) from Betran and Ashburner <abbrgrp><abbr bid="B69">69</abbr></abbrgrp>; <it>&#945;-Esterase-2 </it>(<it>&#945;-Est2</it>) and <it>&#945;-Esterase-3 </it>(<it>&#945;-Est3</it>) from Robin <it>et al</it>. <abbrgrp><abbr bid="B70">70</abbr></abbrgrp>; <it>CG13617 </it>from Puig, Caceres, and Ruiz <abbrgrp><abbr bid="B71">71</abbr></abbrgrp>; and <it>Larval serum protein 1 &#946; </it>(<it>Lsp1&#946;</it>) and <it>Lsp1&#947; </it>from Gonzalez, Casals and Ruiz <abbrgrp><abbr bid="B72">72</abbr></abbrgrp>. Sequences of <it>D. melanogaster </it>orthologs were collected from Flybase <abbrgrp><abbr bid="B73">73</abbr><abbr bid="B74">74</abbr></abbrgrp>, and those of <it>D. pseudoobscura </it>were annotated on the scaffolds from the whole genome shotgun sequencing project <abbrgrp><abbr bid="B75">75</abbr><abbr bid="B76">76</abbr></abbrgrp>. We identified the <it>D. pseudoobscura </it>orthologs by using the alignment of this species with the <it>D. melanogaster </it>genome generated by the Berkeley Genome Pipeline <abbrgrp><abbr bid="B77">77</abbr></abbrgrp>, and annotated the target sequences with the aid of ARTEMIS v. 7 <abbrgrp><abbr bid="B78">78</abbr></abbrgrp> and BIOEDIT v. 7.0.4.1 <abbrgrp><abbr bid="B79">79</abbr></abbrgrp>. A complete list of all genes, accession numbers (from Genbank or Flybase) and chromosomal locations is provided [see <supplr sid="S3">Additional file 3</supplr>]. The longest transcript of each gene was used for the analyses. Genes were classified into three categories: 1) <it>Hox </it>genes (<it>abd-A</it>, <it>lab </it>and <it>pb</it>); 2) <it>Hox</it>-derived genes (<it>bcd</it>, <it>zen </it>and <it>zen2</it>); and 3) non-<it>Hox </it>genes (the remaining 17 genes). Results in each group were produced by calculating the average of all the genes within the group.</p>
         </sec>
         <sec>
            <st>
               <p>Sequence annotation and alignment</p>
            </st>
            <p>A set of Perl scripts, together with modules from PDA v. 1.4 <abbrgrp><abbr bid="B80">80</abbr></abbrgrp> and BIOPERL v. 1.2.3 <abbrgrp><abbr bid="B81">81</abbr></abbrgrp>, were used to automatically check sequence annotations, extract the coding sequences (CDSs) of the selected transcripts and calculate basic gene structure and base composition parameters (gene and protein lengths; codon bias measured by the Effective Number of Codons (<it>N</it><sub><it>C</it></sub>); and G+C content in second, third and all codon positions) [see <supplr sid="S1">Additional file 1</supplr>]. Differences among the three groups of genes were tested with one-way ANOVAs and pairwise contrast tests <abbrgrp><abbr bid="B82">82</abbr></abbrgrp>, assuming homogeneity of variances for those variables that gave non-significant P values for the Levene test <abbrgrp><abbr bid="B83">83</abbr></abbrgrp> [see <supplr sid="S2">Additional file 2</supplr>]. Orthologous coding sequences in <it>D. buzzatii</it>, <it>D. melanogaster </it>and <it>D. pseudoobscura </it>were aligned according to their translation to protein using RevTrans 1.3 Server <abbrgrp><abbr bid="B84">84</abbr></abbrgrp> with some manual editing using BIOEDIT v. 7.0.4.1 <abbrgrp><abbr bid="B79">79</abbr></abbrgrp>. Two non-<it>Hox </it>genes of the initial sample (<it>CG1288 </it>and <it>CG17836</it>) showed a doubtful alignment, containing many gaps and few residue matches, and thus were excluded from the analyses to avoid unreliable estimates. A total of 15 non-<it>Hox </it>genes were therefore used in this study.</p>
         </sec>
         <sec>
            <st>
               <p>Estimation of evolutionary rates</p>
            </st>
            <p>The numbers of synonymous and nonsynonymous substitutions per site (<it>d</it><sub><it>S </it></sub>and <it>d</it><sub><it>N</it></sub>, respectively) were estimated on the nucleotide alignments of each gene using maximum likelihood methods with the program <it>codeml </it>of the PAML v. 3.14 package <abbrgrp><abbr bid="B85">85</abbr></abbrgrp> [see <supplr sid="S1">Additional file 1</supplr>]. We used an unrooted tree and the codon equilibrium frequencies (<it>&#960;</it><sub><it>i</it></sub>) estimated from the nucleotide frequencies of the three codon sites (F3X4 option of <it>codeml</it>). Differences among the three groups of genes were tested using one-way ANOVAs and pairwise contrast tests as before. Furthermore, we visualized differences along the genes by plotting <it>d</it><sub><it>N </it></sub>and &#969; in sliding windows of 240 nucleotides and a step size of three nucleotides (one codon).</p>
         </sec>
         <sec>
            <st>
               <p>Measurement of amino acid differences and indels</p>
            </st>
            <p>We measured the proportion of amino acid differences and indels in the protein alignments (translated from the previous nucleotide alignments) using in-house Perl scripts. The methodology was based on measuring the number of non-conserved positions due to either amino acid differences (point changes) or indels (structural changes) in the protein multiple alignments (e.g. the minimum indel length is one amino acid, corresponding to three nucleotides in the nucleotide sequence). We can estimate in this way the percentage of the protein which has been changed in our set of species. We think that this is a simple (yet somewhat rough) measure to estimate the degree of constraint relaxation of proteins.</p>
            <p>Specifically, the number of amino acid differences was computed as the number of non-gapped positions with non-identical amino acids in the three species. All percentages are given in relation to the total number of aligned amino acids (non-gapped positions). Similarly, the number of indels was computed as the number of different indels (gaps affecting different positions) in the complete alignment (gapped and non-gapped sites). Therefore, an indel shared by two species was considered a single indel, while overlapping gaps were considered separately. Indel lengths were taken into account to calculate the percentage of the alignment affected by indels. In this case, all percentages are given in relation to the total length of the alignment (gapped and non-gapped positions).</p>
            <p>We used one-way ANOVAs to test for differences between <it>Hox</it>, <it>Hox</it>-derived and non-<it>Hox </it>proteins in both parameters: the proportion of amino acid differences and the proportion of indels. We also used the Pearson correlation coefficient to test for a correlation between the two measures (e.g. to test whether proteins with a high proportion of amino acid differences also have a high proportion of indels), and a nested ANOVA <abbrgrp><abbr bid="B82">82</abbr></abbrgrp> to test for differences in indel length among the three groups, taking into account the variation within groups.</p>
         </sec>
         <sec>
            <st>
               <p>Contribution of the homeobox and the repetitive regions to the evolutionary rates</p>
            </st>
            <p>In order to test the effect of the homeobox and the repetitive regions in our estimates of nucleotide substitutions, we repeated the previous analyses excluding one or both types of sequence. Repetitive regions were identified in three different ways. First, we searched in the UniProt Knowledgebase Release 8.6 (Swiss-Prot Release 50.6 + TrEMBL Release 33.6) <abbrgrp><abbr bid="B86">86</abbr></abbrgrp> for annotated compositionally biased regions (defined in the feature table as COMPBIAS) in the protein sequences encoded by <it>Hox</it>, <it>Hox</it>-derived and non-<it>Hox </it>genes [see <supplr sid="S3">Additional file 3</supplr>]. In the case of <it>Hox </it>genes, all three genes in the group contained at least one annotated repetitive region, while for <it>Hox</it>-derived and non-<it>Hox </it>genes only one entry of each group (<it>bcd </it>and <it>CG2520</it>, respectively) contained annotated repetitive regions. Note that only repeats in <it>D. melanogaster </it>are identified by using this methodology. Second, we identified simple repeats as those runs of 5 or more identical amino acids (e.g. QQQQQ), or at least 4 identical repetitions of 2 or more amino acids (e.g. GVGVGVGV), in any of the three species. By using this second approach, we extended the number of proteins with repetitive sequences in both the <it>Hox</it>-derived and non-<it>Hox </it>groups. Finally, we tried to visually annotate complex repeats as those imperfect runs of amino acid repetitions or compositionally biased regions in the protein (e.g. regions in the protein with a high content of Q, S, A, P, H, G, V, etc.). Data was analyzed using a combination of the three approaches as follows: (1) using UniProt only; (2) using UniProt + Simple repeats; and (3) using UniProt + Simple repeats + Complex repeats. Because the identification of complex repeats is somewhat subjective, we present in the main text the results obtained by identifying repeats using the second combination (UniProt + Simple repeats). However, results do not differ significantly among the three combinations [see <supplr sid="S4">Additional file 4</supplr>].</p>
            <p>We also calculated the proportion of amino acid differences and indels in repetitive and non-repetitive (unique) sequence in the three groups, and tested for differences between these two types of regions using a T-test for paired samples <abbrgrp><abbr bid="B82">82</abbr></abbrgrp> on those proteins having both types of regions.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Abbreviations</p>
         </st>
         <p><it>t </it>= number of nucleotide substitutions per codon; <it>d</it><sub><it>S </it></sub>= number of synonymous substitutions per synonymous site; <it>d</it><sub><it>N </it></sub>= number of nonsynonymous substitutions per nonsynonymous site; &#969; = <it>d</it><sub><it>N</it></sub>/<it>d</it><sub><it>S </it></sub>ratio that measures the level of functional constraint; <it>&#954; </it>= transition/transversion rate ratio; <it>N</it><sub><it>C </it></sub>= Effective Number of Codons.</p>
      </sec>
      <sec>
         <st>
            <p>Authors' contributions</p>
         </st>
         <p>SC carried out the analyses and drafted the manuscript. BN participated in obtaining the data and in the design of the analyses. AB participated in the statistical analysis. AR conceived the study, and participated in its design and coordination. All authors read and approved the final manuscript.</p>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>The authors would like to thank Natalia Petit for helpful discussions, and Kristen Panfilio and two anonymous referees for valuable comments on this manuscript. This work has been supported by grant BMC2002-01708 from the Direcci&#243;n General de Ense&#241;anza Superior e Investigaci&#243;n Cient&#237;fica (MEC, Spain) awarded to AR, a doctoral FPI fellowship from the Ministerio de Ciencia y Tecnolog&#237;a (BES-2003-0416) awarded to SC and a doctoral FI/DGR fellowship from the Generalitat de Catalunya awarded to BN.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>From DNA to diversity: Molecular Genetics and Evolution of Animal Design</p>
            </title>
            <aug>
               <au>
                  <snm>Carroll</snm>
                  <fnm>SB</fnm>
               </au>
               <au>
                  <snm>Grenier</snm>
                  <fnm>JK</fnm>
               </au>
               <au>
                  <snm>Weatherbee</snm>
                  <fnm>SD</fnm>
               </au>
            </aug>
            <publisher> Blackwell</publisher>
            <edition>2nd ed.</edition>
            <pubdate>2005</pubdate>
         </bibl>
         <bibl id="B2">
            <title>
               <p>A gene complex controlling segmentation in Drosophila</p>
            </title>
            <aug>
               <au>
                  <snm>Lewis</snm>
                  <fnm>EB</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>1978</pubdate>
            <volume>276</volume>
            <issue>5688</issue>
            <fpage>565</fpage>
            <lpage>570</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/276565a0</pubid>
                  <pubid idtype="pmpid">103000</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>Cytogenetic Analysis of Chromosome 3 in Drosophila melanogaster: the Homeotic Gene Complex in Polytene Chromosome Interval 84A-B.</p>
            </title>
            <aug>
               <au>
                  <snm>Kaufman</snm>
                  <fnm>TC</fnm>
               </au>
               <au>
                  <snm>Lewis</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Wakimoto</snm>
                  <fnm>B</fnm>
               </au>
            </aug>
            <source>Genetics</source>
            <pubdate>1980</pubdate>
            <volume>94</volume>
            <issue>1</issue>
            <fpage>115</fpage>
            <lpage>133</lpage>
         </bibl>
         <bibl id="B4">
            <title>
               <p>Hox genes and the phylogeny of the arthropods</p>
            </title>
            <aug>
               <au>
                  <snm>Cook</snm>
                  <fnm>CE</fnm>
               </au>
               <au>
                  <snm>Smith</snm>
                  <fnm>ML</fnm>
               </au>
               <au>
                  <snm>Telford</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Bastianello</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Akam</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Curr Biol</source>
            <pubdate>2001</pubdate>
            <volume>11</volume>
            <issue>10</issue>
            <fpage>759</fpage>
            <lpage>763</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0960-9822(01)00222-6</pubid>
                  <pubid idtype="pmpid" link="fulltext">11378385</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>Hox genes and the evolution of the arthropod body plan</p>
            </title>
            <aug>
               <au>
                  <snm>Hughes</snm>
                  <fnm>CL</fnm>
               </au>
               <au>
                  <snm>Kaufman</snm>
                  <fnm>TC</fnm>
               </au>
            </aug>
            <source>Evol Dev</source>
            <pubdate>2002</pubdate>
            <volume>4</volume>
            <issue>6</issue>
            <fpage>459</fpage>
            <lpage>499</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1046/j.1525-142X.2002.02034.x</pubid>
                  <pubid idtype="pmpid" link="fulltext">12492146</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>Expression patterns of the rogue Hox genes Hox3/zen and fushi tarazu in the apterygote insect Thermobia domestica</p>
            </title>
            <aug>
               <au>
                  <snm>Hughes</snm>
                  <fnm>CL</fnm>
               </au>
               <au>
                  <snm>Liu</snm>
                  <fnm>PZ</fnm>
               </au>
               <au>
                  <snm>Kaufman</snm>
                  <fnm>TC</fnm>
               </au>
            </aug>
            <source>Evol Dev</source>
            <pubdate>2004</pubdate>
            <volume>6</volume>
            <issue>6</issue>
            <fpage>393</fpage>
            <lpage>401</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1111/j.1525-142X.2004.04048.x</pubid>
                  <pubid idtype="pmpid" link="fulltext">15509221</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>Evolution of the homeobox complex in the Diptera</p>
            </title>
            <aug>
               <au>
                  <snm>Lewis</snm>
                  <fnm>EB</fnm>
               </au>
               <au>
                  <snm>Pfeiffer</snm>
                  <fnm>BD</fnm>
               </au>
               <au>
                  <snm>Mathog</snm>
                  <fnm>DR</fnm>
               </au>
               <au>
                  <snm>Celniker</snm>
                  <fnm>SE</fnm>
               </au>
            </aug>
            <source>Curr Biol</source>
            <pubdate>2003</pubdate>
            <volume>13</volume>
            <issue>15</issue>
            <fpage>R587</fpage>
            <lpage>8</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0960-9822(03)00520-7</pubid>
                  <pubid idtype="pmpid" link="fulltext">12906807</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>A new split of the Hox gene complex in Drosophila: relocation and evolution of the gene labial</p>
            </title>
            <aug>
               <au>
                  <snm>Negre</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Ranz</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Casals</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Caceres</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Ruiz</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Mol Biol Evol</source>
            <pubdate>2003</pubdate>
            <volume>20</volume>
            <issue>12</issue>
            <fpage>2042</fpage>
            <lpage>2054</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/molbev/msg238</pubid>
                  <pubid idtype="pmpid" link="fulltext">12949134</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>Splits in fruitfly Hox gene complexes</p>
            </title>
            <aug>
               <au>
                  <snm>Von Allmen</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Hogga</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Spierer</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Karch</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Bender</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Gyurkovics</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Lewis</snm>
                  <fnm>E</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>1996</pubdate>
            <volume>380</volume>
            <issue>6570</issue>
            <fpage>116</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/380116a0</pubid>
                  <pubid idtype="pmpid" link="fulltext">8600383</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>HOM-C evolution in Drosophila: is there a need for Hox gene clustering?</p>
            </title>
            <aug>
               <au>
                  <snm>Negre</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Ruiz</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Trends Genet</source>
            <pubdate>2007</pubdate>
            <inpress/>
         </bibl>
         <bibl id="B11">
            <title>
               <p>[Extreme divergence of a homeotic gene: the bicoid case]</p>
            </title>
            <aug>
               <au>
                  <snm>Bonneton</snm>
                  <fnm>F</fnm>
               </au>
            </aug>
            <source>Med Sci (Paris)</source>
            <pubdate>2003</pubdate>
            <volume>19</volume>
            <fpage>1265</fpage>
            <lpage>1270</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">14691752</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>The anterior determinant bicoid of Drosophila is a derived Hox class 3 gene</p>
            </title>
            <aug>
               <au>
                  <snm>Stauber</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Jackle</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Schmidt-Ott</snm>
                  <fnm>U</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci U S A</source>
            <pubdate>1999</pubdate>
            <volume>96</volume>
            <issue>7</issue>
            <fpage>3786</fpage>
            <lpage>3789</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">22372</pubid>
                  <pubid idtype="pmpid" link="fulltext">10097115</pubid>
                  <pubid idtype="doi">10.1073/pnas.96.7.3786</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B13">
            <title>
               <p>A single Hox3 gene with composite bicoid and zerknullt expression characteristics in non-Cyclorrhaphan flies</p>
            </title>
            <aug>
               <au>
                  <snm>Stauber</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Prell</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Schmidt-Ott</snm>
                  <fnm>U</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci U S A</source>
            <pubdate>2002</pubdate>
            <volume>99</volume>
            <issue>1</issue>
            <fpage>274</fpage>
            <lpage>279</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">117551</pubid>
                  <pubid idtype="pmpid" link="fulltext">11773616</pubid>
                  <pubid idtype="doi">10.1073/pnas.012292899</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>Evidence for the derivation of the Drosophila fushi tarazu gene from a Hox gene orthologous to lophotrochozoan Lox5</p>
            </title>
            <aug>
               <au>
                  <snm>Telford</snm>
                  <fnm>MJ</fnm>
               </au>
            </aug>
            <source>Curr Biol</source>
            <pubdate>2000</pubdate>
            <volume>10</volume>
            <issue>6</issue>
            <fpage>349</fpage>
            <lpage>352</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0960-9822(00)00387-0</pubid>
                  <pubid idtype="pmpid" link="fulltext">10744975</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B15">
            <title>
               <p>Evolution of a transcriptional repression domain in an insect Hox protein</p>
            </title>
            <aug>
               <au>
                  <snm>Galant</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Carroll</snm>
                  <fnm>SB</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2002</pubdate>
            <volume>415</volume>
            <issue>6874</issue>
            <fpage>910</fpage>
            <lpage>913</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nature717</pubid>
                  <pubid idtype="pmpid" link="fulltext">11859369</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <title>
               <p>Evolving role of Antennapedia protein in arthropod limb patterning</p>
            </title>
            <aug>
               <au>
                  <snm>Shiga</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Yasumoto</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Yamagata</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Hayashi</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Development</source>
            <pubdate>2002</pubdate>
            <volume>129</volume>
            <issue>15</issue>
            <fpage>3555</fpage>
            <lpage>3561</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">12117806</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>Drosophila fushi tarazu. a gene on the border of homeotic function</p>
            </title>
            <aug>
               <au>
                  <snm>Lohr</snm>
                  <fnm>U</fnm>
               </au>
               <au>
                  <snm>Yussa</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Pick</snm>
                  <fnm>L</fnm>
               </au>
            </aug>
            <source>Curr Biol</source>
            <pubdate>2001</pubdate>
            <volume>11</volume>
            <issue>18</issue>
            <fpage>1403</fpage>
            <lpage>1412</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0960-9822(01)00443-2</pubid>
                  <pubid idtype="pmpid" link="fulltext">11566098</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p>The evolving role of Hox genes in arthropods</p>
            </title>
            <aug>
               <au>
                  <snm>Akam</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Averof</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Castelli-Gair</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Dawes</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Falciani</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Ferrier</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Dev Suppl</source>
            <pubdate>1994</pubdate>
            <fpage>209</fpage>
            <lpage>215</lpage>
            <xrefbib>
               <pubid idtype="pmpid">7579521</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B19">
            <title>
               <p>Conservation of regulatory sequences and gene expression patterns in the disintegrating Drosophila Hox gene complex</p>
            </title>
            <aug>
               <au>
                  <snm>Negre</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Casillas</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Suzanne</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Sanchez-Herrero</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Akam</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Nefedov</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Barbadilla</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>de Jong</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Ruiz</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2005</pubdate>
            <volume>15</volume>
            <issue>5</issue>
            <fpage>692</fpage>
            <lpage>700</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1088297</pubid>
                  <pubid idtype="pmpid" link="fulltext">15867430</pubid>
                  <pubid idtype="doi">10.1101/gr.3468605</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B20">
            <title>
               <p>The role of localization of bicoid RNA in organizing the anterior pattern of the Drosophila embryo</p>
            </title>
            <aug>
               <au>
                  <snm>Berleth</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Burri</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Thoma</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Bopp</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Richstein</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Frigerio</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Noll</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Nusslein-Volhard</snm>
                  <fnm>C</fnm>
               </au>
            </aug>
            <source>Embo J</source>
            <pubdate>1988</pubdate>
            <volume>7</volume>
            <issue>6</issue>
            <fpage>1749</fpage>
            <lpage>1756</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">457163</pubid>
                  <pubid idtype="pmpid">2901954</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B21">
            <title>
               <p>Molecular characterization of the zerknullt region of the Antennapedia gene complex in Drosophila</p>
            </title>
            <aug>
               <au>
                  <snm>Rushlow</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Doyle</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Hoey</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Levine</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Genes Dev</source>
            <pubdate>1987</pubdate>
            <volume>1</volume>
            <issue>10</issue>
            <fpage>1268</fpage>
            <lpage>1279</lpage>
            <xrefbib>
               <pubid idtype="pmpid">2892759</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B22">
            <title>
               <p>Human Hox-4.2 and Drosophila deformed encode similar regulatory specificities in Drosophila embryos and larvae</p>
            </title>
            <aug>
               <au>
                  <snm>McGinnis</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Kuziora</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>McGinnis</snm>
                  <fnm>W</fnm>
               </au>
            </aug>
            <source>Cell</source>
            <pubdate>1990</pubdate>
            <volume>63</volume>
            <issue>5</issue>
            <fpage>969</fpage>
            <lpage>976</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/0092-8674(90)90500-E</pubid>
                  <pubid idtype="pmpid" link="fulltext">1979526</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B23">
            <title>
               <p>The mouse Hox-1.3 gene is functionally equivalent to the Drosophila Sex combs reduced gene</p>
            </title>
            <aug>
               <au>
                  <snm>Zhao</snm>
                  <fnm>JJ</fnm>
               </au>
               <au>
                  <snm>Lazzarini</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Pick</snm>
                  <fnm>L</fnm>
               </au>
            </aug>
            <source>Genes Dev</source>
            <pubdate>1993</pubdate>
            <volume>7</volume>
            <issue>3</issue>
            <fpage>343</fpage>
            <lpage>354</lpage>
            <xrefbib>
               <pubid idtype="pmpid">8095481</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B24">
            <title>
               <p>Conservation of a functional hierarchy between mammalian and insect Hox/HOM genes</p>
            </title>
            <aug>
               <au>
                  <snm>Bachiller</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Macias</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Duboule</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Morata</snm>
                  <fnm>G</fnm>
               </au>
            </aug>
            <source>Embo J</source>
            <pubdate>1994</pubdate>
            <volume>13</volume>
            <issue>8</issue>
            <fpage>1930</fpage>
            <lpage>1941</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">395034</pubid>
                  <pubid idtype="pmpid">7909514</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B25">
            <title>
               <p>Functional equivalence and rescue among group 11 Hox gene products in vertebral patterning</p>
            </title>
            <aug>
               <au>
                  <snm>Zakany</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Gerard</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Favier</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Potter</snm>
                  <fnm>SS</fnm>
               </au>
               <au>
                  <snm>Duboule</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Dev Biol</source>
            <pubdate>1996</pubdate>
            <volume>176</volume>
            <issue>2</issue>
            <fpage>325</fpage>
            <lpage>328</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1006/dbio.1996.0137</pubid>
                  <pubid idtype="pmpid" link="fulltext">8660870</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B26">
            <title>
               <p>Maintenance of functional equivalence during paralogous Hox gene evolution</p>
            </title>
            <aug>
               <au>
                  <snm>Greer</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Puetz</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Thomas</snm>
                  <fnm>KR</fnm>
               </au>
               <au>
                  <snm>Capecchi</snm>
                  <fnm>MR</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2000</pubdate>
            <volume>403</volume>
            <issue>6770</issue>
            <fpage>661</fpage>
            <lpage>665</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/35001077</pubid>
                  <pubid idtype="pmpid" link="fulltext">10688203</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B27">
            <title>
               <p>Why highly expressed proteins evolve slowly</p>
            </title>
            <aug>
               <au>
                  <snm>Drummond</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>Bloom</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Adami</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Wilke</snm>
                  <fnm>CO</fnm>
               </au>
               <au>
                  <snm>Arnold</snm>
                  <fnm>FH</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci U S A</source>
            <pubdate>2005</pubdate>
            <volume>102</volume>
            <issue>40</issue>
            <fpage>14338</fpage>
            <lpage>14343</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1242296</pubid>
                  <pubid idtype="pmpid" link="fulltext">16176987</pubid>
                  <pubid idtype="doi">10.1073/pnas.0504070102</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B28">
            <title>
               <p>Highly expressed genes in yeast evolve slowly</p>
            </title>
            <aug>
               <au>
                  <snm>Pal</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Papp</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Hurst</snm>
                  <fnm>LD</fnm>
               </au>
            </aug>
            <source>Genetics</source>
            <pubdate>2001</pubdate>
            <volume>158</volume>
            <issue>2</issue>
            <fpage>927</fpage>
            <lpage>931</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1461684</pubid>
                  <pubid idtype="pmpid" link="fulltext">11430355</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B29">
            <title>
               <p>Protein evolution in the context of Drosophila development</p>
            </title>
            <aug>
               <au>
                  <snm>Davis</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Brandman</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Petrov</snm>
                  <fnm>DA</fnm>
               </au>
            </aug>
            <source>J Mol Evol</source>
            <pubdate>2005</pubdate>
            <volume>60</volume>
            <issue>6</issue>
            <fpage>774</fpage>
            <lpage>785</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1007/s00239-004-0241-2</pubid>
                  <pubid idtype="pmpid" link="fulltext">15909223</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B30">
            <title>
               <p>Rates of DNA evolution in Drosophila depend on function and developmental stage of expression</p>
            </title>
            <aug>
               <au>
                  <snm>Powell</snm>
                  <fnm>JR</fnm>
               </au>
               <au>
                  <snm>Caccone</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Gleason</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Nigro</snm>
                  <fnm>L</fnm>
               </au>
            </aug>
            <source>Genetics</source>
            <pubdate>1993</pubdate>
            <volume>133</volume>
            <issue>2</issue>
            <fpage>291</fpage>
            <lpage>298</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1205319</pubid>
                  <pubid idtype="pmpid">8094697</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B31">
            <title>
               <p>Order in living organisms: A systems analysis of evolution</p>
            </title>
            <aug>
               <au>
                  <snm>Riedl</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <publisher>New York , Wiley</publisher>
            <pubdate>1978</pubdate>
         </bibl>
         <bibl id="B32">
            <title>
               <p>Intron size and exon evolution in Drosophila</p>
            </title>
            <aug>
               <au>
                  <snm>Marais</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Nouvellet</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Keightley</snm>
                  <fnm>PD</fnm>
               </au>
               <au>
                  <snm>Charlesworth</snm>
                  <fnm>B</fnm>
               </au>
            </aug>
            <source>Genetics</source>
            <pubdate>2005</pubdate>
            <volume>170</volume>
            <issue>1</issue>
            <fpage>481</fpage>
            <lpage>485</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1449718</pubid>
                  <pubid idtype="pmpid" link="fulltext">15781704</pubid>
                  <pubid idtype="doi">10.1534/genetics.104.037333</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B33">
            <title>
               <p>Arthropod Hox genes: insights on the evolutionary forces that shape gene functions</p>
            </title>
            <aug>
               <au>
                  <snm>Averof</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Curr Opin Genet Dev</source>
            <pubdate>2002</pubdate>
            <volume>12</volume>
            <issue>4</issue>
            <fpage>386</fpage>
            <lpage>392</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0959-437X(02)00314-3</pubid>
                  <pubid idtype="pmpid" link="fulltext">12100881</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B34">
            <title>
               <p>Trinucleotide repeats and long homopeptides in genes and proteins associated with nervous system disease and development</p>
            </title>
            <aug>
               <au>
                  <snm>Karlin</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Burge</snm>
                  <fnm>C</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci U S A</source>
            <pubdate>1996</pubdate>
            <volume>93</volume>
            <issue>4</issue>
            <fpage>1560</fpage>
            <lpage>1565</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">39980</pubid>
                  <pubid idtype="pmpid" link="fulltext">8643671</pubid>
                  <pubid idtype="doi">10.1073/pnas.93.4.1560</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B35">
            <title>
               <p>Microsatellites: simple sequences with complex evolution</p>
            </title>
            <aug>
               <au>
                  <snm>Ellegren</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>Nat Rev Genet</source>
            <pubdate>2004</pubdate>
            <volume>5</volume>
            <issue>6</issue>
            <fpage>435</fpage>
            <lpage>445</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nrg1348</pubid>
                  <pubid idtype="pmpid" link="fulltext">15153996</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B36">
            <title>
               <p>Equilibrium distributions of microsatellite repeat length resulting from a balance between slippage events and point mutations</p>
            </title>
            <aug>
               <au>
                  <snm>Kruglyak</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Durrett</snm>
                  <fnm>RT</fnm>
               </au>
               <au>
                  <snm>Schug</snm>
                  <fnm>MD</fnm>
               </au>
               <au>
                  <snm>Aquadro</snm>
                  <fnm>CF</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci U S A</source>
            <pubdate>1998</pubdate>
            <volume>95</volume>
            <issue>18</issue>
            <fpage>10774</fpage>
            <lpage>10778</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">27971</pubid>
                  <pubid idtype="pmpid" link="fulltext">9724780</pubid>
                  <pubid idtype="doi">10.1073/pnas.95.18.10774</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B37">
            <title>
               <p>The probability of duplicate gene preservation by subfunctionalization</p>
            </title>
            <aug>
               <au>
                  <snm>Lynch</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Force</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Genetics</source>
            <pubdate>2000</pubdate>
            <volume>154</volume>
            <issue>1</issue>
            <fpage>459</fpage>
            <lpage>473</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1460895</pubid>
                  <pubid idtype="pmpid" link="fulltext">10629003</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B38">
            <title>
               <p>The evolutionary fate and consequences of duplicate genes</p>
            </title>
            <aug>
               <au>
                  <snm>Lynch</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Conery</snm>
                  <fnm>JS</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2000</pubdate>
            <volume>290</volume>
            <issue>5494</issue>
            <fpage>1151</fpage>
            <lpage>1155</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.290.5494.1151</pubid>
                  <pubid idtype="pmpid" link="fulltext">11073452</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B39">
            <title>
               <p>The origin of new genes: glimpses from the young and old</p>
            </title>
            <aug>
               <au>
                  <snm>Long</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Betran</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Thornton</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>W</fnm>
               </au>
            </aug>
            <source>Nat Rev Genet</source>
            <pubdate>2003</pubdate>
            <volume>4</volume>
            <issue>11</issue>
            <fpage>865</fpage>
            <lpage>875</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nrg1204</pubid>
                  <pubid idtype="pmpid" link="fulltext">14634634</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B40">
            <title>
               <p>Evolution by gene duplication: an update</p>
            </title>
            <aug>
               <au>
                  <snm>Zhang</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Trends Ecol Evol</source>
            <pubdate>2003</pubdate>
            <volume>18</volume>
            <issue>6</issue>
            <fpage>292</fpage>
            <lpage>298</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1016/S0169-5347(03)00033-8</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B41">
            <title>
               <p>DNA sequence variation at a duplicated gene: excess of replacement polymorphism and extensive haplotype structure in the Drosophila melanogaster bicoid region</p>
            </title>
            <aug>
               <au>
                  <snm>Baines</snm>
                  <fnm>JF</fnm>
               </au>
               <au>
                  <snm>Chen</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Das</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Stephan</snm>
                  <fnm>W</fnm>
               </au>
            </aug>
            <source>Mol Biol Evol</source>
            <pubdate>2002</pubdate>
            <volume>19</volume>
            <issue>7</issue>
            <fpage>989</fpage>
            <lpage>998</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">12082119</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B42">
            <title>
               <p>Maternal Expression Relaxes Constraint on Innovation of the Anterior Determinant, bicoid</p>
            </title>
            <aug>
               <au>
                  <snm>Barker</snm>
                  <fnm>MS</fnm>
               </au>
               <au>
                  <snm>Demuth</snm>
                  <fnm>JP</fnm>
               </au>
               <au>
                  <snm>Wade</snm>
                  <fnm>MJ</fnm>
               </au>
            </aug>
            <source>PLoS Genet</source>
            <pubdate>2005</pubdate>
            <volume>1</volume>
            <issue>5</issue>
            <fpage>e57</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1283158</pubid>
                  <pubid idtype="pmpid" link="fulltext">16299585</pubid>
                  <pubid idtype="doi">10.1371/journal.pgen.0010057</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B43">
            <title>
               <p>Gene expression and molecular evolution</p>
            </title>
            <aug>
               <au>
                  <snm>Akashi</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>Curr Opin Genet Dev</source>
            <pubdate>2001</pubdate>
            <volume>11</volume>
            <issue>6</issue>
            <fpage>660</fpage>
            <lpage>666</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0959-437X(00)00250-1</pubid>
                  <pubid idtype="pmpid" link="fulltext">11682310</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B44">
            <title>
               <p>The 'effective number of codons' used in a gene</p>
            </title>
            <aug>
               <au>
                  <snm>Wright</snm>
                  <fnm>F</fnm>
               </au>
            </aug>
            <source>Gene</source>
            <pubdate>1990</pubdate>
            <volume>87</volume>
            <issue>1</issue>
            <fpage>23</fpage>
            <lpage>29</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/0378-1119(90)90491-9</pubid>
                  <pubid idtype="pmpid" link="fulltext">2110097</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B45">
            <title>
               <p>Amino acid runs in eukaryotic proteomes and disease associations</p>
            </title>
            <aug>
               <au>
                  <snm>Karlin</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Brocchieri</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Bergman</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Mrazek</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Gentles</snm>
                  <fnm>AJ</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci U S A</source>
            <pubdate>2002</pubdate>
            <volume>99</volume>
            <issue>1</issue>
            <fpage>333</fpage>
            <lpage>338</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">117561</pubid>
                  <pubid idtype="pmpid" link="fulltext">11782551</pubid>
                  <pubid idtype="doi">10.1073/pnas.012608599</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B46">
            <title>
               <p>Molecular origins of rapid and continuous morphological evolution</p>
            </title>
            <aug>
               <au>
                  <snm>Fondon</snm>
                  <fnm>JW</fnm>
                  <suf>3rd</suf>
               </au>
               <au>
                  <snm>Garner</snm>
                  <fnm>HR</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci U S A</source>
            <pubdate>2004</pubdate>
            <volume>101</volume>
            <issue>52</issue>
            <fpage>18058</fpage>
            <lpage>18063</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">539791</pubid>
                  <pubid idtype="pmpid" link="fulltext">15596718</pubid>
                  <pubid idtype="doi">10.1073/pnas.0408118101</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B47">
            <title>
               <p>The mutation rates of di-, tri- and tetranucleotide repeats in Drosophila melanogaster</p>
            </title>
            <aug>
               <au>
                  <snm>Schug</snm>
                  <fnm>MD</fnm>
               </au>
               <au>
                  <snm>Hutter</snm>
                  <fnm>CM</fnm>
               </au>
               <au>
                  <snm>Wetterstrand</snm>
                  <fnm>KA</fnm>
               </au>
               <au>
                  <snm>Gaudette</snm>
                  <fnm>MS</fnm>
               </au>
               <au>
                  <snm>Mackay</snm>
                  <fnm>TF</fnm>
               </au>
               <au>
                  <snm>Aquadro</snm>
                  <fnm>CF</fnm>
               </au>
            </aug>
            <source>Mol Biol Evol</source>
            <pubdate>1998</pubdate>
            <volume>15</volume>
            <issue>12</issue>
            <fpage>1751</fpage>
            <lpage>1760</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">9866209</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B48">
            <title>
               <p>Molecular Evolution</p>
            </title>
            <aug>
               <au>
                  <snm>Li</snm>
                  <fnm>WH</fnm>
               </au>
            </aug>
            <publisher>Sunderland Massachusetts , Sinauer Associates, Inc.</publisher>
            <pubdate>1997</pubdate>
         </bibl>
         <bibl id="B49">
            <title>
               <p>A role for selection in regulating the evolutionary emergence of disease-causing and other coding CAG repeats in humans and mice</p>
            </title>
            <aug>
               <au>
                  <snm>Hancock</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Worthey</snm>
                  <fnm>EA</fnm>
               </au>
               <au>
                  <snm>Santibanez-Koref</snm>
                  <fnm>MF</fnm>
               </au>
            </aug>
            <source>Mol Biol Evol</source>
            <pubdate>2001</pubdate>
            <volume>18</volume>
            <issue>6</issue>
            <fpage>1014</fpage>
            <lpage>1023</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">11371590</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B50">
            <title>
               <p>Associations between human disease genes and overlapping gene groups and multiple amino acid runs</p>
            </title>
            <aug>
               <au>
                  <snm>Karlin</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Chen</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Gentles</snm>
                  <fnm>AJ</fnm>
               </au>
               <au>
                  <snm>Cleary</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci U S A</source>
            <pubdate>2002</pubdate>
            <volume>99</volume>
            <issue>26</issue>
            <fpage>17008</fpage>
            <lpage>17013</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">139260</pubid>
                  <pubid idtype="pmpid" link="fulltext">12473749</pubid>
                  <pubid idtype="doi">10.1073/pnas.262658799</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B51">
            <title>
               <p>The other trinucleotide repeat: polyalanine expansion disorders</p>
            </title>
            <aug>
               <au>
                  <snm>Albrecht</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Mundlos</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Curr Opin Genet Dev</source>
            <pubdate>2005</pubdate>
            <volume>15</volume>
            <issue>3</issue>
            <fpage>285</fpage>
            <lpage>293</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.gde.2005.04.003</pubid>
                  <pubid idtype="pmpid" link="fulltext">15917204</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B52">
            <title>
               <p>Alanine tracts: the expanding story of human illness and trinucleotide repeats</p>
            </title>
            <aug>
               <au>
                  <snm>Brown</snm>
                  <fnm>LY</fnm>
               </au>
               <au>
                  <snm>Brown</snm>
                  <fnm>SA</fnm>
               </au>
            </aug>
            <source>Trends Genet</source>
            <pubdate>2004</pubdate>
            <volume>20</volume>
            <issue>1</issue>
            <fpage>51</fpage>
            <lpage>58</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.tig.2003.11.002</pubid>
                  <pubid idtype="pmpid" link="fulltext">14698619</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B53">
            <title>
               <p>Structure and expression of a Manduca sexta larval cuticle gene homologous to Drosophila cuticle genes</p>
            </title>
            <aug>
               <au>
                  <snm>Rebers</snm>
                  <fnm>JE</fnm>
               </au>
               <au>
                  <snm>Riddiford</snm>
                  <fnm>LM</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1988</pubdate>
            <volume>203</volume>
            <issue>2</issue>
            <fpage>411</fpage>
            <lpage>423</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/0022-2836(88)90009-5</pubid>
                  <pubid idtype="pmpid" link="fulltext">2462055</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B54">
            <title>
               <p>A conserved domain in arthropod cuticular proteins binds chitin</p>
            </title>
            <aug>
               <au>
                  <snm>Rebers</snm>
                  <fnm>JE</fnm>
               </au>
               <au>
                  <snm>Willis</snm>
                  <fnm>JH</fnm>
               </au>
            </aug>
            <source>Insect Biochem Mol Biol</source>
            <pubdate>2001</pubdate>
            <volume>31</volume>
            <issue>11</issue>
            <fpage>1083</fpage>
            <lpage>1093</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0965-1748(01)00056-X</pubid>
                  <pubid idtype="pmpid" link="fulltext">11520687</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B55">
            <title>
               <p>Determination of the covalent structure of an N- and C-terminally blocked glycoprotein from endocuticle of Locusta migratoria. Combined use of plasma desorption mass spectrometry and Edman degradation to study post-translationally modified proteins</p>
            </title>
            <aug>
               <au>
                  <snm>Talbo</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Hojrup</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Rahbek-Nielsen</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Andersen</snm>
                  <fnm>SO</fnm>
               </au>
               <au>
                  <snm>Roepstorff</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Eur J Biochem</source>
            <pubdate>1991</pubdate>
            <volume>195</volume>
            <issue>2</issue>
            <fpage>495</fpage>
            <lpage>504</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1111/j.1432-1033.1991.tb15730.x</pubid>
                  <pubid idtype="pmpid">1997327</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B56">
            <title>
               <p>Sequence studies of proteins from larval and pupal cuticle of the yellow meal worm, Tenebrio molitor</p>
            </title>
            <aug>
               <au>
                  <snm>Andersen</snm>
                  <fnm>SO</fnm>
               </au>
               <au>
                  <snm>Rafn</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Roepstorff</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Insect Biochem Mol Biol</source>
            <pubdate>1997</pubdate>
            <volume>27</volume>
            <issue>2</issue>
            <fpage>121</fpage>
            <lpage>131</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0965-1748(96)00076-8</pubid>
                  <pubid idtype="pmpid" link="fulltext">9066122</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B57">
            <title>
               <p>On the formation of spontaneous deletions: the importance of short sequence homologies in the generation of large deletions</p>
            </title>
            <aug>
               <au>
                  <snm>Albertini</snm>
                  <fnm>AM</fnm>
               </au>
               <au>
                  <snm>Hofer</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Calos</snm>
                  <fnm>MP</fnm>
               </au>
               <au>
                  <snm>Miller</snm>
                  <fnm>JH</fnm>
               </au>
            </aug>
            <source>Cell</source>
            <pubdate>1982</pubdate>
            <volume>29</volume>
            <issue>2</issue>
            <fpage>319</fpage>
            <lpage>328</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/0092-8674(82)90148-9</pubid>
                  <pubid idtype="pmpid" link="fulltext">6288254</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B58">
            <title>
               <p>Genetic and physical mapping in the early region of bacteriophage T7 DNA</p>
            </title>
            <aug>
               <au>
                  <snm>Studier</snm>
                  <fnm>FW</fnm>
               </au>
               <au>
                  <snm>Rosenberg</snm>
                  <fnm>AH</fnm>
               </au>
               <au>
                  <snm>Simon</snm>
                  <fnm>MN</fnm>
               </au>
               <au>
                  <snm>Dunn</snm>
                  <fnm>JJ</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1979</pubdate>
            <volume>135</volume>
            <issue>4</issue>
            <fpage>917</fpage>
            <lpage>937</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/0022-2836(79)90520-5</pubid>
                  <pubid idtype="pmpid" link="fulltext">231684</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B59">
            <title>
               <p>rII cistrons of bacteriophage T4. DNA sequence around the intercistronic divide and positions of genetic landmarks</p>
            </title>
            <aug>
               <au>
                  <snm>Pribnow</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Sigurdson</snm>
                  <fnm>DC</fnm>
               </au>
               <au>
                  <snm>Gold</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Singer</snm>
                  <fnm>BS</fnm>
               </au>
               <au>
                  <snm>Napoli</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Brosius</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Dull</snm>
                  <fnm>TJ</fnm>
               </au>
               <au>
                  <snm>Noller</snm>
                  <fnm>HF</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1981</pubdate>
            <volume>149</volume>
            <issue>3</issue>
            <fpage>337</fpage>
            <lpage>376</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/0022-2836(81)90477-0</pubid>
                  <pubid idtype="pmpid" link="fulltext">6273585</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B60">
            <title>
               <p>beta-Galactosidase chimeras: primary structure of a lac repressor-beta-galactosidase protein</p>
            </title>
            <aug>
               <au>
                  <snm>Brake</snm>
                  <fnm>AJ</fnm>
               </au>
               <au>
                  <snm>Fowler</snm>
                  <fnm>AV</fnm>
               </au>
               <au>
                  <snm>Zabin</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Kania</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Muller-Hill</snm>
                  <fnm>B</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci U S A</source>
            <pubdate>1978</pubdate>
            <volume>75</volume>
            <issue>10</issue>
            <fpage>4824</fpage>
            <lpage>4827</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">336213</pubid>
                  <pubid idtype="pmpid">105358</pubid>
                  <pubid idtype="doi">10.1073/pnas.75.10.4824</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B61">
            <title>
               <p>Deletion mutants of Xenopus laevis 5S ribosomal DNA</p>
            </title>
            <aug>
               <au>
                  <snm>Fedoroff</snm>
                  <fnm>NV</fnm>
               </au>
            </aug>
            <source>Cell</source>
            <pubdate>1979</pubdate>
            <volume>16</volume>
            <issue>3</issue>
            <fpage>551</fpage>
            <lpage>563</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/0092-8674(79)90029-1</pubid>
                  <pubid idtype="pmpid" link="fulltext">455442</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B62">
            <title>
               <p>IS2-61 and IS2-611 arise by illegitimate recombination from IS2-6</p>
            </title>
            <aug>
               <au>
                  <snm>Ghosal</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Saedler</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>Mol Gen Genet</source>
            <pubdate>1979</pubdate>
            <volume>176</volume>
            <issue>2</issue>
            <fpage>233</fpage>
            <lpage>238</lpage>
            <url>http://www.springerlink.com/content/u340577860368426/</url>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid">393955</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B63">
            <title>
               <p>DNA sequence of the promoter region for the alpha ribosomal protein operon in Escherichia coli</p>
            </title>
            <aug>
               <au>
                  <snm>Post</snm>
                  <fnm>LE</fnm>
               </au>
               <au>
                  <snm>Arfsten</snm>
                  <fnm>AE</fnm>
               </au>
               <au>
                  <snm>Davis</snm>
                  <fnm>GR</fnm>
               </au>
               <au>
                  <snm>Nomura</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>J Biol Chem</source>
            <pubdate>1980</pubdate>
            <volume>255</volume>
            <issue>10</issue>
            <fpage>4653</fpage>
            <lpage>4659</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">6154696</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B64">
            <title>
               <p>Nearly precise excision: a new type of DNA alteration associated with the translocatable element Tn10</p>
            </title>
            <aug>
               <au>
                  <snm>Ross</snm>
                  <fnm>DG</fnm>
               </au>
               <au>
                  <snm>Swan</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Kleckner</snm>
                  <fnm>N</fnm>
               </au>
            </aug>
            <source>Cell</source>
            <pubdate>1979</pubdate>
            <volume>16</volume>
            <issue>4</issue>
            <fpage>733</fpage>
            <lpage>738</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/0092-8674(79)90089-8</pubid>
                  <pubid idtype="pmpid" link="fulltext">455447</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B65">
            <title>
               <p>Deletions of distal sequence after termination of transcription at the end of the tryptophan operon in E. coli</p>
            </title>
            <aug>
               <au>
                  <snm>Wu</snm>
                  <fnm>AM</fnm>
               </au>
               <au>
                  <snm>Chapman</snm>
                  <fnm>AB</fnm>
               </au>
               <au>
                  <snm>Platt</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Guarente</snm>
                  <fnm>LP</fnm>
               </au>
               <au>
                  <snm>Beckwith</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Cell</source>
            <pubdate>1980</pubdate>
            <volume>19</volume>
            <issue>4</issue>
            <fpage>829</fpage>
            <lpage>836</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/0092-8674(80)90073-2</pubid>
                  <pubid idtype="pmpid">6991123</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B66">
            <title>
               <p>The structure and evolution of the human beta-globin gene family</p>
            </title>
            <aug>
               <au>
                  <snm>Efstratiadis</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Posakony</snm>
                  <fnm>JW</fnm>
               </au>
               <au>
                  <snm>Maniatis</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Lawn</snm>
                  <fnm>RM</fnm>
               </au>
               <au>
                  <snm>O'Connell</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Spritz</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>DeRiel</snm>
                  <fnm>JK</fnm>
               </au>
               <au>
                  <snm>Forget</snm>
                  <fnm>BG</fnm>
               </au>
               <au>
                  <snm>Weissman</snm>
                  <fnm>SM</fnm>
               </au>
               <au>
                  <snm>Slightom</snm>
                  <fnm>JL</fnm>
               </au>
               <au>
                  <snm>Blechl</snm>
                  <fnm>AE</fnm>
               </au>
               <au>
                  <snm>Smithies</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Baralle</snm>
                  <fnm>FE</fnm>
               </au>
               <au>
                  <snm>Shoulders</snm>
                  <fnm>CC</fnm>
               </au>
               <au>
                  <snm>Proudfoot</snm>
                  <fnm>NJ</fnm>
               </au>
            </aug>
            <source>Cell</source>
            <pubdate>1980</pubdate>
            <volume>21</volume>
            <issue>3</issue>
            <fpage>653</fpage>
            <lpage>668</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/0092-8674(80)90429-8</pubid>
                  <pubid idtype="pmpid" link="fulltext">6985477</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B67">
            <title>
               <p>Human beta-globin messenger RNA. III. Nucleotide sequences derived from complementary DNA</p>
            </title>
            <aug>
               <au>
                  <snm>Marotta</snm>
                  <fnm>CA</fnm>
               </au>
               <au>
                  <snm>Wilson</snm>
                  <fnm>JT</fnm>
               </au>
               <au>
                  <snm>Forget</snm>
                  <fnm>BG</fnm>
               </au>
               <au>
                  <snm>Weissman</snm>
                  <fnm>SM</fnm>
               </au>
            </aug>
            <source>J Biol Chem</source>
            <pubdate>1977</pubdate>
            <volume>252</volume>
            <issue>14</issue>
            <fpage>5040</fpage>
            <lpage>5053</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">68958</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B68">
            <title>
               <p>Sequence of the lacI gene</p>
            </title>
            <aug>
               <au>
                  <snm>Farabaugh</snm>
                  <fnm>PJ</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>1978</pubdate>
            <volume>274</volume>
            <issue>5673</issue>
            <fpage>765</fpage>
            <lpage>769</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/274765a0</pubid>
                  <pubid idtype="pmpid">355891</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B69">
            <title>
               <p>Duplication, dicistronic transcription, and subsequent evolution of the Alcohol dehydrogenase and Alcohol dehydrogenase-related genes in Drosophila</p>
            </title>
            <aug>
               <au>
                  <snm>Betran</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Ashburner</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Mol Biol Evol</source>
            <pubdate>2000</pubdate>
            <volume>17</volume>
            <issue>9</issue>
            <fpage>1344</fpage>
            <lpage>1352</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">10958851</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B70">
            <title>
               <p>The evolution of an alpha-esterase pseudogene inactivated in the Drosophila melanogaster lineage</p>
            </title>
            <aug>
               <au>
                  <snm>Robin</snm>
                  <fnm>GC</fnm>
               </au>
               <au>
                  <snm>Russell</snm>
                  <fnm>RJ</fnm>
               </au>
               <au>
                  <snm>Cutler</snm>
                  <fnm>DJ</fnm>
               </au>
               <au>
                  <snm>Oakeshott</snm>
                  <fnm>JG</fnm>
               </au>
            </aug>
            <source>Mol Biol Evol</source>
            <pubdate>2000</pubdate>
            <volume>17</volume>
            <issue>4</issue>
            <fpage>563</fpage>
            <lpage>575</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">10742048</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B71">
            <title>
               <p>Silencing of a gene adjacent to the breakpoint of a widespread Drosophila inversion by a transposon-induced antisense RNA</p>
            </title>
            <aug>
               <au>
                  <snm>Puig</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Caceres</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Ruiz</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci U S A</source>
            <pubdate>2004</pubdate>
            <volume>101</volume>
            <issue>24</issue>
            <fpage>9013</fpage>
            <lpage>9018</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">428464</pubid>
                  <pubid idtype="pmpid" link="fulltext">15184654</pubid>
                  <pubid idtype="doi">10.1073/pnas.0403090101</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B72">
            <title>
               <p>Duplicative and conservative transpositions of larval serum protein 1 genes in the genus Drosophila</p>
            </title>
            <aug>
               <au>
                  <snm>Gonzalez</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Casals</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Ruiz</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Genetics</source>
            <pubdate>2004</pubdate>
            <volume>168</volume>
            <issue>1</issue>
            <fpage>253</fpage>
            <lpage>264</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1448094</pubid>
                  <pubid idtype="pmpid" link="fulltext">15454541</pubid>
                  <pubid idtype="doi">10.1534/genetics.103.025916</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B73">
            <title>
               <p>FlyBase: genes and gene models</p>
            </title>
            <aug>
               <au>
                  <snm>Drysdale</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Crosby</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <cnm>FlyBase Consortium</cnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2005</pubdate>
            <volume>33</volume>
            <fpage>D390</fpage>
            <lpage>5</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">540000</pubid>
                  <pubid idtype="pmpid" link="fulltext">15608223</pubid>
                  <pubid idtype="doi">10.1093/nar/gki046</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B74">
            <title>
               <p>Finishing a whole-genome shotgun: release 3 of the Drosophila melanogaster euchromatic genome sequence</p>
            </title>
            <aug>
               <au>
                  <snm>Celniker</snm>
                  <fnm>SE</fnm>
               </au>
               <au>
                  <snm>Wheeler</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>Kronmiller</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Carlson</snm>
                  <fnm>JW</fnm>
               </au>
               <au>
                  <snm>Halpern</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Patel</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Adams</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Champe</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Dugan</snm>
                  <fnm>SP</fnm>
               </au>
               <au>
                  <snm>Frise</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Hodgson</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>George</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Hoskins</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Laverty</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Muzny</snm>
                  <fnm>DM</fnm>
               </au>
               <au>
                  <snm>Nelson</snm>
                  <fnm>CR</fnm>
               </au>
               <au>
                  <snm>Pacleb</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Park</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Pfeiffer</snm>
                  <fnm>BD</fnm>
               </au>
               <au>
                  <snm>Richards</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Sodergren</snm>
                  <fnm>EJ</fnm>
               </au>
               <au>
                  <snm>Svirskas</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Tabor</snm>
                  <fnm>PE</fnm>
               </au>
               <au>
                  <snm>Wan</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Stapleton</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Sutton</snm>
                  <fnm>GG</fnm>
               </au>
               <au>
                  <snm>Venter</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Weinstock</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Scherer</snm>
                  <fnm>SE</fnm>
               </au>
               <au>
                  <snm>Myers</snm>
                  <fnm>EW</fnm>
               </au>
               <au>
                  <snm>Gibbs</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Rubin</snm>
                  <fnm>GM</fnm>
               </au>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2002</pubdate>
            <volume>3</volume>
            <issue>12</issue>
            <fpage>RESEARCH0079</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">151181</pubid>
                  <pubid idtype="pmpid" link="fulltext">12537568</pubid>
                  <pubid idtype="doi">10.1186/gb-2002-3-12-research0079</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B75">
            <title>
               <p>Comparative genome sequencing of Drosophila pseudoobscura: chromosomal, gene, and cis-element evolution</p>
            </title>
            <aug>
               <au>
                  <snm>Richards</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Liu</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Bettencourt</snm>
                  <fnm>BR</fnm>
               </au>
               <au>
                  <snm>Hradecky</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Letovsky</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Nielsen</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Thornton</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Hubisz</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Chen</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Meisel</snm>
                  <fnm>RP</fnm>
               </au>
               <au>
                  <snm>Couronne</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Hua</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Smith</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Liu</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Bussemaker</snm>
                  <fnm>HJ</fnm>
               </au>
               <au>
                  <snm>van Batenburg</snm>
                  <fnm>MF</fnm>
               </au>
               <au>
                  <snm>Howells</snm>
                  <fnm>SL</fnm>
               </au>
               <au>
                  <snm>Scherer</snm>
                  <fnm>SE</fnm>
               </au>
               <au>
                  <snm>Sodergren</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Matthews</snm>
                  <fnm>BB</fnm>
               </au>
               <au>
                  <snm>Crosby</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Schroeder</snm>
                  <fnm>AJ</fnm>
               </au>
               <au>
                  <snm>Ortiz-Barrientos</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Rives</snm>
                  <fnm>CM</fnm>
               </au>
               <au>
                  <snm>Metzker</snm>
                  <fnm>ML</fnm>
               </au>
               <au>
                  <snm>Muzny</snm>
                  <fnm>DM</fnm>
               </au>
               <au>
                  <snm>Scott</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Steffen</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Wheeler</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>Worley</snm>
                  <fnm>KC</fnm>
               </au>
               <au>
                  <snm>Havlak</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Durbin</snm>
                  <fnm>KJ</fnm>
               </au>
               <au>
                  <snm>Egan</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Gill</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Hume</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Morgan</snm>
                  <fnm>MB</fnm>
               </au>
               <au>
                  <snm>Miner</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Hamilton</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Huang</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Waldron</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Verduzco</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Clerc-Blankenburg</snm>
                  <fnm>KP</fnm>
               </au>
               <au>
                  <snm>Dubchak</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Noor</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Anderson</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>White</snm>
                  <fnm>KP</fnm>
               </au>
               <au>
                  <snm>Clark</snm>
                  <fnm>AG</fnm>
               </au>
               <au>
                  <snm>Schaeffer</snm>
                  <fnm>SW</fnm>
               </au>
               <au>
                  <snm>Gelbart</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Weinstock</snm>
                  <fnm>GM</fnm>
               </au>
               <au>
                  <snm>Gibbs</snm>
                  <fnm>RA</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2005</pubdate>
            <volume>15</volume>
            <issue>1</issue>
            <fpage>1</fpage>
            <lpage>18</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">540289</pubid>
                  <pubid idtype="pmpid" link="fulltext">15632085</pubid>
                  <pubid idtype="doi">10.1101/gr.3059305</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B76">
            <title>
               <p>GenBank</p>
            </title>
            <aug>
               <au>
                  <snm>Benson</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>Karsch-Mizrachi</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Lipman</snm>
                  <fnm>DJ</fnm>
               </au>
               <au>
                  <snm>Ostell</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Wheeler</snm>
                  <fnm>DL</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2006</pubdate>
            <volume>34</volume>
            <fpage>D16</fpage>
            <lpage>20</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1347519</pubid>
                  <pubid idtype="pmpid" link="fulltext">16381837</pubid>
                  <pubid idtype="doi">10.1093/nar/gkj157</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B77">
            <title>
               <p>VISTA : visualizing global DNA sequence alignments of arbitrary length</p>
            </title>
            <aug>
               <au>
                  <snm>Mayor</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Brudno</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Schwartz</snm>
                  <fnm>JR</fnm>
               </au>
               <au>
                  <snm>Poliakov</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Rubin</snm>
                  <fnm>EM</fnm>
               </au>
               <au>
                  <snm>Frazer</snm>
                  <fnm>KA</fnm>
               </au>
               <au>
                  <snm>Pachter</snm>
                  <fnm>LS</fnm>
               </au>
               <au>
                  <snm>Dubchak</snm>
                  <fnm>I</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2000</pubdate>
            <volume>16</volume>
            <issue>11</issue>
            <fpage>1046</fpage>
            <lpage>1047</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/16.11.1046</pubid>
                  <pubid idtype="pmpid" link="fulltext">11159318</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B78">
            <title>
               <p>Viewing and annotating sequence data with Artemis</p>
            </title>
            <aug>
               <au>
                  <snm>Berriman</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Rutherford</snm>
                  <fnm>K</fnm>
               </au>
            </aug>
            <source>Brief Bioinform</source>
            <pubdate>2003</pubdate>
            <volume>4</volume>
            <issue>2</issue>
            <fpage>124</fpage>
            <lpage>132</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bib/4.2.124</pubid>
                  <pubid idtype="pmpid" link="fulltext">12846394</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B79">
            <title>
               <p>BioEdit: a user-friendly biological sequence alignment editor and analysis program for Windows 95/98/NT</p>
            </title>
            <aug>
               <au>
                  <snm>Hall</snm>
                  <fnm>TA</fnm>
               </au>
            </aug>
            <source>Nucl Acids Symp Ser</source>
            <pubdate>1999</pubdate>
            <volume>41</volume>
            <fpage>95</fpage>
            <lpage>98</lpage>
         </bibl>
         <bibl id="B80">
            <title>
               <p>PDA: a pipeline to explore and estimate polymorphism in large DNA databases</p>
            </title>
            <aug>
               <au>
                  <snm>Casillas</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Barbadilla</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2004</pubdate>
            <volume>32</volume>
            <issue>Web Server issue</issue>
            <fpage>W166</fpage>
            <lpage>9</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">441566</pubid>
                  <pubid idtype="pmpid" link="fulltext">15215372</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B81">
            <title>
               <p>The Bioperl toolkit: Perl modules for the life sciences</p>
            </title>
            <aug>
               <au>
                  <snm>Stajich</snm>
                  <fnm>JE</fnm>
               </au>
               <au>
                  <snm>Block</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Boulez</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Brenner</snm>
                  <fnm>SE</fnm>
               </au>
               <au>
                  <snm>Chervitz</snm>
                  <fnm>SA</fnm>
               </au>
               <au>
                  <snm>Dagdigian</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Fuellen</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Gilbert</snm>
                  <fnm>JG</fnm>
               </au>
               <au>
                  <snm>Korf</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Lapp</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Lehvaslaiho</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Matsalla</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Mungall</snm>
                  <fnm>CJ</fnm>
               </au>
               <au>
                  <snm>Osborne</snm>
                  <fnm>BI</fnm>
               </au>
               <au>
                  <snm>Pocock</snm>
                  <fnm>MR</fnm>
               </au>
               <au>
                  <snm>Schattner</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Senger</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Stein</snm>
                  <fnm>LD</fnm>
               </au>
               <au>
                  <snm>Stupka</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Wilkinson</snm>
                  <fnm>MD</fnm>
               </au>
               <au>
                  <snm>Birney</snm>
                  <fnm>E</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2002</pubdate>
            <volume>12</volume>
            <issue>10</issue>
            <fpage>1611</fpage>
            <lpage>1618</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">187536</pubid>
                  <pubid idtype="pmpid" link="fulltext">12368254</pubid>
                  <pubid idtype="doi">10.1101/gr.361602</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B82">
            <title>
               <p>Biometry: The principles and practice of statistics in biological research</p>
            </title>
            <aug>
               <au>
                  <snm>Sokal</snm>
                  <fnm>RR</fnm>
               </au>
               <au>
                  <snm>Rohlf</snm>
                  <fnm>FJ</fnm>
               </au>
            </aug>
            <publisher>New York , W.H. Freeman and Co.</publisher>
            <pubdate>1995</pubdate>
         </bibl>
         <bibl id="B83">
            <title>
               <p>Contributions to Probability and Statistics: Essays in Honor of Harold Hotelling</p>
            </title>
            <aug>
               <au>
                  <snm>Levene</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <publisher> Stanford University Press</publisher>
            <editor>al. O</editor>
            <pubdate>1960</pubdate>
            <volume>I</volume>
            <fpage>278</fpage>
            <lpage>292</lpage>
         </bibl>
         <bibl id="B84">
            <title>
               <p>RevTrans: Multiple alignment of coding DNA from aligned amino acid sequences</p>
            </title>
            <aug>
               <au>
                  <snm>Wernersson</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Pedersen</snm>
                  <fnm>AG</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2003</pubdate>
            <volume>31</volume>
            <issue>13</issue>
            <fpage>3537</fpage>
            <lpage>3539</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">169015</pubid>
                  <pubid idtype="pmpid" link="fulltext">12824361</pubid>
                  <pubid idtype="doi">10.1093/nar/gkg609</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B85">
            <title>
               <p>PAML: a program package for phylogenetic analysis by maximum likelihood</p>
            </title>
            <aug>
               <au>
                  <snm>Yang</snm>
                  <fnm>Z</fnm>
               </au>
            </aug>
            <source>Comput Appl Biosci</source>
            <pubdate>1997</pubdate>
            <volume>13</volume>
            <issue>5</issue>
            <fpage>555</fpage>
            <lpage>556</lpage>
            <url>http://bioinformatics.oxfordjournals.org/cgi/reprint/13/5/555</url>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid">9367129</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B86">
            <title>
               <p>The Universal Protein Resource (UniProt)</p>
            </title>
            <aug>
               <au>
                  <snm>Bairoch</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Apweiler</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Wu</snm>
                  <fnm>CH</fnm>
               </au>
               <au>
                  <snm>Barker</snm>
                  <fnm>WC</fnm>
               </au>
               <au>
                  <snm>Boeckmann</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Ferro</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Gasteiger</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Huang</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Lopez</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Magrane</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Martin</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Natale</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>O'Donovan</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Redaschi</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Yeh</snm>
                  <fnm>LS</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2005</pubdate>
            <volume>33</volume>
            <issue>Database issue</issue>
            <fpage>D154</fpage>
            <lpage>9</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">540024</pubid>
                  <pubid idtype="pmpid" link="fulltext">15608167</pubid>
                  <pubid idtype="doi">10.1093/nar/gki070</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
      </refgrp>
   </bm>
</art>
