<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1471-2164-9-127</ui>
   <ji>1471-2164</ji>
   <fm>
      <dochead>Research article</dochead>
      <bibl>
         <title>
            <p>Evolution of <it>hedgehog </it>and <it>hedgehog</it>-related genes, their origin from Hog proteins in ancestral eukaryotes and discovery of a novel Hint motif</p>
         </title>
         <aug>
            <au id="A1" ca="yes">
               <snm>B&#252;rglin</snm>
               <mi>R</mi>
               <fnm>Thomas</fnm>
               <insr iid="I1"/>
               <email>thomas.burglin@biosci.ki.se</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Dept. of Biosciences and Nutrition, Karolinska Institutet &amp; School of Life Sciences, S&#246;dert&#246;rns H&#246;gskola, Alfred Nobels All&#233; 7, SE-141 89 Huddinge, Sweden</p>
            </ins>
         </insg>
         <source>BMC Genomics</source>
         <issn>1471-2164</issn>
         <pubdate>2008</pubdate>
         <volume>9</volume>
         <issue>1</issue>
         <fpage>127</fpage>
         <url>http://www.biomedcentral.com/1471-2164/9/127</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">18334026</pubid>
               <pubid idtype="doi">10.1186/1471-2164-9-127</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>16</day>
               <month>3</month>
               <year>2007</year>
            </date>
         </rec>
         <acc>
            <date>
               <day>11</day>
               <month>3</month>
               <year>2008</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>11</day>
               <month>3</month>
               <year>2008</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2008</year>
         <collab>B&#252;rglin; licensee BioMed Central Ltd.</collab>
         <note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>The Hedgehog (Hh) signaling pathway plays important roles in human and animal development as well as in carcinogenesis. Hh molecules have been found in both protostomes and deuterostomes, but curiously the nematode <it>Caenorhabditis elegans</it> lacks a bona-fide Hh. Instead a series of Hh-related proteins are found, which share the Hint/Hog domain with Hh, but have distinct N-termini.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>We performed extensive genome searches of the cnidarian <it>Nematostella vectensis</it> and several nematodes to gain further insights into Hh evolution. We found six genes in <it>N. vectensis</it> with a relationship to Hh: two Hh genes, one gene with a Hh N-terminal domain fused to a Willebrand factor type A domain (VWA), and three genes containing Hint/Hog domains with distinct novel N-termini. In the nematode <it>Brugia malayi</it> we find the same types of <it>hh</it>-related genes as in <it>C. elegans</it>. In the more distantly related Enoplea nematodes Xiphinema and <it>Trichinella spiralis</it> we find a bona-fide Hh. In addition, T. spiralis also has a <it>quahog</it> gene like <it>C. elegans</it>, and there are several additional <it>hh</it>-related genes, some of which have secreted N-terminal domains of only 15 to 25 residues. Examination of other Hh pathway components revealed that <it>T. spiralis</it> - like <it>C. elegans</it> - lacks some of these components. Extending our search to all eukaryotes, we recovered genes containing a Hog domain similar to Hh from many different groups of protists. In addition, we identified a novel Hint gene family present in many eukaryote groups that encodes a VWA domain fused to a distinct Hint domain we call Vint. Further members of a poorly characterized Hint family were also retrieved from bacteria.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusion</p>
               </st>
               <p>In Cnidaria and nematodes the evolution of <it>hh</it> genes occurred in parallel to the evolution of other genes that contain a Hog domain but have different N-termini. The fact that Hog genes comprising a secreted N-terminus and a Hog domain are also found in many protists suggests that this gene family must have arisen in very early eukaryotic evolution, and eventually gave rise to <it>hh</it> and <it>hh</it>-related genes in animals. The results indicate a hitherto unsuspected ability of Hog domain encoding genes to evolve new N-termini. In one instance in Cnidaria, the Hh N-terminal signaling domain is associated with a VWA domain and lacks a Hog domain, suggesting a modular mode of evolution also for the N-terminal domain. The Hog domain proteins, the inteins and VWA-Vint proteins represent three different families of Hint domain proteins that evolved in parallel in eukaryotes.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <meta>
      <classifications>
         <classification type="bmc" subtype="user_supplied_xml" id="endnote"/>
      </classifications>
   </meta>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>The Hedgehog (Hh) signaling pathway has been shown to be of fundamental importance for patterning and cell proliferation in animal development (for review see <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B2">2</abbr><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr></abbrgrp>). Mutations in this pathway cause congenital defects and several types of cancer such as basal cell carcinoma and medulloblastoma <abbrgrp><abbr bid="B5">5</abbr><abbr bid="B6">6</abbr><abbr bid="B7">7</abbr><abbr bid="B8">8</abbr></abbrgrp>. A key molecule of the pathway is Hh, a secreted ligand that can act as morphogen. <it>Drosophila melanogaster </it>has a single <it>hedgehog </it>(<it>hh</it>) gene, while mammalian genomes contain three paralogous genes, Sonic Hh (Shh), Desert Hh (Dhh), and Indian Hh (Ihh) <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>. In zebrafish, five <it>hh </it>genes are present due to an extra round of genome duplication during evolution of ray-finned fish <abbrgrp><abbr bid="B10">10</abbr><abbr bid="B11">11</abbr></abbrgrp>. The Hh protein is synthesized as a precursor composed of two domains, the N-terminal signaling domain and the C-terminal autoprocessing domain. A substantial part of the autoprocessing domain shares sequence similarity with self-splicing inteins and therefore this domain has been named Hint <abbrgrp><abbr bid="B12">12</abbr></abbrgrp>. C-terminal to the Hint domain is a sterol recognition region (SRR). A crucial function of the autoprocessing domain is to add a cholesterol moiety to the N-terminal signaling domain, which is required for the proper function of the N-terminal ligand <abbrgrp><abbr bid="B13">13</abbr><abbr bid="B14">14</abbr><abbr bid="B15">15</abbr><abbr bid="B16">16</abbr></abbrgrp>. In the nematode <it>Caenorhabditis elegans </it>no bona-fide <it>hh </it>is present, i.e. there is no gene that encodes both the N-terminal signalling domain as well as the C-terminal Hint domain. Instead ten genes encoding the C-terminal autoprocessing domain are found that, however, have N-terminal regions very distinct from Hh. Furthermore, a large number of additional genes are found that encode only these new N-terminal domains and lack the C-terminal autoprocessing domain. Overall, these genes can be grouped into four families that have been named <it>quahog </it>(<it>qua</it>), <it>warthog </it>(<it>wrt</it>), <it>groundhog </it>(<it>grd</it>) and <it>ground-like </it>(<it>grl</it>) and are collectively referred to as <it>hh</it>-related genes <abbrgrp><abbr bid="B17">17</abbr><abbr bid="B18">18</abbr><abbr bid="B19">19</abbr></abbrgrp>. At present it is not clear, whether the C-terminal domains of the <it>C. elegans </it>Hh-related proteins can add a cholesterol moiety to the N-terminus analogous to Hh, since there are sequence differences in the SRR equivalent region. Therefore, this region of the Hh-related proteins was named ARR (adduct recognition region) <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>; here we refer to the combined Hint/SRR or Hint/ARR region as Hog domain for simplicity, as others have done as well <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>.</p>
         <p>The N-terminal domains of the <it>C. elegans hh</it>-related genes were not found in vertebrates and flies using blast searches, giving rise to the notion that these genes were perhaps derived from <it>hh </it>in early nematode evolution <abbrgrp><abbr bid="B17">17</abbr><abbr bid="B18">18</abbr></abbrgrp>. Recently, a Hog domain containing protein, Hoglet, was discovered in the choanoflagellate <it>Monosiga ovata</it>, but its N-terminal region is distinct from Hh and other Hh-related proteins, instead sharing sequence similarity with cellulose-binding domains (CBD) <abbrgrp><abbr bid="B22">22</abbr></abbrgrp>. Choanoflagellates are unicellular protists most closely related to multicellular animals <abbrgrp><abbr bid="B23">23</abbr><abbr bid="B24">24</abbr></abbrgrp> and therefore Hoglet might represent an ancestral precursor form of Hh. A Hh protein was also described from the cnidarian <it>Nematostella vectensis </it><abbrgrp><abbr bid="B25">25</abbr><abbr bid="B26">26</abbr></abbrgrp>, indicating that Hh already existed before the rise of bilaterian animals. An EST with sequence similarity to Hh was also recovered from the sponge Oscarella carmela <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>, indicating that the "Hedge" domain originated before the advent of Eumetazoa. In order to understand the origin and evolution of the <it>C. elegans hh</it>-related genes, we had already performed cursory searches of the genome of the parasitic nematode <it>Brugia malayi </it>and found that it also contains several <it>hh</it>-related genes <abbrgrp><abbr bid="B18">18</abbr><abbr bid="B17">17</abbr></abbrgrp>. Here we performed comprehensive searches of the genomes of the cnidarian <it>N. vectensis </it><abbrgrp><abbr bid="B28">28</abbr></abbrgrp>, the nematodes <it>B. malayi </it>and <it>Trichinella spiralis </it>as well as the NCBI protein, DNA and EST databases to find additional <it>hh </it>and <it>hh</it>-related genes that may shed light on the evolution of these genes. In these searches we found a previously described gene from the fungus <it>Glomus mosseae </it>that shares sequence similarity with Hh through the Hog domain <abbrgrp><abbr bid="B29">29</abbr></abbrgrp>, but has not been considered in recent evolutionary analyses <abbrgrp><abbr bid="B22">22</abbr></abbrgrp>. Furthermore, we found a number of additional genes with similarity to the Hog domain in Alveolata, moss, red algae, and other protists, indicating that the origin of the Hog domain occurred already in lower eukaryotes. As stated above, the Hog domain shares sequence similarity to self-splicing inteins, which have been found in Archaea, Bacteria, as well as fungi, algae and a few protists <abbrgrp><abbr bid="B30">30</abbr><abbr bid="B31">31</abbr><abbr bid="B32">32</abbr></abbrgrp>. Recently, two other types of Hint related domains have been described, primarily from bacteria, that have been named bacterial intein-like proteins (BIL) type A and B <abbrgrp><abbr bid="B21">21</abbr><abbr bid="B33">33</abbr></abbrgrp>. Several conserved sequence motifs within the Hint domain have been described for inteins that have been named motif A, B, E and F <abbrgrp><abbr bid="B34">34</abbr><abbr bid="B35">35</abbr><abbr bid="B36">36</abbr><abbr bid="B37">37</abbr></abbrgrp>. Our searches revealed also ORFs in Tetrahymena, fungi and several other protist branches that have similarity to the Hint domain via motifs A and B, but cannot be classified as inteins, Hog, or BIL domains.</p>
      </sec>
      <sec>
         <st>
            <p>Results</p>
         </st>
         <sec>
            <st>
               <p>Retrieval and analysis of sequences</p>
            </st>
            <p>We have previously characterized one <it>qua</it>, one <it>hog</it>-only, ten <it>wrt</it>, 17 <it>grd</it>, and 32 <it>grl </it>ORFs from <it>C. elegans</it>, three of which are pseudogenes <abbrgrp><abbr bid="B19">19</abbr><abbr bid="B38">38</abbr></abbrgrp>. Furthermore, we have identified 49 <it>hh</it>-related genes in the related nematode <it>Caenorhabditis briggsae </it><abbrgrp><abbr bid="B38">38</abbr></abbrgrp>. We correct this number to 48 <it>hh</it>-related genes here, because <it>C. briggsae wrt-8 </it>is the same locus as <it>wrt-4</it>. To retrieve sequences from other species we used selected Hh, WRT, QUA, GRD and GRL protein sequences as queries for tblastn and blastp searches at Stellabase, the DOE Joint Genome Institute, The Genome Sequencing Center at the Washington University School of Medicine, The Institute for Genomic Research (TIGR), and NCBI (see Methods). The recovered sequences were aligned to sequences that we had assembled previously <abbrgrp><abbr bid="B18">18</abbr><abbr bid="B19">19</abbr><abbr bid="B38">38</abbr></abbrgrp>. When obvious discrepancies in conserved regions were found in the newly retrieved ORFs, genomic sequences were inspected for additional or extraneous exons or alternative splice sites, and ESTs were examined for frameshifts. ORFs were corrected to optimize matches to existing motifs, and extraneous N-terminal residues were truncated when methionine residues followed by good N-terminal signal peptides for secretion were found. One caveat is that our ORF predictions from genomic sequences are still limited due to partial nature of the various contig assemblies. In some instances an ORF runs into an unsequenced region (e.g., <it>B. malayi wrt-4</it>). In the case of ESTs it was often possible to assemble several ESTs into contigs, but in most instances ORFs derived from ESTs lack either N-terminus and/or C-terminus. Considering also that the various genome projects are are in different states of completion, the nomenclature given here to the ORFs should be considered preliminary. After correction of the ORFs multiple sequence alignments of the different protein domains were made and used for phylogenetic analyses using Neighbor Joining and Maximum Likelihood. We also prepared protein sequence logos of the Hog domains of Hh and nematode Hh-related genes to aid with the analysis of more divergent Hog domains (Figure <figr fid="F1">1</figr>, Additional files <supplr sid="S1">1</supplr>, <supplr sid="S2">2</supplr>, <supplr sid="S3">3</supplr>). We extended the motif nomenclature of inteins by introducing motifs J, K, and L (Figure <figr fid="F1">1</figr>, Additional file <supplr sid="S2">2</supplr>). Motif J corresponds to motif G in inteins <abbrgrp><abbr bid="B34">34</abbr><abbr bid="B35">35</abbr><abbr bid="B36">36</abbr><abbr bid="B37">37</abbr></abbrgrp>, however, because this motif is so distinct in Hog domains, we have ventured to give it its own name here. Motifs K and L are located in the SRR and are primarily found in Hog domains of Hh proteins (Figure <figr fid="F2">2</figr>, <figr fid="F3">3</figr>, <figr fid="F4">4</figr>, Additional file <supplr sid="S1">1</supplr>). In the Hog domains of nematode Hh-related proteins, these two regions show a number of differences compared to the Hh proteins (Figure <figr fid="F2">2</figr>, <figr fid="F3">3</figr>, <figr fid="F4">4</figr>, Additional files <supplr sid="S1">1</supplr>, <supplr sid="S2">2</supplr>, <supplr sid="S3">3</supplr>), and, as will be shown below, motifs K and L provide useful diagnostic functions for evaluating Hog domains.</p>
            <suppl id="S1">
               <title>
                  <p>Additional file 1</p>
               </title>
               <text>
                  <p>Multiple sequence alignment of Hog domains used for the protein sequence logos. Multiple sequence alignment in this and subsequent figures was carried out using first MUSCLE and imported subsequently into Clustal_X. Color coding was modified from default Clustal_X color coding by marking all cysteine residues in yellow, small hydrophobic residues in light blue and large hydrophobic residues in cyan blue. The conserved motifs, as well as the C-terminal SRR or ARR region are indicated in the alignment. The two conserved cysteine residues found in the Hog domain of nematode Hh-related proteins are indicated with red arrows.</p>
               </text>
               <file name="1471-2164-9-127-S1.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <suppl id="S2">
               <title>
                  <p>Additional file 2</p>
               </title>
               <text>
                  <p>Full image of the protein sequence logo of aligned Hog domains shown in Figure <figr fid="F1">1</figr>. The color scheme is similar to the one used in the multiple sequence alignments (N,Q,S,T: green; C: yellow; P: pink; G: orange; K,R: red; A,I,L,M,V: blue; F,W,Y: cyan blue; H, purple, D,E: magenta; gaps: white). The extend of the Hint domain and the SRR region are indicated above the logo with a red line. Red boxes underneath the logo indicate the different motifs A, B, F, J, K, L.</p>
               </text>
               <file name="1471-2164-9-127-S2.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <suppl id="S3">
               <title>
                  <p>Additional file 3</p>
               </title>
               <text>
                  <p>Protein sequence logo of nematode Hog domains. Protein sequence logo generated from nematode Hog domains shown in the multiple sequence aligment of Additional file 1. This logo is in register with the Hh Hog domain logo of Additional file 2. The two conserved cysteine residues specific to nematode Hh-related proteins are indicated with red arrows.</p>
               </text>
               <file name="1471-2164-9-127-S3.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <fig id="F1">
               <title>
                  <p>Figure 1</p>
               </title>
               <caption>
                  <p>Protein sequence logo of Hh Hog domains</p>
               </caption>
               <text>
                  <p><b>Protein sequence logo of Hh Hog domains</b>. Central section of the protein sequence logo that was generated from aligned Hog domains of diverse Hh proteins using LogoBar. For the full image see Additional file <supplr sid="S2">2</supplr>. The color scheme is similar to the one used in the multiple sequence alignments (N,Q,S,T: green; C: yellow; P: pink; G: orange; K,R: red; A,I,L,M,V: blue; F,W,Y: cyan blue; H, purple, D,E: magenta; gaps: white). The extend of the Hint domain and the SRR region are indicated above the logo with a red line. Red boxes underneath the logo indicate the different motifs A, B, F, J, K, L.</p>
               </text>
               <graphic file="1471-2164-9-127-1"/>
            </fig>
            <fig id="F2">
               <title>
                  <p>Figure 2</p>
               </title>
               <caption>
                  <p>Multiple sequence alignment of Hog domains, part 1</p>
               </caption>
               <text>
                  <p><b>Multiple sequence alignment of Hog domains, part 1</b>. Multiple sequence alignment in this and other figures was carried out using first MUSCLE and then imported into Clustal_X. Manual adjustments to the alignment were carried out using SeaView. Color coding was modified from default Clustal_X color coding by marking all cysteine residues in yellow, small hydrophobic residues in light blue and large hydrophobic residues in cyan blue. The Hint domain, as well as the C-terminal SRR or ARR regions are indicated above the alignment. Motifs A, B, F, J, K, and L are indicated with red rectangles underneath the alignment. Species abbreviations are shown in Table 3. Note that not all sequences in this alignment are complete.</p>
               </text>
               <graphic file="1471-2164-9-127-2"/>
            </fig>
            <fig id="F3">
               <title>
                  <p>Figure 3</p>
               </title>
               <caption>
                  <p>Multiple sequence alignment of Hog domains, part 2</p>
               </caption>
               <text>
                  <p><b>Multiple sequence alignment of Hog domains, part 2</b>. Continuation of the multiple sequence alignment of Figure 2.</p>
               </text>
               <graphic file="1471-2164-9-127-3"/>
            </fig>
            <fig id="F4">
               <title>
                  <p>Figure 4</p>
               </title>
               <caption>
                  <p>Multiple sequence alignment of Hog domains, part 3</p>
               </caption>
               <text>
                  <p><b>Multiple sequence alignment of Hog domains, part 3</b>. Continuation of the multiple sequence alignment of Figure 3.</p>
               </text>
               <graphic file="1471-2164-9-127-4"/>
            </fig>
         </sec>
         <sec>
            <st>
               <p><it>hh</it>-related genes in <it>B. malayi </it>and other Chromadorea</p>
            </st>
            <p>The nematode <it>B. malayi </it>is a parasitic nematode that is one of the more distantly related members to <it>C. elegans </it>within the order of Rhabditiada <abbrgrp><abbr bid="B39">39</abbr></abbrgrp>. We have previously described a <it>quahog </it>gene, <it>qua-1</it>, in <it>B. malayi </it><abbrgrp><abbr bid="B19">19</abbr></abbrgrp>, and obtained partial sequences from ESTs for two <it>wrt</it>, one <it>grd </it>and two <it>grl </it>genes <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>. Here we retrieved a total of four <it>wrt </it>genes, one with a Hog domain (Bm <it>wrt-6</it>), two without a Hog domain (Bm <it>wrt-10</it>, Bm <it>wrt-5/3</it>) and one whose C-terminus is presently unknown (Bm <it>wrt-4</it>) (Table <tblr tid="T1">1</tblr>, Fig. <figr fid="F2">2</figr>, <figr fid="F3">3</figr>, <figr fid="F4">4</figr>, Additional file <supplr sid="S4">4</supplr>). Based on phylogenetic analyses of both the Hog domain and the Wart domain (Figure <figr fid="F5">5</figr>, <figr fid="F6">6</figr>, <figr fid="F7">7</figr>, Additional files <supplr sid="S5">5</supplr>, <supplr sid="S6">6</supplr>, <supplr sid="S7">7</supplr>), Bm <it>wrt-10 </it>is a clear orthologue of Ce <it>wrt-10</it>, <it>wrt-5/3 </it>is a co-orthologue of Ce <it>wrt-5 </it>and <it>wrt-3</it>, and &#8211; based primarily on the Hog domain &#8211; Bm <it>wrt-6 </it>is an orthologue of Ce <it>wrt-6</it>. The <it>wrt-6 </it>ORF encodes a full Wart domain, however the previously identified <it>wrt-6 </it>EST <abbrgrp><abbr bid="B17">17</abbr></abbrgrp> lacks the C-terminal half of the Wart domain. Comparison with the genomic sequence revealed that this EST spans exons 1, 2, the first 10 nucleotides of exon 3 and continues then into exon 10, 11, and 12, which contain the Hog domain (data not shown). The point of discrepancy in exon 3 is not at a splice site, therefore this unusual EST might represent a cloning artifact. Bm <it>wrt-4 </it>cannot be assigned unequivocally as orthologue of Cb <it>wrt-2 </it>or Cb <it>wrt-4</it>, but it appears to group with them. Ce <it>wrt-10 </it>lies next to Ce <it>wrt-1 </it>on the chromosome <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>, however the Bm <it>wrt-10 </it>contig is too small to determine, whether another <it>wrt </it>gene resides next to it.</p>
            <suppl id="S4">
               <title>
                  <p>Additional file 4</p>
               </title>
               <text>
                  <p>Multiple sequence alignment of Wart domains. Wart domains were aligned and visualized in Clustal_X as described in Figure <figr fid="F2">2</figr>.</p>
               </text>
               <file name="1471-2164-9-127-S4.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <suppl id="S5">
               <title>
                  <p>Additional file 5</p>
               </title>
               <text>
                  <p>Phylogenetic tree analysis of Hog domains using Neighbor joining. Neighbor joining tree without protist sequences. The Hog domain of the fungal gene GmGIN1 was used as outgroup.</p>
               </text>
               <file name="1471-2164-9-127-S5.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <suppl id="S6">
               <title>
                  <p>Additional file 6</p>
               </title>
               <text>
                  <p>Phylogenetic tree analysis of Hog domains using Maximum likelihood. Maximum likelihood tree of the same sequences as in Additional file 5 with GmGIN1 as outgroup.</p>
               </text>
               <file name="1471-2164-9-127-S6.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <suppl id="S7">
               <title>
                  <p>Additional file 7</p>
               </title>
               <text>
                  <p>Neighbor joining tree of Hog sequences which were truncated at the N-terminus. Hog sequences were truncated at the N-terminus to have the same size as the Pt wrt sequence fragment. This analysis shows that Pt wrt clusters with the <it>wrt </it>genes (arrow). GmGIN1 was used as outgroup. Note: Apart from Figure <figr fid="F5">5</figr> and <figr fid="F6">6</figr>, and Additional files 5-7 further phylogenetic analysis were carried out that are not shown here. For example, the intein from vacuolar ATPase from <it>C. tropicalis </it>was used as outgroup <abbrgrp><abbr bid="B22">22</abbr></abbrgrp> and gave comparable results to the tree analyses shown here.</p>
               </text>
               <file name="1471-2164-9-127-S7.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <tbl id="T1">
               <title>
                  <p>Table 1</p>
               </title>
               <caption>
                  <p>Number of <it>hh </it>and <it>hh</it>-related genes found in different species.</p>
               </caption>
               <tblbdy cols="9">
                  <r>
                     <c ca="left">
                        <p>Gene structure</p>
                     </c>
                     <c ca="left">
                        <p>Nv</p>
                     </c>
                     <c ca="left">
                        <p>XC</p>
                     </c>
                     <c ca="left">
                        <p>Ts</p>
                     </c>
                     <c ca="left">
                        <p>Bm</p>
                     </c>
                     <c ca="left">
                        <p>Ce</p>
                     </c>
                     <c ca="left">
                        <p>Cb</p>
                     </c>
                     <c ca="left">
                        <p>Dm</p>
                     </c>
                     <c ca="left">
                        <p>Mm</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="9">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Hedgehog</p>
                     </c>
                     <c ca="left">
                        <p>2</p>
                     </c>
                     <c ca="left">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>1</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>3</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Hedge-VWA</p>
                     </c>
                     <c ca="left">
                        <p>1</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Wart-only</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>2</p>
                     </c>
                     <c ca="left">
                        <p>5</p>
                     </c>
                     <c ca="left">
                        <p>5</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Warthog</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>1 + 1?</p>
                     </c>
                     <c ca="left">
                        <p>5</p>
                     </c>
                     <c ca="left">
                        <p>3</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Ground-only</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>13 (1P)</p>
                     </c>
                     <c ca="left">
                        <p>10</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Groundhog</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>3</p>
                     </c>
                     <c ca="left">
                        <p>1</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Ground-like</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>13</p>
                     </c>
                     <c ca="left">
                        <p>30 (2P)</p>
                     </c>
                     <c ca="left">
                        <p>27</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Quahog</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>3?</p>
                     </c>
                     <c ca="left">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>1</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Hog only</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>1</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Y0-hog</p>
                     </c>
                     <c ca="left">
                        <p>1</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Y1-hog</p>
                     </c>
                     <c ca="left">
                        <p>1</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Y2-hog</p>
                     </c>
                     <c ca="left">
                        <p>1</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Enop-hog</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>2</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>T-hog</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>1</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Short-hog</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>2</p>
                     </c>
                     <c ca="left">
                        <p>1</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Unknown hog</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>1</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Total</p>
                     </c>
                     <c ca="left">
                        <p>6</p>
                     </c>
                     <c ca="left">
                        <p>9</p>
                     </c>
                     <c ca="left">
                        <p>5</p>
                     </c>
                     <c ca="left">
                        <p>19</p>
                     </c>
                     <c ca="left">
                        <p>58 (3P)</p>
                     </c>
                     <c ca="left">
                        <p>48</p>
                     </c>
                     <c ca="left">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>3</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>The left column indicates the gene structure, with Hog referring to the combined Hint/SRR or Hint/ARR domain. Known pseudogenes are indicated in brackets.</p>
               </tblfn>
            </tbl>
            <fig id="F5">
               <title>
                  <p>Figure 5</p>
               </title>
               <caption>
                  <p>Phylogenetic tree of Hog domains</p>
               </caption>
               <text>
                  <p><b>Phylogenetic tree of Hog domains</b>. Phylogenetic trees were built from aligned Hog domains (Figure 2 &#8211; 4). The Neighbor joining tree was created using the default settings of Clustal_X. Bootstrap values of 1000 trials are indicated in the figure. In this and subsequent phylogenetic tree figures Enoplea sequences are highlighted in light green, Cnidarian sequences in yellow, Choanoflagellate sequences in light red and fungal sequences in blue. The Hh sequences are marked with Hh and the nematode Hh-related sequences are marked with NemaHog. The root was placed between the red algae/plant sequences and the remaining sequences. Some incomplete sequences were omitted in this tree. Additional phylogenetic analyses were also carried out, for example by omitting the protist sequences and using the fungal sequence GmGIN1 as outgroup (Additional files <supplr sid="S5">5</supplr>, <supplr sid="S6">6</supplr>, <supplr sid="S7">7</supplr>). Overall, the results were very similar.</p>
               </text>
               <graphic file="1471-2164-9-127-5"/>
            </fig>
            <fig id="F6">
               <title>
                  <p>Figure 6</p>
               </title>
               <caption>
                  <p>Maximum likelihood phylogenetic tree of Hog domains</p>
               </caption>
               <text>
                  <p><b>Maximum likelihood phylogenetic tree of Hog domains</b>. A Maximum likelihood phylogenetic tree was constructed using the same data as in Figure 5. Phyml default values were used, and bootstrap values for 100 trials are shown.</p>
               </text>
               <graphic file="1471-2164-9-127-6"/>
            </fig>
            <fig id="F7">
               <title>
                  <p>Figure 7</p>
               </title>
               <caption>
                  <p>Phylogenetic tree of Wart domains</p>
               </caption>
               <text>
                  <p><b>Phylogenetic tree of Wart domains</b>. A multiple sequence alignment of Wart domains (see Additional file <supplr sid="S4">4</supplr>) was used to generate at Neighbor joining tree with the default settings of Clustal_X. <it>B. malayi </it>sequences are highlighted in light blue. This tree is unrooted. Results of 1000 bootstrap trials are shown.</p>
               </text>
               <graphic file="1471-2164-9-127-7"/>
            </fig>
            <p>One Ground domain gene was recovered from <it>B. malayi</it>, Bm <it>grd-5</it>, that is co-orthologous to Ce <it>grd-5 </it>and <it>grd-10 </it>(Figure <figr fid="F8">8</figr>, Additional file <supplr sid="S8">8</supplr>). We have previously identified a few <it>grl </it>genes from <it>B. malayi </it>in searches of ESTs <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>. Here, thirteen <it>grl </it>genes were recovered from <it>B. malayi</it>, however, only a few could be identified as orthologues of <it>C. elegans </it>genes, i.e. <it>grl-4</it>, <it>grl-16</it>, and perhaps <it>grl-7 </it>and <it>grl-17 </it>(Figure <figr fid="F8">8</figr>, Additional file <supplr sid="S8">8</supplr>). Other <it>grl </it>genes have clearly duplicated within the <it>B. malayi </it>branch, e.g. Bm <it>grl-x1</it>, <it>gr l-x2 </it>and <it>grl-3</it>, which are more similar to each other than to other genes.</p>
            <suppl id="S8">
               <title>
                  <p>Additional file 8</p>
               </title>
               <text>
                  <p>Multiple sequence alignment of Ground and Ground-like domains. Alignment of nematode Ground and Ground-like domains.</p>
               </text>
               <file name="1471-2164-9-127-S8.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <fig id="F8">
               <title>
                  <p>Figure 8</p>
               </title>
               <caption>
                  <p>Phylogenetic tree of Ground and Ground-like domains</p>
               </caption>
               <text>
                  <p><b>Phylogenetic tree of Ground and Ground-like domains</b>. A multiple sequence alignment of Ground and Ground-like domains (see Additional file 8) was used to generate a Neighbor joining tree with the default settings of Clustal_X. For <it>grd-1</it>, <it>grd-2 </it>and <it>grd-11 </it>the four Ground domains were extracted manually prior to alignment; the R1 to R4 postscripts indicate the repeat number. <it>B. malayi </it>sequences are highlighted in light blue. This tree is unrooted. Results of 1000 bootstrap trials are shown.</p>
               </text>
               <graphic file="1471-2164-9-127-8"/>
            </fig>
            <p>A few ESTs were retrieved from other Chromadorea nematodes: In <it>Meloidogyne incognita </it>we found a gene with similarity to <it>wrt-6 </it>(Figure <figr fid="F2">2</figr>, <figr fid="F3">3</figr>, <figr fid="F4">4</figr>, <figr fid="F5">5</figr>, <figr fid="F6">6</figr>), and one <it>grl </it>gene, Msp3, which is expressed in the esophagal gland cells <abbrgrp><abbr bid="B40">40</abbr></abbrgrp> (Additional file <supplr sid="S8">8</supplr>). In <it>Parastrongyloides trichosuri </it>a gene with similarity to <it>wrt </it>genes was found (Additional file <supplr sid="S7">7</supplr>).</p>
         </sec>
         <sec>
            <st>
               <p><it>hh </it>and <it>hh</it>-related genes in Enoplea nematodes: <it>Xiphinema sp</it>. and <it>Trichinella spiralis</it></p>
            </st>
            <p><it>C. elegans </it>and <it>B. malayi </it>belong to the class of Chromadorea. Our database searches revealed now also Hog-containing genes from the distantly related class of Enoplea nematodes, i.e. <it>Xiphinema </it>index CSEQDL01, and <it>T. spiralis</it>, both members of the Dorylaimia <abbrgrp><abbr bid="B39">39</abbr></abbrgrp>. From <it>Xiphinema </it>we retrieved ESTs for nine distinct genes, and from <it>T. spiralis </it>five (Table <tblr tid="T1">1</tblr>), one of which (Ts Xhog1) is also supported by ESTs. All five <it>T. spirals </it>ORFs have a signal peptide sequence for secretion, and although many of the Xiphinema ESTs are incomplete, in several instances methionine residues followed by good signal peptides could be found at the 5' of the ESTs (XC Thog, Shog1, Shog2) (Additional file <supplr sid="S9">9</supplr>). One gene from Xiphinema (XC hh) and one gene from <it>T. spiralis </it>(Ts hh) are clearly <it>hh </it>genes (Figure <figr fid="F2">2</figr>, <figr fid="F3">3</figr>, <figr fid="F4">4</figr>, <figr fid="F9">9</figr>, Additional file <supplr sid="S10">10</supplr>), since they both have a Hedge domain and a Hog domain. One gene from <it>T. spiralis </it>has a QUA domain upstream of the Hog domain (Ts qua-1). While its Qua domain is rather divergent, the cysteine residues are all conserved (Additional file <supplr sid="S11">11</supplr>). Ts Xhog3 has a rather short region upstream of the Hog domain, which cannot be extended, because it is delimited by an upstream cyclin gene, for which ESTs are available (data not shown). After cleavage of the signal peptide and subsequent autoprocessing through the Hog domain the predicted N-terminal peptide of Ts Xhog3 would only be 34 residues long. In Xiphinema the three ORFs with a signal peptide (XC Shog1, Shog2, and Thog) have rather short predicted N-terminal sequences as well. In the case of XC Shog1 it is only 15 residues long, in the case of XC Shog2 it is 25 residues, and in the case of XC Thog it is 79 residues long with an unusual stretch of about 70 residues almost entirely composed of threonine and serine residues. Two <it>T. spiralis </it>genes reside next to each other on the chromosome (Ts Xhog1 and Xhog2). They share sequence similarity upstream of the Hog domain with six conserved cysteine residues. In addition, XC Xhog5 also has sequence similarity to the upstream regions of Ts Xhog1 and Xhog 2 (Figure <figr fid="F10">10</figr>).</p>
            <suppl id="S9">
               <title>
                  <p>Additional file 9</p>
               </title>
               <text>
                  <p>Sequences used in the analysis. List of sequences, accession numbers, notes, predicted signal peptide cleavage sites and protein sequences used in this analysis.</p>
               </text>
               <file name="1471-2164-9-127-S9.html">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <suppl id="S10">
               <title>
                  <p>Additional file 10</p>
               </title>
               <text>
                  <p>Multiple sequence alignment of "Hedge" domain containing proteins and Hedgehog proteins. Note that Os hhlike and NV 200640 do not line up in the Hog domain region.</p>
               </text>
               <file name="1471-2164-9-127-S10.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <suppl id="S11">
               <title>
                  <p>Additional file 11</p>
               </title>
               <text>
                  <p>Multiple sequence alignment of Quahog proteins. Alignment of nematode Quahog proteins.</p>
               </text>
               <file name="1471-2164-9-127-S11.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <fig id="F9">
               <title>
                  <p>Figure 9</p>
               </title>
               <caption>
                  <p>Phylogenetic tree of Hedge domains</p>
               </caption>
               <text>
                  <p><b>Phylogenetic tree of Hedge domains</b>. A multiple sequence alignment of Hedge and Hedgehog proteins (see Additional file <supplr sid="S10">10</supplr>) was used to generate at Neighbor joining tree with the default settings of Clustal_X. This tree is unrooted. Results of 1000 bootstrap trials are shown.</p>
               </text>
               <graphic file="1471-2164-9-127-9"/>
            </fig>
            <fig id="F10">
               <title>
                  <p>Figure 10</p>
               </title>
               <caption>
                  <p>Multiple sequence alignment of Enoplea Hog proteins with a new upstream motif</p>
               </caption>
               <text>
                  <p><b>Multiple sequence alignment of Enoplea Hog proteins with a new upstream motif</b>. Multiple sequence alignment of Enoplea Ts Xhog1, Ts Xhog2, and XC Xhog5 reveals new conserved regions upstream of the Hog domain.</p>
               </text>
               <graphic file="1471-2164-9-127-10"/>
            </fig>
            <p>Apart from the similarities in the regions N-terminal to the Hog domain indicated above, the remaining N-terminal sequences show no obvious similarities between each other or to any other proteins. Only the threonine-rich stretch is reminiscent of the 200 residue long threonine stretch in the N-terminal region of the choanoflagellate Hoglet protein <abbrgrp><abbr bid="B22">22</abbr></abbrgrp>. However, this may be a case of convergent evolution. No Wart, Ground, or Ground-like domains could be detected in the genome of <it>T. spiralis </it>or in EST database searches restricted to Enoplea.</p>
            <p>Based on phylogenetic analyses of the Hog domains, XC Xhog1, Xhog2, Xhog3, and Ts Qua-1 form a clade with the Quahog proteins (Figure <figr fid="F5">5</figr>, <figr fid="F6">6</figr>, Additional files <supplr sid="S5">5</supplr>, <supplr sid="S6">6</supplr>, <supplr sid="S7">7</supplr>). While N-terminal sequences for XC Xhog1, 2 and 3 are lacking they could be bona-fide Quahog proteins. A second, distinct clade is formed by Ts Xhog1, Xhog2, and XC Shog1, Xhog4, Xhog5 and Thog, indicating that they are derived from a common ancestor (Figure <figr fid="F5">5</figr>, <figr fid="F6">6</figr>, Additional files <supplr sid="S5">5</supplr>, <supplr sid="S6">6</supplr>, <supplr sid="S7">7</supplr>). In three cases, Ts Xhog1, Xhog2 and XC Xhog5), a common upstream sequence (Enop) has been identified (Figure <figr fid="F10">10</figr>), which seems to be specific to Enoplea nematodes, suggesting that at least in the cases of XC Shog1 and Thog the N-terminal regions have diverged relatively recently.</p>
            <p>Almost all nematode Hh-related proteins form a distinct clade, the only exception being the Hh proteins, and Ts Xhog3 and XC S2hog, which are both very divergent and do not fall into the Hh clade of genes either (Figure <figr fid="F5">5</figr>, <figr fid="F6">6</figr>, Additional files <supplr sid="S5">5</supplr>, <supplr sid="S6">6</supplr>, <supplr sid="S7">7</supplr>). Two features distinguish the Hog domains of the nematode Hh-related proteins from those of the Hh proteins (Figure <figr fid="F1">1</figr>, <figr fid="F2">2</figr>, <figr fid="F3">3</figr>, <figr fid="F4">4</figr>, Additional files <supplr sid="S1">1</supplr>, <supplr sid="S2">2</supplr>, <supplr sid="S3">3</supplr>). 1) The regions corresponding to motifs K and L have characteristic differences in their conserved residues in nematode Hh-related proteins. 2) Two conserved cysteine residues are found in the central region of the Hog domain. When these two residues are mapped onto the X-ray structure of the C-terminal autoprocessing domain of Drosophila Hh <abbrgrp><abbr bid="B12">12</abbr></abbrgrp>, it emerges that they lie adjacent to each other and therefore could form a disulfide bond. This feature might stabilize this type of Hog domain in an extracellular environment, and this extra stability might possibly provide some new functionality. It is however not unique to nematode Hog domains. Zebrafish ihha and ihhb and fugu dhh (fhh) also have this extra cysteine pair, which must represent convergent evolution. It is worth pointing out that Ts Hh lacks the two cysteine residues and has motifs K and L as expected from a bona-fide Hh molecule. However, the quite divergent Ts Xhog3 protein, which lacks a Hedge domain, also lacks the cysteine residues and has motifs K and L.</p>
         </sec>
         <sec>
            <st>
               <p><it>hh </it>and <it>hh</it>-related genes in Cnidaria</p>
            </st>
            <p>tlastn searches of the <it>N. vectensis </it>predicted ORFs returned 10 hits. Several turned out to be differently predicted ORF variants most likely derived from the same locus, since corresponding genomic sequences for some of these loci displayed >99% identity. In the end six distinct ORFs were retrieved that all had good signal sequence for secretion. For four of the ORFs ESTs were found that at least partially support the predictions (Table <tblr tid="T1">1</tblr>, Additional file <supplr sid="S9">9</supplr>). In the case of Nv 239508 the corresponding genomic region seems to have undergone a recent duplication as two virtually identical Hog domains are present there (Additional file <supplr sid="S12">12</supplr>). In addition to the <it>N. vectensis </it>sequences, ESTs for two genes from <it>Acropora millepora </it>and one gene from <it>Hydra magnipapillata </it>were identified (Additional file <supplr sid="S9">9</supplr>). The EST from <it>H. magnipapillata </it>could be extended using the blastn of the NCBI trace archives, which also revealed a second, closely related paralogous gene (Additional file <supplr sid="S9">9</supplr>). Two genes from <it>N. vectensis </it>and one from <it>A. millepora </it>are bona-fide <it>hh </it>genes, because they both encode a Hedge domain and a Hog domain (Figure <figr fid="F2">2</figr>, <figr fid="F3">3</figr>, <figr fid="F4">4</figr>, Additional file <supplr sid="S10">10</supplr>). Two other ORFs, Nv 120428 and Acm DY579185, share conserved sequences upstream of the Hog domain with at least 3 conserved cysteine residues (Figure <figr fid="F11">11</figr>). The N-termini of Nv 140260 and 239508 do not show any similarity with known motifs, and the processed N-terminal peptide of Nv 140260 is only 86 residues long. Similarly, the upstream region of Hm CO905822 and its close paralog do not shown any similarity to the upstream regions of other cnidarian Hog proteins.</p>
            <suppl id="S12">
               <title>
                  <p>Additional file 12</p>
               </title>
               <text>
                  <p>Structure of the Nematostella vectensis genomic assembly around Nv 239508. Current assembly of the genomic region around Nv 239508. Color arrows indicate duplicated regions. N gap indicateds two regions with unknown sequence. The green area shows the ESTs found mapping to this region. The CAGN20453 correpsonds to Nv 239508. The yellow area shows regions of sequence similarity, i.e. hydrolase domain, Hog domain, and Reverse transcriptase. CAGN20453 is not sequenced fully, but the 3' read has been mapped to the right side, since the 3' untranslated region matches better to the 2. repeat of the duplication due to some indel differences. However, as will be noted, the final resulting transcript (shown at bottom) would be rather unusual, as it would splice over another gene, i.e. the hydrolase, which is also supported by an EST. Hence, the genomic organization and gene structure in this region could be subject to change, especially given the unsequenced areas.</p>
               </text>
               <file name="1471-2164-9-127-S12.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <fig id="F11">
               <title>
                  <p>Figure 11</p>
               </title>
               <caption>
                  <p>Multiple sequence alignment of two cnidarian Hog proteins with a new upstream motif</p>
               </caption>
               <text>
                  <p><b>Multiple sequence alignment of two cnidarian Hog proteins with a new upstream motif</b>. Pairwise sequence alignment of cnidarian Nv 120428 and Acm DY579185.</p>
               </text>
               <graphic file="1471-2164-9-127-11"/>
            </fig>
            <p>Last but not least, Nv 200640 is predicted to be 3592 amino acids long and is highly unusual. It is similar to the Hh proteins through the N-terminal Hedge domain (blast expected probability: 1e-18 to Ciona Hh), but no Hog domain follows (Additional files <supplr sid="S10">10</supplr>, <supplr sid="S13">13</supplr>). The Hedge domain is encoded by two exons, and after an intron of 600 bp many additional exons continue the ORF of the JGI prediction, but nowhere in this genomic region resides a Hog domain. Analysis of the ORF using the SMART server revealed that these extra exons encode multiple motifs with significant sequence similarity to other proteins (Additional file <supplr sid="S13">13</supplr>). The first motif, encoded by exons 3 and 4, contains a von Willebrand factor (vWF) type A domain (VWA). For example, the VWA domain of chicken collagen, type XIV, alpha 1 (undulin) is retrieved with a blastp probability of 8e-28. After the VWA domain, 21 CA (Cadherin repeat) domains follow, they occur as repeats in extracellular regions and are thought to mediate cell-cell contact when bound to calcium. Further follow two Immunoglobulin C-2 Type domains, two EGF repeats, a transmembrane region, and finally an SH2 domain.</p>
            <suppl id="S13">
               <title>
                  <p>Additional file 13</p>
               </title>
               <text>
                  <p>Predicted protein structure of Nv 200640. Protein motif prediction of the SMART server was used to analyse the ORF Nv 200640, and the different types of conserved motifs found are indicated.</p>
               </text>
               <file name="1471-2164-9-127-S13.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <p>The phylogenetic analysis of the cnidarian Hog domains reveals that they cluster primarily with the Hh Hog domains (Figure <figr fid="F5">5</figr>, <figr fid="F6">6</figr>, Additional files <supplr sid="S5">5</supplr>, <supplr sid="S6">6</supplr>, <supplr sid="S7">7</supplr>), albeit mostly with insignificant bootstrap values. The Hog domain of Nv 241466 Hh has the best similarity to the Hh Hog domains, and clusters with the deuterostome Hh proteins. Nv 140260 and Nv 239508 are most similar to each other, suggesting a likely duplication event within the cnidarian lineage. Nv 120428 and Acm DY579185 may also be related to these two proteins via their Hog domain (Figure <figr fid="F5">5</figr>, <figr fid="F6">6</figr>, Additional files <supplr sid="S5">5</supplr>, <supplr sid="S6">6</supplr>, <supplr sid="S7">7</supplr>), but the bootstrap values are not significant. The Nv 95413 Hh protein is rather divergent, and the Hydra sequence Hm CO905822 is also very divergent and does not from a clade with any of the <it>N. vectensis </it>sequences. Therefore, it is not possible to determine, whether all the cnidarian Hog genes originated all from a single ancestral gene in the cnidarian lineage, or whether <it>hh </it>and other Hog genes were already present before the split of Cnidaria and Bilateria. The Hedge domains of the three <it>N. vectensis </it>ORFs are more divergent than the bilaterian Hedge domains (Figure <figr fid="F9">9</figr>, Additional file <supplr sid="S10">10</supplr>). The Hedge domain of Nv 241466 Hh is most similar to bilaterian Hh proteins, with a best blast probability of 2e-52 to a fish Hedge domain. Nv 95413 Hh is more divergent, with a best blast probability of 5e-36, and Nv 200640 is the most divergent Hedge domain, with a probability of 1e-18 to a Ciona Hh.</p>
         </sec>
         <sec>
            <st>
               <p>Hog genes in lower eukaryotes</p>
            </st>
            <p>In order to detect Hh sequences from lower eukaryotes, tblastn searches were performed using the organism restriction "eukaryotes NOT bilateria". This recovered a number of genomic and EST matches from lower animals, fungi, plants and protists (Figure <figr fid="F2">2</figr>, <figr fid="F3">3</figr>, <figr fid="F4">4</figr>, Additonal files 9, 14). One EST was recovered from the sponge <it>Oscarella carmela</it>, which was previously described <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>. Analysis of this sequence shows that, while it does have a Hedge domain, the downstream sequence does not contain the start of a Hog domain in any frame (Additional file <supplr sid="S10">10</supplr>). No sequence similarity to a VWA domain is detected in that fragment either. Nevertheless, it indicates that as in the case of Nv 200640, this gene may not contain a Hog domain.</p>
            <p>A match was detected to the gene GmGIN1 from the fungus <it>Glomus mosseae</it>, which belongs to the Glomeromycota, a sister group of ascomycetes and basidiomycetes and had already been described as having similarity to Hh <abbrgrp><abbr bid="B29">29</abbr></abbrgrp>. The Hog domain has a blast probability of 7e-18 to the best matching Hh Hog domain, which is much better than the blast probability of choanoflagellate Hoglet to the best matching Hh Hog domains (4e-10). Furthermore, good matches to motifs J and K, as well as a region with similarity to motif L. Therefore, GmGIN1 contains a bona-fide Hog domain (Figure <figr fid="F2">2</figr>, <figr fid="F3">3</figr>, <figr fid="F4">4</figr>, Additional file <supplr sid="S14">14</supplr>). The upstream domain of GmGIN1 shares similarity with Ras-like GTPases, e.g. the Arabidopsis protein AIG1 (avrRpt2-induced gene 1) and the animal The IAN/IMAP subfamily <abbrgrp><abbr bid="B29">29</abbr></abbrgrp>. However, this ORF lacks a signal peptide and may therefore not be secreted.</p>
            <suppl id="S14">
               <title>
                  <p>Additional file 14</p>
               </title>
               <text>
                  <p>Multiple sequence alignment of non-metazoan Hog proteins. Alignment of non-metazoan Hog proteins including also the ones which are only based on EST fragments.</p>
               </text>
               <file name="1471-2164-9-127-S14.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <p>A number of matches were found in Alveolata, i.e. in the dinoflagellates <it>Alexandrium tamarense</it>, <it>Amphidinium carterae</it>, and <it>Karlodinium micrum </it>(blast expected probability of aKm Hog: 9e-17 to the best matching Hh Hog domain; note: blast probabilities below also refer to Hh Hog domains) and the apicomplexans <it>Cryptosporidium muris </it>and <it>Cryptosporidium parvum </it>(blast prob. of aCp Hog: 4e-08). Their Hog domains contain motifs J and K, although in a few cases the cysteine has been replaced with a serine in motif J (Figure <figr fid="F2">2</figr>, <figr fid="F3">3</figr>, <figr fid="F4">4</figr>, Additional file <supplr sid="S14">14</supplr>). The aCm and aCp sequences are most likely full length, they have signal peptide sequences for secretion and share a conserved upstream region of about 100 residues in length that contains two conserved cysteine residues (Figure <figr fid="F12">12</figr>), but no sequence similarity of this motif to other known domains was found.</p>
            <fig id="F12">
               <title>
                  <p>Figure 12</p>
               </title>
               <caption>
                  <p>Pairwise sequence alignment of Alveolata aCm and aCp Hog</p>
               </caption>
               <text>
                  <p>Pairwise sequence alignment of Alveolata aCm and aCp Hog.</p>
               </text>
               <graphic file="1471-2164-9-127-12"/>
            </fig>
            <p>Further Hog sequences were found in red algae and mosses (Figure <figr fid="F2">2</figr>, <figr fid="F3">3</figr>, <figr fid="F4">4</figr>, Additional file <supplr sid="S14">14</supplr>): In the mosses <it>Selaginella moellendorffii </it>(blast prob.: 1e-09) and <it>Physcomitrella patens </it>one sequence each with a Hog domain; in the red algae <it>Chondrus crispus </it>two sequences (blast prob. of rCc Hog: 1e-10); in <it>Griffithsia japonica </it>one sequence (blast prob.: 3e-08); in <it>Porphyra haitanensis </it>(blast prob. of rPh Hog: 4e-07) and <it>Porphyra yezoensis </it>two Hog domain ORFs each; and in <it>Gracilaria changii </it>six ORF fragments (blast prob. of rGc Hog1: 3e-12). Those moss and red algae Hog domains that are not truncated have motifs J and K, although the cysteine residue in motif J has been changed to serine, threonine, or aspartate. Alignment of rPy and rPh Hog2 revealed conserved sequences upstream of the Hog domain, however, these two sequences are relatively closely related so this conservation is not surprising (Figure <figr fid="F13">13</figr>). Blast searches with this upstream region did not reveal matches in any other organisms. Similarity, alignment of the moss sequences revealed also a conserved upstream region that was not found in other organisms (Figure <figr fid="F14">14</figr>). The <it>P. patens </it>sequence is presumably full length, since it was predicated from genomic sequence, and it has a good signal peptide. One EST sequence supposedly stems from rice (XX 104K18), however, it has a much better match to Hh Hog domains (blast prob.: 3e-24) than other non-metazoan Hog domains, and we could not find any match to rice genomic sequences. Therefore, this sequence may come from a contaminating organism and is designated as species XX here.</p>
            <fig id="F13">
               <title>
                  <p>Figure 13</p>
               </title>
               <caption>
                  <p>Pairwise sequence alignment of red algae rPy and rPh Hog2</p>
               </caption>
               <text>
                  <p>Pairwise sequence alignment of red algae rPy and rPh Hog2.</p>
               </text>
               <graphic file="1471-2164-9-127-13"/>
            </fig>
            <fig id="F14">
               <title>
                  <p>Figure 14</p>
               </title>
               <caption>
                  <p>Multiple sequence alignment of moss pPp Hog, pSl Hog and pSm Hog</p>
               </caption>
               <text>
                  <p>Multiple sequence alignment of moss pPp Hog, pSl Hog and pSm Hog.</p>
               </text>
               <graphic file="1471-2164-9-127-14"/>
            </fig>
            <p>Additional Hog-like sequences were recovered from the cercozoan <it>Bigelowiella natans </it>(blast prob.: 9e-10), from the cryptophyte <it>Guillardia theta </it>(blast prob. of crGt Hog1: 6e-11 to Hog of Mo hoglet), and from the jakobid <it>Jakoba libera </it>(blast prob. jJl Hog1: 2e-05) (Additional file <supplr sid="S14">14</supplr>). These sequences have motif J, although the cysteine has been replaced, and in those cases, where the C-terminal region is complete, it is clear that motif K is not conserved. Sequence alignment of the <it>J. libera </it>Hog sequences revealed conserved upstream sequences with some conserved cysteine residues (Figure <figr fid="F15">15</figr>). Numerous ESTs cover jJl Hog1 and therefore its ORF could be complete. If this is the case, the putative start methionine has a good signal sequence for secretion, and therefore jJl Hog1 has the same global structural features as the animal Hh and Hh-related proteins, i.e. a secreted N-terminal domain followed by a Hog domain. Finally, three sequences were recovered from the haptophyte <it>Pleurochrysis haptonemofera</it>. Sequence alignment revealed sequence conservation upstream of the Hog domain with conserved cysteine residues. However, it is noteworthy that the Hog domain is much better conserved than the upstream region, indicating that the upstream region can evolve more rapidly (Figure <figr fid="F16">16</figr>).</p>
            <fig id="F15">
               <title>
                  <p>Figure 15</p>
               </title>
               <caption>
                  <p>Multiple sequence alignment of jakobid jJl Hog1, Hog2, and Hog3</p>
               </caption>
               <text>
                  <p>Multiple sequence alignment of jakobid jJl Hog1, Hog2, and Hog3.</p>
               </text>
               <graphic file="1471-2164-9-127-15"/>
            </fig>
            <fig id="F16">
               <title>
                  <p>Figure 16</p>
               </title>
               <caption>
                  <p>Multiple sequence alignment of haptophyte hPh Hog1, Hog2, and Hog3</p>
               </caption>
               <text>
                  <p>Multiple sequence alignment of haptophyte hPh Hog1, Hog2, and Hog3.</p>
               </text>
               <graphic file="1471-2164-9-127-16"/>
            </fig>
            <p>Overall, these results show that Hog domains occur in many different branches of the major groups of eukaryotes. However, multiple losses seem to have occurred, since in many branches we did not detect Hog domains, for example, in <it>Arabidopsis thaliana </it>and other higher plants, or in the currently sequenced ascomycetes and basidiomycetes, or in other sequenced organisms such as Dictyostelium.</p>
         </sec>
         <sec>
            <st>
               <p>Other genes of the Hh pathway in Enoplea and <it>N. vectensis</it></p>
            </st>
            <p><it>C. elegans </it>not only lacks a bona-fide Hh molecule, but several other components of the Hh signaling pathway have been lost as well. In particular orthologs of the Hh signaling pathway in recipient cells, i.e. <it>smoothened</it>, <it>fused</it>, <it>suppressor of fused </it>(<it>sufu</it>), and <it>costa </it>are missing <abbrgrp><abbr bid="B18">18</abbr></abbrgrp>. On the other hand, a homolog of the transcription factor Cubitus interruptus (Ci/Gli) is present, albeit it has been adapted for sex determination. And multiple homologs for the receptor of Hh, i.e. Patched, have been found in <it>C. elegans </it><abbrgrp><abbr bid="B18">18</abbr></abbrgrp>, as well as the related molecule Dispatched, required for secretion of Hh. Patched, Dispatched, Smoothened, Ci/Gli and Hip have already been found in <it>N. vectensis </it><abbrgrp><abbr bid="B25">25</abbr><abbr bid="B26">26</abbr></abbrgrp>. We were particularly interested to find components lacking in <it>C. elegans </it>in the relatively well sequence genomes of <it>N. vectensis </it>and <it>T. spiralis</it>. Using reciprocal blast searches, we have attempted to identify these components of the pathway in Nematostella and Enoplea (Table <tblr tid="T2">2</tblr>). In Xiphinema we only detected an EST for <it>patched</it>, but this is not surprising giving the limitations of the current dataset. In <it>T. spiralis </it>we detected <it>patched</it>, <it>dispatched</it>, <it>dally-like </it>and Ci/Gli, but found no evidence for <it>Ihog</it>, <it>smoothed</it>, <it>costa</it>, <it>fused</it>, and <it>sufu</it>. This is actually identical to the situation in <it>C. elegans</it>. Presently about 56.8 Mb of an estimated genome size of about 65 Mb has been sequenced for <it>T. spiralis </it><abbrgrp><abbr bid="B41">41</abbr></abbrgrp>. If we assume that only about 80% has been sequenced, the probability of finding only the genes listed in Table <tblr tid="T2">2</tblr>, but missing the others is 0.013%. If the sequence coverage is higher, this probability would even be lower. Therefore, we have to assume that in <it>T. spiralis</it>, even though it has a bona-fide <it>hh </it>gene, the Hh signaling pathway is compromised in a similar way as in <it>C. elegans</it>.</p>
            <tbl id="T2">
               <title>
                  <p>Table 2</p>
               </title>
               <caption>
                  <p>Components of the Hh signaling pathway in <it>N. vectensis </it>and <it>Xiphinema </it>sp. The absence of a gene does not mean it is not present, it just may not have been sequenced yet. Numbers indicate the protein prediction in JGI (Nv) or the accession number (XC). For more information on pathway components and <it>C. elegans </it>genes see [18]. Best blast scores are given for the Nv predictions in parenthesis.</p>
               </caption>
               <tblbdy cols="5">
                  <r>
                     <c ca="left">
                        <p>Gene</p>
                     </c>
                     <c ca="left">
                        <p>Nv</p>
                     </c>
                     <c ca="left">
                        <p>XC</p>
                     </c>
                     <c ca="left">
                        <p>Ts</p>
                     </c>
                     <c ca="left">
                        <p>Ce</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="5">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>dispatched</p>
                     </c>
                     <c ca="left">
                        <p>2), 88278 (e-100)</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>yes (2 copies)</p>
                     </c>
                     <c ca="left">
                        <p><it>ceh-14</it>, <it>ptd-2</it></p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Ihog</p>
                     </c>
                     <c ca="left">
                        <p>-***</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>no</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>dally-like</p>
                     </c>
                     <c ca="left">
                        <p>247677 (4e-71)</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>yes</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>gpn-1</it>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Patched</p>
                     </c>
                     <c ca="left">
                        <p>1), 84424 (0.0)</p>
                     </c>
                     <c ca="left">
                        <p>CV511563</p>
                     </c>
                     <c ca="left">
                        <p>yes</p>
                     </c>
                     <c ca="left">
                        <p><it>ptc-1</it>, <it>ptc-3</it></p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>smoothened</p>
                     </c>
                     <c ca="left">
                        <p>2), 208236 (e-123), 92220 (4e-84)</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>-*</p>
                     </c>
                     <c ca="left">
                        <p>no</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Costa</p>
                     </c>
                     <c ca="left">
                        <p>79512 (e-135) #</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>no</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Fused</p>
                     </c>
                     <c ca="left">
                        <p>136852 (4&#8211;63)</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>-**</p>
                     </c>
                     <c ca="left">
                        <p>no</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Sufu</p>
                     </c>
                     <c ca="left">
                        <p>246114 (2e-89)</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>no</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>cubitus interruptus (Ci/Gli)</p>
                     </c>
                     <c ca="left">
                        <p>2), 116463 (3e-85)</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>yes</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>tra-1</it>
                        </p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p># This match is to human KIF27, costa itself is rather divergent and may not be a bona-fide ortholog of KIF27, and there is functional divergence between mammals and Drosophila in this aspect of the pathway [42].</p>
                  <p>* best reciprocal match found is to Drosophila frizzled dFz2</p>
                  <p>** best reciprocal match found is to ULK3 kinase</p>
                  <p>*** best reciprocal match of Nv185528 is to fish protogenin 5e-75</p>
                  <p>1) mentioned in [25]</p>
                  <p>2) mentioned in [26]</p>
               </tblfn>
            </tbl>
            <p>In <it>N. vectensis </it>we found good orthologues for <it>dispatched</it>, <it>dally-like</it>, <it>patched</it>, <it>smoothened</it>, <it>fused</it>, <it>sufu </it>and <it>Ci </it>(Table <tblr tid="T2">2</tblr>). No obvious homolog was found for Ihog. In the case of Drosophila <it>costa</it>, good matches to its human homologs were found, and Drosophila <it>costa </it>is rather divergent. Recently it has been shown that the mammalian homologues of <it>fused </it>and <it>costa </it>do not play the same key role in the pathway as in flies, instead <it>sufu </it>plays a major role <abbrgrp><abbr bid="B42">42</abbr><abbr bid="B43">43</abbr></abbrgrp>. Overall, it looks like most of the key players of the Hh pathway are present in <it>N. vectensis </it>so that it is clear that the pathway was already well established before the split of Cnidaria and Bilateria.</p>
         </sec>
         <sec>
            <st>
               <p>Genes with novel Hint-like (Vint) domains</p>
            </st>
            <p>During the tblastn searches ESTs and ORFs from non-Hog genes such as inteins were discovered, usually in the non-significant zone at the bottom of the results lists. One group of genes attracted our attention, because upon closer inspection it became apparent that these genes had an amino-terminal domain comprised of a VWA domain followed by a region that has good similarity to the first part of the Hint domain, i.e. in particular motifs A and B (Figure <figr fid="F17">17</figr>, <figr fid="F18">18</figr>, <figr fid="F19">19</figr>, Additional file <supplr sid="S15">15</supplr>). This observation was intriguing given that in Nematostella Nv 200640 a Hedge domain is followed by a VWA domain. Further blast searches revealed the presence of these VWA-Hint proteins in Tetrahymena, several fungal species, the Heterolobosea <it>Naegleria gruberi</it>, the parabasilid <it>Tritrichomonas foetus</it>, dinoflagellates, the slime mold <it>Physarum polycephalum</it>, rice and the chanoflagellate <it>Monosiga brevicollis</it>. Additional matches in other species, for example, pine tree, were found in the EST database, but not included here, because the fragmentary nature of the sequence information made it impossible to determine, whether the VWA domain resides in the same transcript as the Hint domain (data not shown). No match could be found for the cDNA sequence from rice (pOs AK110392) in the genomic sequence, but ESTs recovered from other plants support the notion that VWA-Hint proteins exist in plants.</p>
            <suppl id="S15">
               <title>
                  <p>Additional file 15</p>
               </title>
               <text>
                  <p>Multiple sequence alignment of the Hint region of VWA-Vint proteins with Hog domains of Hh proteins.</p>
               </text>
               <file name="1471-2164-9-127-S15.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <fig id="F17">
               <title>
                  <p>Figure 17</p>
               </title>
               <caption>
                  <p>Multiple sequence alignment of VWA domain &#8211; Hint-like domain proteins, part 1</p>
               </caption>
               <text>
                  <p><b>Multiple sequence alignment of VWA domain &#8211; Hint-like domain proteins, part 1</b>. Proteins containing a VWA merged to a Hint-like domain were discovered in Tetrahymena, several fungal species, as well as several other eukaryote branches, including choanoflagellates. The VWA domain and the Hint-like domain (Vint) with motifs A and B of the Hint domain are marked in the alignment. A new domain between the VWA and Vint domain is marked with Vwaint. Four proteins also have an Ubox upstream of the VWA domain. An alignment of selected Vint domains to Hh Hog domains is presented in additional file <supplr sid="S15">15</supplr>. <it>A. thaliana </it>At5g60710 is not a Vint protein, but one of the best matching VWA domain containing proteins. While it lacks the Vint domain, it does have some weak similarity to the Vwaint domain, and upstream of the VWA domain is a Ring finger, which shares similarity with the Ubox motif. It would be worthwhile to investigate this similarity with a detailed evolutionary analysis in the future.</p>
               </text>
               <graphic file="1471-2164-9-127-17"/>
            </fig>
            <fig id="F18">
               <title>
                  <p>Figure 18</p>
               </title>
               <caption>
                  <p>Multiple sequence alignment of VWA domain &#8211; Hint-like domain proteins, part 2</p>
               </caption>
               <text>
                  <p><b>Multiple sequence alignment of VWA domain &#8211; Hint-like domain proteins, part 2</b>. Continuation of the multiple sequence alignment of Figure 17.</p>
               </text>
               <graphic file="1471-2164-9-127-18"/>
            </fig>
            <fig id="F19">
               <title>
                  <p>Figure 19</p>
               </title>
               <caption>
                  <p>Multiple sequence alignment of VWA domain &#8211; Hint-like domain proteins, part 3</p>
               </caption>
               <text>
                  <p><b>Multiple sequence alignment of VWA domain &#8211; Hint-like domain proteins, part 3</b>. Continuation of the multiple sequence alignment of Figure 18.</p>
               </text>
               <graphic file="1471-2164-9-127-19"/>
            </fig>
            <p>The VWA-Hint proteins do not seem to have a signal peptide for secretion. The VWA domain is located at the N-terminus of the proteins, although in four cases a Ubox precedes the VWA domain (Figure <figr fid="F17">17</figr>, <figr fid="F18">18</figr>, <figr fid="F19">19</figr>). A region of around 300 residues separates the VWA domain from the Hint domain. This region has several small patches of conservation and one large region, that we propose to call Vwaint domain. At the C-terminus a Hint-like domain follows, which is of similar size as a Hog domain. However, the best conserved features are only motifs A and B, i.e. the N-terminal region of the Hint-like domain. One region shares a little similarity with motif F of inteins and BIL-Bs, but motifs J, K and L are lacking (Figure <figr fid="F17">17</figr>, <figr fid="F18">18</figr>, <figr fid="F19">19</figr>). The Hint-like domain is also rather different from inteins or Hog domains, the best blast matches of aTt 00471620 are to honeybee Hh with a probability 0.013. Therefore, these sequences cannot be classified as intein, Hog or Bil domains, and we refer to these genes as Vint genes. Vint genes are apparently so wide spread in eukaryotes that we have to assume that a common ancestor was present in early eukaryotes. However, Vint genes seem to be lacking in Arabidopsis, many fungi (for example, <it>Saccharomyces cerevisiae</it>), and in Metazoa. Multiple independent losses in different lineages seem the most likely explanation for this absence.</p>
            <p>Our searches also revealed a group of proteins from bacteria that had a Hint-like domain at their C-terminus and shared some weak sequence similarity in their N-terminal region (Additional files <supplr sid="S16">16</supplr>, <supplr sid="S17">17</supplr>). At least some of these proteins are predicated to have signal peptides for secretion, and the upstream region has two cysteine residues conserved between all sequences. The Hint-like domains of these bacterial proteins are also quite divergent from inteins, Hog and BIL domains, and represent yet another subgroup. This subgroup has previously also been detected by Dassa and Pietrokovski <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>. The new members we retrieved here support the notion that this is yet another new type of Hint protein.</p>
            <suppl id="S16">
               <title>
                  <p>Additional file 16</p>
               </title>
               <text>
                  <p>Multiple sequence alignment of bacterial proteins with a novel type of Hint-like region. Full length multiple sequence alignment of bacterial proteins. Motifs A and B are marked, as well as a small region with some similarity to the beginning of motif F in BIL-As. Note: the conserved upstream regions may be secreted, since some of the sequences have bacterial signal peptides for secretion. Species and the accession number for the sequences are: bSa_STIAU_1829: Stigmatella aurantiaca DW4/3-1 (ZP_01466308); bMx_MXAN_6253: Myxococcus xanthus DK 1622 (YP_634382); bRM_MED297_11140: Reinekea sp. MED297 (ZP_01113290); bPl_plu1731: Photorhabdus luminescens subsp. laumondii TTO1 (NP_929012); bSp_Draf4685: Serratia proteamaculans 568 (ZP_01534811); bYi_YintA_01002283: Yersinia intermedia ATCC 29909 (ZP_00833384).</p>
               </text>
               <file name="1471-2164-9-127-S16.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <suppl id="S17">
               <title>
                  <p>Additional file 17</p>
               </title>
               <text>
                  <p>Multiple sequence alignment of the Hint region of the bacterial proteins with a novel Hint-like domain with Hog domains. Multiple sequence alignment of the bacterial Hint domains from Additional file 16 with Hog domains from animals. Note the roughly similar length.</p>
               </text>
               <file name="1471-2164-9-127-S17.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Discussion</p>
         </st>
         <sec>
            <st>
               <p>Hh and hh-related proteins in nematodes</p>
            </st>
            <p>Hh genes are present in deuterostomes as well as in several different protostome phyla such as molluscs, annelids, and arthropods (Figure <figr fid="F20">20</figr>). However, in nematodes the situation is more complex. In <it>C. elegans</it>, <it>C. briggsae </it>and <it>B. malayi </it>we find no <it>hh </it>gene but instead many <it>hh</it>-related genes. We recovered 19 <it>hh</it>-related genes from the nematode <it>B. malayi</it>. Based on empirical evidence from other gene families we estimate that the genome of <it>B. malayi </it>is around 80% complete (K. Mukherjee and T. B., unpublished). Therefore, some additional <it>hh</it>-related genes might still be forthcoming. But the present survey shows that members of the <it>qua</it>, <it>wrt</it>, <it>grd </it>and <it>grl </it>gene families are all present in <it>B. malayi</it>. Only a representative of a <it>grd </it>gene with a Ground domain has so far not been found. The phylogenetic analyses show that while there are some instances of direct orthology between <it>B. malayi </it>and Caenorhabditis genes, in many instances, in particular for the <it>grl </it>genes, the relationship is not clear and in fact suggests that independent diversification occurred in these two Chromadorea branches. This shows that these gene families have been actively evolving in nematodes.</p>
            <fig id="F20">
               <title>
                  <p>Figure 20</p>
               </title>
               <caption>
                  <p>Summary of the evolution of <it>hh </it>and <it>hh</it>-related genes</p>
               </caption>
               <text>
                  <p><b>Summary of the evolution of <it>hh </it>and <it>hh</it>-related genes</b>. For detailed discussion of the evolution of the Hog proteins see text. The right side shows the different types of ORFs found in different organisms. The sizes are not to scale. The "Hedge" domain is marked in green, the Qua domain in orange, and the Hog domain in black, with yellow bars representing the conserved cysteine residues. T stands for poly-threonine repeats. Red 'X's mark branches where a gene loss occurred.</p>
               </text>
               <graphic file="1471-2164-9-127-20"/>
            </fig>
            <p>In the more distantly related Enoplea nematodes <it>Xiphinema sp</it>. and <it>T. spiralis </it>a strikingly different pictures emerges. In both species we find both a <it>hh </it>gene as well as several <it>hh</it>-related genes. In <it>T. spiralis </it>we also find a <it>quahog </it>gene, and &#8211; based on the phylogenetic analyses &#8211; some of the Xiphinema genes could also be <it>quahog </it>genes. Two <it>T. spiralis </it>and one Xiphinema protein share a new motif (Enop motif) upstream of the Hog domain that appears to be specific to Enoplea nematodes. However, there are also a number of instances of N-terminal sequences that are very short. Several of these proteins cluster with the "Enop" proteins in the phylogenetic analyses, suggesting that they diverged from a common ancestor. However, two proteins with short N-terminal regions (Ts Xhog3 and XC Shog2) are rather divergent and do not reliably fall within the clade of nematode-specific Hog proteins ("Nema-Hog" proteins) in phylogenetic analyses. In particular Ts Xhog3 lacks the conserved cysteine pair usually found in Nema-Hog domains, and it shares motifs K and L with Hh Hog domains, indicating it could be derived from a Hh protein. Therefore, while these genes could have diverged from <it>hh </it>or Nema-Hog genes, it may also be possible that the represent ancestral genes that were lost in Chromadorea. In conclusion, we think that there were probably at least three different types of Hog genes in the common ancestor of Enoplea and Chromadorea, one <it>hh </it>gene, one <it>quahog </it>gene and one gene which give rise to the <it>wrt/grd </it>branch in Chromadorea and the Ts Xhog1/2 branch in Enoplea. But possibly up to five Hog genes could have existed in the common ancestor. The proliferation into further distinct groups such as <it>wrt</it>, <it>ground </it>and <it>ground-like </it>appears to have happened later in a branch specific manner.</p>
            <p>Many different N-termini now exist in Nema-Hog proteins. Two possible mechanisms can explain this diversity: Either acquisition of new N-terminal domains, or divergent evolution of existing N-terminal domains. A relatively good case can be made that all Wart, Ground and Ground-like domains arose from a single common ancestor based on weak sequence similarities between the motifs <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>. This relationship is also supported by the phylogenetic analyses of the Hog domains. Therefore, multiple loss of the Hog domain must have occurred secondarily within the <it>wrt </it>and <it>ground </it>families. The presence of the rather short N-termini in Enoplea suggests that these regions have evolved and diverged through mutations, rather than by acquisition of a new domain. The threonine-rich stretch in XC Thog is very likely the result of polymerase slippage, though it is striking that this feature has evolved separately also in the choanoflagellate Hoglet protein <abbrgrp><abbr bid="B22">22</abbr></abbrgrp>. It is also worth mentioning that some of the Caenorhabditis N-terminal domains have repetitive regions outside of the conserved Ground and Ground-like domains, mainly proline, glycine and serine. For example, Ce <it>grl-23 </it>has a 176 residue long stretch upstream of the Ground-like domain containing 125 glycine residues. In conclusion, most of the observed variability in the N-terminal domains of nematode Hh-related proteins is probably the result of sequence divergence from a progenitor, rather than acquisition of new domains. Loss of N-terminal domains in the case of <it>C. elegans </it>Hog-1, as well as loss of Hog domains did occur however.</p>
            <p>A surprising observation is the fact that <it>T. spiralis </it>has a <it>hh </it>gene, but apparently lacks several components of the Hh pathway, such as Smoothened. Particularly noteworthy is that the components that appear to be missing are the same as in <it>C. elegans</it>. This would suggest that the signaling pathway was modified by loss already before the split of Enoplea and Chromadorea, even though <it>hh </it>was maintained in Enoplea. While one could imagine that Hh could be maintained in an animal parasite such as <it>T. spiralis </it>to affect host cells, this is very unlikely in the case of the plant nematode Xiphinema. It implies that Hh has an important function even in the absence of Smoothened, and it refutes the hypothesis that the Nema-Hog genes evolved directly from <it>hh </it>concomitantly with the other changes in the Hh pathway.</p>
         </sec>
         <sec>
            <st>
               <p>Hog proteins in Cnidaria</p>
            </st>
            <p>In Cnidaria we also encounter a complex situation with both Hh and Hh-related proteins. Both in <it>N. vectensis </it>and <it>A. millepora </it>we find bona-fide <it>hh </it>genes that have a Hedge and a Hog domain. Another gene is well conserved between <it>N. vectensis </it>and <it>A. millepora </it>and has a distinct, novel secreted N-terminal domain. Two further Hh-related proteins in <it>N. vectensis </it>have yet other, distinct N-termini. The upstream region of the two closely related genes retrieved from Hydra do not share any similarity with those in Nematostella, indicating divergent evolution. No sequence similarity of these new N-terminal motifs has been found outside Cnidaria.</p>
            <p>The <it>hh</it>-related genes from Cnidaria are however distinct from those in nematodes, since the phylogenetic analyses of the Hog domains does not show them to be closely related. Therefore, we would like to suggest that &#8211; as in the case of the nematode <it>hh</it>-related genes &#8211; the Cnidarian N-terminal domains have evolved from common ancestors by divergent evolution rather than by domain acquisition.</p>
            <p>The case of Nv 200640 is perhaps a special exception. In this protein we find an N-terminal Hedge domain fused to a large extracellular protein that contains a VWA domain as well as CA and EGF repeats, but it clearly lacks a Hog domain. The VWA domain is a 200 residue long domain first identified in von Willebrand Factor <abbrgrp><abbr bid="B44">44</abbr><abbr bid="B45">45</abbr></abbrgrp>. VWA domains are found both in extracellular and intracellular proteins, such as non-fibrillar collagens, plasma proteins such as complement factors and integrins, and they mediate adhesion via metal ion-dependent adhesion sites. Likewise, the CA repeats also mediate adhesion in a Ca2+ dependent fashion. Therefore, the Nv 200640 protein is probably involved in cell adhesion. This shows that the Hedge domain can also evolve in a modular fashion and separate from the Hog domain. The EST recovered from sponges also has a Hedge domain that lacks the immediately following Hog domain, and may perhaps represent also a protein lacking a Hog domain.</p>
         </sec>
         <sec>
            <st>
               <p>Hog proteins in lower eukaryotes</p>
            </st>
            <p>We have recovered a substantial number of Hog domain proteins from many diverse groups of eukaryotes, mostly protists, such as red algae, moss, alveolates (ciliates, dinoflagellates, apicomplexans), cryptophytes, jakobids, haptophytes, cercozoa and Glomeromycota fungi. While some of these Hog sequences are quite divergent, they are invariably most closely related to Hog domain proteins from animals, and not to inteins, such as those found in fungi, or to BIL or Vint domains. Given the widespread occurrence in many of the major groups of eukaryotes (<abbrgrp><abbr bid="B46">46</abbr></abbrgrp>, we must conclude that Hog proteins were present already in the earliest eukaryotes. We find diverse N-termini associated with the Hog domain that are only conserved to limited extends within groups (case in point are the various conserved N-termini in nematodes). Many of these limited conserved N-termini have conserved cysteine residues, and in cases, where one can be quite confident of the start methionine, they start with a good signal peptide for secretion. Only in the case of the fungal protein GmGIN1 and the choanoflagellate Hoglet are distinct other N-terminal domains fused to the Hog domain. Therefore, we postulate that an ancestral Ur-Hog gene existed, with a secreted N-terminal domain and an autoprocessing Hog domain, that may have added a sterol or similar moiety to its secreted N-terminus. This gene evolved in concert with eukaryote evolution and was lost in several branches. In animals, the question arises about the origin of the Hedge domain. Both in sponge and in Nematostella we find a Hedge gene that lacks a Hog domain. Perhaps such a gene merged with a Hog domain in early metazoans. However, the reverse process is also possible: the Hedge domain evolved as an N-terminal variant of a Hog protein in early metazoans, and in the two Hedge genes in sponge and Nematostella the Hog domain was lost later. Both in Cnidaria and nematodes we find both <it>hh </it>and <it>hh</it>-related genes. Did the <it>hh</it>-related genes evolve twice independently from a <it>hh </it>precursor in each lineage? This is certainly the most parsimonious hypothesis. Nonetheless, in an alternative scenario, a <it>hh </it>and a <it>hh</it>-related gene could have been present in the common ancestor of eumetazoa, and the <it>hh</it>-related gene would have given rise to the cnidarian and nematode <it>hh</it>-related genes. For this hypothesis to be true, we would have to postulate three separate losses of <it>hh</it>-related genes: in deuterostomes, in lophotrochozoa, and in arthropods. While this seems rather unlikely, we do observe many losses of Hog genes in various branches of eukaryotes, as well as loss of the Hog domain only in a number of nematode genes so that such a series of losses may not be totally impossible.</p>
         </sec>
         <sec>
            <st>
               <p>Novel Hint genes</p>
            </st>
            <p>Our searches revealed new genes with Hint motifs merged to VWA domains. Given that a Hedge domain was found fused to a VWA domain in Nematostella we investigated this further and recovered a novel gene family. The well-conserved gene structure consists of a VWA followed by a new domain, termed Vwaint, followed by the "Vint"-type Hint domain. Unlike the Hog proteins, these proteins are most likely not secreted and instead are processed inside the cell. The Vint genes are present in many eukaryotic groups, but must have been lost multiple times, in particular in multicellular eukaryotes. Multiple loss seems to be a common theme also in Hog proteins and especially inteins <abbrgrp><abbr bid="B21">21</abbr><abbr bid="B32">32</abbr></abbrgrp>. Inteins may be subject to special selective pressure for loss <abbrgrp><abbr bid="B21">21</abbr><abbr bid="B32">32</abbr></abbrgrp>, and this pressure may also extend to Hog and Vint proteins. However, gene loss is not uncommon. The <it>N. vectensis </it>genome contains a remarkable complexity of highly conserved gene families <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>, and several instances of later gene loss in the protostome or deuterostome lineage, for example in the homeobox gene family, have been found <abbrgrp><abbr bid="B47">47</abbr><abbr bid="B48">48</abbr></abbrgrp>, indicating gene loss later in evolution is feasible.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Conclusion</p>
         </st>
         <p>We find that the evolution of Hh is more complex than anticipated, and that this gene family is not simply derived from an intein in early metazoan evolution. Both in Cnidaria and nematodes parallel evolution between <it>hh </it>and <it>hh</it>-related genes occurred. Given that the nematode-specific Hog domain (Nema-Hog) with its distinct features was already present in the progenitor of two very different nematode branches it may be possible that both Hh and some other Hog domain protein was already present in protostomes before the emergence of nematodes and was lost in other lineages such as arthropods. The finding of multiple Hog domain proteins in Cnidaria raises the possibility that multiple distinct types of Hog domain proteins also existed in ancestral Eumetazoa. Snell et al. (2006) suggested that a precursor of a Hedge domain fused to a Hog domain in early Metazoan evolution. However, our discovery that an Ur-Hog gene probably existed in the progenitor of eukaryotes makes if feasible that Hh evolved from an ancestral Hog gene without domain shuffling. In eukaryotes, we now know that at least three different types of Hint domains evolved in parallel: Hog, Vint, and inteins. At present we do not know the origin of the Hog and Vint domains, but perhaps new Hint domains from bacteria, such as described here and by Dassa and Pietrokovski <abbrgrp><abbr bid="B21">21</abbr></abbrgrp> will shed light on that issue in the future.</p>
      </sec>
      <sec>
         <st>
            <p>Methods</p>
         </st>
         <p>Procedures for retrieving and analyzing sequences have been detailed in Hao et al. 2006 and Mukherjee and B&#252;rglin 2007 <abbrgrp><abbr bid="B38">38</abbr><abbr bid="B48">48</abbr></abbrgrp>. Briefly, <it>B. malayi </it>sequences were searched at TIGR <abbrgrp><abbr bid="B49">49</abbr></abbrgrp>. Preliminary sequence data for <it>B. malayi </it>is deposited regularly into the GSS division of GenBank. This sequencing effort is part of the International Brugia Genome Sequencing Project and is supported by an award from the National Institute of Allergy and Infectious Diseases, National Institutes of Health. ESTs, in particular nematode ESTs, were searched at NCBI <abbrgrp><abbr bid="B50">50</abbr></abbrgrp>. The nematode ESTs are generated by the Washington University Parasitic Nematode EST sequencing project <abbrgrp><abbr bid="B51">51</abbr></abbrgrp>. Many of the protist ESTs were generated by the Protist EST program <abbrgrp><abbr bid="B52">52</abbr></abbrgrp>. The <it>T. spiralis </it>genome was searched using the GSC blast server at The Genome Sequencing Center of the Washington University School of Medicine <abbrgrp><abbr bid="B53">53</abbr></abbrgrp>. <it>N. vectensis </it>sequences were searched at Stellabase <abbrgrp><abbr bid="B54">54</abbr><abbr bid="B28">28</abbr></abbrgrp>, and at the DOE Joint Genome Institute (JGI) <abbrgrp><abbr bid="B55">55</abbr></abbrgrp>. Additional genome sequences such as for <it>Naegleria gruberi</it>, <it>Physcomitrella patens </it>and <it>Monosiga brevicollis </it>were searched at the JGI <abbrgrp><abbr bid="B55">55</abbr></abbrgrp>. Zebrafish sequences were retrieved from ZFIN <abbrgrp><abbr bid="B56">56</abbr><abbr bid="B57">57</abbr></abbrgrp>. The intein database was checked at New England Biolabs InBase <abbrgrp><abbr bid="B30">30</abbr><abbr bid="B58">58</abbr></abbrgrp>. Manual sequence corrections were performed with the help of FGENESH and FGENESH+ at Softberry <abbrgrp><abbr bid="B59">59</abbr></abbrgrp> and PPCMatrix <abbrgrp><abbr bid="B60">60</abbr></abbrgrp>. ESTs representing the same locus were assembled using the CAP3 server at Iowa State University <abbrgrp><abbr bid="B61">61</abbr></abbrgrp>.</p>
         <p>Sequences were added to an existing database of Hh and Hh-related proteins <abbrgrp><abbr bid="B38">38</abbr></abbrgrp>, and are shown in Additional file <supplr sid="S9">9</supplr>. Protist sequences were arbitrarily named Hog, Hog2, etc. (Additional file <supplr sid="S9">9</supplr>). For identification and tagging of sequences in the figures the species names were reduced to two and three letter codes and prefixed to sequence names (Table <tblr tid="T3">3</tblr>). Multiple sequence alignment and phylogenetic analyses using Neighbor joining were carried using Clustal_X <abbrgrp><abbr bid="B62">62</abbr></abbrgrp> and MUSCLE <abbrgrp><abbr bid="B63">63</abbr><abbr bid="B64">64</abbr></abbrgrp>. Manual correction of alignments was carried out using SeaView <abbrgrp><abbr bid="B65">65</abbr></abbrgrp>. For Maximum likelihood analysis PHYML was employed <abbrgrp><abbr bid="B66">66</abbr></abbrgrp>. Signal peptide predication was carried out at the SignalP 3.0 server <abbrgrp><abbr bid="B67">67</abbr><abbr bid="B68">68</abbr></abbrgrp>. Protein sequence logos were generated using LogoBar <abbrgrp><abbr bid="B69">69</abbr><abbr bid="B70">70</abbr></abbrgrp>. Some protein motifs were also identified using the SMART server <abbrgrp><abbr bid="B71">71</abbr></abbrgrp>.</p>
         <tbl id="T3">
            <title>
               <p>Table 3</p>
            </title>
            <caption>
               <p>Species abbreviations. Fungi are prefixed with 'f', red algae with 'r', plants with 'p", Alveolata (ciliates, dinoflagellates, Apicomplexa) with 'a', jakobids with 'j', Cercozoa with 'c', Cryptophyta with 'cr', excavates with 'e', haptophytes with 'h', heterolobosea with 'l', and slime molds with 's'.</p>
            </caption>
            <tblbdy cols="2">
               <r>
                  <c ca="left">
                     <p>Codes</p>
                  </c>
                  <c ca="left">
                     <p>Species names</p>
                  </c>
               </r>
               <r>
                  <c cspan="2">
                     <hr/>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Acm</p>
                  </c>
                  <c ca="left">
                     <p><it>Acropora millepora </it>(Cnidaria)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Ag</p>
                  </c>
                  <c ca="left">
                     <p><it>Anopheles gambiae </it>(malaria mosquito)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>At</p>
                  </c>
                  <c ca="left">
                     <p><it>Achaearanea tepidariorum </it>(common house spider)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Bf</p>
                  </c>
                  <c ca="left">
                     <p><it>Branchiostoma floridae </it>(Florida lancelet, Amphioxus)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Bm</p>
                  </c>
                  <c ca="left">
                     <p><it>Brugia malayi </it>(nematode, Chromadorea)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Cap</p>
                  </c>
                  <c ca="left">
                     <p><it>Capitella </it>sp. I ECS-2004 (polychaete)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Cb</p>
                  </c>
                  <c ca="left">
                     <p><it>Caenorhabditis briggsae </it>(nematode, Chromadorea)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Ce</p>
                  </c>
                  <c ca="left">
                     <p><it>Caenorhabditis elegans </it>(nematode, Chromadorea)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Cr</p>
                  </c>
                  <c ca="left">
                     <p><it>Caenorhabditis remanei </it>(nematode, Chromadorea)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Dm</p>
                  </c>
                  <c ca="left">
                     <p><it>Drosophila melanogaster </it>(fruitfly)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Dh</p>
                  </c>
                  <c ca="left">
                     <p>
                        <it>Drosophila hydei</it>
                     </p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Dr</p>
                  </c>
                  <c ca="left">
                     <p><it>Danio rerio </it>(zebrafish)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Gb</p>
                  </c>
                  <c ca="left">
                     <p><it>Gryllus bimaculatus </it>(two-spotted cricket)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Lv</p>
                  </c>
                  <c ca="left">
                     <p><it>Lytechinus variegatus </it>(green sea urchin)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Hm</p>
                  </c>
                  <c ca="left">
                     <p><it>Hydra magnipapillata </it>(Cnidaria)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Mb</p>
                  </c>
                  <c ca="left">
                     <p><it>Monosiga brevicollis </it>(choanoflagellate)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Mi</p>
                  </c>
                  <c ca="left">
                     <p><it>Meloidogyne incognita </it>(southern root-knot nematode, Chromadorea)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Mm</p>
                  </c>
                  <c ca="left">
                     <p><it>Mus musculus </it>(mouse)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Mo</p>
                  </c>
                  <c ca="left">
                     <p><it>Monosiga ovata </it>(choanoflagellate)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Nv</p>
                  </c>
                  <c ca="left">
                     <p><it>Nematostella vectensis </it>(Cnidaria, starlet sea anemone)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Ob</p>
                  </c>
                  <c ca="left">
                     <p><it>Octopus bimaculoides </it>(mollusc)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Oc</p>
                  </c>
                  <c ca="left">
                     <p><it>Oscarella carmela </it>(sponge)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Pt</p>
                  </c>
                  <c ca="left">
                     <p><it>Parastrongyloides trichosuri </it>(nematode, Chromadorea)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Pv</p>
                  </c>
                  <c ca="left">
                     <p><it>Patella vulgata </it>(common limpet, mollusc)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Sp</p>
                  </c>
                  <c ca="left">
                     <p><it>Strongylocentrotus purpuratus </it>(sea urchin)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Tr</p>
                  </c>
                  <c ca="left">
                     <p><it>Takifugu rubripes </it>(fugu)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Ts</p>
                  </c>
                  <c ca="left">
                     <p><it>Trichinella spiralis </it>(nematode, Enoplea)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>XC</p>
                  </c>
                  <c ca="left">
                     <p><it>Xiphinema </it>index CSEQDL01 (nematode, Enoplea)</p>
                  </c>
               </r>
               <r>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>aAc</p>
                  </c>
                  <c ca="left">
                     <p><it>Amphidinium carterae </it>(dinoflagellate, Alveolata)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>aAt</p>
                  </c>
                  <c ca="left">
                     <p><it>Alexandrium tamarense </it>(dinoflagellate, Alveolata)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>aCm</p>
                  </c>
                  <c ca="left">
                     <p><it>Cryptosporidium muris </it>(Apicomplexa, Alveolata)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>aCp</p>
                  </c>
                  <c ca="left">
                     <p><it>Cryptosporidium parvum </it>(Apicomplexa, Alveolata)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>aKb</p>
                  </c>
                  <c ca="left">
                     <p><it>Karenia brevis </it>(dinoflagellate, Alveolata)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>aKm</p>
                  </c>
                  <c ca="left">
                     <p><it>Karlodinium micrum </it>(dinoflagellate, Alveolata)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>aTt</p>
                  </c>
                  <c ca="left">
                     <p><it>Tetrahymena thermophila </it>(ciliate, Alveolata)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>cBn</p>
                  </c>
                  <c ca="left">
                     <p><it>Bigelowiella natans </it>(Cercozoa)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>crGt</p>
                  </c>
                  <c ca="left">
                     <p><it>Guillardia theta </it>(Cryptophyta)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>eTf</p>
                  </c>
                  <c ca="left">
                     <p><it>Tritrichomonas foetus </it>(Parabasalidea, excavates)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>fAc</p>
                  </c>
                  <c ca="left">
                     <p><it>Ajellomyces capsulatus </it>(ascomycetes, fungus)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>fCg</p>
                  </c>
                  <c ca="left">
                     <p><it>Chaetomium globosum </it>(ascomycetes, fungus)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>fCt</p>
                  </c>
                  <c ca="left">
                     <p><it>Candida tropicalis </it>(ascomycetes, fungus)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>fGm</p>
                  </c>
                  <c ca="left">
                     <p><it>Glomus mosseae </it>(Glomeromycota, fungus)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>fGz</p>
                  </c>
                  <c ca="left">
                     <p><it>Gibberella zeae </it>(ascomycetes, fungus)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>fMg</p>
                  </c>
                  <c ca="left">
                     <p><it>Magnaporthe grisea </it>(ascomycetes, rice blast fungus)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>fNc</p>
                  </c>
                  <c ca="left">
                     <p><it>Neurospora crassa </it>(ascomycetes, fungus)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>hPh</p>
                  </c>
                  <c ca="left">
                     <p><it>Pleurochrysis haptonemofera </it>(haptophytes)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>jJl</p>
                  </c>
                  <c ca="left">
                     <p><it>Jakoba libera </it>(jakobids)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>lNg</p>
                  </c>
                  <c ca="left">
                     <p><it>Naegleria gruberi </it>(heterolobosea)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>pAt</p>
                  </c>
                  <c ca="left">
                     <p><it>Arabidopsis thaliana </it>(plants)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>pOs</p>
                  </c>
                  <c ca="left">
                     <p><it>Oryza sativa </it>(rice, plants)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>pPp</p>
                  </c>
                  <c ca="left">
                     <p><it>Physcomitrella patens </it>(moss, plants)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>pSl</p>
                  </c>
                  <c ca="left">
                     <p><it>Selaginella lepidophylla </it>(club moss, plants)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>pSm</p>
                  </c>
                  <c ca="left">
                     <p><it>Selaginella moellendorffii </it>(club moss, plants)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>RCc</p>
                  </c>
                  <c ca="left">
                     <p><it>Chondrus crispus </it>(carragheen, red algae)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>RGc</p>
                  </c>
                  <c ca="left">
                     <p><it>Gracilaria changii </it>(red algae)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>RGj</p>
                  </c>
                  <c ca="left">
                     <p><it>Griffithsia japonica </it>(red algae)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>RPh</p>
                  </c>
                  <c ca="left">
                     <p><it>Porphyra haitanensis </it>(red algae)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>RPy</p>
                  </c>
                  <c ca="left">
                     <p><it>Porphyra yezoensis </it>(red algae)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>SPp</p>
                  </c>
                  <c ca="left">
                     <p><it>Physarum polycephalum </it>(slime mold, amoebozoa)</p>
                  </c>
               </r>
            </tblbdy>
         </tbl>
      </sec>
      <sec>
         <st>
            <p>Authors' contributions</p>
         </st>
         <p>All research was carried out by TRB and the manuscript was written by TRB.</p>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>I would like to thank Shmuel Pietrokovski for helpful discussions and for sharing information. Preliminary sequence data for <it>B. malayi </it>is deposited regularly into the GSS division of GenBank. The Sequencing effort is part of the International Brugia Genome Sequencing Project and is supported by an award from the National Institute of Allergy and Infectious Diseases, National Institutes of Health. The Nematostella sequence data as well as other genome data such as <it>Naegleria gruberi</it>, <it>Physcomitrella patens</it>, and <it>Monosiga brevicollis </it>were produced by the US Department of Energy Joint Genome Institute <abbrgrp><abbr bid="B55">55</abbr></abbrgrp>. The Trichinella data were produced by the Genome Sequencing Center at Washington University School of Medicine in St. Louis and can be obtained from their Web site <abbrgrp><abbr bid="B41">41</abbr></abbrgrp>. This research was supported by grants from the Swedish Foundation for Strategic Research (SSF) and the Karolinska Institutet.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>The hedgehog signaling network</p>
            </title>
            <aug>
               <au>
                  <snm>Cohen</snm>
                  <fnm>MM</fnm>
                  <suf>Jr.</suf>
               </au>
            </aug>
            <source>Am J Med Genet A</source>
            <pubdate>2003</pubdate>
            <volume>123</volume>
            <issue>1</issue>
            <fpage>5</fpage>
            <lpage>28</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/ajmg.a.20495</pubid>
                  <pubid idtype="pmpid" link="fulltext">14556242</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>Communicating with Hedgehogs</p>
            </title>
            <aug>
               <au>
                  <snm>Hooper</snm>
                  <fnm>JE</fnm>
               </au>
               <au>
                  <snm>Scott</snm>
                  <fnm>MP</fnm>
               </au>
            </aug>
            <source>Nature reviews</source>
            <pubdate>2005</pubdate>
            <volume>6</volume>
            <issue>4</issue>
            <fpage>306</fpage>
            <lpage>317</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nrm1622</pubid>
                  <pubid idtype="pmpid" link="fulltext">15803137</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>Signaling from Smo to Ci/Gli: conservation and divergence of Hedgehog pathways from Drosophila to vertebrates</p>
            </title>
            <aug>
               <au>
                  <snm>Huangfu</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Anderson</snm>
                  <fnm>KV</fnm>
               </au>
            </aug>
            <source>Development (Cambridge, England)</source>
            <pubdate>2006</pubdate>
            <volume>133</volume>
            <issue>1</issue>
            <fpage>3</fpage>
            <lpage>14</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">16339192</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>Shifting paradigms in Hedgehog signaling</p>
            </title>
            <aug>
               <au>
                  <snm>Wang</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>McMahon</snm>
                  <fnm>AP</fnm>
               </au>
               <au>
                  <snm>Allen</snm>
                  <fnm>BL</fnm>
               </au>
            </aug>
            <source>Current opinion in cell biology</source>
            <pubdate>2007</pubdate>
            <volume>19</volume>
            <issue>2</issue>
            <fpage>159</fpage>
            <lpage>165</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.ceb.2007.02.005</pubid>
                  <pubid idtype="pmpid" link="fulltext">17303409</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>Tissue repair and stem cell renewal in carcinogenesis</p>
            </title>
            <aug>
               <au>
                  <snm>Beachy</snm>
                  <fnm>PA</fnm>
               </au>
               <au>
                  <snm>Karhadkar</snm>
                  <fnm>SS</fnm>
               </au>
               <au>
                  <snm>Berman</snm>
                  <fnm>DM</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2004</pubdate>
            <volume>432</volume>
            <issue>7015</issue>
            <fpage>324</fpage>
            <lpage>331</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nature03100</pubid>
                  <pubid idtype="pmpid" link="fulltext">15549094</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>Hedgehog Signaling: From the Drosphila cuticle to anti-cancer drugs</p>
            </title>
            <aug>
               <au>
                  <snm>Briscoe</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Th&#233;rond</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Dev Cell</source>
            <pubdate>2005</pubdate>
            <volume>8</volume>
            <fpage>143</fpage>
            <lpage>151</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.devcel.2005.01.008</pubid>
                  <pubid idtype="pmpid">15736317</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>Developmental roles and clinical significance of hedgehog signaling</p>
            </title>
            <aug>
               <au>
                  <snm>McMahon</snm>
                  <fnm>AP</fnm>
               </au>
               <au>
                  <snm>Ingham</snm>
                  <fnm>PW</fnm>
               </au>
               <au>
                  <snm>Tabin</snm>
                  <fnm>CJ</fnm>
               </au>
            </aug>
            <source>Current topics in developmental biology</source>
            <pubdate>2003</pubdate>
            <volume>53</volume>
            <fpage>1</fpage>
            <lpage>114</lpage>
            <xrefbib>
               <pubid idtype="pmpid">12509125</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>Targeting the Hedgehog pathway in cancer</p>
            </title>
            <aug>
               <au>
                  <snm>Rubin</snm>
                  <fnm>LL</fnm>
               </au>
               <au>
                  <snm>de Sauvage</snm>
                  <fnm>FJ</fnm>
               </au>
            </aug>
            <source>Nature reviews</source>
            <pubdate>2006</pubdate>
            <volume>5</volume>
            <issue>12</issue>
            <fpage>1026</fpage>
            <lpage>1033</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nrd2086</pubid>
                  <pubid idtype="pmpid" link="fulltext">17139287</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>Evolution and orthology of hedgehog genes.</p>
            </title>
            <aug>
               <au>
                  <snm>Zardoya</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Abouheif</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Meyer</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Trends Genet</source>
            <pubdate>1996</pubdate>
            <volume>12</volume>
            <issue>12</issue>
            <fpage>496</fpage>
            <lpage>497</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0168-9525(96)20014-9</pubid>
                  <pubid idtype="pmpid" link="fulltext">9257526</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>Characterization of two new zebrafish members of the hedgehog family: atypical expression of a zebrafish indian hedgehog gene in skeletal elements of both endochondral and dermal origins</p>
            </title>
            <aug>
               <au>
                  <snm>Avaron</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Hoffman</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Guay</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Akimenko</snm>
                  <fnm>MA</fnm>
               </au>
            </aug>
            <source>Dev Dyn</source>
            <pubdate>2006</pubdate>
            <volume>235</volume>
            <issue>2</issue>
            <fpage>478</fpage>
            <lpage>489</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/dvdy.20619</pubid>
                  <pubid idtype="pmpid" link="fulltext">16292774</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>Gene and genome duplications in vertebrates: the one-to-four (-to-eight in fish) rule and the evolution of novel gene functions</p>
            </title>
            <aug>
               <au>
                  <snm>Meyer</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Schartl</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Current opinion in cell biology</source>
            <pubdate>1999</pubdate>
            <volume>11</volume>
            <issue>6</issue>
            <fpage>699</fpage>
            <lpage>704</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0955-0674(99)00039-3</pubid>
                  <pubid idtype="pmpid" link="fulltext">10600714</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>Crystal structure of a Hedgehog autoprocessing domain: Homology between Hedgehog and self-splicing proteins.</p>
            </title>
            <aug>
               <au>
                  <snm>Hall</snm>
                  <fnm>TMT</fnm>
               </au>
               <au>
                  <snm>Porter</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Young</snm>
                  <fnm>KE</fnm>
               </au>
               <au>
                  <snm>Koonin</snm>
                  <fnm>EV</fnm>
               </au>
               <au>
                  <snm>Beachy</snm>
                  <fnm>PA</fnm>
               </au>
               <au>
                  <snm>Leahy</snm>
                  <fnm>DJ</fnm>
               </au>
            </aug>
            <source>Cell</source>
            <pubdate>1997</pubdate>
            <volume>91</volume>
            <fpage>85</fpage>
            <lpage>97</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0092-8674(01)80011-8</pubid>
                  <pubid idtype="pmpid" link="fulltext">9335337</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B13">
            <title>
               <p>Cholesterol modification of Hedgehog signaling proteins in animal development.</p>
            </title>
            <aug>
               <au>
                  <snm>Porter</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Young</snm>
                  <fnm>KE</fnm>
               </au>
               <au>
                  <snm>Beachy</snm>
                  <fnm>PA</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>1996</pubdate>
            <volume>274</volume>
            <fpage>255</fpage>
            <lpage>259</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.274.5285.255</pubid>
                  <pubid idtype="pmpid" link="fulltext">8824192</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>Hedgehog: an unusual signal transducer.</p>
            </title>
            <aug>
               <au>
                  <snm>Bijlsma</snm>
                  <fnm>MF</fnm>
               </au>
               <au>
                  <snm>Spek</snm>
                  <fnm>CA</fnm>
               </au>
               <au>
                  <snm>Peppelenbosch</snm>
                  <fnm>MP</fnm>
               </au>
            </aug>
            <source>Bioessays</source>
            <pubdate>2004</pubdate>
            <volume>26</volume>
            <issue>4</issue>
            <fpage>387</fpage>
            <lpage>394</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/bies.20007</pubid>
                  <pubid idtype="pmpid" link="fulltext">15057936</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B15">
            <title>
               <p>Cholesterol modification of hedgehog is required for trafficking and movement, revealing an asymmetric cellular response to hedgehog</p>
            </title>
            <aug>
               <au>
                  <snm>Gallet</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Rodriguez</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Ruel</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Therond</snm>
                  <fnm>PP</fnm>
               </au>
            </aug>
            <source>Dev Cell</source>
            <pubdate>2003</pubdate>
            <volume>4</volume>
            <issue>2</issue>
            <fpage>191</fpage>
            <lpage>204</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S1534-5807(03)00031-5</pubid>
                  <pubid idtype="pmpid" link="fulltext">12586063</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <title>
               <p>Cholesterol modification is necessary for controlled planar long-range activity of Hedgehog in Drosophila epithelia</p>
            </title>
            <aug>
               <au>
                  <snm>Gallet</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Ruel</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Staccini-Lavenant</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Therond</snm>
                  <fnm>PP</fnm>
               </au>
            </aug>
            <source>Development (Cambridge, England)</source>
            <pubdate>2006</pubdate>
            <volume>133</volume>
            <issue>3</issue>
            <fpage>407</fpage>
            <lpage>418</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">16396912</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>Caenorhabditis elegans has scores of hedgehog-related genes: sequence and expression analysis.</p>
            </title>
            <aug>
               <au>
                  <snm>Asp&#246;ck</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Kagoshima</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Niklaus</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>B&#252;rglin</snm>
                  <fnm>TR</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>1999</pubdate>
            <volume>9</volume>
            <issue>10</issue>
            <fpage>909</fpage>
            <lpage>923</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1101/gr.9.10.909</pubid>
                  <pubid idtype="pmpid" link="fulltext">10523520</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p>Homologs of the Hh signalling network in C. elegans</p>
            </title>
            <aug>
               <au>
                  <snm>B&#252;rglin</snm>
                  <fnm>TR</fnm>
               </au>
               <au>
                  <snm>Kuwabara</snm>
                  <fnm>PE</fnm>
               </au>
            </aug>
            <source>WormBook</source>
            <pubdate>2006</pubdate>
            <fpage>1</fpage>
            <lpage>14</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">18050469</pubid>
                  <pubid idtype="doi">10.1895/wormbook.1.76.1</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B19">
            <title>
               <p>The hedgehog-related gene qua-1 is required for molting in Caenorhabditis elegans</p>
            </title>
            <aug>
               <au>
                  <snm>Hao</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Mukherjee</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Liegeois</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Baillie</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Labouesse</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>B&#252;rglin</snm>
                  <fnm>TR</fnm>
               </au>
            </aug>
            <source>Dev Dyn</source>
            <pubdate>2006</pubdate>
            <volume>235</volume>
            <issue>6</issue>
            <fpage>1469</fpage>
            <lpage>1481</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/dvdy.20721</pubid>
                  <pubid idtype="pmpid" link="fulltext">16502424</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B20">
            <title>
               <p>Multiple roles of cholesterol in hedgehog protein biogenesis and signaling.</p>
            </title>
            <aug>
               <au>
                  <snm>Beachy</snm>
                  <fnm>PA</fnm>
               </au>
               <au>
                  <snm>Cooper</snm>
                  <fnm>MK</fnm>
               </au>
               <au>
                  <snm>Young</snm>
                  <fnm>KE</fnm>
               </au>
               <au>
                  <snm>von Kessler</snm>
                  <fnm>DP</fnm>
               </au>
               <au>
                  <snm>Park</snm>
                  <fnm>WJ</fnm>
               </au>
               <au>
                  <snm>Hall</snm>
                  <fnm>TMT</fnm>
               </au>
               <au>
                  <snm>Leahy</snm>
                  <fnm>DJ</fnm>
               </au>
               <au>
                  <snm>Porter</snm>
                  <fnm>JA</fnm>
               </au>
            </aug>
            <source>Cold Spring Harb Symp Quant Biol</source>
            <pubdate>1997</pubdate>
            <volume>62</volume>
            <fpage>191</fpage>
            <lpage>204</lpage>
            <xrefbib>
               <pubid idtype="pmpid">9598352</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B21">
            <title>
               <p>Origin and evolution of inteins and other Hint domains</p>
            </title>
            <aug>
               <au>
                  <snm>Dassa</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Pietrokovski</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Homing Endonucleases and Inteins</source>
            <publisher> Springer</publisher>
            <editor>Belfort M, Stoddard BL, Wood DW, Derbyshire V</editor>
            <series>
               <title>
                  <p>Nucleic Acids and Molecular Biology</p>
               </title>
            </series>
            <pubdate>2005</pubdate>
         </bibl>
         <bibl id="B22">
            <title>
               <p>An unusual choanoflagellate protein released by Hedgehog autocatalytic processing</p>
            </title>
            <aug>
               <au>
                  <snm>Snell</snm>
                  <fnm>EA</fnm>
               </au>
               <au>
                  <snm>Brooke</snm>
                  <fnm>NM</fnm>
               </au>
               <au>
                  <snm>Taylor</snm>
                  <fnm>WR</fnm>
               </au>
               <au>
                  <snm>Casane</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Philippe</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Holland</snm>
                  <fnm>PW</fnm>
               </au>
            </aug>
            <source>Proc Biol Sci</source>
            <pubdate>2006</pubdate>
            <volume>273</volume>
            <issue>1585</issue>
            <fpage>401</fpage>
            <lpage>407</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1560198</pubid>
                  <pubid idtype="pmpid" link="fulltext">16615205</pubid>
                  <pubid idtype="doi">10.1098/rspb.2005.3263</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B23">
            <title>
               <p>Phylogenomics of eukaryotes: impact of missing data on large alignments</p>
            </title>
            <aug>
               <au>
                  <snm>Philippe</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Snell</snm>
                  <fnm>EA</fnm>
               </au>
               <au>
                  <snm>Bapteste</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Lopez</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Holland</snm>
                  <fnm>PW</fnm>
               </au>
               <au>
                  <snm>Casane</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Mol Biol Evol</source>
            <pubdate>2004</pubdate>
            <volume>21</volume>
            <issue>9</issue>
            <fpage>1740</fpage>
            <lpage>1752</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/molbev/msh182</pubid>
                  <pubid idtype="pmpid" link="fulltext">15175415</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B24">
            <title>
               <p>Reconstructing the early evolution of Fungi using a six-gene phylogeny</p>
            </title>
            <aug>
               <au>
                  <snm>James</snm>
                  <fnm>TY</fnm>
               </au>
               <au>
                  <snm>Kauff</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Schoch</snm>
                  <fnm>CL</fnm>
               </au>
               <au>
                  <snm>Matheny</snm>
                  <fnm>PB</fnm>
               </au>
               <au>
                  <snm>Hofstetter</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Cox</snm>
                  <fnm>CJ</fnm>
               </au>
               <au>
                  <snm>Celio</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Gueidan</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Fraker</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Miadlikowska</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Lumbsch</snm>
                  <fnm>HT</fnm>
               </au>
               <au>
                  <snm>Rauhut</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Reeb</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Arnold</snm>
                  <fnm>AE</fnm>
               </au>
               <au>
                  <snm>Amtoft</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Stajich</snm>
                  <fnm>JE</fnm>
               </au>
               <au>
                  <snm>Hosaka</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Sung</snm>
                  <fnm>GH</fnm>
               </au>
               <au>
                  <snm>Johnson</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>O'Rourke</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Crockett</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Binder</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Curtis</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Slot</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Powell</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Taylor</snm>
                  <fnm>JW</fnm>
               </au>
               <au>
                  <snm>McLaughlin</snm>
                  <fnm>DJ</fnm>
               </au>
               <au>
                  <snm>Spatafora</snm>
                  <fnm>JW</fnm>
               </au>
               <au>
                  <snm>Vilgalys</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2006</pubdate>
            <volume>443</volume>
            <issue>7113</issue>
            <fpage>818</fpage>
            <lpage>822</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nature05110</pubid>
                  <pubid idtype="pmpid" link="fulltext">17051209</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B25">
            <title>
               <p>Maintenance of ancestral complexity and non-metazoan genes in two basal cnidarians</p>
            </title>
            <aug>
               <au>
                  <snm>Technau</snm>
                  <fnm>U</fnm>
               </au>
               <au>
                  <snm>Rudd</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Maxwell</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Gordon</snm>
                  <fnm>PM</fnm>
               </au>
               <au>
                  <snm>Saina</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Grasso</snm>
                  <fnm>LC</fnm>
               </au>
               <au>
                  <snm>Hayward</snm>
                  <fnm>DC</fnm>
               </au>
               <au>
                  <snm>Sensen</snm>
                  <fnm>CW</fnm>
               </au>
               <au>
                  <snm>Saint</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Holstein</snm>
                  <fnm>TW</fnm>
               </au>
               <au>
                  <snm>Ball</snm>
                  <fnm>EE</fnm>
               </au>
               <au>
                  <snm>Miller</snm>
                  <fnm>DJ</fnm>
               </au>
            </aug>
            <source>Trends Genet</source>
            <pubdate>2005</pubdate>
            <volume>21</volume>
            <issue>12</issue>
            <fpage>633</fpage>
            <lpage>639</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.tig.2005.09.007</pubid>
                  <pubid idtype="pmpid" link="fulltext">16226338</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B26">
            <title>
               <p>Genomics and expression profiles of the Hedgehog and Notch signaling pathways in sea urchin development</p>
            </title>
            <aug>
               <au>
                  <snm>Walton</snm>
                  <fnm>KD</fnm>
               </au>
               <au>
                  <snm>Croce</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Glenn</snm>
                  <fnm>TD</fnm>
               </au>
               <au>
                  <snm>Wu</snm>
                  <fnm>SY</fnm>
               </au>
               <au>
                  <snm>McClay</snm>
                  <fnm>DR</fnm>
               </au>
            </aug>
            <source>Developmental biology</source>
            <pubdate>2006</pubdate>
            <volume>300</volume>
            <issue>1</issue>
            <fpage>153</fpage>
            <lpage>164</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1880897</pubid>
                  <pubid idtype="pmpid" link="fulltext">17067570</pubid>
                  <pubid idtype="doi">10.1016/j.ydbio.2006.08.064</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B27">
            <title>
               <p>Early evolution of animal cell signaling and adhesion genes</p>
            </title>
            <aug>
               <au>
                  <snm>Nichols</snm>
                  <fnm>SA</fnm>
               </au>
               <au>
                  <snm>Dirks</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Pearse</snm>
                  <fnm>JS</fnm>
               </au>
               <au>
                  <snm>King</snm>
                  <fnm>N</fnm>
               </au>
            </aug>
            <source>Proceedings of the National Academy of Sciences of the United States of America</source>
            <pubdate>2006</pubdate>
            <volume>103</volume>
            <issue>33</issue>
            <fpage>12451</fpage>
            <lpage>12456</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1567900</pubid>
                  <pubid idtype="pmpid" link="fulltext">16891419</pubid>
                  <pubid idtype="doi">10.1073/pnas.0604065103</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B28">
            <title>
               <p>StellaBase: the Nematostella vectensis Genomics Database</p>
            </title>
            <aug>
               <au>
                  <snm>Sullivan</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Ryan</snm>
                  <fnm>JF</fnm>
               </au>
               <au>
                  <snm>Watson</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Webb</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Mullikin</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Rokhsar</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Finnerty</snm>
                  <fnm>JR</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2006</pubdate>
            <volume>34</volume>
            <issue>Database issue</issue>
            <fpage>D495</fpage>
            <lpage>9</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1347383</pubid>
                  <pubid idtype="pmpid" link="fulltext">16381919</pubid>
                  <pubid idtype="doi">10.1093/nar/gkj020</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B29">
            <title>
               <p>Early developmentally regulated genes in the arbuscular mycorrhizal fungus Glomus mosseae: identification of GmGIN1, a novel gene with homology to the C-terminus of metazoan hedgehog proteins </p>
            </title>
            <aug>
               <au>
                  <snm>Requena</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Mann</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Hampp</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Franken</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Plant Soil</source>
            <pubdate>2002</pubdate>
            <volume>244</volume>
            <fpage>129</fpage>
            <lpage>139</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1023/A:1020249932310</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B30">
            <title>
               <p>InBase: the Intein Database</p>
            </title>
            <aug>
               <au>
                  <snm>Perler</snm>
                  <fnm>FB</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2002</pubdate>
            <volume>30</volume>
            <issue>1</issue>
            <fpage>383</fpage>
            <lpage>384</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">99080</pubid>
                  <pubid idtype="pmpid" link="fulltext">11752343</pubid>
                  <pubid idtype="doi">10.1093/nar/30.1.383</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B31">
            <title>
               <p>The nuclear-encoded inteins of fungi</p>
            </title>
            <aug>
               <au>
                  <snm>Poulter</snm>
                  <fnm>RT</fnm>
               </au>
               <au>
                  <snm>Goodwin</snm>
                  <fnm>TJ</fnm>
               </au>
               <au>
                  <snm>Butler</snm>
                  <fnm>MI</fnm>
               </au>
            </aug>
            <source>Fungal Genet Biol</source>
            <pubdate>2007</pubdate>
            <volume>44</volume>
            <issue>3</issue>
            <fpage>153</fpage>
            <lpage>179</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.fgb.2006.07.012</pubid>
                  <pubid idtype="pmpid" link="fulltext">17046294</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B32">
            <title>
               <p>Intein spread and extinction in evolution</p>
            </title>
            <aug>
               <au>
                  <snm>Pietrokovski</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Trends Genet</source>
            <pubdate>2001</pubdate>
            <volume>17</volume>
            <issue>8</issue>
            <fpage>465</fpage>
            <lpage>472</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0168-9525(01)02365-4</pubid>
                  <pubid idtype="pmpid" link="fulltext">11485819</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B33">
            <title>
               <p>Distribution and function of new bacterial intein-like protein domains</p>
            </title>
            <aug>
               <au>
                  <snm>Amitai</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Belenkiy</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Dassa</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Shainskaya</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Pietrokovski</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Molecular microbiology</source>
            <pubdate>2003</pubdate>
            <volume>47</volume>
            <issue>1</issue>
            <fpage>61</fpage>
            <lpage>73</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1046/j.1365-2958.2003.03283.x</pubid>
                  <pubid idtype="pmpid" link="fulltext">12492854</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B34">
            <title>
               <p>Conserved sequence features of inteins (protein introns) and their use in identifying new inteins and related proteins.</p>
            </title>
            <aug>
               <au>
                  <snm>Pietrokovski</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Protein Science</source>
            <pubdate>1994</pubdate>
            <volume>3</volume>
            <fpage>2340</fpage>
            <lpage>2350</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">2142770</pubid>
                  <pubid idtype="pmpid" link="fulltext">7756989</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B35">
            <title>
               <p>Statistical modeling, phylogenetic analysis and structure prediction of a protein splicing domain common to Inteins and Hedgehog proteins.</p>
            </title>
            <aug>
               <au>
                  <snm>Dalgaard</snm>
                  <fnm>JZ</fnm>
               </au>
               <au>
                  <snm>Moser</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Hughey</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Mian</snm>
                  <fnm>IS</fnm>
               </au>
            </aug>
            <source>Journal of Computational Biology</source>
            <pubdate>1997</pubdate>
            <volume>4</volume>
            <issue>2</issue>
            <fpage>193</fpage>
            <lpage>214</lpage>
            <xrefbib>
               <pubid idtype="pmpid">9228618</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B36">
            <title>
               <p>Compilation and analysis of intein sequences.</p>
            </title>
            <aug>
               <au>
                  <snm>Perler</snm>
                  <fnm>FB</fnm>
               </au>
               <au>
                  <snm>Olsen</snm>
                  <fnm>GJ</fnm>
               </au>
               <au>
                  <snm>Adam</snm>
                  <fnm>E</fnm>
               </au>
            </aug>
            <source>Nucl Acids Res</source>
            <pubdate>1997</pubdate>
            <volume>25</volume>
            <issue>6</issue>
            <fpage>1087</fpage>
            <lpage>1093</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">146560</pubid>
                  <pubid idtype="pmpid" link="fulltext">9092614</pubid>
                  <pubid idtype="doi">10.1093/nar/25.6.1087</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B37">
            <title>
               <p>Protein splicing in cis and in trans</p>
            </title>
            <aug>
               <au>
                  <snm>Saleh</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Perler</snm>
                  <fnm>FB</fnm>
               </au>
            </aug>
            <source>Chemical record</source>
            <pubdate>2006</pubdate>
            <volume>6</volume>
            <issue>4</issue>
            <fpage>183</fpage>
            <lpage>193</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/tcr.20082</pubid>
                  <pubid idtype="pmpid" link="fulltext">16900466</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B38">
            <title>
               <p>Comprehensive analysis of gene expression patterns of hedgehog-related genes</p>
            </title>
            <aug>
               <au>
                  <snm>Hao</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Johnsen</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Lauter</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Baillie</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>B&#252;rglin</snm>
                  <fnm>TR</fnm>
               </au>
            </aug>
            <source>BMC Genomics</source>
            <pubdate>2006</pubdate>
            <volume>7</volume>
            <fpage>280</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1636047</pubid>
                  <pubid idtype="pmpid" link="fulltext">17076889</pubid>
                  <pubid idtype="doi">10.1186/1471-2164-7-280</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B39">
            <title>
               <p>A quick tour of nematode diversity and the backbone of nematode phylogeny</p>
            </title>
            <aug>
               <au>
                  <snm>De Ley</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>WormBook</source>
            <publisher>WormBook</publisher>
            <editor>Community TCR</editor>
            <pubdate>2006</pubdate>
            <url>http://www.wormbook.org</url>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1895/wormbook.1.41.1</pubid>
                  <pubid idtype="pmpid" link="fulltext">18050465</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B40">
            <title>
               <p>A profile of putative parasitism genes expressed in the esophageal gland cells of the root-knot nematode Meloidogyne incognita</p>
            </title>
            <aug>
               <au>
                  <snm>Huang</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Gao</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Maier</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Allen</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Davis</snm>
                  <fnm>EL</fnm>
               </au>
               <au>
                  <snm>Baum</snm>
                  <fnm>TJ</fnm>
               </au>
               <au>
                  <snm>Hussey</snm>
                  <fnm>RS</fnm>
               </au>
            </aug>
            <source>Mol Plant Microbe Interact</source>
            <pubdate>2003</pubdate>
            <volume>16</volume>
            <issue>5</issue>
            <fpage>376</fpage>
            <lpage>381</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1094/MPMI.2003.16.5.376</pubid>
                  <pubid idtype="pmpid" link="fulltext">12744507</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B41">
            <title>
               <p>Genome Sequencing Center, Washington University School of Medicine</p>
            </title>
            <url>http://genome.wustl.edu/</url>
         </bibl>
         <bibl id="B42">
            <title>
               <p>Divergence of hedgehog signal transduction mechanism between Drosophila and mammals</p>
            </title>
            <aug>
               <au>
                  <snm>Varjosalo</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>SP</fnm>
               </au>
               <au>
                  <snm>Taipale</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Dev Cell</source>
            <pubdate>2006</pubdate>
            <volume>10</volume>
            <issue>2</issue>
            <fpage>177</fpage>
            <lpage>186</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.devcel.2005.12.014</pubid>
                  <pubid idtype="pmpid" link="fulltext">16459297</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B43">
            <title>
               <p>Genetic elimination of Suppressor of fused reveals an essential repressor function in the mammalian Hedgehog signaling pathway</p>
            </title>
            <aug>
               <au>
                  <snm>Svard</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Heby-Henricson</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Persson-Lek</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Rozell</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Lauth</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Bergstrom</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Ericson</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Toftgard</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Teglund</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Dev Cell</source>
            <pubdate>2006</pubdate>
            <volume>10</volume>
            <issue>2</issue>
            <fpage>187</fpage>
            <lpage>197</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.devcel.2005.12.013</pubid>
                  <pubid idtype="pmpid" link="fulltext">16459298</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B44">
            <title>
               <p>Type A modules: interacting domains found in several non-fibrillar collagens and in other extracellular matrix proteins</p>
            </title>
            <aug>
               <au>
                  <snm>Colombatti</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Bonaldo</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Doliana</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Matrix</source>
            <pubdate>1993</pubdate>
            <volume>13</volume>
            <issue>4</issue>
            <fpage>297</fpage>
            <lpage>306</lpage>
            <xrefbib>
               <pubid idtype="pmpid">8412987</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B45">
            <title>
               <p>The secondary structure of the von Willebrand factor type A domain in factor B of human complement by Fourier transform infrared spectroscopy. Its occurrence in collagen types VI, VII, XII and XIV, the integrins and other proteins by averaged structure predictions</p>
            </title>
            <aug>
               <au>
                  <snm>Perkins</snm>
                  <fnm>SJ</fnm>
               </au>
               <au>
                  <snm>Smith</snm>
                  <fnm>KF</fnm>
               </au>
               <au>
                  <snm>Williams</snm>
                  <fnm>SC</fnm>
               </au>
               <au>
                  <snm>Haris</snm>
                  <fnm>PI</fnm>
               </au>
               <au>
                  <snm>Chapman</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Sim</snm>
                  <fnm>RB</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1994</pubdate>
            <volume>238</volume>
            <issue>1</issue>
            <fpage>104</fpage>
            <lpage>119</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1006/jmbi.1994.1271</pubid>
                  <pubid idtype="pmpid" link="fulltext">8145250</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B46">
            <title>
               <p>The deep roots of eukaryotes</p>
            </title>
            <aug>
               <au>
                  <snm>Baldauf</snm>
                  <fnm>SL</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2003</pubdate>
            <volume>300</volume>
            <issue>5626</issue>
            <fpage>1703</fpage>
            <lpage>1706</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.1085544</pubid>
                  <pubid idtype="pmpid" link="fulltext">12805537</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B47">
            <title>
               <p>The cnidarian-bilaterian ancestor possessed at least 56 homeoboxes. Evidence from the starlet sea anemone, Nematostella vectensis</p>
            </title>
            <aug>
               <au>
                  <snm>Ryan</snm>
                  <fnm>JF</fnm>
               </au>
               <au>
                  <snm>Burton</snm>
                  <fnm>PM</fnm>
               </au>
               <au>
                  <snm>Mazza</snm>
                  <fnm>ME</fnm>
               </au>
               <au>
                  <snm>Kwong</snm>
                  <fnm>GK</fnm>
               </au>
               <au>
                  <snm>Mullikin</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Finnerty</snm>
                  <fnm>JR</fnm>
               </au>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2006</pubdate>
            <volume>7</volume>
            <issue>7</issue>
            <fpage>R64</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1779571</pubid>
                  <pubid idtype="pmpid" link="fulltext">16867185</pubid>
                  <pubid idtype="doi">10.1186/gb-2006-7-7-r64</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B48">
            <title>
               <p>Comprehensive Analysis of Animal TALE Homeobox Genes: New Conserved Motifs and Cases of Accelerated Evolution</p>
            </title>
            <aug>
               <au>
                  <snm>Mukherjee</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>B&#252;rglin</snm>
                  <fnm>TR</fnm>
               </au>
            </aug>
            <source>Journal of molecular evolution</source>
            <pubdate>2007</pubdate>
            <volume>65</volume>
            <issue>2</issue>
            <fpage>137</fpage>
            <lpage>153</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1007/s00239-006-0023-0</pubid>
                  <pubid idtype="pmpid" link="fulltext">17665086</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B49">
            <title>
               <p>J. Craig Venter Institute</p>
            </title>
            <url>http://www.tigr.org</url>
         </bibl>
         <bibl id="B50">
            <title>
               <p>BLAST: Basic Local Alignment and Search Tool</p>
            </title>
            <url>http://www.ncbi.nlm.nih.gov/blast/</url>
         </bibl>
         <bibl id="B51">
            <title>
               <p>Nematode.net: a tool for navigating sequences from parasitic and free-living nematodes</p>
            </title>
            <aug>
               <au>
                  <snm>Wylie</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Martin</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Dante</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Mitreva</snm>
                  <fnm>MD</fnm>
               </au>
               <au>
                  <snm>Clifton</snm>
                  <fnm>SW</fnm>
               </au>
               <au>
                  <snm>Chinwalla</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Waterston</snm>
                  <fnm>RH</fnm>
               </au>
               <au>
                  <snm>Wilson</snm>
                  <fnm>RK</fnm>
               </au>
               <au>
                  <snm>McCarter</snm>
                  <fnm>JP</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2004</pubdate>
            <volume>32</volume>
            <issue>Database issue</issue>
            <fpage>D423</fpage>
            <lpage>6</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">308745</pubid>
                  <pubid idtype="pmpid" link="fulltext">14681448</pubid>
                  <pubid idtype="doi">10.1093/nar/gkh010</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B52">
            <title>
               <p>TBestDB: a taxonomically broad database of expressed sequence tags (ESTs)</p>
            </title>
            <aug>
               <au>
                  <snm>O'Brien</snm>
                  <fnm>EA</fnm>
               </au>
               <au>
                  <snm>Koski</snm>
                  <fnm>LB</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Yang</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Gray</snm>
                  <fnm>MW</fnm>
               </au>
               <au>
                  <snm>Burger</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Lang</snm>
                  <fnm>BF</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2007</pubdate>
            <volume>35</volume>
            <issue>Database issue</issue>
            <fpage>D445</fpage>
            <lpage>51</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1899108</pubid>
                  <pubid idtype="pmpid" link="fulltext">17202165</pubid>
                  <pubid idtype="doi">10.1093/nar/gkl770</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B53">
            <title>
               <p>GSC: BLAST Server</p>
            </title>
            <url>http://genome.wustl.edu/tools/blast/</url>
         </bibl>
         <bibl id="B54">
            <title>
               <p>StellaBase: Nematostella vectensis Database</p>
            </title>
            <url>http://evodevo.bu.edu/stellabase/</url>
         </bibl>
         <bibl id="B55">
            <title>
               <p>DOE Joint Genome Institute</p>
            </title>
            <url>http://www.jgi.doe.gov/</url>
         </bibl>
         <bibl id="B56">
            <title>
               <p>ZFIN: The Zebrafish Model Organism Database</p>
            </title>
            <url>http://zfin.org/</url>
         </bibl>
         <bibl id="B57">
            <title>
               <p>The Zebrafish Information Network: the zebrafish model organism database</p>
            </title>
            <aug>
               <au>
                  <snm>Sprague</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Bayraktaroglu</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Clements</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Conlin</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Fashena</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Frazer</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Haendel</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Howe</snm>
                  <fnm>DG</fnm>
               </au>
               <au>
                  <snm>Mani</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Ramachandran</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Schaper</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Segerdell</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Song</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Sprunger</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Taylor</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Van Slyke</snm>
                  <fnm>CE</fnm>
               </au>
               <au>
                  <snm>Westerfield</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2006</pubdate>
            <volume>34</volume>
            <issue>Database issue</issue>
            <fpage>D581</fpage>
            <lpage>5</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1347449</pubid>
                  <pubid idtype="pmpid" link="fulltext">16381936</pubid>
                  <pubid idtype="doi">10.1093/nar/gkj086</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B58">
            <title>
               <p>NEB Intein Database</p>
            </title>
            <url>http://www.neb.com/neb/inteins.html</url>
         </bibl>
         <bibl id="B59">
            <title>
               <p>SoftBerry</p>
            </title>
            <url>http://www.softberry.com</url>
         </bibl>
         <bibl id="B60">
            <title>
               <p>PPCMatrix: a PowerPC dotmatrix program to compare large genomic sequences against protein sequences.</p>
            </title>
            <aug>
               <au>
                  <snm>B&#252;rglin</snm>
                  <fnm>TR</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>1998</pubdate>
            <volume>14</volume>
            <issue>8</issue>
            <fpage>751</fpage>
            <lpage>752</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">9789102</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B61">
            <title>
               <p>Sequence Assembly at Iowa State University</p>
            </title>
            <url>http://deepc2.psi.iastate.edu/aat/cap/cap.html</url>
         </bibl>
         <bibl id="B62">
            <title>
               <p>The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools.</p>
            </title>
            <aug>
               <au>
                  <snm>Thompson</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Gibson</snm>
                  <fnm>TJ</fnm>
               </au>
               <au>
                  <snm>Plewniak</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Jeanmougin</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Higgins</snm>
                  <fnm>DG</fnm>
               </au>
            </aug>
            <source>Nucl Acids Res</source>
            <pubdate>1997</pubdate>
            <volume>25</volume>
            <issue>24</issue>
            <fpage>4876</fpage>
            <lpage>4882</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">147148</pubid>
                  <pubid idtype="pmpid" link="fulltext">9396791</pubid>
                  <pubid idtype="doi">10.1093/nar/25.24.4876</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B63">
            <title>
               <p>MUSCLE</p>
            </title>
            <url>http://phylogenomics.berkeley.edu/cgi-bin/muscle/input_muscle.py</url>
         </bibl>
         <bibl id="B64">
            <title>
               <p>MUSCLE: a multiple sequence alignment method with reduced time and space complexity</p>
            </title>
            <aug>
               <au>
                  <snm>Edgar</snm>
                  <fnm>RC</fnm>
               </au>
            </aug>
            <source>BMC Bioinformatics</source>
            <pubdate>2004</pubdate>
            <volume>5</volume>
            <issue>1</issue>
            <fpage>113</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">517706</pubid>
                  <pubid idtype="pmpid" link="fulltext">15318951</pubid>
                  <pubid idtype="doi">10.1186/1471-2105-5-113</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B65">
            <title>
               <p>SEAVIEW and PHYLO_WIN: two graphic tools for sequence alignment and molecular phylogeny.</p>
            </title>
            <aug>
               <au>
                  <snm>Galtier</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Gouy</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Gautier</snm>
                  <fnm>C</fnm>
               </au>
            </aug>
            <source>Comput Appl Biosci</source>
            <pubdate>1996</pubdate>
            <volume>12</volume>
            <issue>6</issue>
            <fpage>543</fpage>
            <lpage>548</lpage>
            <xrefbib>
               <pubid idtype="pmpid">9021275</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B66">
            <title>
               <p>A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood</p>
            </title>
            <aug>
               <au>
                  <snm>Guindon</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Gascuel</snm>
                  <fnm>O</fnm>
               </au>
            </aug>
            <source>Syst Biol</source>
            <pubdate>2003</pubdate>
            <volume>52</volume>
            <issue>5</issue>
            <fpage>696</fpage>
            <lpage>704</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1080/10635150390235520</pubid>
                  <pubid idtype="pmpid">14530136</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B67">
            <title>
               <p>SignalP 3.0 Server</p>
            </title>
            <url>http://www.cbs.dtu.dk/services/SignalP/</url>
         </bibl>
         <bibl id="B68">
            <title>
               <p>Improved prediction of signal peptides: SignalP 3.0</p>
            </title>
            <aug>
               <au>
                  <snm>Bendtsen</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Nielsen</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>von Heijne</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Brunak</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>2004</pubdate>
            <volume>340</volume>
            <issue>4</issue>
            <fpage>783</fpage>
            <lpage>795</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.jmb.2004.05.028</pubid>
                  <pubid idtype="pmpid" link="fulltext">15223320</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B69">
            <title>
               <p>LogoBar - Java application for protein sequence Logos</p>
            </title>
            <url>http://www.biosci.ki.se/groups/tbu/logobar/</url>
         </bibl>
         <bibl id="B70">
            <title>
               <p>LogoBar: bar graph visualization of protein logos with gaps</p>
            </title>
            <aug>
               <au>
                  <snm>P&#233;rez-Bercoff</snm>
                  <fnm/>
               </au>
               <au>
                  <snm>Koch</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>B&#252;rglin</snm>
                  <fnm>TR</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2006</pubdate>
            <volume>22</volume>
            <issue>1</issue>
            <fpage>112</fpage>
            <lpage>114</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/bti761</pubid>
                  <pubid idtype="pmpid" link="fulltext">16269415</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B71">
            <title>
               <p>SMART 4.0: towards genomic data integration</p>
            </title>
            <aug>
               <au>
                  <snm>Letunic</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Copley</snm>
                  <fnm>RR</fnm>
               </au>
               <au>
                  <snm>Schmidt</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Ciccarelli</snm>
                  <fnm>FD</fnm>
               </au>
               <au>
                  <snm>Doerks</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Schultz</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Ponting</snm>
                  <fnm>CP</fnm>
               </au>
               <au>
                  <snm>Bork</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2004</pubdate>
            <volume>32</volume>
            <issue>Database issue</issue>
            <fpage>D142</fpage>
            <lpage>4</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">308822</pubid>
                  <pubid idtype="pmpid" link="fulltext">14681379</pubid>
                  <pubid idtype="doi">10.1093/nar/gkh088</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
      </refgrp>
   </bm>
</art>
