<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1471-2148-8-316</ui>
   <ji>1471-2148</ji>
   <fm>
      <dochead>Research article</dochead>
      <bibl>
         <title>
            <p>Insights into the evolutionary origins of clostridial neurotoxins from analysis of the <it>Clostridium botulinum </it>strain A neurotoxin gene cluster</p>
         </title>
         <aug>
            <au id="A1">
               <snm>Doxey</snm>
               <mi>C</mi>
               <fnm>Andrew</fnm>
               <insr iid="I1"/>
               <email>acdoxey@uwaterloo.ca</email>
            </au>
            <au id="A2">
               <snm>Lynch</snm>
               <mi>DJ</mi>
               <fnm>Michael</fnm>
               <insr iid="I1"/>
               <email>mdjlynch@sciborg.uwaterloo.ca</email>
            </au>
            <au id="A3">
               <snm>M&#252;ller</snm>
               <mi>M</mi>
               <fnm>Kirsten</fnm>
               <insr iid="I1"/>
               <email>kmmuller@sciborg.uwaterloo.ca</email>
            </au>
            <au id="A4">
               <snm>Meiering</snm>
               <mi>M</mi>
               <fnm>Elizabeth</fnm>
               <insr iid="I2"/>
               <email>meiering@uwaterloo.ca</email>
            </au>
            <au id="A5" ca="yes">
               <snm>McConkey</snm>
               <mi>J</mi>
               <fnm>Brendan</fnm>
               <insr iid="I1"/>
               <email>mcconkey@uwaterloo.ca</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Department of Biology, University of Waterloo, 200 University Avenue West, Waterloo, Ontario, N2L 3G1, Canada</p>
            </ins>
            <ins id="I2">
               <p>Guelph-Waterloo Centre for Graduate Studies in Chemistry and Biochemistry, University of Waterloo, 200 University Avenue West, Waterloo, Ontario, N2L 3G1, Canada</p>
            </ins>
         </insg>
         <source>BMC Evolutionary Biology</source>
         <issn>1471-2148</issn>
         <pubdate>2008</pubdate>
         <volume>8</volume>
         <issue>1</issue>
         <fpage>316</fpage>
         <url>http://www.biomedcentral.com/1471-2148/8/316</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">19014598</pubid>
               <pubid idtype="doi">10.1186/1471-2148-8-316</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>21</day>
               <month>4</month>
               <year>2008</year>
            </date>
         </rec>
         <acc>
            <date>
               <day>14</day>
               <month>11</month>
               <year>2008</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>14</day>
               <month>11</month>
               <year>2008</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2008</year>
         <collab>Doxey et al; licensee BioMed Central Ltd.</collab>
         <note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>Clostridial neurotoxins (CNTs) are the most deadly toxins known and causal agents of botulism and tetanus neuroparalytic diseases. Despite considerable progress in understanding CNT structure and function, the evolutionary origins of CNTs remain a mystery as they are unique to <it>Clostridium </it>and possess a sequence and structural architecture distinct from other protein families. Uncovering the origins of CNTs would be a significant contribution to our understanding of how pathogens evolve and generate novel toxin families.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>The <it>C. botulinum </it>strain A genome was examined for potential homologues of CNTs. A key link was identified between the neurotoxin and the flagellin gene (CBO0798) located immediately upstream of the BoNT/A neurotoxin gene cluster. This flagellin sequence displayed the strongest sequence similarity to the neurotoxin and NTNH homologue out of all proteins encoded within <it>C. botulinum </it>strain A. The CBO0798 gene contains a unique hypervariable region, which in closely related flagellins encodes a collagenase-like domain. Remarkably, these collagenase-containing flagellins were found to possess the characteristic HEXXH zinc-protease motif responsible for the neurotoxin's endopeptidase activity. Additional links to collagenase-related sequences and functions were detected by further analysis of CNTs and surrounding genes, including sequence similarities to collagen-adhesion domains and collagenases. Furthermore, the neurotoxin's HCRn domain was found to exhibit both structural and sequence similarity to eukaryotic collagen jelly-roll domains.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusion</p>
               </st>
               <p>Multiple lines of evidence suggest that the neurotoxin and adjacent genes evolved from an ancestral collagenase-like gene cluster, linking CNTs to another major family of clostridial proteolytic toxins. Duplication, reshuffling and assembly of neighboring genes within the BoNT/A neurotoxin gene cluster may have lead to the neurotoxin's unique architecture. This work provides new insights into the evolution of <it>C. botulinum </it>neurotoxins and the evolutionary mechanisms underlying the origins of virulent genes.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>Clostridial neurotoxins (CNTs) are the most poisonous biological toxins known and molecular agents of botulism and tetanus neuroparalytic diseases <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. Due to their extreme toxicity and potential threat as bioterrorism agents, they are listed as Category A agents by the Centers for Disease Control and Prevention along with other deadly agents such as anthrax. Elucidating the mechanisms by which CNTs evolved is therefore of significant importance to our understanding of pathogen evolution and emerging diseases.</p>
         <p>While considerable progress has been made in understanding CNT structure and function <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr><abbr bid="B5">5</abbr><abbr bid="B6">6</abbr><abbr bid="B7">7</abbr><abbr bid="B8">8</abbr><abbr bid="B9">9</abbr></abbrgrp>, like many toxins and virulence factors, the evolutionary origins of CNTs are unclear. CNTs are produced by four phylogenetically distinct groups (I-IV) of <it>C. botulinum</it>, and also by strains of <it>C. tetani</it>, <it>C. baratii</it>, and <it>C. butyricum </it><abbrgrp><abbr bid="B10">10</abbr></abbrgrp>. As demonstrated by the scattered phyletic distribution of neurotoxin-producing clostridia <abbrgrp><abbr bid="B10">10</abbr></abbrgrp> and the patterns of sequence similarity between different neurotoxin gene clusters <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>, CNT genes appear to have undergone significant lateral transfer between different species of <it>Clostridium</it>. The occurrence of lateral transfer is also supported by the discovery of plasmid-encoded neurotoxin genes in numerous <it>C. botulinum </it>strains <abbrgrp><abbr bid="B12">12</abbr></abbrgrp>, as well as the existence of putative insertion sequences flanking the neurotoxin gene cluster <abbrgrp><abbr bid="B13">13</abbr></abbrgrp>.</p>
         <p>While CNTs have undergone frequent lateral transfer between species of <it>Clostridium</it>, no CNT homologues have been identified outside of the <it>Clostridium </it>genus. CNTs form an isolated protein family according to SCOP <abbrgrp><abbr bid="B14">14</abbr></abbrgrp> and PFAM <abbrgrp><abbr bid="B15">15</abbr></abbrgrp> and have a unique structural architecture that complicates the identification of related proteins and potential ancestors. While CNT domains have little detectable sequence similarity to proteins outside of the CNT family, there are however some structural and functional similarities to other domain families. The beta-trefoil, a three-fold symmetrical structure that forms the C-terminal receptor binding domain (HCRc) and associated hemagglutinin components, is common to interleukins, ricin-like lectins, and fibroblast growth factors <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>. The adjacent HCRn domain, also involved in receptor binding, forms a jelly-roll like structure similar to laminin globular G domains <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>. The central translocase adopts a long alpha-helical structure containing alpha-helical bundles that resemble those found in translocase-like domains of other toxins <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>. Lastly, the N-terminal catalytic domain has been grouped under the zincin-like group of metalloproteases by SCOP and under the Peptidase MA clan by the MEROPS database <abbrgrp><abbr bid="B18">18</abbr></abbrgrp>. It contains a HEXXH zinc-binding motif found in other zinc endopeptidases, but has only weak structural similarity to other members of the Peptidase MA clan <abbrgrp><abbr bid="B5">5</abbr></abbrgrp>.</p>
         <p>Diversity of domain and fold composition and extreme sequence divergence are common features of bacterial toxins <abbrgrp><abbr bid="B19">19</abbr></abbrgrp>. Rapid sequence evolution in toxin genes is largely a consequence of the evolutionary 'arms race' between pathogen and host <abbrgrp><abbr bid="B19">19</abbr></abbrgrp>. Therefore, it is important to consider that evolutionarily related toxins may only share weak sequence similarity and may have undergone considerable structural rearrangements. Here, we present evidence supporting the hypothesis that CNTs were formed within an ancestral <it>Clostridium </it>species by duplications and rearrangements of neighboring genes within the neurotoxin gene cluster, and identify the likely evolutionary precursors of CNTs and surrounding genes. Multiple links to collagenase-related sequences and functions are detected through an analysis of the nearby flagellin and hemagglutinin genes (strain A) in addition to the CNT domains. The detected links provide novel insights into the evolutionary origins and ancestral function of the neurotoxin gene cluster.</p>
      </sec>
      <sec>
         <st>
            <p>Results and Discussion</p>
         </st>
         <sec>
            <st>
               <p>Ancient gene duplications within the BoNT/A neurotoxin gene cluster</p>
            </st>
            <p>A comprehensive analysis of pairwise sequence similarities was performed for all proteins encoded within the <it>C. botulinum </it>(strain Hall A, ATCC 3502) genome <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>, in an attempt to identify distant homologues of CNTs and possible sequence remnants of the evolutionary process by which CNTs originated. This initial analysis was limited to a single genome for a more sensitive detection of pairwise homologies using a restricted database, however subsequent searches were also performed using all available clostridial genomes. For the 3615 proteins encoded within <it>C. botulinum </it>(strain Hall A, ATCC 3502) <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>, a 'heat map' of pairwise sequence similarity was constructed (see Methods) (Figure <figr fid="F1">1</figr>). For each pairwise alignment, the <it>E</it>-value and percentile rank relative to all other pairwise alignments was calculated using SSEARCH <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>. When compared by percentile rank, the neurotoxin gene cluster stood out as a "hot spot" of local pairwise sequence similarities. The neurotoxin gene cluster can be seen as a distinct cluster of high-scoring pairs in the centre of the heat map region in Figure <figr fid="F1">1A</figr>. Based on both the percentile ranks and <it>E</it>-values for the pairwise alignments corresponding to these genes (Figure <figr fid="F1">1B</figr>), there are clear sequence similarities between multiple sequences within this region, including BoNT/A, non-toxic non-hemagglutinin (NTNH), the adjacent hemagglutinin (HA) components and the adjacent CBO0798 gene encoding a flagellin protein (NCBI accession YP_001253335). BoNT/A and NTNH produced the top-scoring alignments with each other out of 3615 proteins in <it>C. botulinum </it>strain A (<it>E </it>= 1e-22, 9e-24), an expected result given previously identified sequence similarities between BoNTs and NTNH <abbrgrp><abbr bid="B22">22</abbr></abbrgrp> as well as their virtually identical domain architecture as identified by the NCBI's conserved domain database annotation (e.g., for NCBI IDs ABP48106 and BAA90660). Surprisingly, the next highest match in both cases corresponds to the CBO0798 flagellin gene located immediately upstream of the neurotoxin gene cluster (Figure <figr fid="F2">2A</figr>). The associated <it>E</it>-values were 0.041 and 0.42 for BoNT/A and NTNH, respectively (Figure <figr fid="F1">1B</figr>). CBO0798 aligned with NTNH and BONT/A in two different CNT regions (I and II) (Figure <figr fid="F2">2B</figr>, Additional Files <supplr sid="S1">1</supplr>, <supplr sid="S2">2</supplr>). Additional searches using the sequences of CNTs from other strains also identified CBO0798 as the most consistent top ranked hit out of all <it>C. botulinum </it>strain A proteins with sequence identities between CNTs and CBO0798 ranging from 20&#8211;24%, and the strongest alignments involving region II of CNTs (Additional File <supplr sid="S1">1</supplr>).</p>
            <suppl id="S1">
               <title>
                  <p>Additional file 1</p>
               </title>
               <text>
                  <p><b>Detected sequence similarities between CBO0798 (flagellin) and CNT sequences.</b> SSEARCH was used to screen the <it>C. botulinum </it>A protein database (3615 sequences) plus the target CNTs using default parameters. <it>E</it>-values were calculated within SSEARCH using randomly reshuffled copies of the library sequences, as described in Methods. CBO0798's rank relative to all 3615 <it>C. botulinum </it>proteins, associated E-value, and the alignment regions are reported for nine search cases (BoNT/A-G, TeNT, and NTNHA). The flagellin aligned to two separate regions in the CNTs, suggesting an ancestral duplication.</p>
               </text>
               <file name="1471-2148-8-316-S1.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <suppl id="S2">
               <title>
                  <p>Additional file 2</p>
               </title>
               <text>
                  <p><b>Key alignments between CBO0798 and CNTs produced by SSEARCH and PSI-BLAST. </b>Smith-Waterman alignments between BoNT/A-CBO0798 and NTNH-CBO0798 were performed using SSEARCH (default parameters used with -z 11 flag). The listed <it>E</it>-values are based on a single pairwise alignment of both sequences rather than a database search. The alignment between CBO0798 and BoNT/E from <it>C. butyricum </it>is the result of a PSI-BLAST search of CBO0798 (restricted to <it>Clostridia</it>) using default parameters with composition-based statistics.</p>
               </text>
               <file name="1471-2148-8-316-S2.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <fig id="F1">
               <title>
                  <p>Figure 1</p>
               </title>
               <caption>
                  <p>Protein sequence similarity heat map surrounding the BoNT/A neurotoxin gene cluster</p>
               </caption>
               <text>
                  <p><b>Protein sequence similarity heat map surrounding the BoNT/A neurotoxin gene cluster</b>. Sequence similarity scores, <it>E</it>-values and percentile ranks were calculated for all pairwise combinations of putative proteins encoded in the <it>C. botulinum </it>strain A genome. A) A heat map of the percentile ranks for pairwise alignments involving 100 genes surrounding the neurotoxin gene cluster (described in Methods). B) Similarity ranks and <it>E</it>-values (in brackets) for pairwise protein sequence alignments in the neurotoxin gene cluster, corresponding to BoNT/A, NTNH, CBO0798, associated hemagglutinin components and other neighboring genes. <it>E</it>-values &lt; 1 are in boldface.</p>
               </text>
               <graphic file="1471-2148-8-316-1"/>
            </fig>
            <fig id="F2">
               <title>
                  <p>Figure 2</p>
               </title>
               <caption>
                  <p>Genomic location of flagellin CBO0798 and regions of sequence similarity with CNTs</p>
               </caption>
               <text>
                  <p><b>Genomic location of flagellin CBO0798 and regions of sequence similarity with CNTs</b>. A) Genomic context of the neurotoxin gene cluster for <it>C. botulinum </it>A. str. Hall. B) Domain structure of CNTs and regions of detected similarity with NTNH (region I) and BoNT/A (region 2) according to SSEARCH. A consensus CNT secondary structure based on a multiple sequence alignment is indicated below the schematic, with black lines representing alpha helices and grey lines representing beta sheets. C) The structure of BoNT/A (PDB ID <ext-link ext-link-type="pdb" ext-link-id="3BTA">3BTA</ext-link>) with region II highlighted in red. D) A Smith-Waterman alignment of region II from <it>C. butyricum </it>BoNT/E and CBO0798.</p>
               </text>
               <graphic file="1471-2148-8-316-2"/>
            </fig>
            <p>In addition to the detected similarities between CBO0798-BoNT/A and NTNH-BoNT/A, sequence similarities were also detected between the beta-trefoil hemagglutinin components (HA33 and HA17). HA33 and HA17 were identified as reciprocal top ranked matches (<it>E </it>= 0.041, 0.2), and a weak similarity was detected between HA33 and the C-terminal (beta-trefoil) regions of NTNH (ranked 10th, <it>E </it>= 2.4). Sequence similarity was also found between the hemagglutinin components HA70 and residues 39&#8211;474 of NTNH (ranked #2 out of all pairwise alignments with HA70 as the query, <it>E </it>= 0.72). Though the <it>E</it>-values calculated above are not all statistically significant, the high-ranking scores relative to the 3615 <it>C. botulinum </it>proteins suggest that multiple genes within the BoNT/A neurotoxin gene cluster are likely distant homologues that have undergone extensive sequence divergence.</p>
         </sec>
         <sec>
            <st>
               <p>Sequence similarity to the upstream flagellin gene</p>
            </st>
            <p>To identify other clostridial sequences homologous to CBO0798, a PSI-BLAST <abbrgrp><abbr bid="B23">23</abbr></abbrgrp> search was conducted starting with the CBO0798 sequence (default parameters, results restricted to <it>Clostridia</it>). All homologues identified in the first iteration were members of the flagellin family. The second iteration identified additional flagellins, followed by the type E botulinum toxin (BoNT/E) from <it>C. butyricum </it>with an <it>E</it>-value of 0.05 [23% sequence identity over residues 88&#8211;406 of flagellin and 727&#8211;1045 (region II) of BoNT/E, see Additional File <supplr sid="S2">2</supplr> for the alignment]. To check for the influence of composition on the alignment, two permutation reshuffling tests were performed, which calculate the probability that random sequences of the same composition could result in similar alignment scores. The permutation reshuffling tests detected significant sequence similarity between the two proteins with (p = 0.0024) and without (p = 0.011) statistically overrepresented amino acids included (see Methods).</p>
            <p>According to the sequence alignments produced by PSI-BLAST and SSEARCH, the region of CNTs with the strongest detected similarity to CBO0798 (region II) includes most of the translocase domain as well as the HCRn domain (Figure <figr fid="F2">2B&#8211;D</figr>). Region I was also detected by SSEARCH (Additional File <supplr sid="S1">1</supplr>), spanning the peptidase and 'belt' region, though without definitive statistical significance (<it>E </it>= 0.41). The translocase, an extended alpha-helical domain, has a general structural similarity to the central helical regions of known flagellin structures (see PDB IDs <ext-link ext-link-type="pdb" ext-link-id="1io1">1io1</ext-link>, <ext-link ext-link-type="pdb" ext-link-id="2zbi">2zbi</ext-link>, <ext-link ext-link-type="pdb" ext-link-id="2d4x">2d4x</ext-link>). The beta-rich domains of flagellin are highly variable however, and it is this variable region of flagellin that shares similarity with the HCRn domain of CNTs. As a structure is not available for the variable region of CBO0798, 3D-PSSM <abbrgrp><abbr bid="B24">24</abbr></abbrgrp> was used to predict the fold of CBO0798's central region. The structure for the CNT's HCRn jelly-roll domain was the top ranked structural match for this region (<it>E </it>= 0.34), additionally supporting homology between the two proteins.</p>
            <p>CBO0798 is annotated in the NCBI database as a member of the flagellar hook associated protein 3 (FlgL) family. This flagellin gene has been mentioned in previous CNT studies due to its close proximity to the neurotoxin gene cluster <abbrgrp><abbr bid="B13">13</abbr></abbrgrp> and its existence in numerous <it>C. botulinum </it>type A strains and associated plasmids <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>. Flagellins are also known to have key roles in the virulence of bacterial pathogens <abbrgrp><abbr bid="B26">26</abbr></abbrgrp>, have been shown by mass spectrometry studies to interact with CNT components <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>, and possess previously unreported common structural features with CNTs (i.e., both contain a central region composed of extended alpha-helices followed by beta-rich domains <abbrgrp><abbr bid="B4">4</abbr><abbr bid="B28">28</abbr></abbrgrp>). These additional functional and structural links further support a potential evolutionary relationship between CBO0798 and CNTs.</p>
         </sec>
         <sec>
            <st>
               <p>Collagenase-like domains in the flagellin hypervariable region</p>
            </st>
            <p>Comparative sequence analysis of CBO0798 was performed by aligning CBO0798 to other flagellins from <it>Clostridium </it>species (Additional File <supplr sid="S3">3</supplr>). According to the alignment, CBO0798 has a highly divergent central region containing a unique insert (residues ~135&#8211;360), and this insert region comprises a large portion of CBO0798's alignments with CNTs. The existence of a unique central region within CBO0798 is not surprising, since flagellins are known to contain conserved regions at the N- and C-terminus but have a hypervariable central region that is structurally exposed on the flagellar surface <abbrgrp><abbr bid="B26">26</abbr></abbrgrp>. As the structurally exposed region of the flagellar filament, the hypervariable region can interact with the host cell and is thus critical to flagellin-mediated virulence <abbrgrp><abbr bid="B26">26</abbr></abbrgrp>. Interestingly, it is the variable region of CBO0798 that is central to the CBO0798-CNT alignments and that was predicted by 3D-PSSM to possess a jelly-roll fold similar to HCRn.</p>
            <suppl id="S3">
               <title>
                  <p>Additional file 3</p>
               </title>
               <text>
                  <p><b>Multiple alignment of clostridial flagellins.</b> The N- and C-terminal domains are indicated, and the intermediate section represents the flagellin hypervariable region. The collagenase-like insert identified within the hypervariable region of FliA(H) is boxed in red. CBO0798 is underlined in black. Additional clostridial flagellins containing large hypervariable region inserts are grouped with CBO0798 and FliA(H) at the beginning of the alignment.</p>
               </text>
               <file name="1471-2148-8-316-S3.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <p>To characterize the origins of the insert, we examined similarly located inserts identified within the hypervariable region of a small number of additional flagellins from <it>Clostridium </it>species (Additional File <supplr sid="S3">3</supplr>). While the sequences within the hypervariable region are highly divergent from one another as expected, one insert in particular [the insert of FliA(H) from <it>C. haemolyticum</it>] was identified to be both the largest insert and the only insert region with detected homology to other proteins using PSI-BLAST. FliA(H) is a relatively close homolog of CBO0798, as FliA(H) was the only flagellin detected using CBO0798's C-terminal region (residues 114&#8211;452) as a BLAST query sequence (<it>E </it>= 0.076). A PSI-BLAST search revealed that the hypervariable region of FliA(H) possesses significant similarity to microbial collagenases (<it>E </it>= 8e-04, iteration 2) and to the hypervariable regions of several flagellins from non-clostridial species (Figure <figr fid="F3">3</figr>). Remarkably, both the detected microbial collagenases and collagenase-like regions within the identified flagellins contain a HEXXH motif, the critical catalytic residues responsible for the CNT's zinc-endopeptidase activity. The alignment of CBO0798 with collagenase-containing flagellins and alignment of the HEXXH-containing segments from these flagellins, BoNT/B, and a representative microbial collagenase are shown in Figure <figr fid="F3">3</figr>.</p>
            <fig id="F3">
               <title>
                  <p>Figure 3</p>
               </title>
               <caption>
                  <p>Collagenase-like sequences within the flagellin hypervariable region</p>
               </caption>
               <text>
                  <p><b>Collagenase-like sequences within the flagellin hypervariable region</b>. A) A multiple alignment of CBO0798 and collagenase-containing flagellins identified by PSI-BLAST. Vertical black bars in the alignment correspond to the collagenase-containing region identified by a PSI-BLAST search using <it>C. haemolyticum </it>FliA(H) as the query. B) A schematic of a representative collagenase-containing flagellin based on the FliA(H) sequence. An alignment of similar HEXXH-containing segments from BoNT/B, a microbial collagenase, and the collagenase-containing flagellins are shown below the schematic. Accession numbers are provided in the Methods.</p>
               </text>
               <graphic file="1471-2148-8-316-3"/>
            </fig>
            <p>The identified link to collagenase sequences by analysis of the flagellin hypervariable region is a striking result given the strong similarities between collagenases and the CNT's Peptidase M27 domain. Both collagenases (Peptidase M9s) and Peptidase M27's are zinc-endopeptidases and are grouped under the same peptidase family (thermolysin-like Peptidase MA clan) by the MEROPS database <abbrgrp><abbr bid="B18">18</abbr></abbrgrp>. As exotoxins, collagenases play a major role in clostridial toxicity by degrading collagenous host tissues <abbrgrp><abbr bid="B29">29</abbr><abbr bid="B30">30</abbr></abbrgrp>. For instance, <it>C. perfringens</it>, a species responsible for clostridial myonecrosis (gas gangrene), produces a tissue-degrading collagenase known as kappa-toxin <abbrgrp><abbr bid="B30">30</abbr></abbrgrp>. Collagenases are therefore an excellent candidate evolutionary precursor of CNTs as both collagenases and CNTs function as clostridial toxins, and both share the same fundamental proteolytic mechanism.</p>
            <p>As the hypervariable region encodes the outer exposed portion of the flagellin filament, it would be ideally situated to interact with (and potentially degrade) host cell wall components such as collagen. The identified sequences may therefore encode a novel family of virulent flagellins with collagenase activity. Future experimental verification of this predicted activity would be valuable, and could potentially lead to new avenues of research concerning the virulent functions of bacterial flagellins.</p>
         </sec>
         <sec>
            <st>
               <p>Additional evidence of collagenase-related functions within the neurotoxin gene cluster</p>
            </st>
            <p>Several additional links to collagenases and collagen-related domains were detected for other sequences present within the BoNT/A neurotoxin gene cluster. All sequenced <it>Clostridium </it>genomes were screened for potential homologues of each of the BoNT/A neurotoxin gene cluster components. In a dataset of over 55000 sequences, a search of BoNT/A detected flagellin as the third top ranked hit outside of the CNT family (<it>E </it>= 0.23). While HA33 expectedly displayed similarities with other ricin-like components (e.g., a ricin-domain from a <it>C. acetobutylicum </it>cellulase, NP_347343, <it>E </it>= 0.019), HA70 displayed the strongest similarity to <it>C. perfringens </it>enterotoxin (YP_697710, <it>E </it>= 0.0042) followed by <it>C. tetani </it>collagenase (NP_783761, <it>E </it>= 0.22). A HEXXH binding motif was also identified within this collagenase sequence. A PSI-BLAST search of flagellin CBO0798 restricted to the <it>Clostridium </it>genus also detected collagen-adhesion proteins with alignments spanning the hypervariable region after three iterations (<it>E </it>= 0.017, ZP_02635881). This result is consistent with the analysis linking CBO0798 with flagellins containing collagenase-like hypervariable regions.</p>
            <p>Another key result was obtained when examining sequence and structural similarities between the HCRn domain and the full NCBI nr database, including eukaryotic sequences. After two iterations starting with BoNT/A's HCRn domain, PSI-BLAST detected a region of chicken type XII collagen (AAA48635, <it>E </it>= 0.03). The detected sequence similarity occurred with collagen's thrombospondin N-terminal like domains. Recently, the structure of this family of domains has been determined for the NC4 domain of collagen IX <abbrgrp><abbr bid="B31">31</abbr></abbrgrp>. The fold of NC4 (PDB ID <ext-link ext-link-type="pdb" ext-link-id="2UUR">2UUR</ext-link>) is remarkably similar to that of HCRn (Figure <figr fid="F4">4</figr>). To determine the extent of structural similarity between these two domains, we analyzed structural neighbours of the NC4 domain using the VAST structural alignment algorithm <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>. Sorted by VAST <it>E</it>-value, the two most structurally similar domains to PDB ID <ext-link ext-link-type="pdb" ext-link-id="2UUR">2UUR</ext-link> were its identified fold family (PDB IDs <ext-link ext-link-type="pdb" ext-link-id="2ES3">2ES3</ext-link> and <ext-link ext-link-type="pdb" ext-link-id="2OUJ">2OUJ</ext-link>), the thrombospondin N-terminal domain), followed by the HCRn domain (PDB ID <ext-link ext-link-type="pdb" ext-link-id="1DLL">1DLL</ext-link>) of the tetanus neurotoxin (<it>E </it>= 10e-9.9). Ranked by sequence similarity based on structural alignments, the tetanus HCRn domain (PDB ID <ext-link ext-link-type="pdb" ext-link-id="1YYN">1YYN</ext-link>) ranked first out of all known structures in the Protein Data Bank (%ID = 17.7).</p>
            <fig id="F4">
               <title>
                  <p>Figure 4</p>
               </title>
               <caption>
                  <p>Structural similarity between HCRn and the NC4 domain of collagen IX</p>
               </caption>
               <text>
                  <p><b>Structural similarity between HCRn and the NC4 domain of collagen IX</b>. A structural superposition of the human collagen IX NC4 domain (<ext-link ext-link-type="pdb" ext-link-id="2UUR">2UUR</ext-link>) and the TeNT HCRn domain (<ext-link ext-link-type="pdb" ext-link-id="1YYN">1YYN</ext-link>) was performed using the VAST alignment algorithm <url>http://www.ncbi.nlm.nih.gov/Structure/VAST/vastsearch.html</url>. In the structural alignment <ext-link ext-link-type="pdb" ext-link-id="2UUR">2UUR</ext-link> and <ext-link ext-link-type="pdb" ext-link-id="1YYN">1YYN</ext-link> are colored pink and blue respectively.</p>
               </text>
               <graphic file="1471-2148-8-316-4"/>
            </fig>
            <p>As the detected similarities between the HCRn domain and the collagen NC4 domain occur across kingdoms, this may represent an instance of structural mimicry rather than a direct evolutionary relationship. Given the multiple identified links to collagenases, and that structural mimicry of collagen has been proposed as a mechanism for other collagenase enzymes <abbrgrp><abbr bid="B33">33</abbr></abbrgrp>, the link between HCRn and the collagen NC4 domain may be indicative of a similar mechanism. A role in collagen-binding is entirely possible for CNTs as previous studies have shown that expression of TeNT enhances adhesion of epithelial cells to collagen, laminin, and fibronectin <abbrgrp><abbr bid="B34">34</abbr></abbrgrp>. While the observed similarities support the hypothesis of convergent evolution and structural mimicry, the possibility that HCRn was transferred to <it>Clostridium </it>from a eukaryotic source cannot be completely ruled out. This scenario has been demonstrated recently for the <it>Clostridium </it>glyceraldehyde-3-phosphate dehydrogenase gene <abbrgrp><abbr bid="B35">35</abbr></abbrgrp>.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Conclusion</p>
         </st>
         <p>A comprehensive search was conducted for potential distant homologues of CNTs, starting with a genomic analysis of <it>C. botulinum </it>strain A, followed by a more general search involving additional clostridial and eukaryotic species. Multiple independent links to collagenase-related sequences were identified, including the detected similarities involving the upstream flagellin gene (CBO0798) in the BoNT/A neurotoxin gene cluster, distant BLAST hits to collagenase-related domains, and detected structural similarities to the collagen NC4 domain. As microbial collagenases are phyletically widespread compared to CNTs, they represent a protein family likely to be ancestral to CNTs. Given this and the multiple detected links to collagenase-related sequences, it is proposed that the ancestral function of the neurotoxin gene cluster may have been related to collagen binding and degradation, a hypothesis that places CNT sequence, structure, and function within the broader context of other clostridial toxins and the evolution of clostridial pathogenesis.</p>
         <p>The CBO0798 flagellin gene appears to be a divergent member of a unique class of flagellins containing a collagenase-like hypervariable domain, an ideal arrangement for the development of novel virulent functions and co-evolution with host cell walls. It is possible that repeats and rearrangements of a gene ancestral to CBO0798 may have been involved in the origin of the ancestral CNT gene. To date, the CBO0798 flagellin sequence has only been identified in a number of group I strains (see <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>), and identification and analysis of additional CBO0798 homologues in other strains could provide a broader context to the evolutionary relationship between flagellins, collagenases, and CNTs.</p>
      </sec>
      <sec>
         <st>
            <p>Methods</p>
         </st>
         <sec>
            <st>
               <p>Sequence dataset and database searches</p>
            </st>
            <p>Botulinum neurotoxins A-G (P10845, ABM73983, BAA08418, AAB24244, CAA43999, 1904210A, CAA52275), and NTNH/A (YP_001253341) sequences were retrieved from NCBI. The flagellin and collagenase sequences used in the alignment of the HEXXH-containing segment (Figure <figr fid="F3">3</figr>) were <it>Clostridium haemolyticum </it>flagellin [FliA(H)], BAB87738; <it>Pseudoalteromonas tunicata </it>flagellin, ZP_01132756; <it>Azoarcus </it>BH72 flagellin, YP_934037; <it>Desulfuromonas acetoxidans </it>flagellin, ZP_01312630; and <it>Burkholderia pseudomallei </it>collagenase, ZP_01765667. The following default parameters were used in all PSI-BLAST <abbrgrp><abbr bid="B23">23</abbr></abbrgrp> searches unless specified otherwise: Blosum62 matrix, Gap existence: 11, Gap Extension: 1, <it>E</it>-value cutoff = 0.005, with conditional compositional matrix score adjustment. The SSEARCH <abbrgrp><abbr bid="B21">21</abbr></abbrgrp> program from the FASTA package (version 3.515) was used to search the <it>C. botulinum </it>protein database, and was obtained via the SANGER website <url>ftp://ftp.sanger.ac.uk/pub/pathogens/cb/</url>. SSEARCH was run with default parameters, except for the -z 11 flag, which computes the regression by reshuffling the target sequence library (removing the influence of homologous sequences present within the genome). For searching additional <it>Clostridium </it>species, the following protein sequence databases were retrieved from the NCBI FTP server: <it>C. acetobutylicum </it>ATCC 824, <it>C. beijerinckii </it>NCIMB 8052, <it>C. botulinum </it>A ATCC 3502, <it>C. botulinum </it>A ATCC 19397, <it>C. botulinum </it>A Hall, <it>C. botulinum </it>A3 Loch Maree, <it>C. botulinum </it>B1 Okra, <it>C. botulinum </it>F Langeland, <it>C. difficile </it>630, <it>C. kluyveri </it>DSM 555, <it>C. noyvi </it>NT, <it>C. perfringens </it>13, <it>C. perfringens </it>ATCC 131245, <it>C. perfringens </it>SM101, <it>C. phytofermentans </it>ISDg, <it>C. tetani </it>E88, <it>C. thermocellum </it>ATCC 27405.</p>
         </sec>
         <sec>
            <st>
               <p>Construction of sequence similarity heat map</p>
            </st>
            <p>A perl program was written to generate a 2D sequence similarity matrix based on all-against-all Smith-Waterman alignment scores using 3615 sequences from the <it>C. botulinum </it>protein database. Proteins were ranked by <it>E</it>-values computed by the SSEARCH program with default parameters. The matrix consists of query sequences on the Y-axis, target database proteins on the X-axis, and data values correspond to percentile ranks. This approach was used to detect distant pairwise similarities within gene clusters that may reflect ancient gene duplication blocks. The matrix was visualized using Treeview version 1.1.1 <url>http://rana.lbl.gov/EisenSoftware.htm</url>.</p>
         </sec>
         <sec>
            <st>
               <p>Permutation testing</p>
            </st>
            <p>The PRSS component of the FASTA package <abbrgrp><abbr bid="B21">21</abbr></abbrgrp> was used for sequence reshuffling and the permutation test. The permutation reshuffling test calculates the optimal Smith-Waterman alignments of the first query sequence with N reshuffled versions of the second query sequence. The alignment score of the unshuffled sequences is compared to the distribution of scores obtained using the reshuffled query sequence, which is fit to an extreme value distribution. From this distribution, the probability that the observed alignment score could have resulted from a random sequence of the same composition is estimated. Default parameters were used and 1000 reshuffled sequences were used to generate the random distribution of alignment scores.</p>
            <p>To detect potential compositional bias, the composition of CNTs and CBO0798 was analyzed relative to all protein sequences in <it>C. botulinum </it>strain A as a reference. One amino acid type, asparagine, was found to be significantly elevated in both CBO0798 and CNT sequences (Z > 2 standard deviations). To verify that PSI-BLAST hits from CBO0798 to CNTs sequences were not due to composition, all asparagine residues were removed from CBO0798 and the top-scoring alignment detected via PSI-BLAST (<it>C. butyricum </it>BoNT/E), and permutation reshuffling tests were repeated using the altered sequences.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Authors' contributions</p>
         </st>
         <p>A.C.D. designed and performed the analysis and wrote the paper. B.J.M. designed the analysis and co-wrote the paper. E.M.M., M.D.J.L. and K.M.M. assisted with analysis and preparation of the manuscript.</p>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>This work was supported by grants from the National Science and Engineering Research Council (NSERC) to B.J.M., K.M.M., and E.M.M., and Ontario Early Researcher Awards (ERA) to B.J.M. and K.M.M. A.C.D. is a recipient of an NSERC Canada Graduate Scholarship and M.D.J.L. is a recipient of an Ontario Graduate Scholarship (OGS).</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>Botulinal neurotoxins: revival of an old killer</p>
            </title>
            <aug>
               <au>
                  <snm>Montecucco</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Molgo</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Curr Opin Pharmacol</source>
            <pubdate>2005</pubdate>
            <volume>5</volume>
            <fpage>274</fpage>
            <lpage>279</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.coph.2004.12.006</pubid>
                  <pubid idtype="pmpid" link="fulltext">15907915</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>Kinetic studies on the interaction between botulinum toxin type A and the cholinergic neuromuscular junction</p>
            </title>
            <aug>
               <au>
                  <snm>Simpson</snm>
                  <fnm>LL</fnm>
               </au>
            </aug>
            <source>J Pharmacol Exp Ther</source>
            <pubdate>1980</pubdate>
            <volume>212</volume>
            <fpage>16</fpage>
            <lpage>21</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">6243359</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>Botulinum neurotoxin A selectively cleaves the synaptic protein SNAP-25</p>
            </title>
            <aug>
               <au>
                  <snm>Blasi</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Chapman</snm>
                  <fnm>ER</fnm>
               </au>
               <au>
                  <snm>Link</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Binz</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Yamasaki</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>De Camilli</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>S&#252;dhof</snm>
                  <fnm>TC</fnm>
               </au>
               <au>
                  <snm>Niemann</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Jahn</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>1993</pubdate>
            <volume>365</volume>
            <fpage>160</fpage>
            <lpage>163</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/365160a0</pubid>
                  <pubid idtype="pmpid" link="fulltext">8103915</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>Crystal structure of botulinum neurotoxin type A and implications for toxicity</p>
            </title>
            <aug>
               <au>
                  <snm>Lacy</snm>
                  <fnm>DB</fnm>
               </au>
               <au>
                  <snm>Tepp</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Cohen</snm>
                  <fnm>AC</fnm>
               </au>
               <au>
                  <snm>DasGupta</snm>
                  <fnm>BR</fnm>
               </au>
               <au>
                  <snm>Stevens</snm>
                  <fnm>RC</fnm>
               </au>
            </aug>
            <source>Nat Struct Biol</source>
            <pubdate>1998</pubdate>
            <volume>5</volume>
            <fpage>898</fpage>
            <lpage>902</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/2338</pubid>
                  <pubid idtype="pmpid" link="fulltext">9783750</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>Cocrystal structure of synaptobrevin-II bound to botulinum neurotoxin type B at 2.0 A resolution</p>
            </title>
            <aug>
               <au>
                  <snm>Hanson</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Stevens</snm>
                  <fnm>RC</fnm>
               </au>
            </aug>
            <source>Nat Struct Biol</source>
            <pubdate>2000</pubdate>
            <volume>7</volume>
            <fpage>687</fpage>
            <lpage>692</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/77997</pubid>
                  <pubid idtype="pmpid">10932255</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>Structural analysis of the catalytic and binding sites of <it>Clostridium botulinum </it>neurotoxin B</p>
            </title>
            <aug>
               <au>
                  <snm>Swaminathan</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Eswaramoorthy</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Nat Struct Biol</source>
            <pubdate>2000</pubdate>
            <volume>7</volume>
            <fpage>693</fpage>
            <lpage>699</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/78005</pubid>
                  <pubid idtype="pmpid">10932256</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>SV2 Is the Protein Receptor For Botulinum Neurotoxin A</p>
            </title>
            <aug>
               <au>
                  <snm>Dong</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Yeh</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Tepp</snm>
                  <fnm>WH</fnm>
               </au>
               <au>
                  <snm>Dean</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Johnson</snm>
                  <fnm>EA</fnm>
               </au>
               <au>
                  <snm>Janz</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Chapman</snm>
                  <fnm>ER</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2006</pubdate>
            <volume>312</volume>
            <fpage>592</fpage>
            <lpage>596</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.1123654</pubid>
                  <pubid idtype="pmpid" link="fulltext">16543415</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>Botulinum neurotoxin B recognizes its protein receptor with high affinity and specificity</p>
            </title>
            <aug>
               <au>
                  <snm>Jin</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Rummel</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Binz</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Brunger</snm>
                  <fnm>AT</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2006</pubdate>
            <volume>444</volume>
            <fpage>1092</fpage>
            <lpage>1095</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nature05387</pubid>
                  <pubid idtype="pmpid" link="fulltext">17167421</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>Structural basis of cell surface receptor recognition by botulinum neurotoxin B</p>
            </title>
            <aug>
               <au>
                  <snm>Chai</snm>
                  <fnm>Q</fnm>
               </au>
               <au>
                  <snm>Arndt</snm>
                  <fnm>JW</fnm>
               </au>
               <au>
                  <snm>Dong</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Tepp</snm>
                  <fnm>WH</fnm>
               </au>
               <au>
                  <snm>Johnson</snm>
                  <fnm>EA</fnm>
               </au>
               <au>
                  <snm>Chapman</snm>
                  <fnm>ER</fnm>
               </au>
               <au>
                  <snm>Stevens</snm>
                  <fnm>RC</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2006</pubdate>
            <volume>444</volume>
            <fpage>1096</fpage>
            <lpage>1100</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nature05411</pubid>
                  <pubid idtype="pmpid" link="fulltext">17167418</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>Phylogeny and taxonomy of the food-borne pathogen <it>Clostridium botulinum </it>and its neurotoxins</p>
            </title>
            <aug>
               <au>
                  <snm>Collins</snm>
                  <fnm>MD</fnm>
               </au>
               <au>
                  <snm>East</snm>
                  <fnm>AK</fnm>
               </au>
            </aug>
            <source>J Appl Microbiol</source>
            <pubdate>1998</pubdate>
            <volume>84</volume>
            <fpage>5</fpage>
            <lpage>17</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1046/j.1365-2672.1997.00313.x</pubid>
                  <pubid idtype="pmpid" link="fulltext">15244052</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>Analysis of neurotoxin cluster genes in <it>Clostridium botulinum </it>strains producing botulinum neurotoxin serotype A subtypes</p>
            </title>
            <aug>
               <au>
                  <snm>Jacobson</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Lin</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Raphael</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Andreadis</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Johnson</snm>
                  <fnm>EA</fnm>
               </au>
            </aug>
            <source>Appl Environ Microbiol</source>
            <pubdate>2008</pubdate>
            <volume>74</volume>
            <fpage>2778</fpage>
            <lpage>2786</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1128/AEM.02828-07</pubid>
                  <pubid idtype="pmpid" link="fulltext">18326685</pubid>
                  <pubid idtype="pmcid">2394882</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>Plasmid encoded neurotoxin genes in <it>Clostridium botulinum </it>serotype A subtypes</p>
            </title>
            <aug>
               <au>
                  <snm>Marshall</snm>
                  <fnm>KM</fnm>
               </au>
               <au>
                  <snm>Bradshaw</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Pellett</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Johnson</snm>
                  <fnm>EA</fnm>
               </au>
            </aug>
            <source>Biochem Biophys Res Commun</source>
            <pubdate>2007</pubdate>
            <volume>361</volume>
            <fpage>49</fpage>
            <lpage>54</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.bbrc.2007.06.166</pubid>
                  <pubid idtype="pmpid" link="fulltext">17658467</pubid>
                  <pubid idtype="pmcid">2346372</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B13">
            <title>
               <p>Neurotoxin gene clusters in <it>Clostridium botulinum </it>type A strains: sequence comparison and evolutionary implications</p>
            </title>
            <aug>
               <au>
                  <snm>Dineen</snm>
                  <fnm>SS</fnm>
               </au>
               <au>
                  <snm>Bradshaw</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Johnson</snm>
                  <fnm>EA</fnm>
               </au>
            </aug>
            <source>Curr Microbiol</source>
            <pubdate>2003</pubdate>
            <volume>46</volume>
            <fpage>345</fpage>
            <lpage>352</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1007/s00284-002-3851-1</pubid>
                  <pubid idtype="pmpid" link="fulltext">12732962</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>SCOP: a structural classification of proteins database for the investigation of sequences and structures</p>
            </title>
            <aug>
               <au>
                  <snm>Murzin</snm>
                  <fnm>AG</fnm>
               </au>
               <au>
                  <snm>Brenner</snm>
                  <fnm>SE</fnm>
               </au>
               <au>
                  <snm>Hubbard</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Chothia</snm>
                  <fnm>C</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1995</pubdate>
            <volume>247</volume>
            <fpage>536</fpage>
            <lpage>540</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">7723011</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B15">
            <title>
               <p>Pfam: clans, web tools and services</p>
            </title>
            <aug>
               <au>
                  <snm>Finn</snm>
                  <fnm>RD</fnm>
               </au>
               <au>
                  <snm>Mistry</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Schuster-B&#246;ckler</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Griffiths-Jones</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Hollich</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Lassmann</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Moxon</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Marshall</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Khanna</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Durbin</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Eddy</snm>
                  <fnm>SR</fnm>
               </au>
               <au>
                  <snm>Sonnhammer</snm>
                  <fnm>EL</fnm>
               </au>
               <au>
                  <snm>Bateman</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2006</pubdate>
            <volume>34</volume>
            <fpage>D247</fpage>
            <lpage>D251</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/nar/gkj149</pubid>
                  <pubid idtype="pmpid" link="fulltext">16381856</pubid>
                  <pubid idtype="pmcid">1347511</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <title>
               <p>beta-Trefoil fold. Patterns of structure and sequence in the Kunitz inhibitors interleukins-1 beta and 1 alpha and fibroblast growth factors</p>
            </title>
            <aug>
               <au>
                  <snm>Murzin</snm>
                  <fnm>AG</fnm>
               </au>
               <au>
                  <snm>Lesk</snm>
                  <fnm>AM</fnm>
               </au>
               <au>
                  <snm>Chothia</snm>
                  <fnm>C</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1992</pubdate>
            <volume>223</volume>
            <fpage>531</fpage>
            <lpage>543</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/0022-2836(92)90668-A</pubid>
                  <pubid idtype="pmpid" link="fulltext">1738162</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>Crystal structure of colicin Ia</p>
            </title>
            <aug>
               <au>
                  <snm>Wiener</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Freymann</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Ghosh</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Stroud</snm>
                  <fnm>RM</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>1997</pubdate>
            <volume>385</volume>
            <fpage>461</fpage>
            <lpage>464</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/385461a0</pubid>
                  <pubid idtype="pmpid" link="fulltext">9009197</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p>MEROPS: the peptidase database</p>
            </title>
            <aug>
               <au>
                  <snm>Rawlings</snm>
                  <fnm>ND</fnm>
               </au>
               <au>
                  <snm>Morton</snm>
                  <fnm>FR</fnm>
               </au>
               <au>
                  <snm>Barrett</snm>
                  <fnm>AJ</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2006</pubdate>
            <volume>34</volume>
            <fpage>D270</fpage>
            <lpage>D272</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/nar/gkj089</pubid>
                  <pubid idtype="pmpid" link="fulltext">16381862</pubid>
                  <pubid idtype="pmcid">1347452</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B19">
            <title>
               <p>Structure, diversity, and evolution of protein toxins from spore-forming entomopathogenic bacteria</p>
            </title>
            <aug>
               <au>
                  <snm>de Maagd</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Bravo</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Berry</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Crickmore</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Schnepf</snm>
                  <fnm>HE</fnm>
               </au>
            </aug>
            <source>Annu Rev Genet</source>
            <pubdate>2003</pubdate>
            <volume>37</volume>
            <fpage>409</fpage>
            <lpage>433</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1146/annurev.genet.37.110801.143042</pubid>
                  <pubid idtype="pmpid" link="fulltext">14616068</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B20">
            <title>
               <p>Genome sequence of a proteolytic (Group I) <it>Clostridium botulinum </it>strain Hall A and comparative analysis of the clostridial genomes</p>
            </title>
            <aug>
               <au>
                  <snm>Sebaihia</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Peck</snm>
                  <fnm>MW</fnm>
               </au>
               <au>
                  <snm>Minton</snm>
                  <fnm>NP</fnm>
               </au>
               <au>
                  <snm>Thomson</snm>
                  <fnm>NR</fnm>
               </au>
               <au>
                  <snm>Holden</snm>
                  <fnm>MT</fnm>
               </au>
               <au>
                  <snm>Mitchell</snm>
                  <fnm>WJ</fnm>
               </au>
               <au>
                  <snm>Carter</snm>
                  <fnm>AT</fnm>
               </au>
               <au>
                  <snm>Bentley</snm>
                  <fnm>SD</fnm>
               </au>
               <au>
                  <snm>Mason</snm>
                  <fnm>DR</fnm>
               </au>
               <au>
                  <snm>Crossman</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Paul</snm>
                  <fnm>CJ</fnm>
               </au>
               <au>
                  <snm>Ivens</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Wells-Bennik</snm>
                  <fnm>MH</fnm>
               </au>
               <au>
                  <snm>Davis</snm>
                  <fnm>IJ</fnm>
               </au>
               <au>
                  <snm>Cerde&#241;o-T&#225;rraga</snm>
                  <fnm>AM</fnm>
               </au>
               <au>
                  <snm>Churcher</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Quail</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Chillingworth</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Feltwell</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Fraser</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Goodhead</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Hance</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Jagels</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Larke</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Maddison</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Moule</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Mungall</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Norbertczak</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Rabbinowitsch</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Sanders</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Simmonds</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>White</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Whithead</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Parkhill</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2001</pubdate>
            <volume>17</volume>
            <fpage>1082</fpage>
            <lpage>1092</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">17519437</pubid>
                  <pubid idtype="doi">10.1101/gr.6282807</pubid>
                  <pubid idtype="pmcid">1899119</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B21">
            <title>
               <p>Improved tools for biological sequence comparison</p>
            </title>
            <aug>
               <au>
                  <snm>Pearson</snm>
                  <fnm>WR</fnm>
               </au>
               <au>
                  <snm>Lipman</snm>
                  <fnm>DJ</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>1988</pubdate>
            <volume>85</volume>
            <fpage>2444</fpage>
            <lpage>2448</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1073/pnas.85.8.2444</pubid>
                  <pubid idtype="pmpid" link="fulltext">3162770</pubid>
                  <pubid idtype="pmcid">280013</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B22">
            <title>
               <p>The structure of the neurotoxin-associated protein HA33/A from <it>Clostridium botulinum </it>suggests a reoccurring beta-trefoil fold in the progenitor toxin complex</p>
            </title>
            <aug>
               <au>
                  <snm>Arndt</snm>
                  <fnm>JW</fnm>
               </au>
               <au>
                  <snm>Gu</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Jaroszewski</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Schwarzenbacher</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Hanson</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Lebeda</snm>
                  <fnm>FJ</fnm>
               </au>
               <au>
                  <snm>Stevens</snm>
                  <fnm>RC</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>2005</pubdate>
            <volume>346</volume>
            <fpage>1083</fpage>
            <lpage>1093</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.jmb.2004.12.039</pubid>
                  <pubid idtype="pmpid" link="fulltext">15701519</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B23">
            <title>
               <p>Gapped BLAST and PSI-BLAST: a new generation of protein database search programs</p>
            </title>
            <aug>
               <au>
                  <snm>Altschul</snm>
                  <fnm>SF</fnm>
               </au>
               <au>
                  <snm>Madden</snm>
                  <fnm>TL</fnm>
               </au>
               <au>
                  <snm>Sch&#228;ffer</snm>
                  <fnm>AA</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Miller</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Lipman</snm>
                  <fnm>DJ</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>1997</pubdate>
            <volume>25</volume>
            <fpage>3389</fpage>
            <lpage>3402</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/nar/25.17.3389</pubid>
                  <pubid idtype="pmpid" link="fulltext">9254694</pubid>
                  <pubid idtype="pmcid">146917</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B24">
            <title>
               <p>CAFASP-1: critical assessment of fully automated structure prediction methods</p>
            </title>
            <aug>
               <au>
                  <snm>Fischer</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Barret</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Bryson</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Elofsson</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Godzik</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Jones</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Karplus</snm>
                  <fnm>KJ</fnm>
               </au>
               <au>
                  <snm>Kelley</snm>
                  <fnm>LA</fnm>
               </au>
               <au>
                  <snm>MacCallum</snm>
                  <fnm>RM</fnm>
               </au>
               <au>
                  <snm>Pawowski</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Rost</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Rychlewski</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Sternberg</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Proteins</source>
            <pubdate>1999</pubdate>
            <volume>3</volume>
            <fpage>209</fpage>
            <lpage>217</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/(SICI)1097-0134(1999)37:3+&lt;209::AID-PROT27>3.0.CO;2-Y</pubid>
                  <pubid idtype="pmpid">10526371</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B25">
            <title>
               <p>Analysis of the Neurotoxin Complex Genes in <it>Clostridium botulinum </it>A1&#8211;A4 and B1 Strains: BoNT/A3, /Ba4 and /B1 Clusters Are Located within Plasmids</p>
            </title>
            <aug>
               <au>
                  <snm>Smith</snm>
                  <fnm>TJ</fnm>
               </au>
               <au>
                  <snm>Hill</snm>
                  <fnm>KK</fnm>
               </au>
               <au>
                  <snm>Foley</snm>
                  <fnm>BT</fnm>
               </au>
               <au>
                  <snm>Detter</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Munk</snm>
                  <fnm>AC</fnm>
               </au>
               <au>
                  <snm>Bruce</snm>
                  <fnm>DC</fnm>
               </au>
               <au>
                  <snm>Doggett</snm>
                  <fnm>NA</fnm>
               </au>
               <au>
                  <snm>Smith</snm>
                  <fnm>LA</fnm>
               </au>
               <au>
                  <snm>Marks</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Xie</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Brettin</snm>
                  <fnm>TS</fnm>
               </au>
            </aug>
            <source>PLoS ONE</source>
            <pubdate>2007</pubdate>
            <volume>2</volume>
            <fpage>e1271</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1371/journal.pone.0001271</pubid>
                  <pubid idtype="pmpid" link="fulltext">18060065</pubid>
                  <pubid idtype="pmcid">2092393</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B26">
            <title>
               <p>Bacterial flagellins: mediators of pathogenicity and host immune responses in mucosa</p>
            </title>
            <aug>
               <au>
                  <snm>Ramos</snm>
                  <fnm>HC</fnm>
               </au>
               <au>
                  <snm>Rumbo</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Sirard</snm>
                  <fnm>JC</fnm>
               </au>
            </aug>
            <source>Trends Microbiol</source>
            <pubdate>2004</pubdate>
            <volume>12</volume>
            <fpage>509</fpage>
            <lpage>517</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.tim.2004.09.002</pubid>
                  <pubid idtype="pmpid" link="fulltext">15488392</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B27">
            <title>
               <p>Characterization of botulinum progenitor toxins by mass spectrometry</p>
            </title>
            <aug>
               <au>
                  <snm>Hines</snm>
                  <fnm>HB</fnm>
               </au>
               <au>
                  <snm>Lebeda</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Hale</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Brueggemann</snm>
                  <fnm>EE</fnm>
               </au>
            </aug>
            <source>Appl Environ Microbiol</source>
            <pubdate>2005</pubdate>
            <volume>71</volume>
            <fpage>4478</fpage>
            <lpage>4486</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1128/AEM.71.8.4478-4486.2005</pubid>
                  <pubid idtype="pmpid" link="fulltext">16085839</pubid>
                  <pubid idtype="pmcid">1183299</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B28">
            <title>
               <p>Structure of the bacterial flagellar protofilament and implications for a switch for supercoiling</p>
            </title>
            <aug>
               <au>
                  <snm>Samatey</snm>
                  <fnm>FA</fnm>
               </au>
               <au>
                  <snm>Imada</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Nagashima</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Vonderviszt</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Kumasaka</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Yamamoto</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Namba</snm>
                  <fnm>K</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2001</pubdate>
            <volume>410</volume>
            <fpage>331</fpage>
            <lpage>337</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/35066504</pubid>
                  <pubid idtype="pmpid" link="fulltext">11268201</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B29">
            <title>
               <p>Toxigenic Clostridia</p>
            </title>
            <aug>
               <au>
                  <snm>Hatheway</snm>
                  <fnm>CL</fnm>
               </au>
            </aug>
            <source>Clin Microbiol Rev</source>
            <pubdate>1990</pubdate>
            <volume>3</volume>
            <fpage>66</fpage>
            <lpage>98</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">358141</pubid>
                  <pubid idtype="pmpid" link="fulltext">2404569</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B30">
            <title>
               <p>Bacterial collagenases and collagen-degrading enzymes and their potential role in human disease</p>
            </title>
            <aug>
               <au>
                  <snm>Harrington</snm>
                  <fnm>DJ</fnm>
               </au>
            </aug>
            <source>Infect Immun</source>
            <pubdate>1996</pubdate>
            <volume>64</volume>
            <fpage>1885</fpage>
            <lpage>9</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">174012</pubid>
                  <pubid idtype="pmpid" link="fulltext">8675283</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B31">
            <title>
               <p>Crystal structure of the N-terminal NC4 domain of collagen IX, a zinc binding member of the laminin-neurexin-sex hormone binding globulin (LNS) domain family</p>
            </title>
            <aug>
               <au>
                  <snm>Lepp&#228;nen</snm>
                  <fnm>VM</fnm>
               </au>
               <au>
                  <snm>Tossavainen</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Permi</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Lehti&#246;</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>R&#246;nnholm</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Goldman</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Kilpela&#239;nen</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Pihlajamaa</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>J Biol Chem</source>
            <pubdate>2007</pubdate>
            <volume>282</volume>
            <fpage>23219</fpage>
            <lpage>23230</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1074/jbc.M702514200</pubid>
                  <pubid idtype="pmpid" link="fulltext">17553797</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B32">
            <title>
               <p>Surprising similarities in structure comparison</p>
            </title>
            <aug>
               <au>
                  <snm>Gibrat</snm>
                  <fnm>JF</fnm>
               </au>
               <au>
                  <snm>Madej</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Bryant</snm>
                  <fnm>SH</fnm>
               </au>
            </aug>
            <source>Curr Opin Struct Biol</source>
            <pubdate>1996</pubdate>
            <volume>6</volume>
            <fpage>377</fpage>
            <lpage>385</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0959-440X(96)80058-3</pubid>
                  <pubid idtype="pmpid" link="fulltext">8804824</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B33">
            <title>
               <p>Collagen/collagenase interaction: does the enzyme mimic the conformation of its own substrate?</p>
            </title>
            <aug>
               <au>
                  <snm>De Souza</snm>
                  <fnm>SJ</fnm>
               </au>
               <au>
                  <snm>Pereira</snm>
                  <fnm>HM</fnm>
               </au>
               <au>
                  <snm>Jacchieri</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Brentani</snm>
                  <fnm>RR</fnm>
               </au>
            </aug>
            <source>FASEB J</source>
            <pubdate>1996</pubdate>
            <volume>10</volume>
            <fpage>927</fpage>
            <lpage>930</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">8666171</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B34">
            <title>
               <p>Tetanus neurotoxin-mediated cleavage of cellubrevin impairs epithelial cell migration and integrin-dependent cell adhesion</p>
            </title>
            <aug>
               <au>
                  <snm>Proux-Gillardeaux</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Gavard</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Irinopoulou</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>M&#232;ge</snm>
                  <fnm>RM</fnm>
               </au>
               <au>
                  <snm>Galli</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2005</pubdate>
            <volume>102</volume>
            <fpage>6362</fpage>
            <lpage>7</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1073/pnas.0409613102</pubid>
                  <pubid idtype="pmpid" link="fulltext">15851685</pubid>
                  <pubid idtype="pmcid">1088364</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B35">
            <title>
               <p>Eukaryotic origin of glyceraldehyde-3-phosphate dehydrogenase genes in <it>Clostridium thermocellum </it>and <it>Clostridium cellulolyticum </it>genomes and putative fates of the exogenous gene in the subsequent genome evolution</p>
            </title>
            <aug>
               <au>
                  <snm>Takishita</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Inagaki</snm>
                  <fnm>Y</fnm>
               </au>
            </aug>
            <source>Gene</source>
            <inpress/>
            <note>[<it>2008, Mar 10, doi:10.1016/j.gene.2008.03.001</it>]</note>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pubmed">18420358</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
      </refgrp>
   </bm>
</art>
