<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1471-2164-9-138</ui>
   <ji>1471-2164</ji>
   <fm>
      <dochead>Research article</dochead>
      <bibl>
         <title>
            <p>Comparative genomics-based investigation of resequencing targets in <it>Vibrio fischeri</it>: Focus on point miscalls and artefactual expansions</p>
         </title>
         <aug>
            <au id="A1" ca="yes">
               <snm>Mandel</snm>
               <mi>J</mi>
               <fnm>Mark</fnm>
               <insr iid="I1"/>
               <email>mmandel@wisc.edu</email>
            </au>
            <au id="A2">
               <snm>Stabb</snm>
               <mi>V</mi>
               <fnm>Eric</fnm>
               <insr iid="I2"/>
               <email>estabb@uga.edu</email>
            </au>
            <au id="A3">
               <snm>Ruby</snm>
               <mi>G</mi>
               <fnm>Edward</fnm>
               <insr iid="I1"/>
               <email>egruby@wisc.edu</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Department of Medical Microbiology and Immunology, University of Wisconsin School of Medicine and Public Health, 1550 Linden Drive, Madison WI 53706-1521, USA</p>
            </ins>
            <ins id="I2">
               <p>Department of Microbiology, University of Georgia, 828 Biological Sciences, Athens, GA 30602-2605, USA</p>
            </ins>
         </insg>
         <source>BMC Genomics</source>
         <issn>1471-2164</issn>
         <pubdate>2008</pubdate>
         <volume>9</volume>
         <issue>1</issue>
         <fpage>138</fpage>
         <url>http://www.biomedcentral.com/1471-2164/9/138</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">18366731</pubid>
               <pubid idtype="doi">10.1186/1471-2164-9-138</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>05</day>
               <month>12</month>
               <year>2007</year>
            </date>
         </rec>
         <acc>
            <date>
               <day>25</day>
               <month>3</month>
               <year>2008</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>25</day>
               <month>3</month>
               <year>2008</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2008</year>
         <collab>Mandel et al; licensee BioMed Central Ltd.</collab>
         <note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>Sequence closure often represents the end-point of a genome project, without a system in place for subsequent improvement and refinement. Building on the genome project of <it>Vibrio fischeri </it>ES114, we used a comparative approach to identify and investigate genes that had a high likelihood of sequence error.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>Comparison of the <it>V. fischeri </it>ES114 genome with that of conspecific strain MJ11 identified 82 target loci in ES114 as containing likely errors, and thus of high-priority for resequencing. Analysis of the targets identified 75 loci in which an error had occurred, resulting in the correction of 10,457 base pairs to generate the new ES114 genomic sequence. A majority of the inaccurate loci involved frameshift errors, correction of which fused adjacent ORFs. Although insertions/deletions are thought to be rare in microbial genome assemblies, fourteen of the loci contained extraneous sequence of over 300 bp, likely due to imperfect contig ends that were misassembled in tandem rather than as overlapping segments. Additionally we updated the entire genome annotation with 113 new features including previously uncalled protein-coding genes, regulatory RNA genes and operon leader peptides, and we analyzed the transcriptional apparatus encoded by ES114.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusion</p>
               </st>
               <p>We demonstrate that errors in microbial genome sequences, thought to largely be confined to point mutations, may also consist of other prevalent large-scale rearrangements such as insertions. Ongoing genome quality control and annotation programs are necessary to accompany technological advancements in data generation. These updates further advance <it>V. fischeri </it>as an important model for understanding intercellular communication and colonization of animal tissue.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>In the thirteen years since the announcement of the first complete organism genome <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>, there has been a rapid accumulation of sequence data from complete and draft genomes. The number of complete or almost-complete projects is in the range of 3,000 <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>, but this number is a "moving target," and improvements in sequencing technologies over the past decade ensure continued rapid expansion in the number and diversity of organisms that are analyzed by complete genome sequencing.</p>
         <p>Despite these significant advances in data acquisition, there have not been commensurate improvements in data-quality assessment and refinement during this period. Individual miscalled bases are assumed to be present in practically all completed genome sequences, and their frequency has been suggested to be between 1&#8211;100 errors per 100 kb <abbrgrp><abbr bid="B3">3</abbr></abbrgrp> and has been measured in some instances to be at most 1 error per 88 kb <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B4">4</abbr></abbrgrp>. Errors in microbial genomes are believed to be generally restricted to point miscalls, with large-scale rearrangements rarely occurring <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>. To identify and correct errors, recent studies have utilized microarray-based detection, in which errors in a subject genome are identified by comparison to a reference genome which served as the basis for array construction. For example, this method has been employed successfully in <it>Escherichia coli </it><abbrgrp><abbr bid="B5">5</abbr></abbrgrp> and <it>Bacillus anthracis </it><abbrgrp><abbr bid="B6">6</abbr></abbrgrp>. However, these analyses are unidirectional: "errors" are defined as sequence distinct from that of the reference genome, and therefore errors in the reference genome cannot be detected.</p>
         <p>As small nucleotide changes in a genome model often manifest as large protein errors &#8211; for instance, due to introduction of frameshift and nonsense errors &#8211; multiple approaches have capitalized on this protein signal to detect DNA errors in complete genomes <abbrgrp><abbr bid="B7">7</abbr><abbr bid="B8">8</abbr><abbr bid="B9">9</abbr><abbr bid="B10">10</abbr></abbrgrp>. By comparing protein-coding sequences in a subject strain to those in a closely-related strain or to closely-related proteins in molecular databases, one can identify those that are potentially truncated inappropriately in the subject strain and target those regions for resequencing. Targeted resequencing has been applied successfully in <it>B. subtilis </it><abbrgrp><abbr bid="B10">10</abbr></abbrgrp> and <it>Mycobacterium smegmatis </it><abbrgrp><abbr bid="B11">11</abbr></abbrgrp>, and in both cases the errors were restricted to changes in 1&#8211;2 nucleotides. Importantly, Perrodou et al. <abbrgrp><abbr bid="B8">8</abbr></abbrgrp> generalized this method <it>in silico </it>to make it available to any subject organism of interest. Targeted resequencing is efficient and available to a wide range of investigators because: (i) the initial steps are completed <it>in silico </it>prior to proceeding to the wet laboratory; and (ii) when a closely-related strain is available targeted resequencing provides an efficient means to identify discrepancies that alter coding sequence predictions.</p>
         <p>In this study, we focus on the genome of the luminous Gram-negative bacterium <it>Vibrio fischeri </it>ES114. <it>V. fischeri </it>forms symbiotic associations with squid and fish, and the association between <it>V. fischeri </it>and the Hawaiian bobtail squid <it>Euprymna scolopes </it>represents one of the most powerful natural models for the study of mutualistic animal-microbe relationships. Specific strains of symbiotic <it>V. fischeri </it>colonize a dedicated "light organ" in the squid host, multiply to high density, and exhibit luminescence in a density-dependent manner <abbrgrp><abbr bid="B12">12</abbr><abbr bid="B13">13</abbr></abbrgrp>. The light produced by the bacteria is believed to aid the squid host by providing protection from predators: the shadow revealed from the nocturnal-foraging squid in moonlight is camouflaged by the downward-welling light of the host-associated <it>V. fischeri </it><abbrgrp><abbr bid="B14">14</abbr></abbrgrp>. In return, the bacterium benefits from a protected, nutrient-rich environment. This was the first system in which it was shown that a specific symbiont directs normal animal development <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>, and now represents an emerging model for cross-kingdom genomics-based studies.</p>
         <p>The genomic potential for this system is based on a strong history of molecular inquiry on both the symbiont and host sides of the interaction. First, the complete genome sequence of squid symbiont <it>V. fischeri </it>ES114 has been published and studied, and the sequence revealed novel insights into pilin gene diversity and the distribution of toxin genes in beneficial bacteria <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>. Second, based on the genome sequence a number of global studies have been initiated; the first sets to be published yield novel results about how chemical communication among <it>V. fischeri </it>strains regulates bacterial behavior <abbrgrp><abbr bid="B17">17</abbr><abbr bid="B18">18</abbr></abbrgrp> and how two-component signal transduction affects host-interaction <abbrgrp><abbr bid="B19">19</abbr><abbr bid="B20">20</abbr></abbrgrp>. Third, an EST library of the squid host <abbrgrp><abbr bid="B21">21</abbr></abbrgrp> has provided novel insight into cephalopod genetic capabilities and widely conserved signaling pathways such as the NF-&#954;B pathway <abbrgrp><abbr bid="B22">22</abbr></abbrgrp>. Fourth, the phenomenon we now call quorum sensing &#8211; autoinduced density-dependent cell-cell communication &#8211; was first described in <it>V. fischeri </it><abbrgrp><abbr bid="B23">23</abbr></abbrgrp>, and a number of evolutionary and modeling studies of this process have focused on the well-characterized systems in <it>V. fischeri</it>. Fifth, by having access to the natural host &#8211; a rarity among systems in which high-throughput genetic and genomics approaches are applicable &#8211; we can exploit the high information content in the coevolved squid-<it>Vibrio </it>relationship to learn how closely-related pathogenic marine microbes interact with natural hosts that have yet to be identified. Sixth, the draft genome of a second strain of <it>V. fischeri</it>, the fish symbiont MJ11, is being completed and will provide a strong platform for applying comparative genomic approaches to the study of host-specificity.</p>
         <p>While undertaking such a comparative study among <it>V. fischeri </it>strains, we detected a high incidence of suspected genomic anomalies in the published sequence of <it>V. fischeri </it>ES114. We resequenced these suspect regions and identified 91% of these loci to be in error. Notably, in fourteen of the cases we detected misincorporation of extraneous sequence in the published assembly, leading to the appearance of duplicated DNA where none existed. In five other cases, the sequence in the suspect region was correct in the published sequence and the resulting gene product would be predicted to be nonfunctional; we therefore designated these features as pseudogenes in ES114. In addition to correcting these features, we completed a full genomic update of ES114 gene annotations, and incorporated the addition of 113 genes that were previously unannotated into release 2.0 of the ES114 annotation. Together these updates advance <it>V. fischeri </it>as a platform for functional and comparative genomic studies, and demonstrate how a targeted set of approaches may yield high impact on genomic quality improvement.</p>
      </sec>
      <sec>
         <st>
            <p>Results</p>
         </st>
         <sec>
            <st>
               <p>Identification of suspect genomic regions</p>
            </st>
            <p>We obtained the draft genome sequence of <it>V. fischeri </it>strain MJ11 and, as part of our initial analysis, we conducted a number of reciprocal BLAST analyses to compare its predicted proteome with that of the completely sequenced conspecific strain ES114 <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>. We used BLASTP <abbrgrp><abbr bid="B24">24</abbr></abbrgrp> to identify orthologs between the two strains, using a modified reciprocal best-hit approach as outlined in the Methods. A surprising outcome from this analysis was the occurrence of over seventy protein-coding genes in MJ11 with reciprocal best-hits to two neighboring genes in ES114. At the time that we were performing this analysis, a handful of cases were being identified empirically in which neighboring genes in ES114 were actually one gene, and that the appearance of two genes resulted from frameshift or nonsense errors in the original sequence data. Examples that were identified independent of this work include <it>ptsI </it><abbrgrp><abbr bid="B25">25</abbr></abbrgrp>, <it>fnr </it>(J.L. Bose and E.V.S., unpublished data), and <it>acs </it>(S.V. Studer &amp; E.G.R., unpublished data).</p>
            <p>Analysis of the suspect regions supported the hypothesis that there were a large number of loci in ES114 in which sequencing errors had led to the miscalling of one gene as multiple ORFs. In support of this hypothesis, we identified a number of genes that are essential in <it>Escherichia coli </it>and other bacteria, but that were split in version 1.0 of the ES114 sequence. These included <it>dnaG</it>, <it>ftsQ</it>, <it>mukB</it>, <it>nusG</it>, <it>rplC</it>, <it>rplN</it>, <it>rplO</it>, <it>rpoB</it>, <it>rpoC</it>, <it>thrS</it>, and <it>tilS </it><abbrgrp><abbr bid="B26">26</abbr><abbr bid="B27">27</abbr></abbrgrp>, and the conditionally-essential <it>rpoH </it><abbrgrp><abbr bid="B28">28</abbr></abbrgrp>. Second, we identified eleven ambiguous bases (i.e., "N" listed in the nucleotide sequence) that had been called in the original sequence, and the incidence of these bases correlated with the presence of suspect ORFs.</p>
            <p>In addition to suspected frameshifts and substitutions, we also identified fourteen regions in which it appeared that extraneous sequence had been incorporated that was highly similar to neighboring sequence, with the size of the duplicated/extraneous region ranging from 318 bp to 1264 bp. In one case, pre-genomic sequencing of a suspect region did not identify any repeated sequence <abbrgrp><abbr bid="B29">29</abbr></abbrgrp>. Therefore, we hypothesized that these regions represented assembly errors in which the same stretch of DNA was mistakenly incorporated twice into the genome's sequence. These regions typically contained a few unique base pairs at either end &#8211; likely due to low-coverage sequencing &#8211; that led to the misincorporation, but were otherwise essentially a direct repeat of DNA that had the effect of introducing extra and/or truncated ORFs.</p>
            <p>A list of the loci targeted for resequencing was assembled and each was assigned a "target number"; that number is used consistently in tables and figures so that the primer sequences used to analyze the data may be correlated with the resulting sequence and analysis.</p>
            <p>In addition to the BLASTP-based identification of potential errors, we undertook a full-length visual comparison of the chromosomes of <it>V. fischeri </it>ES114 and MJ11. Given the prevalence of errors detected by identifying adjacent ORFs that likely represented a single ORF, we hypothesized that there were probably other cases of errors that would not have manifest themselves in this way. Examples of other suspected errors that warranted investigation included situations in which one of the fragmented ORFs was too small to be detected as an ortholog candidate by the BLASTP filters, or in which the second fragment did not lead to a predicted open reading frame. Using the program Mauve <abbrgrp><abbr bid="B30">30</abbr></abbrgrp>, we analyzed ORFs along the length of the chromosomes, identifying candidates that had suspect 5' or 3' ends. In some cases, these appeared to result solely from annotation differences, in which identical sequences had predicted translational start sites (5' boundaries) that were called at distinct points in the two annotations. In other cases, sequence differences underlay the unique ORF boundaries, and we targeted those for our analysis. Furthermore, there were three cases in which putative extraneous sequence was visually identified in intergenic sequence, which could not have been detected by BLASTP analysis in the absence of annotated ORFs (target nos. 130, 172, 178). These cases were added to the list of targeted loci. Finally, any remaining ambiguous bases in the sequence were targeted for resequencing.</p>
         </sec>
         <sec>
            <st>
               <p>Sequence clarification</p>
            </st>
            <p>We examined a total of 82 targets for resequencing. Our general approach involved amplifying across the target, and then sequencing the amplified product with the PCR primers. In cases where we were clarifying the sequence following a large detected "deletion" (missing sequence from what is predicted from the published sequence), we amplified a larger product and sequenced from a set of sequencing primers across both strands. For the oligonucleotide primers used for PCR and sequencing see Additional file <supplr sid="S1">1</supplr>. With one exception, all of the primer pairs amplified products in which there was a clear, predominant band, and thus served as satisfactory templates for sequencing. The primer pair that failed to amplify (target no. 180) included a primer that was in a region that does not exist in the true ES114 sequence, as clarified by our analysis of target no. 182. Therefore the absence of a band in this case supports the deletion that emerged from target no. 182.</p>
            <suppl id="S1">
               <title>
                  <p>Additional file 1</p>
               </title>
               <text>
                  <p>Table listing oligonucleotide primers. The PCR and sequencing primers used to analyze resequencing targets.</p>
               </text>
               <file name="1471-2164-9-138-S1.xls">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <p>Seventy-five of the 82 sets (91%) of resequencing targets examined were found to be in error in the published ES114 sequence. The errors, subsequent changes, new locus tags, and new annotations, are listed in Table <tblr tid="T1">1</tblr>. Conceptual diagrams of representative sequencing and other annotation changes discussed during this report are illustrated in Figure <figr fid="F1">1</figr>. Note that with this update, the locus tag format has been modified to the new NCBI format for locus tags (underscore following the "VF" prefix, which denotes <it>V. fischeri </it>ES114). As a convention, in cases of gene fusion, the locus tag of the 5'-proximal (N-terminal-encoding) fragment retained its locus tag identifier, while the identifier(s) for the remaining gene fragment(s) were deaccessioned.</p>
            <fig id="F1">
               <title>
                  <p>Figure 1</p>
               </title>
               <caption>
                  <p>Types of genomic changes described</p>
               </caption>
               <text>
                  <p><b>Types of genomic changes described</b>. Examples of the types of chromosomal corrections (A-C) and annotation corrections (D-F) described throughout the paper. The case in (B) shows the artefactual expansions that were removed in this analysis. v1 refers to the previously published version 1.0 release, and v2 refers to the version 2.0 release reported here.</p>
               </text>
               <graphic file="1471-2164-9-138-1"/>
            </fig>
            <tbl id="T1">
               <title>
                  <p>Table 1</p>
               </title>
               <caption>
                  <p><it>V. fischeri </it>ES114 loci modified due to sequence changes.</p>
               </caption>
               <tblbdy cols="8">
                  <r>
                     <c ca="left">
                        <p>
                           <b>Locus tag</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>Gene</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>Description</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>Correction</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>s/m</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>Effect on ORFs</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>Locus tag deaccessioned</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>Target</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c cspan="8">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_0040</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>yidZ</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>transcriptional regulator, LysR family</p>
                     </c>
                     <c ca="left">
                        <p>fs</p>
                     </c>
                     <c ca="left">
                        <p>s</p>
                     </c>
                     <c ca="left">
                        <p>fusion</p>
                     </c>
                     <c ca="left">
                        <p>VF0039</p>
                     </c>
                     <c ca="left">
                        <p>101</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_0044</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>rmuC</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>predicted recombination limiting protein</p>
                     </c>
                     <c ca="left">
                        <p>fs</p>
                     </c>
                     <c ca="left">
                        <p>s</p>
                     </c>
                     <c ca="left">
                        <p>fusion</p>
                     </c>
                     <c ca="left">
                        <p>VF0045</p>
                     </c>
                     <c ca="left">
                        <p>102</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_0056</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>rhlB</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>ATP-dependent RNA helicase</p>
                     </c>
                     <c ca="left">
                        <p>fs</p>
                     </c>
                     <c ca="left">
                        <p>s</p>
                     </c>
                     <c ca="left">
                        <p>fusion</p>
                     </c>
                     <c ca="left">
                        <p>VF0055</p>
                     </c>
                     <c ca="left">
                        <p>103</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_0093</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>add</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>adenosine deaminase</p>
                     </c>
                     <c ca="left">
                        <p>dl</p>
                     </c>
                     <c ca="left">
                        <p>m</p>
                     </c>
                     <c ca="left">
                        <p>fusion</p>
                     </c>
                     <c ca="left">
                        <p>VF0092</p>
                     </c>
                     <c ca="left">
                        <p>104</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_0124</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>slmA</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>division inhibitor</p>
                     </c>
                     <c ca="left">
                        <p>fs</p>
                     </c>
                     <c ca="left">
                        <p>s</p>
                     </c>
                     <c ca="left">
                        <p>fusion</p>
                     </c>
                     <c ca="left">
                        <p>VF0123</p>
                     </c>
                     <c ca="left">
                        <p>105</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_0157</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>wbfB</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>WbfB protein</p>
                     </c>
                     <c ca="left">
                        <p>fs, ms, n</p>
                     </c>
                     <c ca="left">
                        <p>m</p>
                     </c>
                     <c ca="left">
                        <p>fusion</p>
                     </c>
                     <c ca="left">
                        <p>VF0156</p>
                     </c>
                     <c ca="left">
                        <p>106</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_0160</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>wbfD</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>WbfD protein</p>
                     </c>
                     <c ca="left">
                        <p>fs</p>
                     </c>
                     <c ca="left">
                        <p>s</p>
                     </c>
                     <c ca="left">
                        <p>fusion</p>
                     </c>
                     <c ca="left">
                        <p>VF0159</p>
                     </c>
                     <c ca="left">
                        <p>107</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_0214</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>prkB</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>phosphoribulokinase</p>
                     </c>
                     <c ca="left">
                        <p>fs</p>
                     </c>
                     <c ca="left">
                        <p>s</p>
                     </c>
                     <c ca="left">
                        <p>fusion</p>
                     </c>
                     <c ca="left">
                        <p>VF0213</p>
                     </c>
                     <c ca="left">
                        <p>109</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_0220</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>kefB</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>potassium:proton antiporter</p>
                     </c>
                     <c ca="left">
                        <p>fs, ns</p>
                     </c>
                     <c ca="left">
                        <p>m</p>
                     </c>
                     <c ca="left">
                        <p>fusion</p>
                     </c>
                     <c ca="left">
                        <p>VF0221</p>
                     </c>
                     <c ca="left">
                        <p>110</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_0235</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>rplC</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>50S ribosomal subunit protein L3</p>
                     </c>
                     <c ca="left">
                        <p>fs</p>
                     </c>
                     <c ca="left">
                        <p>s</p>
                     </c>
                     <c ca="left">
                        <p>fusion</p>
                     </c>
                     <c ca="left">
                        <p>VF0236</p>
                     </c>
                     <c ca="left">
                        <p>111</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_0246</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>rplN</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>50S ribosomal subunit protein L14</p>
                     </c>
                     <c ca="left">
                        <p>fs</p>
                     </c>
                     <c ca="left">
                        <p>s</p>
                     </c>
                     <c ca="left">
                        <p>fusion</p>
                     </c>
                     <c ca="left">
                        <p>VF0247</p>
                     </c>
                     <c ca="left">
                        <p>112</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_0256</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>rplO</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>50S ribosomal subunit protein L15</p>
                     </c>
                     <c ca="left">
                        <p>fs</p>
                     </c>
                     <c ca="left">
                        <p>s</p>
                     </c>
                     <c ca="left">
                        <p>3' extension</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>168</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_0281</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>yjjP</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>predicted inner membrane protein</p>
                     </c>
                     <c ca="left">
                        <p>fs</p>
                     </c>
                     <c ca="left">
                        <p>s</p>
                     </c>
                     <c ca="left">
                        <p>fusion</p>
                     </c>
                     <c ca="left">
                        <p>VF0282</p>
                     </c>
                     <c ca="left">
                        <p>113</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_0300</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>putative salt-induced outer membrane protein</p>
                     </c>
                     <c ca="left">
                        <p>fs</p>
                     </c>
                     <c ca="left">
                        <p>s</p>
                     </c>
                     <c ca="left">
                        <p>fusion</p>
                     </c>
                     <c ca="left">
                        <p>VF0299</p>
                     </c>
                     <c ca="left">
                        <p>114</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_0397</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>yrbC</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>predicted ABC-type organic solvent transporter</p>
                     </c>
                     <c ca="left">
                        <p>fs</p>
                     </c>
                     <c ca="left">
                        <p>s</p>
                     </c>
                     <c ca="left">
                        <p>fusion</p>
                     </c>
                     <c ca="left">
                        <p>VF0398</p>
                     </c>
                     <c ca="left">
                        <p>116</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_0418</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>dgkA</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>diacylglycerol kinase</p>
                     </c>
                     <c ca="left">
                        <p>fs</p>
                     </c>
                     <c ca="left">
                        <p>m</p>
                     </c>
                     <c ca="left">
                        <p>3' extension</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>169</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_0420</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>mltC</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>membrane-bound lytic murein transglycosylase C</p>
                     </c>
                     <c ca="left">
                        <p>fs, ms</p>
                     </c>
                     <c ca="left">
                        <p>m</p>
                     </c>
                     <c ca="left">
                        <p>fusion</p>
                     </c>
                     <c ca="left">
                        <p>VF0419</p>
                     </c>
                     <c ca="left">
                        <p>117</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_0481</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>glmM</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>phosphoglucosamine mutase</p>
                     </c>
                     <c ca="left">
                        <p>fs</p>
                     </c>
                     <c ca="left">
                        <p>m</p>
                     </c>
                     <c ca="left">
                        <p>fusion</p>
                     </c>
                     <c ca="left">
                        <p>VF0482</p>
                     </c>
                     <c ca="left">
                        <p>118</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_0651</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>amino-acid abc transporter binding protein</p>
                     </c>
                     <c ca="left">
                        <p>fs</p>
                     </c>
                     <c ca="left">
                        <p>s</p>
                     </c>
                     <c ca="left">
                        <p>3' extension</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>170</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_0657</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>succinylglutamate desuccinylase/aspartoacylase family protein</p>
                     </c>
                     <c ca="left">
                        <p>n</p>
                     </c>
                     <c ca="left">
                        <p>s</p>
                     </c>
                     <c ca="left">
                        <p>ambiguous residue clarified</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>179</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_0729</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>nqrE</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>sodium-translocating NADH:quinone oxidoreductase, subunit E</p>
                     </c>
                     <c ca="left">
                        <p>fs</p>
                     </c>
                     <c ca="left">
                        <p>s</p>
                     </c>
                     <c ca="left">
                        <p>fusion</p>
                     </c>
                     <c ca="left">
                        <p>VF0730</p>
                     </c>
                     <c ca="left">
                        <p>119</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_0762</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>ychF</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>predicted GTP-binding protein</p>
                     </c>
                     <c ca="left">
                        <p>fs, ms</p>
                     </c>
                     <c ca="left">
                        <p>m</p>
                     </c>
                     <c ca="left">
                        <p>fusion</p>
                     </c>
                     <c ca="left">
                        <p>VF0761</p>
                     </c>
                     <c ca="left">
                        <p>120</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_0960</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>tolA</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>membrane anchored protein in TolA-TolQ-TolR complex</p>
                     </c>
                     <c ca="left">
                        <p>dl</p>
                     </c>
                     <c ca="left">
                        <p>m</p>
                     </c>
                     <c ca="left">
                        <p>fusion</p>
                     </c>
                     <c ca="left">
                        <p>VF0961</p>
                     </c>
                     <c ca="left">
                        <p>171</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_0993</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>icmF</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>secretion protein IcmF</p>
                     </c>
                     <c ca="left">
                        <p>dl</p>
                     </c>
                     <c ca="left">
                        <p>m</p>
                     </c>
                     <c ca="left">
                        <p>fusion</p>
                     </c>
                     <c ca="left">
                        <p>VF0992</p>
                     </c>
                     <c ca="left">
                        <p>182</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_1031</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>trpG</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>anthranilate phosphoribosyltransferase</p>
                     </c>
                     <c ca="left">
                        <p>fs</p>
                     </c>
                     <c ca="left">
                        <p>s</p>
                     </c>
                     <c ca="left">
                        <p>fusion</p>
                     </c>
                     <c ca="left">
                        <p>VF1030</p>
                     </c>
                     <c ca="left">
                        <p>122</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_1214</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>thrS</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>threonyl-tRNA synthetase</p>
                     </c>
                     <c ca="left">
                        <p>fs</p>
                     </c>
                     <c ca="left">
                        <p>s</p>
                     </c>
                     <c ca="left">
                        <p>fusion</p>
                     </c>
                     <c ca="left">
                        <p>VF1215</p>
                     </c>
                     <c ca="left">
                        <p>123</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_1304</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>copper-exporting ATPase</p>
                     </c>
                     <c ca="left">
                        <p>fs</p>
                     </c>
                     <c ca="left">
                        <p>s</p>
                     </c>
                     <c ca="left">
                        <p>fusion</p>
                     </c>
                     <c ca="left">
                        <p>VF1305</p>
                     </c>
                     <c ca="left">
                        <p>125</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_1308</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>fnr</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>transcriptional regulatory protein Fnr, global regulator of anaerobic growth</p>
                     </c>
                     <c ca="left">
                        <p>ms, ns</p>
                     </c>
                     <c ca="left">
                        <p>m</p>
                     </c>
                     <c ca="left">
                        <p>fusion</p>
                     </c>
                     <c ca="left">
                        <p>VF1309</p>
                     </c>
                     <c ca="left">
                        <p>183</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_1358</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>fdnI</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>formate dehydrogenase N, gamma subunit</p>
                     </c>
                     <c ca="left">
                        <p>fs</p>
                     </c>
                     <c ca="left">
                        <p>m</p>
                     </c>
                     <c ca="left">
                        <p>fusion</p>
                     </c>
                     <c ca="left">
                        <p>VF1357</p>
                     </c>
                     <c ca="left">
                        <p>126</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_1515</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>GGDEF domain protein</p>
                     </c>
                     <c ca="left">
                        <p>fs</p>
                     </c>
                     <c ca="left">
                        <p>s</p>
                     </c>
                     <c ca="left">
                        <p>3' extension</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>185</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_1669</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>menB</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>dihydroxynaphthoic acid synthetase</p>
                     </c>
                     <c ca="left">
                        <p>fs</p>
                     </c>
                     <c ca="left">
                        <p>s</p>
                     </c>
                     <c ca="left">
                        <p>fusion</p>
                     </c>
                     <c ca="left">
                        <p>VF1668</p>
                     </c>
                     <c ca="left">
                        <p>127</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_1771</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>prkA</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>serine kinase PrkA</p>
                     </c>
                     <c ca="left">
                        <p>dl</p>
                     </c>
                     <c ca="left">
                        <p>m</p>
                     </c>
                     <c ca="left">
                        <p>fusion</p>
                     </c>
                     <c ca="left">
                        <p>VF1772</p>
                     </c>
                     <c ca="left">
                        <p>128</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_2633</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>lipoprotein, putative</p>
                     </c>
                     <c ca="left">
                        <p>dl</p>
                     </c>
                     <c ca="left">
                        <p>m</p>
                     </c>
                     <c ca="left">
                        <p>none</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>172</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_1828</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>C-terminal CheW domain, putative chemotaxis coupling protein</p>
                     </c>
                     <c ca="left">
                        <p>fs</p>
                     </c>
                     <c ca="left">
                        <p>s</p>
                     </c>
                     <c ca="left">
                        <p>fusion, 3' extension</p>
                     </c>
                     <c ca="left">
                        <p>VF1827</p>
                     </c>
                     <c ca="left">
                        <p>129</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>None</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Intergenic: VF_1856 &#8211; VF_1858</p>
                     </c>
                     <c ca="left">
                        <p>dl</p>
                     </c>
                     <c ca="left">
                        <p>m</p>
                     </c>
                     <c ca="left">
                        <p>deletion</p>
                     </c>
                     <c ca="left">
                        <p>VF1857</p>
                     </c>
                     <c ca="left">
                        <p>130</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_1895</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>ptsI</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>PEP-protein phosphotransferase of PTS system (enzyme I)</p>
                     </c>
                     <c ca="left">
                        <p>fs</p>
                     </c>
                     <c ca="left">
                        <p>s</p>
                     </c>
                     <c ca="left">
                        <p>fusion</p>
                     </c>
                     <c ca="left">
                        <p>VF1896</p>
                     </c>
                     <c ca="left">
                        <p>184</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_1932</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>fadE</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>acyl coenzyme A dehydrogenase</p>
                     </c>
                     <c ca="left">
                        <p>fs</p>
                     </c>
                     <c ca="left">
                        <p>s</p>
                     </c>
                     <c ca="left">
                        <p>fusion</p>
                     </c>
                     <c ca="left">
                        <p>VF1933</p>
                     </c>
                     <c ca="left">
                        <p>131</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_1938</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>hydroxyacylglutathione hydrolase</p>
                     </c>
                     <c ca="left">
                        <p>ms, n</p>
                     </c>
                     <c ca="left">
                        <p>m</p>
                     </c>
                     <c ca="left">
                        <p>amino acid substitutions</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>121</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_1945</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>tilS</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>tRNA(Ile)-lysidine synthetase</p>
                     </c>
                     <c ca="left">
                        <p>dl</p>
                     </c>
                     <c ca="left">
                        <p>m</p>
                     </c>
                     <c ca="left">
                        <p>fusion</p>
                     </c>
                     <c ca="left">
                        <p>VF1944</p>
                     </c>
                     <c ca="left">
                        <p>132</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_2049</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>malZ</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>maltodextrin glucosidase</p>
                     </c>
                     <c ca="left">
                        <p>fs, ns</p>
                     </c>
                     <c ca="left">
                        <p>m</p>
                     </c>
                     <c ca="left">
                        <p>fusion</p>
                     </c>
                     <c ca="left">
                        <p>VF2050</p>
                     </c>
                     <c ca="left">
                        <p>133</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_2078</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>mazG</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>nucleoside triphosphate pyrophosphohydrolase</p>
                     </c>
                     <c ca="left">
                        <p>fs</p>
                     </c>
                     <c ca="left">
                        <p>s</p>
                     </c>
                     <c ca="left">
                        <p>fusion</p>
                     </c>
                     <c ca="left">
                        <p>VF2077</p>
                     </c>
                     <c ca="left">
                        <p>134</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_2152</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>amtB</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>ammonium transporter</p>
                     </c>
                     <c ca="left">
                        <p>fs</p>
                     </c>
                     <c ca="left">
                        <p>s</p>
                     </c>
                     <c ca="left">
                        <p>3' extension</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>173</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_2166</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>pcnB</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>poly(A) polymerase I</p>
                     </c>
                     <c ca="left">
                        <p>fs, ms, ns</p>
                     </c>
                     <c ca="left">
                        <p>m</p>
                     </c>
                     <c ca="left">
                        <p>fusion</p>
                     </c>
                     <c ca="left">
                        <p>VF2167</p>
                     </c>
                     <c ca="left">
                        <p>135</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_2181</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>aceE</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>pyruvate dehydrogenase, decarboxylase component E1, thiamin-binding</p>
                     </c>
                     <c ca="left">
                        <p>dl</p>
                     </c>
                     <c ca="left">
                        <p>m</p>
                     </c>
                     <c ca="left">
                        <p>fusion</p>
                     </c>
                     <c ca="left">
                        <p>VF2180</p>
                     </c>
                     <c ca="left">
                        <p>136</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_2199</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>ftsQ</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>cell division protein FtsQ</p>
                     </c>
                     <c ca="left">
                        <p>fs</p>
                     </c>
                     <c ca="left">
                        <p>m</p>
                     </c>
                     <c ca="left">
                        <p>fusion</p>
                     </c>
                     <c ca="left">
                        <p>VF2198</p>
                     </c>
                     <c ca="left">
                        <p>137</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_2220</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>ubiquinol-cytochrome c reductase iron-sulfur subunit</p>
                     </c>
                     <c ca="left">
                        <p>fs</p>
                     </c>
                     <c ca="left">
                        <p>s</p>
                     </c>
                     <c ca="left">
                        <p>fusion</p>
                     </c>
                     <c ca="left">
                        <p>VF2219</p>
                     </c>
                     <c ca="left">
                        <p>138</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_2252</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>dnaG</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>DNA primase</p>
                     </c>
                     <c ca="left">
                        <p>fs, ms, ns</p>
                     </c>
                     <c ca="left">
                        <p>m</p>
                     </c>
                     <c ca="left">
                        <p>fusion</p>
                     </c>
                     <c ca="left">
                        <p>VF2253</p>
                     </c>
                     <c ca="left">
                        <p>139</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_2347</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>cysE</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>serine acetyltransferase</p>
                     </c>
                     <c ca="left">
                        <p>fs, ms</p>
                     </c>
                     <c ca="left">
                        <p>m</p>
                     </c>
                     <c ca="left">
                        <p>fusion</p>
                     </c>
                     <c ca="left">
                        <p>VF2346</p>
                     </c>
                     <c ca="left">
                        <p>140</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_2366</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>znuA2</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>high-affinity zinc uptake system protein ZnuA2</p>
                     </c>
                     <c ca="left">
                        <p>fs</p>
                     </c>
                     <c ca="left">
                        <p>s</p>
                     </c>
                     <c ca="left">
                        <p>fusion</p>
                     </c>
                     <c ca="left">
                        <p>VF2365</p>
                     </c>
                     <c ca="left">
                        <p>141</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_2370</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>yeiR</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>predicted enzyme</p>
                     </c>
                     <c ca="left">
                        <p>fs</p>
                     </c>
                     <c ca="left">
                        <p>s</p>
                     </c>
                     <c ca="left">
                        <p>fusion</p>
                     </c>
                     <c ca="left">
                        <p>VF2371</p>
                     </c>
                     <c ca="left">
                        <p>142</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_2377</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>hypothetical protein</p>
                     </c>
                     <c ca="left">
                        <p>dl</p>
                     </c>
                     <c ca="left">
                        <p>m</p>
                     </c>
                     <c ca="left">
                        <p>fusion</p>
                     </c>
                     <c ca="left">
                        <p>VF2378</p>
                     </c>
                     <c ca="left">
                        <p>143</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_2383</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>acs</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>acetyl-CoA synthetase</p>
                     </c>
                     <c ca="left">
                        <p>fs</p>
                     </c>
                     <c ca="left">
                        <p>m</p>
                     </c>
                     <c ca="left">
                        <p>fusion</p>
                     </c>
                     <c ca="left">
                        <p>VF2384</p>
                     </c>
                     <c ca="left">
                        <p>144</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_2389</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>dusB</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>tRNA-dihydrouridine synthase B</p>
                     </c>
                     <c ca="left">
                        <p>fs, ms</p>
                     </c>
                     <c ca="left">
                        <p>m</p>
                     </c>
                     <c ca="left">
                        <p>fusion</p>
                     </c>
                     <c ca="left">
                        <p>VF2390</p>
                     </c>
                     <c ca="left">
                        <p>145</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_2412</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>rpoC</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>RNA polymerase, beta prime subunit</p>
                     </c>
                     <c ca="left">
                        <p>fs</p>
                     </c>
                     <c ca="left">
                        <p>m</p>
                     </c>
                     <c ca="left">
                        <p>fusion</p>
                     </c>
                     <c ca="left">
                        <p>VF2411</p>
                     </c>
                     <c ca="left">
                        <p>146</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_2414</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>rpoB</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>RNA polymerase, beta subunit</p>
                     </c>
                     <c ca="left">
                        <p>fs</p>
                     </c>
                     <c ca="left">
                        <p>m</p>
                     </c>
                     <c ca="left">
                        <p>fusion, 5' extension</p>
                     </c>
                     <c ca="left">
                        <p>VF2413</p>
                     </c>
                     <c ca="left">
                        <p>147&#8211;148</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_2418</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>rplA</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>50S ribosomal subunit protein L1</p>
                     </c>
                     <c ca="left">
                        <p>fs</p>
                     </c>
                     <c ca="left">
                        <p>m</p>
                     </c>
                     <c ca="left">
                        <p>fusion</p>
                     </c>
                     <c ca="left">
                        <p>VF2417</p>
                     </c>
                     <c ca="left">
                        <p>149</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_2421</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>nusG</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>transcription termination factor NusG</p>
                     </c>
                     <c ca="left">
                        <p>fs</p>
                     </c>
                     <c ca="left">
                        <p>m</p>
                     </c>
                     <c ca="left">
                        <p>fusion</p>
                     </c>
                     <c ca="left">
                        <p>VF2420</p>
                     </c>
                     <c ca="left">
                        <p>150</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_2450</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>rpoH</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>RNA polymerase, sigma-32 (sigma-H) factor</p>
                     </c>
                     <c ca="left">
                        <p>fs, ms, n</p>
                     </c>
                     <c ca="left">
                        <p>m</p>
                     </c>
                     <c ca="left">
                        <p>fusion</p>
                     </c>
                     <c ca="left">
                        <p>VF2449</p>
                     </c>
                     <c ca="left">
                        <p>151</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_2463</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>nudE</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>ADP-ribose diphosphatase</p>
                     </c>
                     <c ca="left">
                        <p>fs</p>
                     </c>
                     <c ca="left">
                        <p>s</p>
                     </c>
                     <c ca="left">
                        <p>5' extension</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>174</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_2528</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>ilvC</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>ketol-acid reductoisomerase, NAD(P)-binding</p>
                     </c>
                     <c ca="left">
                        <p>dl</p>
                     </c>
                     <c ca="left">
                        <p>m</p>
                     </c>
                     <c ca="left">
                        <p>fusion</p>
                     </c>
                     <c ca="left">
                        <p>VF2526, VF2527</p>
                     </c>
                     <c ca="left">
                        <p>152</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_A0046</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>acriflavin resistance plasma membrane protein</p>
                     </c>
                     <c ca="left">
                        <p>fs</p>
                     </c>
                     <c ca="left">
                        <p>m</p>
                     </c>
                     <c ca="left">
                        <p>fusion</p>
                     </c>
                     <c ca="left">
                        <p>VFA0047</p>
                     </c>
                     <c ca="left">
                        <p>153</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_A0244</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>GGDEF/EAL domains protein</p>
                     </c>
                     <c ca="left">
                        <p>fs, dl</p>
                     </c>
                     <c ca="left">
                        <p>m</p>
                     </c>
                     <c ca="left">
                        <p>fusion</p>
                     </c>
                     <c ca="left">
                        <p>VFA0242, VFA0243</p>
                     </c>
                     <c ca="left">
                        <p>154&#8211;155</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_A0251</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>fdhF</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>formate dehydrogenase-H</p>
                     </c>
                     <c ca="left">
                        <p>fs</p>
                     </c>
                     <c ca="left">
                        <p>m</p>
                     </c>
                     <c ca="left">
                        <p>fusion</p>
                     </c>
                     <c ca="left">
                        <p>VFA0252</p>
                     </c>
                     <c ca="left">
                        <p>156</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_A0304</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>hypothetical protein</p>
                     </c>
                     <c ca="left">
                        <p>fs</p>
                     </c>
                     <c ca="left">
                        <p>s</p>
                     </c>
                     <c ca="left">
                        <p>5' extension</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>176</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_A0338</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>putative glucosyl hydrolase precursor</p>
                     </c>
                     <c ca="left">
                        <p>fs</p>
                     </c>
                     <c ca="left">
                        <p>m</p>
                     </c>
                     <c ca="left">
                        <p>fusion</p>
                     </c>
                     <c ca="left">
                        <p>VFA0337</p>
                     </c>
                     <c ca="left">
                        <p>158</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_A0353</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>galT</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>galactose-1-phosphate uridylyltransferase</p>
                     </c>
                     <c ca="left">
                        <p>fs</p>
                     </c>
                     <c ca="left">
                        <p>m</p>
                     </c>
                     <c ca="left">
                        <p>fusion</p>
                     </c>
                     <c ca="left">
                        <p>VFA0354</p>
                     </c>
                     <c ca="left">
                        <p>159</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_A0432</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>mukB</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>fused chromosome partitioning protein: predicted nucleotide hydrolase</p>
                     </c>
                     <c ca="left">
                        <p>fs, ms</p>
                     </c>
                     <c ca="left">
                        <p>m</p>
                     </c>
                     <c ca="left">
                        <p>fusion</p>
                     </c>
                     <c ca="left">
                        <p>VFA0433</p>
                     </c>
                     <c ca="left">
                        <p>160</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_A0460</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>mfd</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>transcription-repair coupling factor</p>
                     </c>
                     <c ca="left">
                        <p>fs</p>
                     </c>
                     <c ca="left">
                        <p>s</p>
                     </c>
                     <c ca="left">
                        <p>fusion</p>
                     </c>
                     <c ca="left">
                        <p>VFA0459</p>
                     </c>
                     <c ca="left">
                        <p>161</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>None</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Intergenic: VF_A0655-VF_A0666</p>
                     </c>
                     <c ca="left">
                        <p>fs, ms, n</p>
                     </c>
                     <c ca="left">
                        <p>m</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>178</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_A0832</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>putA</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>proline dehydrogenase</p>
                     </c>
                     <c ca="left">
                        <p>dl</p>
                     </c>
                     <c ca="left">
                        <p>m</p>
                     </c>
                     <c ca="left">
                        <p>fusion</p>
                     </c>
                     <c ca="left">
                        <p>VFA0831</p>
                     </c>
                     <c ca="left">
                        <p>162</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_A0856</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>hypothetical protein</p>
                     </c>
                     <c ca="left">
                        <p>dl</p>
                     </c>
                     <c ca="left">
                        <p>m</p>
                     </c>
                     <c ca="left">
                        <p>fusion</p>
                     </c>
                     <c ca="left">
                        <p>VFA0855</p>
                     </c>
                     <c ca="left">
                        <p>163</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_A1008</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>hypothetical protein</p>
                     </c>
                     <c ca="left">
                        <p>fs, ms</p>
                     </c>
                     <c ca="left">
                        <p>m</p>
                     </c>
                     <c ca="left">
                        <p>fusion</p>
                     </c>
                     <c ca="left">
                        <p>VFA1009</p>
                     </c>
                     <c ca="left">
                        <p>165</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_A1152</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>acrA</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>multidrug efflux system</p>
                     </c>
                     <c ca="left">
                        <p>fs</p>
                     </c>
                     <c ca="left">
                        <p>m</p>
                     </c>
                     <c ca="left">
                        <p>fusion</p>
                     </c>
                     <c ca="left">
                        <p>VFA1151, VFA1150</p>
                     </c>
                     <c ca="left">
                        <p>166</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_A1156</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>ATP-dependent DEXH-box helicase</p>
                     </c>
                     <c ca="left">
                        <p>dl</p>
                     </c>
                     <c ca="left">
                        <p>m</p>
                     </c>
                     <c ca="left">
                        <p>fusion</p>
                     </c>
                     <c ca="left">
                        <p>VFA1157</p>
                     </c>
                     <c ca="left">
                        <p>167</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>Correction types: dl, large deletion; fs, frameshift; ms, missense; ns, nonsense; n, ambiguous nucleotide. s/m indicates whether (s)ingle or (m)ultiple nucleotides were affected by the sequence change.</p>
               </tblfn>
            </tbl>
            <p>It is thought that the creation of false large-scale genomic rearrangements such as insertions rarely occurs in microbial genome projects <abbrgrp><abbr bid="B3">3</abbr><abbr bid="B11">11</abbr></abbrgrp>; however, we confirmed the presence of all fourteen predicted insertions by amplifying from the respective unique flanking regions, and demonstrating that the bands obtained are inconsistent with the previous sequence model (Figure <figr fid="F2">2</figr>). In each case, the bands observed were smaller than predicted, and the sequence obtained led to the precise deletion of the extraneous repeated DNA in the new model.</p>
            <fig id="F2">
               <title>
                  <p>Figure 2</p>
               </title>
               <caption>
                  <p>Evidence of expansions at multiple chromosomal sites</p>
               </caption>
               <text>
                  <p><b>Evidence of expansions at multiple chromosomal sites</b>. The fourteen resequencing targets examined had extraneous sequence in the published version. In each case, correction of the error required large deletions (over 300 bp). For each of the targets examined, the closed arrowhead indicates the band observed upon amplification with the PCR primers listed, whereas the open arrowhead indicates the size of the product expected by the sequence in the published version 1.0. Marker sizes are indicated in kb.</p>
               </text>
               <graphic file="1471-2164-9-138-2"/>
            </fig>
            <p>Most of the resulting changes led to the fusion of two &#8211; or in some cases three &#8211; neighboring ORFs, and/or the extension of ORFs at the 5' or 3' end (Figure <figr fid="F1">1A&#8211;C</figr>; Table <tblr tid="T1">1</tblr>). In one case (target no. 178), the deletion affected only an intergenic region that contains no annotated features. In another case (target no. 172) the deletion identified by visual analysis in Mauve affected what was believed to be a 1498-bp intergenic region between <it>rluE </it>and VF_1777. The corrected sequence revealed this region to be only 368 bp in length, but also that it contains a predicted lipoprotein conserved in <it>V. fischeri </it>MJ11. The new release reflects the sequence deletion, as well as the added annotation for this gene (VF_2633).</p>
            <p>The resulting sequence corrections led us to propose a number of protein annotations that were consistent with our predictions. Based on the corrected sequence, many conserved genes now more closely resemble their orthologs in other species. In other cases, the domain structure of even poorly characterized proteins supported the accuracy of the corrections. For example, target no. 185 extended the 3' end of VF_1515 by correcting a frameshift mutation. Analysis of protein domains by conserved domain search (CDD; <abbrgrp><abbr bid="B31">31</abbr></abbrgrp>) identified an incomplete GGDEF (diguanylate cyclase) domain in the protein's C-terminus, and correction of the frameshift led to inclusion of the entire domain.</p>
         </sec>
         <sec>
            <st>
               <p>Pseudogenes and degeneration in <it>umuC</it></p>
            </st>
            <p>In some of the cases we confirmed the published ES114 sequence to be correct, and that the ORF boundaries (5' or 3' end, or the presence of two genes instead of one) were correct in ES114 version 1.0. Table <tblr tid="T2">2</tblr> lists those five cases that we can now more confidently assume to be pseudogenes in ES114 because they appear to be nonfunctional given their predicted amino acid sequence. In each case, the indicated defect is predicted to interrupt a significant portion of the coding sequence required for function in well-characterized homologs. The N-acetylglucosaminyltransferase VF_A0466 has two (apparently functional) paralogs in the genome, and ES114 is capable of utilizing N-acetylglucosamine as a sole N+C source (data not shown): therefore, the appearance of a pseudogene at this location does not have obvious functional consequences for the cell.</p>
            <tbl id="T2">
               <title>
                  <p>Table 2</p>
               </title>
               <caption>
                  <p>Pseudogenes described in ES114 version 2.0.</p>
               </caption>
               <tblbdy cols="5">
                  <r>
                     <c ca="left">
                        <p>
                           <b>Locus tag</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>Previous</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>Homolog</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>Defect in <it>V. fischeri </it>ES114 versus MJ11</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>Target</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c cspan="5">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_0198</p>
                     </c>
                     <c ca="left">
                        <p>VF0198, VF0199</p>
                     </c>
                     <c ca="left">
                        <p><it>ugd</it>, UDP-glucose 6-dehydrogenase, capsule biosynthetic gene</p>
                     </c>
                     <c ca="left">
                        <p>+1 frameshift</p>
                     </c>
                     <c ca="left">
                        <p>108</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_1268</p>
                     </c>
                     <c ca="left">
                        <p>VF1267, VF1268</p>
                     </c>
                     <c ca="left">
                        <p><it>umuC</it>, DNA polymerase V subunit</p>
                     </c>
                     <c ca="left">
                        <p>amber nonsense codon and 5 bp repeat expansion</p>
                     </c>
                     <c ca="left">
                        <p>124</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_A0141</p>
                     </c>
                     <c ca="left">
                        <p>VFA0141</p>
                     </c>
                     <c ca="left">
                        <p>putative transporter, NadC family protein</p>
                     </c>
                     <c ca="left">
                        <p>-1 frameshift</p>
                     </c>
                     <c ca="left">
                        <p>175</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_A0270</p>
                     </c>
                     <c ca="left">
                        <p>VFA0270, VFA0271</p>
                     </c>
                     <c ca="left">
                        <p>transcriptional regulator, LysR family</p>
                     </c>
                     <c ca="left">
                        <p>amber nonsense codon</p>
                     </c>
                     <c ca="left">
                        <p>157</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_A0466</p>
                     </c>
                     <c ca="left">
                        <p>VFA0466</p>
                     </c>
                     <c ca="left">
                        <p>N-acetylglucosaminyltransferase</p>
                     </c>
                     <c ca="left">
                        <p>-1 frameshift</p>
                     </c>
                     <c ca="left">
                        <p>177</p>
                     </c>
                  </r>
               </tblbdy>
            </tbl>
            <p>There is little information about the remaining four pseudogenes, except for <it>umuC</it>. The transcriptional organization between the genes encoding the DNA polymerase V subunits <it>umuD </it>and <it>umuC </it>is conserved between ES114 and MJ11 (Additional file <supplr sid="S5">5</supplr>). However, <it>umuC </it>has uniquely degenerated in ES114, with both a nonsense codon and a 5-bp repeat expansion following the nonsense codon. DNA polymerase V is responsible for error-prone translesion synthesis (e.g., following UV-irradiation), which allows DNA synthesis to proceed despite a high rate of error incorporation <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>, yet there are organisms, including <it>V. cholerae </it>El Tor, that apparently do not encode these functions <abbrgrp><abbr bid="B33">33</abbr><abbr bid="B34">34</abbr></abbrgrp>. Whether the situation in ES114 represents an evolutionary transition state, or instead this arrangement (<it>umuD</it><sup>+</sup><it>umuC</it><sup>-</sup>) has relevant functional implications remains to be determined.</p>
            <suppl id="S5">
               <title>
                  <p>Additional file 5</p>
               </title>
               <text>
                  <p>Figure of <it>umuDC </it>degeneration in ES114. Alignment of <it>umuDC </it>in strains ES114 and MJ11.</p>
               </text>
               <file name="1471-2164-9-138-S5.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
         </sec>
         <sec>
            <st>
               <p>Annotation of previously uncalled protein-coding genes, regulatory RNAs, and operon leader peptides</p>
            </st>
            <p>Because examination of the intergenic region corrected by target no. 172 revealed a likely protein-coding gene, we asked whether there were other genes present within the ES114 sequence that were previously unannotated. Additionally, regulatory RNA genes had not been previously annotated in the <it>V. fischeri </it>genome, yet they are known to play important roles in <it>V. fischeri </it>and other diverse bacteria <abbrgrp><abbr bid="B35">35</abbr><abbr bid="B36">36</abbr></abbrgrp>. Therefore, we undertook an effort to systematically identify ORFs and regulatory RNA genes that had not been called in the published version 1.0 sequence.</p>
            <p>To accomplish this search we took advantage of the annotations present in the J. Craig Venter Institute's Comprehensive Microbial Resource (JCVI CMR), which include <it>ab initio </it>gene-calls that can differ from those in the deposited GenBank flatfiles <abbrgrp><abbr bid="B37">37</abbr></abbrgrp>. We examined the list of approximately 150 novel genes identified in the CMR. We excluded candidates that were unlikely to be biologically significant &#8211; generally, short ORFs that were encoded on the opposite strand against much larger ORFs &#8211; and were left with 53 likely novel ORFs (Additional file <supplr sid="S4">4</supplr>, Basis code "A"). We also took advantage of the presence of the closely-related MJ11 strain as a source for novel gene annotations. Of the MJ11 proteins that did not have an ortholog in ES114, we examined those in which a TBLASTN query of MJ11 proteins against the ES114 genome yielded a percent amino acid identity value of at least 85%. We excluded candidates in which there was low biological basis for the assignment (as described above), or in which the open reading frame was not conserved in ES114. There were 72 novel ES114 ORFs assigned by comparison with MJ11 (Additional file <supplr sid="S4">4</supplr>, Basis "B"). 30 ORFs were called by both methods (CMR and MJ11 conservation), whereas 65 genes were called by only one method, for a total of 95 uncalled chromosomal protein-coding genes that we predict to be uncalled in ES114 (Table <tblr tid="T3">3</tblr>, Additional file <supplr sid="S4">4</supplr>).</p>
            <suppl id="S4">
               <title>
                  <p>Additional file 4</p>
               </title>
               <text>
                  <p>Table of novel gene features. Detailed ES114 gene annotations added in version 2.0.</p>
               </text>
               <file name="1471-2164-9-138-S4.xls">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <tbl id="T3">
               <title>
                  <p>Table 3</p>
               </title>
               <caption>
                  <p>Summary of 113 new gene features in ES114 version 2.0.</p>
               </caption>
               <tblbdy cols="5">
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Regulatory RNAs</p>
                     </c>
                     <c ca="left">
                        <p>Operon leader peptides</p>
                     </c>
                     <c ca="left">
                        <p>Protein-coding genes</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>TOTAL</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c cspan="5">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Chromosome I</p>
                     </c>
                     <c ca="left">
                        <p>9 (9)<sup><it>a</it></sup></p>
                     </c>
                     <c ca="left">
                        <p>6 (6)</p>
                     </c>
                     <c ca="left">
                        <p>73 (13)</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>88 (28)</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Chromosome II</p>
                     </c>
                     <c ca="left">
                        <p>1 (1)</p>
                     </c>
                     <c ca="left">
                        <p>0 (0)</p>
                     </c>
                     <c ca="left">
                        <p>22 (3)</p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>23 (4)</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>CHROMOSOMES TOTAL</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>10 (10)</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>6 (6)</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>95 (16)</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>111 (32)</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Plasmid pES100</p>
                     </c>
                     <c ca="left">
                        <p>0 (0)</p>
                     </c>
                     <c ca="left">
                        <p>0 (0)</p>
                     </c>
                     <c ca="left">
                        <p>2 (2)<sup><it>b</it></sup></p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>2 (2)</b>
                        </p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>Numbers in parentheses indicate subset of features that have an annotation other than "hypothetical."</p>
                  <p><sup><it>a </it></sup>Includes <it>csrB1 </it>and <it>csrB2 </it>[36].</p>
                  <p><sup><it>b </it></sup>Includes two genes predicted from [61].</p>
               </tblfn>
            </tbl>
            <p>In all of these cases, no sequence was changed in the genomic model, but annotations imposed on the sequence were added. Included in the list of new genes predicted from both approaches is biotin synthase (<it>bioB</it>). Because ES114 grows on minimal medium lacking biotin <abbrgrp><abbr bid="B38">38</abbr></abbrgrp>, BioB is likely expressed by the organism. Another example of a gene predicted from both approaches is the oxaloacetate decarboxylase gamma subunit (<it>oadC</it>), which is predicted to be encoded in an operon with the already-predicted alpha and beta subunits (new operon prediction of <it>oadCAB</it>).</p>
            <p>In addition to genes that were identified from both the MJ11-comparative and JCVI CMR approaches, we believe that genes identified by only one of the approaches are still worthy of inclusion, subject to the filters imposed above. Genes that were identified by comparison with MJ11 have the support that the open reading frame is conserved in at least these two strains. A similar measure has been used to call genes in <it>Saccharomyces cerevisiae </it>for genome inclusion <abbrgrp><abbr bid="B39">39</abbr></abbrgrp>. Genes that were identified solely from the JCVI CMR annotation include a number of regions that are unique to ES114, such as a prophage that is present in ES114 and absent in MJ11 (Additional file <supplr sid="S4">4</supplr>, new loci VF_2640 through VF_2649), and therefore would not be expected to be called by comparison with MJ11. The coding density of the prophage was markedly increased due to the addition of the novel gene annotations, consistent with phage genome organization and supporting the assignments predicted by the CMR. It is clear that the consolidated results from both methods, though partially overlapping, identify a significant number of novel, bona fide gene annotations in ES114.</p>
            <p>We added regulatory RNA genes to the annotation as identified from multiple sources (Table <tblr tid="T3">3</tblr>, Additional file <supplr sid="S4">4</supplr>). Prediction of CsrB regulatory RNAs has been pioneered using <it>V. fischeri </it><abbrgrp><abbr bid="B36">36</abbr></abbrgrp>, and additional regulatory RNAs were added based on motifs found in the RFAM database <abbrgrp><abbr bid="B40">40</abbr></abbrgrp>. These methods identified a total of 10 regulatory RNAs and 1 operon leader peptide. Although operon leader peptide predictions are not typically found in databases, we speculated that additional such genes were present in ES114 and, using the Ecocyc database <abbrgrp><abbr bid="B41">41</abbr></abbrgrp> as a guide, we manually searched for operon leader peptides in ES114 and identified five additional high-confidence members in the genome (Additional file <supplr sid="S4">4</supplr>, Basis "E").</p>
            <p>In total, we called 95 new protein-coding genes, 8 regulatory RNAs, and 6 operon leader peptides, and we incorporated 2 protein-coding genes and 2 regulatory RNAs that were published previously, for a total of 113 new annotations incorporated into ES114 version 2.0 (Table <tblr tid="T3">3</tblr>, Additional file <supplr sid="S4">4</supplr>).</p>
         </sec>
         <sec>
            <st>
               <p>Comprehensive annotation update</p>
            </st>
            <p>In the process of correcting sequence errors and adding missing annotations, we additionally took the opportunity to update the annotations of the genes in the ES114 genome and to establish a framework for future genomic and genetic studies in <it>V. fischeri</it>. To update the product annotations of <it>V. fischeri</it>, we assembled a database of <it>V. fischeri </it>genetic and genomic analyses from the the PubMed database <abbrgrp><abbr bid="B42">42</abbr></abbrgrp>. Our initial curated <it>V. fischeri </it>list included 545 unique gene-publication associations from 60 publications, encompassing 339 distinct genes represented in strain ES114. This list served as the core of the reannotation effort, which further gave us the opportunity to update a number of genes whose functions have been discovered since the initial genome publication.</p>
            <p>For all genes in ES114, we additionally compared protein annotations from multiple sources: (1) Orthologous protein annotations in the recently reannotated <it>Escherichia coli </it>K-12 MG1655 sequence, and updates made subsequent to sequence publication through the ASAP and Ecocyc databases <abbrgrp><abbr bid="B41">41</abbr><abbr bid="B43">43</abbr></abbrgrp>; (2) Orthologous protein annotations in <it>V. cholerae </it><abbrgrp><abbr bid="B34">34</abbr></abbrgrp>; (3) the JCVI CMR <abbrgrp><abbr bid="B37">37</abbr></abbrgrp>; and (4) UniprotKB <abbrgrp><abbr bid="B44">44</abbr></abbrgrp>. These comparisons allowed us to update the annotations and to make the annotations more consistent with current practice and NCBI guidelines.</p>
            <p>We found the annotations of <it>E. coli </it>&#8211; though most distant phylogenetically &#8211; to be the most valuable empirically. The timeliness of the update and the availability of curated, referenced descriptions in the Ecocyc entries allowed us to improve a number of entries that appeared to lag behind the other data sources. As one example, we point to the case of <it>yihY </it>(VF_0100, ortholog of <it>E. coli </it>locus tag b3886). Previously annotated as encoding the ribonuclease BN <abbrgrp><abbr bid="B45">45</abbr></abbrgrp>, this annotation has been propagated through numerous sources, including most of the <it>Vibrionaceae </it>genomes. A subsequent report identified the <it>E. coli rbn/elaC </it>gene (locus tag b2268) as the gene that encodes RNase BN, and the most recent genome annotation for b3886 has been updated as <it>yihY</it>, "predicted inner membrane protein" <abbrgrp><abbr bid="B46">46</abbr></abbrgrp>. We compared data from the sources described above, as well as the literature described, and captured this update by calling VF_0100 as <it>yihY </it>with a product of "predicted inner membrane protein". In fact, <it>V. fischeri</it>, like most sequenced <it>Vibrio </it>spp., does not contain an <it>rbn </it>ortholog, and therefore having any product labeled as "ribonuclease BN" would have been misleading from the perspective of predicting genome capabilities. We note that the old annotation persists in major databases <abbrgrp><abbr bid="B31">31</abbr><abbr bid="B47">47</abbr><abbr bid="B48">48</abbr><abbr bid="B49">49</abbr></abbrgrp> and in most of the <it>Vibrionaceae </it>genomes available at the time of data submission. This example highlights the value and relevance of the <it>E. coli </it>K-12 update to this and related annotation projects, as well as our ability to capture the latest information about genes encoded in the ES114 genome.</p>
            <p>Fine-scale annotation changes, such as those shown in Figure <figr fid="F1">1F</figr>, are detailed both in that figure and in the Methods. We also wish to highlight the updated entry for <it>prfB </it>(noted in Table <tblr tid="T3">3</tblr>), the peptide chain release factor RF-2. The programmed frameshift in <it>prfB </it>is not called correctly by machine-call algorithms, and this gene is improperly entered in the GenBank flatfiles of all of the previously submitted <it>Vibrio </it>spp. genomes.</p>
            <p>With the blossoming number of sequencing projects, utilization of locus tags (e.g., VF_0001) as identifiers for both genes and their products has become commonplace as the increase in genomic characterization has outpaced genetic and biochemical characterization of gene products. Nonetheless, biological analysis in a genomic context depends on understanding gene function, and proper nomenclature has been adopted in a number of species to facilitate meaningful communication about genes and their products. In fact, we (and others) repeatedly refer to genes by their identifiers and, without tracking in a database, this practice can lead to incorrect conclusions <abbrgrp><abbr bid="B50">50</abbr></abbrgrp>. Therefore, whereas the previous ES114 version did not contain 3&#8211;5 character "gene" identifiers, we added those for approximately 1,995 genes in which the identity of the gene could be identified or inferred from published work in <it>V. fischeri</it>, or by orthologous genes in other organisms. Due to the availability of well-curated database resources, most of the names were derived from their orthologs in <it>E. coli </it>MG1655 <abbrgrp><abbr bid="B41">41</abbr><abbr bid="B43">43</abbr><abbr bid="B51">51</abbr></abbrgrp>.</p>
            <p>We demanded that unique gene identifiers be a minimum of three lowercase letters (e.g., <it>fnr</it>), with an optional uppercase letter (e.g., <it>dnaA</it>), and/or an optional numeral (e.g., <it>nagA2</it>), for a maximum of five characters total. Such numeric suffixes were assigned to distinguish among members of paralogous families or genes of related function. For approximately half of the genes, no gene identifier could be assigned at this time.</p>
         </sec>
         <sec>
            <st>
               <p><it>V. fischeri </it>transcription machinery</p>
            </st>
            <p>Because three RNA polymerase subunit genes were affected by the resequencing (<it>rpoB</it>, <it>rpoC</it>, <it>rpoH</it>), we took a genomic inventory of the corrected ES114 transcriptional apparatus in a manner that was not possible prior to the targeted resequencing. The subunits identified in the genome are listed in Table <tblr tid="T4">4</tblr> and include the core subunits &#945;, &#946;, &#946;', and &#969;. Classification of the eleven identified sigma factors is described below and was performed by the scheme outlined in Gruber &amp; Gross <abbrgrp><abbr bid="B52">52</abbr></abbrgrp>.</p>
            <tbl id="T4">
               <title>
                  <p>Table 4</p>
               </title>
               <caption>
                  <p>ES114 genes encoding transcriptional machinery.</p>
               </caption>
               <tblbdy cols="4">
                  <r>
                     <c ca="left">
                        <p>
                           <b>Locus_tag</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>Gene</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>Product</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>Notes</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c cspan="4">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c cspan="4" ca="left">
                        <p>
                           <b>
                              <it>RNA polymerase core</it>
                           </b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_0262</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>rpoA</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>&#945; subunit</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_2414</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>rpoB</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>&#946; subunit</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_2412</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>rpoC</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>&#946;' subunit</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_0105</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>rpoZ</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>&#969; subunit</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c cspan="4" ca="left">
                        <p>
                           <b>
                              <it>Sigma subunits (11 predicted)</it>
                           </b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_2254</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>rpoD</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>&#963;<sup>D</sup>/&#963;<sup>70</sup></p>
                     </c>
                     <c ca="left">
                        <p>Group 1: &#963;<sup>70</sup>-type</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_2067</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>rpoS</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>&#963;<sup>S</sup></p>
                     </c>
                     <c ca="left">
                        <p>Group 2: &#963;<sup>70</sup>-type, &#963;<sup>38</sup>-subtype</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_A1015</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>rpoQ</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>&#963;<sup>Q</sup></p>
                     </c>
                     <c ca="left">
                        <p>Group 2: &#963;<sup>70</sup>-type, &#963;<sup>38</sup>-subtype</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_2450</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>rpoH</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>&#963;<sup>H</sup></p>
                     </c>
                     <c ca="left">
                        <p>Group 3: &#963;<sup>70</sup>-type, &#963;<sup>32</sup>-subtype</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_1834</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>fliA</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>&#963;<sup>F</sup></p>
                     </c>
                     <c ca="left">
                        <p>Group 3: &#963;<sup>70</sup>-type, &#963;<sup>28</sup>-subtype</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_2093</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>rpoE</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>&#963;<sup>E</sup></p>
                     </c>
                     <c ca="left">
                        <p>Group 4: &#963;<sup>70</sup>-type, &#963;<sup>24</sup>-subtype</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_0972</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>rpoE2</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>&#963;<sup>E2</sup></p>
                     </c>
                     <c ca="left">
                        <p>Group 4: &#963;<sup>70</sup>-type, &#963;<sup>24</sup>-subtype</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_A0820</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>rpoE3</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>&#963;<sup>E3</sup></p>
                     </c>
                     <c ca="left">
                        <p>Group 4: &#963;<sup>70</sup>-type, &#963;<sup>24</sup>-subtype</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_A0766</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>rpoE4</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>&#963;<sup>E4</sup></p>
                     </c>
                     <c ca="left">
                        <p>Group 4: &#963;<sup>70</sup>-type, &#963;<sup>24</sup>-subtype</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_2498</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>rpoE5</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>&#963;<sup>E5</sup></p>
                     </c>
                     <c ca="left">
                        <p>Group 4: &#963;<sup>70</sup>-type, &#963;<sup>24</sup>-subtype</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>VF_0387</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>rpoN</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>&#963;<sup>N</sup></p>
                     </c>
                     <c ca="left">
                        <p>&#963;<sup>54</sup>-type</p>
                     </c>
                  </r>
               </tblbdy>
            </tbl>
            <p>Group 1 sigma factors include regions 1.1, 1.2, 2, 3, and 4. This category includes only &#963;<sup>70 </sup>in ES114. Group 2 sigma factors (regions 1.2, 2, 3, 4) include the closely-related &#963;<sup>S </sup>subunits; as mentioned above, <it>V. fischeri </it>curiously contains two of these sigma subunits. In addition to a clear ortholog of <it>rpoS </it>(VF_2067), the gene encoding the stationary-phase sigma subunit (&#963;<sup>S</sup>), ES114 also contains a gene that is expected to encode a &#963;<sup>S</sup>-like subunit (VF_A1015). Transcript levels of this second &#963;<sup>S</sup>-family subunit increase upon C8-homoserine-lactone (AinS-dependent) quorum-