<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1471-2105-9-554</ui>
   <ji>1471-2105</ji>
   <fm>
      <dochead>Software</dochead>
      <bibl>
         <title>
            <p>ProfileGrids as a new visual representation of large multiple sequence alignments: a case study of the RecA protein family</p>
         </title>
         <aug>
            <au id="A1" ca="yes">
               <snm>Roca</snm>
               <mi>I</mi>
               <fnm>Alberto</fnm>
               <insr iid="I1"/>
               <email>aroca@uci.edu</email>
            </au>
            <au id="A2">
               <snm>Almada</snm>
               <mi>E</mi>
               <fnm>Albert</fnm>
               <insr iid="I1"/>
               <email>aalmada@alumni.uci.edu</email>
            </au>
            <au id="A3">
               <snm>Abajian</snm>
               <mi>C</mi>
               <fnm>Aaron</fnm>
               <insr iid="I1"/>
               <email>aabajian@alumni.uci.edu</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Department of Molecular Biology and Biochemistry, 560 Steinhaus Hall, University of California, Irvine, California 92697-3900, USA</p>
            </ins>
         </insg>
         <source>BMC Bioinformatics</source>
         <issn>1471-2105</issn>
         <pubdate>2008</pubdate>
         <volume>9</volume>
         <issue>1</issue>
         <fpage>554</fpage>
         <url>http://www.biomedcentral.com/1471-2105/9/554</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">19102758</pubid>
               <pubid idtype="doi">10.1186/1471-2105-9-554</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>30</day>
               <month>7</month>
               <year>2008</year>
            </date>
         </rec>
         <acc>
            <date>
               <day>22</day>
               <month>12</month>
               <year>2008</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>22</day>
               <month>12</month>
               <year>2008</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2008</year>
         <collab>Roca et al; licensee BioMed Central Ltd.</collab>
         <note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>Multiple sequence alignments are a fundamental tool for the comparative analysis of proteins and nucleic acids. However, large data sets are no longer manageable for visualization and investigation using the traditional stacked sequence alignment representation.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>We introduce ProfileGrids that represent a multiple sequence alignment as a matrix color-coded according to the residue frequency occurring at each column position. JProfileGrid is a Java application for computing and analyzing ProfileGrids. A dynamic interaction with the alignment information is achieved by changing the ProfileGrid color scheme, by extracting sequence subsets at selected residues of interest, and by relating alignment information to residue physical properties. Conserved family motifs can be identified by the overlay of similarity plot calculations on a ProfileGrid. Figures suitable for publication can be generated from the saved spreadsheet output of the colored matrices as well as by the export of conservation information for use in the PyMOL molecular visualization program.</p>
               <p>We demonstrate the utility of ProfileGrids on 300 bacterial homologs of the RecA family &#8211; a universally conserved protein involved in DNA recombination and repair. Careful attention was paid to curating the collected RecA sequences since ProfileGrids allow the easy identification of rare residues in an alignment. We relate the RecA alignment sequence conservation to the following three topics: the recently identified DNA binding residues, the unexplored MAW motif, and a unique <it>Bacillus subtilis </it>RecA homolog sequence feature.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusion</p>
               </st>
               <p>ProfileGrids allow large protein families to be visualized more effectively than the traditional stacked sequence alignment form. This new graphical representation facilitates the determination of the sequence conservation at residue positions of interest, enables the examination of structural patterns by using residue physical properties, and permits the display of rare sequence features within the context of an entire alignment. JProfileGrid is free for non-commercial use and is available from <url>http://www.profilegrid.org</url>. Furthermore, we present a curated RecA protein collection that is more diverse than previous data sets; and, therefore, this RecA ProfileGrid is a rich source of information for nanoanatomy analysis.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <meta>
      <classifications>
         <classification type="bmc" subtype="user_supplied_xml" id="endnote"/>
      </classifications>
   </meta>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>Comparative nanoanatomy and phylogenetic studies of macromolecules depend upon multiple sequence alignments (MSAs). However, the traditional stacked sequence representation of an alignment proves cumbersome for large numbers of homologs as is prevalent with the proliferation of genome sequences. Early MSA formatting programs facilitated analysis by emphasizing residues with boxes, colors, and shading <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B2">2</abbr><abbr bid="B3">3</abbr></abbrgrp>. However, these programs (and many subsequent different implementations) still represent a MSA as stacked sequences. Regular expressions, major components <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>, and sequence logos <abbrgrp><abbr bid="B5">5</abbr></abbrgrp> are solutions to compress the sequence alignment information of motifs into a consensus format as reviewed in 2005 <abbrgrp><abbr bid="B6">6</abbr></abbrgrp>. In addition, a graphical view of MSA conservation can be achieved with an "overview" mode <abbrgrp><abbr bid="B7">7</abbr><abbr bid="B8">8</abbr></abbrgrp> or with plots of similarity values <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>. However, all of these representations do not convey the details of each character's frequency distribution at each homologous position in the entire alignment. Thus, potentially valuable information for the interpretation of macromolecular structure and function is lost. Clearly there is a need for a new visual representation paradigm for MSAs.</p>
         <p>Here we introduce the JProfileGrid Java software for generating ProfileGrids &#8211; a new graphical, tabular representation of alignments. Historically, profiles scored by a distance matrix were used for database searches <abbrgrp><abbr bid="B10">10</abbr></abbrgrp>, although simple frequency profiles have been used to tabulate the amino acid content of linear motifs <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>. By contrast, ProfileGrids are color-coded tables of the residue frequency occurring at every homologous position across the entire length of an MSA. Therefore, all MSA information is represented especially at variable regions and of rare residues that may yield clues about function. Similar to ColorGrids <abbrgrp><abbr bid="B12">12</abbr></abbrgrp>, the frequency determines color shading; but, ProfileGrids are specific for MSAs. In particular, our JProfileGrid software enables a dynamic visualization of structural patterns by analyzing protein alignments with respect to amino acid physical properties. Notably, JProfileGrid provides a unique method for generating publishable figures of the entire sequence content of an alignment with many homologs. A ProfileGrid facilitates the inspection of large MSAs and, thus, solves the problem of text legibility of traditional MSAs <abbrgrp><abbr bid="B13">13</abbr></abbrgrp>. Below we describe the features of the JProfileGrid software and demonstrate a ProfileGrid's usefulness by examining the bacterial RecA protein family that we introduce next.</p>
         <p>The RecA protein is the premier genomic sentinel of <it>Escherichia coli </it>because of its crucial protective roles in both recombinational DNA repair <abbrgrp><abbr bid="B14">14</abbr></abbrgrp> and the SOS response <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>. RecA homologs are present in all domains of life <abbrgrp><abbr bid="B16">16</abbr><abbr bid="B17">17</abbr></abbrgrp> and well distributed among bacteria <abbrgrp><abbr bid="B18">18</abbr><abbr bid="B19">19</abbr><abbr bid="B20">20</abbr><abbr bid="B21">21</abbr></abbrgrp>. As the vanguard of bacterial RecA homologs, the <it>E. coli </it>RecA protein (352 residues; [GenBank:AAC75741.1]) has been intensively studied starting with its discovery <abbrgrp><abbr bid="B22">22</abbr></abbrgrp> and the subsequent sequencing of its gene <abbrgrp><abbr bid="B23">23</abbr><abbr bid="B24">24</abbr></abbrgrp>. Later, many RecA sequences became available as microbiologists cloned <it>recA </it>genes from different culturable bacteria to construct knockout derivatives <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>. Furthermore, the ubiquity of the RecA homolog made it a common marker for phylogenetic studies <abbrgrp><abbr bid="B20">20</abbr></abbrgrp> using the most conserved parts of the RecA protein &#8211; the adjacent MAW and P-loop motifs. The precise function of the former is unknown <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>, while the latter motif is the well-characterized ATP-binding site <abbrgrp><abbr bid="B26">26</abbr></abbrgrp>.</p>
         <p>RecA MSAs have been analyzed from a structural perspective to understand RecA function <abbrgrp><abbr bid="B17">17</abbr><abbr bid="B27">27</abbr></abbrgrp>. For example, molecular genetics approaches have generated over 1400 <it>E. coli </it>RecA missense mutations <abbrgrp><abbr bid="B28">28</abbr></abbrgrp>; and, the phenotypes are discussed within the context of the sequence conservation occurring at the mutation location. Furthermore, conserved residues often have functional roles such as ligand binding so such positions are targets for inspection when studying protein structure. The recent determination of a RecA-DNA cocrystal structure <abbrgrp><abbr bid="B29">29</abbr></abbrgrp> with the first clear identification of a DNA binding site provides a new motivation for RecA MSA information.</p>
         <p>As the number of RecA homologs has increased, however, the visualization and analysis of a MSA becomes unwieldy using the traditional stacked sequence representation. In fact, the last complete RecA MSAs available as published figures comes from the mid-1990's when there were only about 60 homologs <abbrgrp><abbr bid="B17">17</abbr><abbr bid="B19">19</abbr><abbr bid="B30">30</abbr></abbrgrp>. More recently, no MSA figures were included in the data sets of 144 <abbrgrp><abbr bid="B20">20</abbr></abbrgrp> and 113 <abbrgrp><abbr bid="B21">21</abbr></abbrgrp> RecA homologs. Since there are more RecA sequences available now, this family makes an excellent case study for showing how ProfileGrids succinctly display the information content of a large MSA. The present work describes a curated data set of 300 RecA protein sequences from a larger diversity of bacterial species than of previously reported alignments. The breadth of this sequence collection creates a robust description of the conserved sequence motifs of the RecA protein family and, therefore, may, shed light on unexplored regions of this protein such as the aforementioned MAW motif.</p>
      </sec>
      <sec>
         <st>
            <p>Implementation</p>
         </st>
         <p>JProfileGrid is a Java program that combines the tasks of examining amino acid frequencies across an entire MSA, identifying conserved motif regions, and comparing species-specific residues against a sequence family. Both a command-line and a graphical user interface are available with the latter allowing interactive ProfileGrid analysis. The program accepts protein and nucleic acid MSAs in either MSF or FASTA formats. The former is preferred because of the inclusion of sequence weight values in the MSF file header. The similarity plot calculations are based on the plotcon algorithm <abbrgrp><abbr bid="B9">9</abbr></abbrgrp> with a modification that the values are normalized between 0 and 1. The program saves matrix output as a spreadsheet file using the JExcel API <abbrgrp><abbr bid="B31">31</abbr></abbrgrp>. The color formatted ProfileGrid and the similarity values are stored in separate worksheets. A third worksheet identifies outlier characters (such as "X") in the MSA that the program flags for verification. JProfileGrid can also write PyMOL scripts <abbrgrp><abbr bid="B32">32</abbr></abbrgrp> that identify the conserved regions of the MSA on a protein structure.</p>
      </sec>
      <sec>
         <st>
            <p>Methods</p>
         </st>
         <sec>
            <st>
               <p>Sequence data set</p>
            </st>
            <p>RecA protein sequences were collected from the following databases: the National Center for Biotechnology Information GenBank database <abbrgrp><abbr bid="B33">33</abbr></abbrgrp>, The Institute for Genomic Research Comprehensive Microbial Resource <abbrgrp><abbr bid="B34">34</abbr></abbrgrp>, the DNA Data Bank of Japan <abbrgrp><abbr bid="B35">35</abbr></abbrgrp>, the European Molecular Biology Laboratory Sequence Database <abbrgrp><abbr bid="B36">36</abbr></abbrgrp>, and UniProt <abbrgrp><abbr bid="B37">37</abbr></abbrgrp>. Keyword searches were used at the aforementioned database websites especially for annotated genomes where RecA orthologs had already been identified. In addition, sequence similarity searches were performed using the <it>E. coli </it>RecA homolog as the query sequence in BLASTp and TBLASTN searches <abbrgrp><abbr bid="B38">38</abbr></abbrgrp> with default parameters. After manually verifying the presence of conserved RecA family motifs, we added the protein sequences from the keyword search results and significant BLAST search hits (E-value &lt;10<sup>-70</sup>) to our previous collection of validated bacterial RecA orthologs <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>. Since we focused on fully sequenced homologs from known bacterial species, no explicit attempt was made to collect RecA homologs from environmental sequencing projects such as from the Sargasso Sea collection <abbrgrp><abbr bid="B39">39</abbr></abbrgrp>. In a previous analysis of 64 RecA homologs, 12 sequences were found to contain errors <abbrgrp><abbr bid="B17">17</abbr><abbr bid="B40">40</abbr><abbr bid="B41">41</abbr></abbrgrp>. Although some of those have not yet been updated in GenBank, we used the corrected versions in all cases. Finally, we limited the RecA data set to unique sequences for each bacterial species. Specifically, we eliminated redundant sequences from duplicate sequencing efforts (genome versus individual gene projects) and from strains of the same bacterial species (<it>E. coli </it>CFT073 versus K12). While these sequences do not appear in our RecA MSA and ProfileGrid, the redundant sequences serve to verify any rare residue observations that could be the result of errors. This underscores the curation that was performed of the individual sequences as described in more detail below.</p>
         </sec>
         <sec>
            <st>
               <p>Alignment</p>
            </st>
            <p>The multiple sequence alignments were calculated using the DNASTAR MegAlign program <abbrgrp><abbr bid="B42">42</abbr></abbrgrp> that implements the ClustalW algorithm <abbrgrp><abbr bid="B43">43</abbr></abbrgrp>. Default parameters were used except that the gap penalty was increased to 30 to minimize the introduction of gaps. The resulting alignment was manually curated by visual inspection to optimize the position of small gaps. Weight values were assigned to each protein sequence using the ClustalX program <abbrgrp><abbr bid="B44">44</abbr></abbrgrp> to remove any bias from similar sequences potentially overrepresented in the alignment. The MegAlign program was also used to identify alignment positions that were either invariant or chemically similar (Additional file <supplr sid="S1">1</supplr>) according to previously described amino acid classes <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>.</p>
            <suppl id="S1">
               <title>
                  <p>Additional file 1</p>
               </title>
               <text>
                  <p><b>Multiple sequence alignment of bacterial RecA homologs</b>. A subset of the 300 sequences is shown representing each of the major bacterial phyla. In the alignment, a dash (-) indicates a gap and a period indicates an amino acid identical to the <it>E. coli </it>RecA protein. NCBI Protein database accession numbers are listed at the end unless the data was taken from the TIGR unfinished microbial genomes database. Summary lines above the alignment were calculated from all 300 sequences. The "Bioin" line indicates the bioinformatic structural elements (nanoanatomy) across the entire RecA protein: 12 motifs and the 10 connecting variable regions. "Secon" are the secondary structural elements from the <it>E. coli </it>RecA crystal structure where "a" are &#945; helices, "b" are &#946; strands, "l" are disordered loops, and "?" are disordered termini <abbrgrp><abbr bid="B62">62</abbr></abbrgrp>. In each case the letter or number name of the element is given in the second position. "Ident" are the 21 resides identical in all 300 sequences. "Chemi" are the 39 chemically conservative substitutions based on the following amino acid classification: a = (DE), b = (HKR), f = (AGILV), m = (NQ), o = (FWY), h = (ST), i = (P), s = (CM). "Funct" lists the 55 functionally conservative residue substitutions based on the classification: a = (DE), b = (HKR), f = (AFILMPVW), p = (CGNQSTY). Finally, "Major" are the 187 residues conserved above a 70% majority threshold (210 sequences) with invariant residues shown in uppercase. The numbering of the alignment is based upon the <it>E. coli </it>RecA protein sequence.</p>
               </text>
               <file name="1471-2105-9-554-S1.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
         </sec>
         <sec>
            <st>
               <p>Data curation</p>
            </st>
            <p>In the genomic era, database web interfaces make it easy for the novice user to find and align many RecA sequences. However the quality of the sequence data sets and their subsequent alignment can not be taken for granted. Instead it is imperative that bioinformatic data be curated to enable researchers to be confident of the conclusions that they draw <abbrgrp><abbr bid="B45">45</abbr></abbrgrp>. This can be particularly important in the conserved motifs of a protein sequence alignment. Below, we belabor this point as a caution about the interpretation of rare residues in MSAs.</p>
            <p>Inspection of the MSA (Additional file <supplr sid="S1">1</supplr>) and ProfileGrid (Additional file <supplr sid="S2">2</supplr>) show that the family motifs are very well conserved among the 300 RecA homologs. However, there are exceptions where residues occur which do not follow the consensus patterns for the motifs. These rare residues are readily visible in ProfileGrid representations. Such rare amino acids may be interesting exceptions or just noise in the bioinformatic data. We paid particular attention to the MAW and P-loop motifs that are the most conserved parts of the RecA family. For example, a single serine is observed in the MAW motif at <it>E. coli </it>position 52 where 298 other RecA sequences have glycine at that position (Additional file <supplr sid="S2">2</supplr>). This is not considered a conservative substitution. By contrast, a single serine in the P-loop at position 73 could be a conservative substitution when compared to the 299 other threonine residues. Structure and function inferences drawn from exceptions to conserved motifs would be a waste of effort if such exceptions were based upon faulty data. We also note that phylogenetic analyses are greatly affected by sequence errors <abbrgrp><abbr bid="B46">46</abbr></abbrgrp>.</p>
            <suppl id="S2">
               <title>
                  <p>Additional file 2</p>
               </title>
               <text>
                  <p><b>Detailed ProfileGrid of the RecA protein family</b>. The frequency values were calculated from the 300 RecA sequences over the full length (352 residues) of the <it>E. coli </it>RecA homolog (top sequence) that determines the position numbering. The "Major" summary line is the 187 residues conserved above a 70% majority threshold. The 12 RecA family motifs are boxed and labeled (as in Additional file 1) while the connecting variable regions are only labeled. Frequency values are shaded in the ranges of 50 to 69% (light gray), 70 to 89% (dark gray), and 90 to 100% (black). Since we anticipate updating the analysis in the future, this is version 1.0 of the RecA ProfileGrid.</p>
               </text>
               <file name="1471-2105-9-554-S2.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <p>Problems in sequence data sets can result from experimental artifacts or data handling mistakes. These issues are diminishing in the genomic era, but anomalies still occur. As mentioned above, we have identified errors in <it>recA </it>gene sequences determined using traditional gel techniques <abbrgrp><abbr bid="B41">41</abbr></abbrgrp>. More importantly, genome projects are introducing a new problem where the complete determination of an organism's DNA content yields sequences that may not be true chromosomal RecA orthologs. For example, the <it>Salmonella enterica </it>genome project <abbrgrp><abbr bid="B47">47</abbr></abbrgrp> uncovered both plasmid encoded [GenBank:CAD09875.1] and chromosome encoded [GenBank:CAD05935.1] RecA proteins. Only the latter was included in the work presented here. In addition, JProfileGrid will flag outliers of one letter characters that do not represent the common amino acids or gap codes. For example, in the RecA protein alignment reported here, we unexpectedly identified "X" characters in two sequences [GenBank:CAD79373.1, GenBank:AAN06665.1].</p>
            <p>Significantly, this point about data curation is not just a hypothetical cautionary comment. Attention <abbrgrp><abbr bid="B48">48</abbr></abbrgrp> was drawn to the observation of a rare tyrosine residue in the <it>Proteus vulgaris </it>RecA protein <abbrgrp><abbr bid="B49">49</abbr></abbrgrp> where the vast majority of RecA homologs have serine at <it>E. coli </it>position 70 (Additional file <supplr sid="S2">2</supplr>). However the discrepancy was resolved <abbrgrp><abbr bid="B41">41</abbr></abbrgrp> when it was determined that the tyrosine observation was actually a simple typographical error in the publication figure. Compounding this problem, though, was a data handling error of the <it>P. vulgaris </it>[GenBank:CAB56804.1] and <it>Pectobacterium carotovorum </it>(formerly <it>Erwinia carotovora</it>) [GenBank:CAB56783.1] RecA protein sequences both determined by the same group <abbrgrp><abbr bid="B49">49</abbr></abbrgrp>. The sequence database records for these homologs were apparently mixed together such that the sequences do not agree with the protein sequences reported in the reference publication. The corrected sequences are used in this work. Thus, we encourage users of ProfileGrids to be cautious of overinterpreting rare residues identified in motifs. Currently, the accurate biocuration of sequence and alignment data sets can only be achieved by slow, tedious, manual efforts by protein family experts <abbrgrp><abbr bid="B50">50</abbr></abbrgrp>.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Results and Discussion</p>
         </st>
         <sec>
            <st>
               <p>JProfileGrid software</p>
            </st>
            <p>The program is controlled from the parameter settings window (Figure <figr fid="F1">1</figr>) which is arranged from top-to-bottom for loading an alignment, customizing the appearance of a ProfileGrid, calculating the similarity plot values, and exporting the results. The ProfileGrid viewer (Figure <figr fid="F2">2</figr>) shows the results of the JProfileGrid calculation after opening the alignment file (here of the RecA family of 300 sequences). The first 3 rows are a position ruler, a majority consensus, and a template sequence (here of the <it>E. coli </it>RecA homolog). The next 21 rows tabulate the frequency of the amino acid and gap characters at the corresponding MSA column position. ProfileGrid cells are color shaded according to the residue frequency value (Figure <figr fid="F3">3</figr>) with the legend in the lower-left corner of the ProfileGrid viewer read from left to right as low to high conservation, respectively. The top-left corner identifies the character and the frequency of the ProfileGrid cell currently selected by the cursor. Note that each column total equals the number of sequences in the alignment. Since the ProfileGrid matrix needs only 21 residue rows to represent protein sequences, there is practically no limit to the number of homologs that can be visualized.</p>
            <fig id="F1">
               <title>
                  <p>Figure 1</p>
               </title>
               <caption>
                  <p>A screen shot of the JProfileGrid parameter settings window</p>
               </caption>
               <text>
                  <p><b>A screen shot of the JProfileGrid parameter settings window</b>.</p>
               </text>
               <graphic file="1471-2105-9-554-1"/>
            </fig>
            <fig id="F2">
               <title>
                  <p>Figure 2</p>
               </title>
               <caption>
                  <p>The ProfileGrid viewer showing the RecA protein family results</p>
               </caption>
               <text>
                  <p><b>The ProfileGrid viewer showing the RecA protein family results</b>. The first 3 rows of the ProfileGrid are a position ruler (Posn), a majority consensus (Major), and a template sequence (here of the <it>E. coli </it>RecA homolog). The remaining rows tabulate the frequency of the amino acid and gap characters at each position of the alignment. Cells are color shaded according to the frequency value (Figure 3). The top-left corner identifies the character and the frequency of the ProfileGrid cell currently selected by the cursor.</p>
               </text>
               <graphic file="1471-2105-9-554-2"/>
            </fig>
            <fig id="F3">
               <title>
                  <p>Figure 3</p>
               </title>
               <caption>
                  <p>The frequency settings determining a ProfileGrid cell color</p>
               </caption>
               <text>
                  <p><b>The frequency settings determining a ProfileGrid cell color</b>.</p>
               </text>
               <graphic file="1471-2105-9-554-3"/>
            </fig>
            <p>The parameter settings window (Figure <figr fid="F1">1</figr>) allows the user to change the template sequence, the position ruler numbering, the majority consensus sequence threshold cutoff (default 70%), and the residue sort order. By default, the template is the first sequence of the alignment; and, the amino acids are alphabetized by the one-letter code to facilitate looking up a residue of interest. JProfileGrid provides a menu of the following amino acid physical constants for analysis: age <abbrgrp><abbr bid="B51">51</abbr></abbrgrp>, flexibility <abbrgrp><abbr bid="B52">52</abbr></abbrgrp>, frequency among <it>E. coli </it>proteins <abbrgrp><abbr bid="B53">53</abbr></abbrgrp>, hydropathy <abbrgrp><abbr bid="B54">54</abbr></abbrgrp>, hydrophobicity <abbrgrp><abbr bid="B55">55</abbr></abbrgrp>, helix propensity <abbrgrp><abbr bid="B56">56</abbr></abbrgrp>, mutability <abbrgrp><abbr bid="B57">57</abbr><abbr bid="B58">58</abbr></abbrgrp>, surface area <abbrgrp><abbr bid="B59">59</abbr></abbrgrp>, and volume <abbrgrp><abbr bid="B60">60</abbr></abbrgrp>. Many more constants are available for those coding their own ProfileGrid implementations <abbrgrp><abbr bid="B61">61</abbr></abbrgrp>. The "Frequency Colors" button opens a window listing the 6 default frequency color bins (Figure <figr fid="F3">3</figr>). A ProfileGrid cell is colored by the following bin that has the largest threshold value greater than or equal to a cell's residue frequency: &lt;10% (white), &#8805; 10% (gray), &#8805; 25% (yellow), &#8805; 50% (orange), &#8805; 70% (green), and &#8805; 90% (red). This color scheme was chosen to maximize the visual differences between bins for the inspection of ProfileGrids for patterns (see below). By contrast, a color ramp (<it>i.e.</it>, shades of one color) would not facilitate such analysis. However, the user is able to define their own frequency color scheme by choosing the number, size, and color of the bins. To assist the inspection of ProfileGrids, the frequency values can be hidden. This same menu allows the values to be reported as a percentage.</p>
            <p>Two features allow one to visualize other sequences of the ProfileGrid besides the template sequence. First, the highlight sequence option allows one to detect and to represent unique features of one sequence with respect to the entire information content of a MSA. Such a feature may indicate specialization with respect to function or activity. When the highlight menu is used to select a sequence different from the template sequence, then the highlight feature is turned on (Figure <figr fid="F4">4</figr>). Specifically, the highlight sequence will appear immediately below the template sequence in the ProfileGrid. Furthermore, a pairwise comparison is made such that the corresponding residue is boxed if the highlight sequence differs from the template sequence. The user may choose other colors besides the default blue selection. Note that in the highlight sequence figure, the cell value identification feature (top left corner) reports the current cell frequency even when the ProfileGrid colors and values are hidden. The second feature to visualize MSA sequences is the alignment viewer window (Figure <figr fid="F5">5</figr>) that displays a traditional alignment representation of sequences from the currently selected ProfileGrid cell. In this example, the 21 homologs that have glycine in the third column are shown. For comparison purposes, the first row in the alignment is the template sequence.</p>
            <fig id="F4">
               <title>
                  <p>Figure 4</p>
               </title>
               <caption>
                  <p><it>B. subtilis </it>RecA highlight sequence example with frequency colors and values turned off</p>
               </caption>
               <text>
                  <p><b><it>B. subtilis </it>RecA highlight sequence example with frequency colors and values turned off</b>.</p>
               </text>
               <graphic file="1471-2105-9-554-4"/>
            </fig>
            <fig id="F5">
               <title>
                  <p>Figure 5</p>
               </title>
               <caption>
                  <p>The alignment viewer showing sequences from the currently selected ProfileGrid cell</p>
               </caption>
               <text>
                  <p><b>The alignment viewer showing sequences from the currently selected ProfileGrid cell</b>.</p>
               </text>
               <graphic file="1471-2105-9-554-5"/>
            </fig>
            <p>JProfileGrid calculates similarity plot values (Figure <figr fid="F6">6</figr>) based on the plotcon algorithm <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>. A user-defined sliding window (default 5 residues) is used to calculate conservation across the MSA using the BLOSUM62 or EDNAFULL scoring matrices for proteins and nucleic acids, respectively. Weights for each sequence are taken from MSF input files to correct for overrepresented sequences. By contrast, calculations based upon FASTA files will not have such a correction. The similarity plot results can be visualized directly within a ProfileGrid. This is accomplished by a threshold cutoff value determining the endpoints of similarity boxes outlined in black in the ProfileGrid (Figure <figr fid="F7">7</figr>). These boxes emphasize conserved regions in the protein family. The similarity boxes also serve as landmarks when the ProfileGrid frequency cell colors are not shown.</p>
            <fig id="F6">
               <title>
                  <p>Figure 6</p>
               </title>
               <caption>
                  <p>Similarity plot of the RecA protein family</p>
               </caption>
               <text>
                  <p><b>Similarity plot of the RecA protein family</b>. Similarity values over the first 150 residues of the alignment were calculated using the BLOSUM62 scoring matrix and a window size of 9. A threshold value of 0.8 is indicated by the dashed line. A complete plot using a smaller RecA data set has been previously published <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>.</p>
               </text>
               <graphic file="1471-2105-9-554-6"/>
            </fig>
            <fig id="F7">
               <title>
                  <p>Figure 7</p>
               </title>
               <caption>
                  <p>ProfileGrid of 300 bacterial RecA protein sequences</p>
               </caption>
               <text>
                  <p><b>ProfileGrid of 300 bacterial RecA protein sequences</b>. The first row is the <it>E. coli </it>RecA protein sequence. The ProfileGrid cells are colored according to the following bins: &lt;10% (white), &#8805;10% (gray), &#8805;25% (yellow), &#8805;50% (orange), &#8805;70% (green), &#8805;90% (red). The boxed regions (potential motifs) were drawn by JProfileGrid from the similarity plot calculations using an 80% threshold cutoff. For visual clarity, only the first 150 residues of the alignment are shown; and, the frequency values are omitted. Additional File <supplr sid="S2">2</supplr> is the entire RecA ProfileGrid including frequency values. This figure was generated from the JProfileGrid spreadsheet output.</p>
               </text>
               <graphic file="1471-2105-9-554-7"/>
            </fig>
            <p>JProfileGrid exports output in two formats. ProfileGrid figures for publication are made from a saved Excel spreadsheet file where the matrix appearance can be optimized such as the selection of the text font. The user can specify a subset range of MSA columns as well as the size of each ProfileGrid tier which in this example was set to 50 (Figure <figr fid="F7">7</figr>). A second output format is a script option for the PyMOL molecular visualization program (Figure <figr fid="F8">8</figr>) here showing the <it>E. coli </it>RecA crystal structure <abbrgrp><abbr bid="B62">62</abbr></abbrgrp>. Residues that are completely conserved, <it>i.e.</it>, identical, in the MSA are saved as a PyMOL selection named "ident" in the script file. Residues that pass the highest threshold value in conservation (default bin of &#8805;90%) are saved as a selection named "bin90". Finally, the motifs and connecting variable regions are labeled numerically starting from the N-terminus.</p>
            <fig id="F8">
               <title>
                  <p>Figure 8</p>
               </title>
               <caption>
                  <p>Visualization of PyMOL script output</p>
               </caption>
               <text>
                  <p><b>Visualization of PyMOL script output</b>. JProfileGrid can write a ".pml" file that will define the following named selections based upon the ProfileGrid information: identical residues (black sidechains); conserved motifs ("mot#") colored from most amino terminal (red) to most carboxyl terminal (green); and connecting variable ("var#") regions (gray). These different selections are mapped on to the <it>E. coli </it>RecA crystal structure [PDB:<ext-link ext-link-type="pdb" ext-link-id="2REB">2REB</ext-link>]. This orientation is defined as the anterior view of the RecA monomer anatomical position. Some of the named selections are indicated by arrows in this PyMOL screen shot.</p>
               </text>
               <graphic file="1471-2105-9-554-8"/>
            </fig>
         </sec>
         <sec>
            <st>
               <p>RecA family data set</p>
            </st>
            <p>We have analyzed a set of bacterial RecA homologs consisting of 300 near full-length protein sequences (Table <tblr tid="T1">1</tblr>). Approximately 280 of the sequences were full-length. The rest are missing short sequences at the termini. The number of unique bacterial species in the 300 sequence data set is 245. We included sequences from multiple strains of a single species whenever such sequences were unique. For example, five strains of <it>Streptococcus pyogenes </it>provided RecA sequences that differed at a small number (1 to 8) of residues. The sizes of the full-length sequences ranged from 318 (<it>Bacteroides fragilis</it>; GenBank: AAA22918.1) to 447 amino acids (<it>Tropheryma whipplei</it>; GenBank:AAO44708.1) with an average length of 354 &#177; 18. The degree of identity to the <it>E. coli </it>RecA protein sequence ranged from 37% (<it>Ureaplasma parvum</it>; GenBank:AAF30489.1) to 100% (<it>Shigella flexneri</it>; GenBank:AAP18040.1) with an average identity of 62% &#177; 10%. These calculations excluded the intein sequences found in the <it>Mycobacterium </it>RecA protein homologs <abbrgrp><abbr bid="B63">63</abbr></abbrgrp>.</p>
            <tbl id="T1">
               <title>
                  <p>Table 1</p>
               </title>
               <caption>
                  <p>Bacterial RecA Homologs</p>
               </caption>
               <tblbdy cols="4">
                  <r>
                     <c ca="left">
                        <p>
                           <b>Phyla</b>
                        </p>
                     </c>
                     <c ca="right">
                        <p>
                           <b>1997</b>
                        </p>
                     </c>
                     <c ca="right">
                        <p>
                           <b>Current</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>Representative Species</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c cspan="4">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Actinobacteria</p>
                     </c>
                     <c ca="right">
                        <p>6</p>
                     </c>
                     <c ca="right">
                        <p>37</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Mycobacterium tuberculosis</it>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Aquificae</p>
                     </c>
                     <c ca="right">
                        <p>1</p>
                     </c>
                     <c ca="right">
                        <p>3</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Aquifex pyrophilus</it>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Bacteroidetes/Chlorobi</p>
                     </c>
                     <c ca="right">
                        <p>1</p>
                     </c>
                     <c ca="right">
                        <p>8</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Bacteroides fragilis</it>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Chlamydiae/Verrucomicrobia</p>
                     </c>
                     <c ca="right">
                        <p>1</p>
                     </c>
                     <c ca="right">
                        <p>8</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Chlamydia trachomatis</it>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Chloroflexi</p>
                     </c>
                     <c ca="right">
                        <p>0</p>
                     </c>
                     <c ca="right">
                        <p>2</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Dehalococcoides ethenogenes</it>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Cyanobacteria</p>
                     </c>
                     <c ca="right">
                        <p>3</p>
                     </c>
                     <c ca="right">
                        <p>12</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Anabaena variabilis</it>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Deinococcus-Thermus</p>
                     </c>
                     <c ca="right">
                        <p>3</p>
                     </c>
                     <c ca="right">
                        <p>5</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Deinococcus radiodurans</it>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Dictyoglomi</p>
                     </c>
                     <c ca="right">
                        <p>0</p>
                     </c>
                     <c ca="right">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Dictyoglomus thermophilum</it>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Fibrobacteres/Acidobacteria</p>
                     </c>
                     <c ca="right">
                        <p>0</p>
                     </c>
                     <c ca="right">
                        <p>2</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Fibrobacter succinogenes</it>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Firmicutes</p>
                     </c>
                     <c ca="right">
                        <p>8</p>
                     </c>
                     <c ca="right">
                        <p>73</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Bacillus subtilis</it>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Fusobacteria</p>
                     </c>
                     <c ca="right">
                        <p>0</p>
                     </c>
                     <c ca="right">
                        <p>2</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Fusobacterium nucleatum</it>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Nitrospirae</p>
                     </c>
                     <c ca="right">
                        <p>0</p>
                     </c>
                     <c ca="right">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Thermodesulfovibrio yellowstonii</it>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Planctomycetes</p>
                     </c>
                     <c ca="right">
                        <p>0</p>
                     </c>
                     <c ca="right">
                        <p>2</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Gemmata obscuriglobus</it>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Proteobacteria</p>
                     </c>
                     <c ca="right">
                        <p>(39)</p>
                     </c>
                     <c ca="right">
                        <p>(133)</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="right">
                        <p>Alpha</p>
                     </c>
                     <c ca="right">
                        <p>11</p>
                     </c>
                     <c ca="right">
                        <p>30</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Rhodobacter capsulatus</it>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="right">
                        <p>Beta</p>
                     </c>
                     <c ca="right">
                        <p>6</p>
                     </c>
                     <c ca="right">
                        <p>25</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Neisseria gonorrhoeae</it>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="right">
                        <p>Delta/Epsilon</p>
                     </c>
                     <c ca="right">
                        <p>3</p>
                     </c>
                     <c ca="right">
                        <p>20</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Campylobacter jejuni</it>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="right">
                        <p>Gamma</p>
                     </c>
                     <c ca="right">
                        <p>19</p>
                     </c>
                     <c ca="right">
                        <p>58</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Escherichia coli</it>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Spirochaetes</p>
                     </c>
                     <c ca="right">
                        <p>1</p>
                     </c>
                     <c ca="right">
                        <p>8</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Borrelia burgdorferi</it>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Thermodesulfobacteria</p>
                     </c>
                     <c ca="right">
                        <p>0</p>
                     </c>
                     <c ca="right">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Thermodesulfobacterium commune</it>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Thermotogae</p>
                     </c>
                     <c ca="right">
                        <p>1</p>
                     </c>
                     <c ca="right">
                        <p>2</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Thermotoga maritima</it>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c cspan="2">
                        <hr/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="right">
                        <p>Total</p>
                     </c>
                     <c ca="right">
                        <p>64</p>
                     </c>
                     <c ca="right">
                        <p>300</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>Bacterial phylogenetic classification was taken from the NCBI Taxonomy database <abbrgrp><abbr bid="B33">33</abbr></abbrgrp>. Column "1997" depicts the number of bacterial RecA homologs used in the multiple sequence alignment from a previous analysis <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>. The adjacent column shows the number of homologs used in the present work. The last column lists a representative species from the corresponding phyla.</p>
               </tblfn>
            </tbl>
            <p>The data sets from the mid-1990's <abbrgrp><abbr bid="B17">17</abbr><abbr bid="B19">19</abbr><abbr bid="B30">30</abbr></abbrgrp> were biased toward RecA homologs from the Proteobacteria phyla (60% of sequences). In the current work, the purple bacteria represent only 44% of the sequences (Table <tblr tid="T1">1</tblr>). Furthermore, we now include homologs from several newly sequenced bacterial phyla including the Chlororflexi and the Fusobacteria. The diversity of the current data set permits a robust description of motifs of the RecA protein family. Additional file <supplr sid="S1">1</supplr> shows a summary of the information from the RecA MSA.</p>
         </sec>
         <sec>
            <st>
               <p>RecA family ProfileGrid applications</p>
            </st>
            <p>An alignment of 300 bacterial RecA homologs is graphically represented by a ProfileGrid (Figure <figr fid="F7">7</figr>). This visualization gives a succinct overview of MSA information especially when the frequency values are hidden to reduce clutter. The details of the residue frequency for all columns of the RecA MSA are found in Additional file <supplr sid="S2">2</supplr>. We used the sequence conservation denoted by the similarity boxes to define RecA motifs to serve as a nomenclature across the full length of the RecA protein family (see Additional files <supplr sid="S1">1</supplr> and <supplr sid="S2">2</supplr>). The labeling (and subsequent analysis) of every part of the RecA protein is a fundamental technique adapted from traditional anatomy <abbrgrp><abbr bid="B64">64</abbr></abbrgrp> and applied to macromolecules, <it>i.e.</it>, nanoanatomy.</p>
            <p>The detailed RecA ProfileGrid information will allow researchers to examine conservation at RecA positions of interest. For example, a new suppressor mutation was recently <abbrgrp><abbr bid="B65">65</abbr></abbrgrp> reported that ameliorates the effects of an impaired [KR]x[KR] motif <abbrgrp><abbr bid="B66">66</abbr></abbrgrp>. The suppressor maps to <it>E. coli </it>RecA position 11 and is a change from alanine to valine which is a residue that is <it>not </it>observed among any of the 300 sequences in the MSA (Figure <figr fid="F2">2</figr>, Additional file <supplr sid="S2">2</supplr>). Since the current sequence data set is larger and more diverse than previous RecA homolog collections, one can have more confidence in the <it>lack </it>of an observed residue change.</p>
            <p>The sequence conservation can also be related to RecA protein structure. For example, most of the 21 invariant residues (100% identity) are located on the monomer anterior side (Figure <figr fid="F8">8</figr>) that faces the central axis of the right-handed helical protein filament. The RecA filament interior is where the DNA strand exchange activity takes place. More specifically, a recent crystal structure of a RecA-DNA complex identifies residues involved in DNA binding <abbrgrp><abbr bid="B29">29</abbr></abbrgrp>; but, the report did not discuss the sequence conservation of these amino acids. We observe that most of the positions involved in direct DNA contacts are almost completely conserved throughout bacterial RecA evolution (Table <tblr tid="T2">2</tblr>) as would be expected for ligand binding residues. However, there are some exceptions. In the <it>E. coli </it>RecA protein cocrystal structure, 164-met is involved in making DNA ribose contacts. Surprisingly, at this position methionine occurs in only 20% of the RecA homologs in the MSA. Instead valine is the more frequent (62%) residue found among bacterial RecA proteins. In addition, two residues involved in DNA base contacts (197-met and 199-ile) have potentially non-conservative substitutions with respect to charge (glutamate) or steric (valine) considerations, respectively. An <it>E. coli </it>RecA mutant 197-met to glu is defective for <it>in vivo </it>repair activities <abbrgrp><abbr bid="B67">67</abbr></abbrgrp>. There are conflicting reports on whether a 199-ile to val RecA mutant is impaired for repair activity <abbrgrp><abbr bid="B67">67</abbr><abbr bid="B68">68</abbr></abbrgrp>. Parenthetically, we also checked these residue positions in MSAs of the distant RecA homologs such as eukaryotic Rad51/Dmc1, archaeal RadA, and viral UvsX proteins <abbrgrp><abbr bid="B17">17</abbr><abbr bid="B69">69</abbr></abbrgrp>. In contrast to the bacterial RecA MSA, only 211-gly and 212-gly are completely conserved among distant homologs while there is weak sequence similarity at positions 164, 176, 200, and 213. Models for the roles of the DNA-interacting positions should account for this sequence diversity.</p>
            <tbl id="T2">
               <title>
                  <p>Table 2</p>
               </title>
               <caption>
                  <p>Conservation of DNA binding residues</p>
               </caption>
               <tblbdy cols="3">
                  <r>
                     <c ca="left">
                        <p>
                           <b>Residue</b>
                        </p>
                     </c>
                     <c ca="right">
                        <p>
                           <b>% Freq.</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>Other residues</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c cspan="3">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>162-Ser</p>
                     </c>
                     <c ca="right">
                        <p>59</p>
                     </c>
                     <c ca="left">
                        <p>Ala 14%, Gln 12%</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>164-Met</p>
                     </c>
                     <c ca="right">
                        <p>20</p>
                     </c>
                     <c ca="left">
                        <p>Val 62%</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>165-Gly</p>
                     </c>
                     <c ca="right">
                        <p>99</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>168-Ala</p>
                     </c>
                     <c ca="right">
                        <p>100</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>169-Arg</p>
                     </c>
                     <c ca="right">
                        <p>99</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>172-Ser</p>
                     </c>
                     <c ca="right">
                        <p>99</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>176-Arg</p>
                     </c>
                     <c ca="right">
                        <p>99</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>196-Arg</p>
                     </c>
                     <c ca="right">
                        <p>99</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>197-Met</p>
                     </c>
                     <c ca="right">
                        <p>47</p>
                     </c>
                     <c ca="left">
                        <p>Glu 42%</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>198-Lys</p>
                     </c>
                     <c ca="right">
                        <p>98</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>199-Ile</p>
                     </c>
                     <c ca="right">
                        <p>74</p>
                     </c>
                     <c ca="left">
                        <p>Val 25%</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>200-Gly</p>
                     </c>
                     <c ca="right">
                        <p>99</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>207-Glu*</p>
                     </c>
                     <c ca="right">
                        <p>99</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>208-Thr</p>
                     </c>
                     <c ca="right">
                        <p>90</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>211-Gly</p>
                     </c>
                     <c ca="right">
                        <p>100</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>212-Gly</p>
                     </c>
                     <c ca="right">
                        <p>99</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>213-Asn</p>
                     </c>
                     <c ca="right">
                        <p>52</p>
                     </c>
                     <c ca="left">
                        <p>Arg 31%</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>226-Arg*</p>
                     </c>
                     <c ca="right">
                        <p>97</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>243-Arg*</p>
                     </c>
                     <c ca="right">
                        <p>56</p>
                     </c>
                     <c ca="left">
                        <p>Lys 41%</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>245-Lys*</p>
                     </c>
                     <c ca="right">
                        <p>96</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>280-Lys*</p>
                     </c>
                     <c ca="right">
                        <p>30</p>
                     </c>
                     <c ca="left">
                        <p>Glu 32%, Asp 16%</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>282-Lys*</p>
                     </c>
                     <c ca="right">
                        <p>19</p>
                     </c>
                     <c ca="left">
                        <p>Gly 36%, Asp 29%</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>286-Lys*</p>
                     </c>
                     <c ca="right">
                        <p>93</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>302-Lys*</p>
                     </c>
                     <c ca="right">
                        <p>46</p>
                     </c>
                     <c ca="left">
                        <p>Arg 47%</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>The first column lists <it>E. coli </it>RecA residues directly involved in DNA binding and those residues proposed (*) to interact with DNA <abbrgrp><abbr bid="B29">29</abbr></abbrgrp>. The "% Freq." column reports the percent frequency of the indicated amino acid among 300 RecA homologs. The last column shows the percent frequency of other residues at that position of the alignment. See text for a description of conservation at these positions among eukaryotic and archaeal RecA homologs.</p>
               </tblfn>
            </tbl>
         </sec>
         <sec>
            <st>
               <p>ProfileGrid structural pattern analysis of the MAW motif</p>
            </st>
            <p>When combined with different amino acid properties <abbrgrp><abbr bid="B61">61</abbr></abbrgrp>, ProfileGrids are a useful tool for visualizing structural patterns across the interspecies diversity of a protein family. We illustrate this on two adjacent motifs (MAW and P-loop) that comprise the most conserved part of RecA homologs of bacteria, eukaryotes, and archaea <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>. Of the two, only the function of the P-loop (the cofactor binding site) has been determined <abbrgrp><abbr bid="B26">26</abbr></abbrgrp>. By contrast, little <abbrgrp><abbr bid="B17">17</abbr></abbrgrp> is known about the MAW motif (residues 40&#8211;65). From the RecA crystal structures, the MAW motif (or "motif 1a"; see Additional file <supplr sid="S1">1</supplr> for motif and variable names) consists of a loop, &#945;-helix B, a tight turn, and ends with &#946;-strand 1. This glycine-rich motif threads through the RecA hydrophobic core and interacts with motifs (1b, 4a, and 5b) that form part of the ATP binding site; but, the MAW region itself has not been shown to contact the cofactor ligand. The MAW motif also connects the P-loop to a hinge (variable 1) that undergoes a dramatic change in the transition from the inactive to active RecA conformation <abbrgrp><abbr bid="B29">29</abbr></abbrgrp>. We note that aside from the protein termini, this hinge region is one of the least conserved parts of the RecA protein (Figure <figr fid="F6">6</figr>, Additional files <supplr sid="S1">1</supplr> and <supplr sid="S2">2</supplr>).</p>
            <p>The ProfileGrid in Figure <figr fid="F9">9</figr> displays the MAW and P-loop motifs sorted by the residue properties of helicity and volume. Among RecA homologs, the region separating helix B and strand 1 is dominated by residues which do not favor helix formation (Figure <figr fid="F9">9A</figr>). The conserved glycines are probably necessary for the tight turn that occurs in this area <abbrgrp><abbr bid="B70">70</abbr></abbrgrp>. Sorting the MAW motif ProfileGrid by amino acid sidechain volume (Figure <figr fid="F9">9B</figr>) allows the visualization of two other structural features. First, the loop from residues 41 to 44 is composed of small amino acids, namely threonine or smaller. Intriguingly, an <it>E. coli </it>RecA mutant with a change of 44-serine to the much larger leucine residue is proficient for <it>in vivo </it>recombination activity. However, the mutant is resistant to the recombination inhibitory effect of overexpression of the UmuD'C complex <abbrgrp><abbr bid="B71">71</abbr></abbrgrp>. The second observed volume feature is that large residues between positions 45 and 58 are, in general, flanked on either side by small amino acids resulting in an alternating pattern of small-large-small residues.</p>
            <fig id="F9">
               <title>
                  <p>Figure 9</p>
               </title>
               <caption>
                  <p>Structural analysis of MAW and P-loop motif regions</p>
               </caption>
               <text>
                  <p><b>Structural analysis of MAW and P-loop motif regions</b>. The MAW and P-loop motifs are highly conserved parts of the RecA protein family found at <it>E. coli </it>homolog positions 40&#8211;65 and 66&#8211;73, respectively. Labels denote the locations of &#945;-helix B and &#946;-strand 1 from the <it>E. coli </it>RecA crystal structure. Sorting the ProfileGrid rows by various amino acid physical constants reveals structural patterns within the context of the entire MSA. (A) Sorting by decreasing helical propensity shows that residues which do not favor helical formation (circled) immediately follow a helix in the MAW motif. (B) Sorting by decreasing volume displays the pattern (blue lines) that large amino acids are flanked by residues smaller than threonine. Whereas these panels were generated from the spreadsheet output, the JProfileGrid software allows an interactive analysis by switching between residue properties and color schemes.</p>
               </text>
               <graphic file="1471-2105-9-554-9"/>
            </fig>
            <p>When considering distant RecA homologs from all domains of life, the MAW motif is better conserved than the recently defined DNA interacting residues (Table <tblr tid="T2">2</tblr>). It is curious, then, that no clear function has been attributed to the MAW motif so here we speculate on possible roles. Universally conserved residues can be involved in ligand interactions or in protein folding <abbrgrp><abbr bid="B72">72</abbr><abbr bid="B73">73</abbr><abbr bid="B74">74</abbr></abbrgrp>. While a ligand interacting role is a formal possibility for the MAW motif, this region of the protein forms part of the RecA hydrophobic core. However, one or more residues in the segment spanning positions 61&#8211;72 can be crosslinked to bound single-stranded DNA <abbrgrp><abbr bid="B75">75</abbr></abbrgrp>. This suggests that parts of the MAW motif may not remain buried in the protein core at all times and that the motif may be involved in DNA binding. With respect to a protein folding role, the RecA ProfileGrid shows a high prevalence of isoleucine, leucine, and valine residues among bacterial RecA MAW motifs (Additional file <supplr sid="S2">2</supplr>). Specifically, two conserved leucines are on the same face of helix B (positions 47 and 51). Two properties of leucine may be relevant to this observation. First, in a study of crystal structures, leucine was found to have the largest amount of sidechain flexibility when buried <abbrgrp><abbr bid="B52">52</abbr></abbrgrp>. Second, leucine is known to stabilize helices <abbrgrp><abbr bid="B76">76</abbr></abbrgrp> which agrees with a theoretical study of RecA family helices. The residues from 44 to 51 of helix B have a near optimal sequence for thermostability when compared to other central domain helices <abbrgrp><abbr bid="B77">77</abbr></abbrgrp>. Also, mutation of position 51 from leucine to phenylalanine results in a RecA mutant that is inactive for activities both <it>in vivo </it>and <it>in vitro </it><abbrgrp><abbr bid="B78">78</abbr><abbr bid="B79">79</abbr></abbrgrp>. Thus, a role for the MAW motif may be to initiate protein folding or to stabilize the RecA protein core mediated by the motif structural features described above. Perhaps such a protein folding role is significant for a motif that connects an ATP binding site to the hinge region that undergoes conformational changes upon cofactor binding.</p>
         </sec>
         <sec>
            <st>
               <p>Highlighting unique <it>B. subtilis </it>RecA residues</p>
            </st>
            <p>The JProfileGrid "highlight sequence" feature can draw attention to any unique residues of a particular sequence within the context of the entire MSA. Here, we analyze the <it>B. subtilis </it>RecA protein [GenBank:CAB13567.1]. The ProfileGrid of Figure <figr fid="F10">10</figr> clearly shows that the characters 85-gln, 87-gap, 88-arg, and 90-ser are rarely found between the highly conserved positions 84 and 91. In addition, 88-arg is significantly larger than the more frequently observed glycine. Given the aforementioned caution about overinterpreting rare residues, we do not believe that the unique <it>B. subtilis </it>RecA feature described here is a due to a sequence error. We found the same result in two redundant <it>B. subtilis </it>RecA sequences determined from different research groups [GenBank:CAA36377.1, GenBank:AAB47709.1]. What could be the functional role for these residues? We note that there is controversy regarding the ability of the <it>B. subtilis </it>RecA protein to hydrolyze the cofactor ATP <abbrgrp><abbr bid="B80">80</abbr><abbr bid="B81">81</abbr><abbr bid="B82">82</abbr></abbrgrp>. We suggest that this region of the <it>B. subtilis </it>RecA protein be targeted for site-directed mutagenesis to ascertain if this rare sequence feature influences a potentially unique biochemical activity.</p>
            <fig id="F10">
               <title>
                  <p>Figure 10</p>
               </title>
               <caption>
                  <p>Representing a unique <it>B. subtilis </it>RecA sequence feature</p>
               </caption>
               <text>
                  <p><b>Representing a unique <it>B. subtilis </it>RecA sequence feature</b>. In this ProfileGrid where the residues are sorted by volume, the <it>B. subtilis </it>RecA homolog is chosen as the "highlight sequence" and appears in the row immediately under the <it>E. coli </it>RecA template sequence. JProfileGrid performs a pair-wise comparison and represents any differences between the two sequences with blue boxes. It is clear within the context of the entire MSA that <it>B. subtilis </it>has a rarely occurring sequence from residues 85 to 90 (<it>E. coli </it>RecA numbering).</p>
               </text>
               <graphic file="1471-2105-9-554-10"/>
            </fig>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Conclusion</p>
         </st>
         <p>ProfileGrids serve as a new visual representation of large sequence alignments where the entire information content is presented in a concise form. The JProfileGrid Java software facilitates the creation and analysis of this alignment depiction. With the advent of sequence databases and software programs adopting MSA viewers, the traditional stacked sequence presentation is burdensome for large alignments especially for the interactive analysis of structural patterns and rare features. Thus, we anticipate that the ProfileGrid paradigm will have widespread application in bioinformatics. Finally, we describe and analyze a curated RecA protein data set whose representation as a ProfileGrid will serve as a valuable resource for researchers studying this ubiquitous protein.</p>
      </sec>
      <sec>
         <st>
            <p>Availability and requirements</p>
         </st>
         <p><b>Project name</b>: JProfileGrid version 1.1.1</p>
         <p><b>Project home page</b>: <url>http://www.profilegrid.org</url></p>
         <p><b>Operating systems</b>: Platform independent</p>
         <p><b>Programming language</b>: Java 1.5 or higher</p>
         <p><b>License</b>: University of California license; see <url>http://www.profilegrid.org/downloads.shtml#license</url></p>
         <p><b>Any restrictions to use by non-academics</b>: license required for commercial use</p>
      </sec>
      <sec>
         <st>
            <p>Abbreviations</p>
         </st>
         <p>MSA: Multiple Sequence Alignment</p>
      </sec>
      <sec>
         <st>
            <p>Competing interests</p>
         </st>
         <p>The authors declare that they have no competing interests.</p>
      </sec>
      <sec>
         <st>
            <p>Authors' contributions</p>
         </st>
         <p>AIR designed the software, collected RecA sequences, performed the bioinformatic analysis &amp; biocuration, and wrote the majority of the manuscript and documentation. AEA collected sequences. ACA wrote Java code and contributed to writing the manuscript and documentation. All authors read and approved the final manuscript and the response to reviewer comments.</p>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>We thank Marcin Joachimiak (LBNL), Markus Kaufman (UCLA; CPS), Juan Alonso (CNB, Spain) and Michael Cox (UW-Madison) for insightful discussions. AIR was supported by a University of California President's Postdoctoral Fellowship, the Erasmo Foundation (grant TSC13702), and a National Institutes of Health Diversity Supplement (parent grant GM058868 to Alexander McPherson). AEA was supported by NIH MBRS grant GM55246 awarded to the UC-Irvine Minority Science Undergraduate Program. ACA was supported by the UC-Irvine Undergraduate Research Opportunities Program.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>A comprehensive set of sequence analysis programs for the VAX</p>
            </title>
            <aug>
               <au>
                  <snm>Devereux</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Haeberli</snm>
                  <fnm>PH</fnm>
               </au>
               <au>
                  <snm>Smithies</snm>
                  <fnm>OS</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>1984</pubdate>
            <volume>12</volume>
            <issue>1</issue>
            <fpage>387</fpage>
            <lpage>395</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">321012</pubid>
                  <pubid idtype="pmpid" link="fulltext">6546423</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>SOMAP: a novel interactive approach to multiple protein sequences alignment</p>
            </title>
            <aug>
               <au>
                  <snm>Parry-Smith</snm>
                  <fnm>DJ</fnm>
               </au>
               <au>
                  <snm>Attwood</snm>
                  <fnm>TK</fnm>
               </au>
            </aug>
            <source>Comput Appl Biosci</source>
            <pubdate>1991</pubdate>
            <volume>7</volume>
            <issue>2</issue>
            <fpage>233</fpage>
            <lpage>235</lpage>
            <xrefbib>
               <pubid idtype="pmpid">2059849</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>ALSCRIPT: a tool to format multiple sequence alignments</p>
            </title>
            <aug>
               <au>
                  <snm>Barton</snm>
                  <fnm>GJ</fnm>
               </au>
            </aug>
            <source>Protein Eng</source>
            <pubdate>1993</pubdate>
            <volume>6</volume>
            <issue>1</issue>
            <fpage>37</fpage>
            <lpage>40</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">8433969</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>A major component approach to presenting consensus sequences</p>
            </title>
            <aug>
               <au>
                  <snm>Smith</snm>
                  <fnm>DK</fnm>
               </au>
               <au>
                  <snm>Xue</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>1998</pubdate>
            <volume>14</volume>
            <issue>2</issue>
            <fpage>151</fpage>
            <lpage>156</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">9545447</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>Sequence logos: a new way to display consensus sequences</p>
            </title>
            <aug>
               <au>
                  <snm>Schneider</snm>
                  <fnm>TD</fnm>
               </au>
               <au>
                  <snm>Stephens</snm>
                  <fnm>RM</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>1990</pubdate>
            <volume>18</volume>
            <issue>20</issue>
            <fpage>6097</fpage>
            <lpage>6100</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">332411</pubid>
                  <pubid idtype="pmpid" link="fulltext">2172928</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>Nomenclature for protein modules and their cognate motifs</p>
            </title>
            <aug>
               <au>
                  <snm>Puntervoll</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Aasland</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Modular Protein Domains</source>
            <publisher>Weinheim, Germany: Wiley-VCH</publisher>
            <editor>Cesareni G, Gimona M, Sudol M, Yaffe M</editor>
            <pubdate>2005</pubdate>
            <fpage>477</fpage>
            <lpage>486</lpage>
         </bibl>
         <bibl id="B7">
            <title>
               <p>CINEMA &#8211; a novel colour INteractive editor for multiple alignments</p>
            </title>
            <aug>
               <au>
                  <snm>Parry-Smith</snm>
                  <fnm>DJ</fnm>
               </au>
               <au>
                  <snm>Payne</snm>
                  <fnm>AW</fnm>
               </au>
               <au>
                  <snm>Michie</snm>
                  <fnm>AD</fnm>
               </au>
               <au>
                  <snm>Attwood</snm>
                  <fnm>TK</fnm>
               </au>
            </aug>
            <source>Gene</source>
            <pubdate>1998</pubdate>
            <volume>221</volume>
            <issue>1</issue>
            <fpage>GC57</fpage>
            <lpage>63</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">9852962</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>The Jalview Java alignment editor</p>
            </title>
            <aug>
               <au>
                  <snm>Clamp</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Cuff</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Searle</snm>
                  <fnm>SM</fnm>
               </au>
               <au>
                  <snm>Barton</snm>
                  <fnm>GJ</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2004</pubdate>
            <volume>20</volume>
            <issue>3</issue>
            <fpage>426</fpage>
            <lpage>427</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">14960472</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>EMBOSS: the European Molecular Biology Open Software Suite</p>
            </title>
            <aug>
               <au>
                  <snm>Rice</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Longden</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Bleasby</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Trends Genet</source>
            <pubdate>2000</pubdate>
            <volume>16</volume>
            <issue>6</issue>
            <fpage>276</fpage>
            <lpage>277</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">10827456</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>Profile analysis: detection of distantly related proteins</p>
            </title>
            <aug>
               <au>
                  <snm>Gribskov</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>McLachlan</snm>
                  <fnm>AD</fnm>
               </au>
               <au>
                  <snm>Eisenberg</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>1987</pubdate>
            <volume>84</volume>
            <fpage>4355</fpage>
            <lpage>4358</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">305087</pubid>
                  <pubid idtype="pmpid" link="fulltext">3474607</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>Insights into DNA recombination from the structure of a RAD51-BRCA2 complex</p>
            </title>
            <aug>
               <au>
                  <snm>Pellegrini</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Yu</snm>
                  <fnm>DS</fnm>
               </au>
               <au>
                  <snm>Lo</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Anand</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Lee</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Blundell</snm>
                  <fnm>TL</fnm>
               </au>
               <au>
                  <snm>Venkitaraman</snm>
                  <fnm>AR</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2002</pubdate>
            <volume>420</volume>
            <issue>6913</issue>
            <fpage>287</fpage>
            <lpage>293</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">12442171</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>JColorGrid: software for the visualization of biological measurements</p>
            </title>
            <aug>
               <au>
                  <snm>Joachimiak</snm>
                  <fnm>MP</fnm>
               </au>
               <au>
                  <snm>Weisman</snm>
                  <fnm>JL</fnm>
               </au>
               <au>
                  <snm>May</snm>
                  <fnm>BCH</fnm>
               </au>
            </aug>
            <source>BMC Bioinformatics</source>
            <pubdate>2006</pubdate>
            <volume>7</volume>
            <fpage>225</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1479842</pubid>
                  <pubid idtype="pmpid" link="fulltext">16640789</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B13">
            <title>
               <p>BMC Author instructions: Sequence alignments</p>
            </title>
            <url>http://www.biomedcentral.com/info/ifora/figuretypes#sequence</url>
         </bibl>
         <bibl id="B14">
            <title>
               <p>Recombinational DNA repair in bacteria and the RecA protein</p>
            </title>
            <aug>
               <au>
                  <snm>Cox</snm>
                  <fnm>MM</fnm>
               </au>
            </aug>
            <source>Prog Nucleic Acid Res Mol Biol</source>
            <pubdate>1999</pubdate>
            <volume>63</volume>
            <fpage>311</fpage>
            <lpage>366</lpage>
            <xrefbib>
               <pubid idtype="pmpid">10506835</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B15">
            <title>
               <p>SOS responses and DNA damage tolerance in prokaryotes</p>
            </title>
            <aug>
               <au>
                  <snm>Friedberg</snm>
                  <fnm>EC</fnm>
               </au>
               <au>
                  <snm>Walker</snm>
                  <fnm>GC</fnm>
               </au>
               <au>
                  <snm>Siede</snm>
                  <fnm>W</fnm>
               </au>
            </aug>
            <source>DNA Repair and Mutagenesis</source>
            <publisher>Washington, D.C.: ASM Press</publisher>
            <pubdate>1995</pubdate>
            <fpage>407</fpage>
            <lpage>464</lpage>
         </bibl>
         <bibl id="B16">
            <title>
               <p>Evolutionary comparisons of RecA-like proteins across all major kingdoms of living organisms</p>
            </title>
            <aug>
               <au>
                  <snm>Brendel</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Brocchieri</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Sandler</snm>
                  <fnm>SJ</fnm>
               </au>
               <au>
                  <snm>Clark</snm>
                  <fnm>AJ</fnm>
               </au>
               <au>
                  <snm>Karlin</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>J Mol Evol</source>
            <pubdate>1997</pubdate>
            <volume>44</volume>
            <issue>5</issue>
            <fpage>528</fpage>
            <lpage>541</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">9115177</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>RecA protein: structure, function, and role in recombinational DNA repair</p>
            </title>
            <aug>
               <au>
                  <snm>Roca</snm>
                  <fnm>AI</fnm>
               </au>
               <au>
                  <snm>Cox</snm>
                  <fnm>MM</fnm>
               </au>
            </aug>
            <source>Prog Nucleic Acid Res Mol Biol</source>
            <pubdate>1997</pubdate>
            <volume>56</volume>
            <fpage>129</fpage>
            <lpage>223</lpage>
            <xrefbib>
               <pubid idtype="pmpid">9187054</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p>Evolution of the <it>recA </it>gene and the molecular phylogeny of bacteria</p>
            </title>
            <aug>
               <au>
                  <snm>Lloyd</snm>
                  <fnm>AT</fnm>
               </au>
               <au>
                  <snm>Sharp</snm>
                  <fnm>PM</fnm>
               </au>
            </aug>
            <source>J Mol Evol</source>
            <pubdate>1993</pubdate>
            <volume>37</volume>
            <issue>4</issue>
            <fpage>399</fpage>
            <lpage>407</lpage>
            <xrefbib>
               <pubid idtype="pmpid">8308907</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B19">
            <title>
               <p>The RecA protein as a model molecule for molecular systematic studies of bacteria: comparison of trees of RecAs and 16S rRNAs from the same species</p>
            </title>
            <aug>
               <au>
                  <snm>Eisen</snm>
                  <fnm>JA</fnm>
               </au>
            </aug>
            <source>J Mol Evol</source>
            <pubdate>1995</pubdate>
            <volume>41</volume>
            <fpage>1105</fpage>
            <lpage>1123</lpage>
            <xrefbib>
               <pubid idtype="pmpid">8587109</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B20">
            <title>
               <p>Identification and phylogenetic sorting of bacterial lineages with universally conserved genes and proteins</p>
            </title>
            <aug>
               <au>
                  <snm>Santos</snm>
                  <fnm>SR</fnm>
               </au>
               <au>
                  <snm>Ochman</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>Environ Microbiol</source>
            <pubdate>2004</pubdate>
            <volume>6</volume>
            <issue>7</issue>
            <fpage>754</fpage>
            <lpage>759</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">15186354</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B21">
            <title>
               <p>Comparative and evolutionary analysis of the bacterial homologous recombination systems</p>
            </title>
            <aug>
               <au>
                  <snm>Rocha</snm>
                  <fnm>EP</fnm>
               </au>
               <au>
                  <snm>Cornet</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Michel</snm>
                  <fnm>B</fnm>
               </au>
            </aug>
            <source>PLoS Genet</source>
            <pubdate>2005</pubdate>
            <volume>1</volume>
            <issue>2</issue>
            <fpage>e15</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1193525</pubid>
                  <pubid idtype="pmpid" link="fulltext">16132081</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B22">
            <title>
               <p>Isolation and characterization of recombination-deficient mutants of <it>Escherichia coli </it>K-12</p>
            </title>
            <aug>
               <au>
                  <snm>Clark</snm>
                  <fnm>AJ</fnm>
               </au>
               <au>
                  <snm>Margulies</snm>
                  <fnm>AD</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>1965</pubdate>
            <volume>53</volume>
            <fpage>451</fpage>
            <lpage>459</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">219534</pubid>
                  <pubid idtype="pmpid">14294081</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B23">
            <title>
               <p>Sequences of the <it>recA </it>gene and protein</p>
            </title>
            <aug>
               <au>
                  <snm>Sancar</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Stachelek</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Konigsberg</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Rupp</snm>
                  <fnm>WD</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>1980</pubdate>
            <volume>77</volume>
            <fpage>2611</fpage>
            <lpage>2615</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">349452</pubid>
                  <pubid idtype="pmpid">6930655</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B24">
            <title>
               <p>Organization of the <it>recA </it>gene of <it>Escherichia coli</it></p>
            </title>
            <aug>
               <au>
                  <snm>Horii</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Ogawa</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Ogawa</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>1980</pubdate>
            <volume>77</volume>
            <issue>1</issue>
            <fpage>313</fpage>
            <lpage>317</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">348260</pubid>
                  <pubid idtype="pmpid">6244554</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B25">
            <title>
               <p>General microbiology of <it>recA</it>: environmental and evolutionary significance</p>
            </title>
            <aug>
               <au>
                  <snm>Miller</snm>
                  <fnm>RV</fnm>
               </au>
               <au>
                  <snm>Kokjohn</snm>
                  <fnm>TA</fnm>
               </au>
            </aug>
            <source>Annu Rev Microbiol</source>
            <pubdate>1990</pubdate>
            <volume>44</volume>
            <fpage>365</fpage>
            <lpage>394</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">2252387</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B26">
            <title>
               <p>The P-loop: a common motif in ATP- and GTP-binding proteins</p>
            </title>
            <aug>
               <au>
                  <snm>Saraste</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Sibbald</snm>
                  <fnm>PR</fnm>
               </au>
               <au>
                  <snm>Wittinghofer</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Trends Biochem Sci</source>
            <pubdate>1990</pubdate>
            <volume>15</volume>
            <fpage>430</fpage>
            <lpage>434</lpage>
            <xrefbib>
               <pubid idtype="pmpid">2126155</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B27">
            <title>
               <p>The bacterial replicative heliase DnaB evolved from a RecA duplication</p>
            </title>
            <aug>
               <au>
                  <snm>Leipe</snm>
                  <fnm>DD</fnm>
               </au>
               <au>
                  <snm>Aravind</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Grishin</snm>
                  <fnm>NV</fnm>
               </au>
               <au>
                  <snm>Koonin</snm>
                  <fnm>EV</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2000</pubdate>
            <volume>10</volume>
            <fpage>5</fpage>
            <lpage>16</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">10645945</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B28">
            <title>
               <p>Molecular design and functional organization of the RecA protein</p>
            </title>
            <aug>
               <au>
                  <snm>McGrew</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>Knight</snm>
                  <fnm>KL</fnm>
               </au>
            </aug>
            <source>Crit Rev Biochem Mol Biol</source>
            <pubdate>2003</pubdate>
            <volume>38</volume>
            <issue>5</issue>
            <fpage>385</fpage>
            <lpage>432</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">14693725</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B29">
            <title>
               <p>Mechanism of homologous recombination from the RecA-ssDNA/dsDNA structures</p>
            </title>
            <aug>
               <au>
                  <snm>Chen</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Yang</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Pavletich</snm>
                  <fnm>NP</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2008</pubdate>
            <volume>453</volume>
            <issue>7194</issue>
            <fpage>489</fpage>
            <lpage>484</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">18497818</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B30">
            <title>
               <p>Bacterial classifications derived from RecA protein sequence comparisons</p>
            </title>
            <aug>
               <au>
                  <snm>Karlin</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Weinstock</snm>
                  <fnm>GM</fnm>
               </au>
               <au>
                  <snm>Brendel</snm>
                  <fnm>V</fnm>
               </au>
            </aug>
            <source>J Bacteriol</source>
            <pubdate>1995</pubdate>
            <volume>177</volume>
            <issue>23</issue>
            <fpage>6881</fpage>
            <lpage>6893</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">177557</pubid>
                  <pubid idtype="pmpid" link="fulltext">7592482</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B31">
            <title>
               <p>JExcelAPI</p>
            </title>
            <url>http://jexcelapi.sourceforge.net</url>
         </bibl>
         <bibl id="B32">
            <title>
               <p>The PyMOL Molecular Graphics System</p>
            </title>
            <url>http://pymol.sourceforge.net</url>
         </bibl>
         <bibl id="B33">
            <title>
               <p>Database resources of the National Center for Biotechnology Information</p>
            </title>
            <aug>
               <au>
                  <snm>Wheeler</snm>
                  <fnm>DL</fnm>
               </au>
               <au>
                  <snm>Barrett</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Benson</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>Bryant</snm>
                  <fnm>SH</fnm>
               </au>
               <au>
                  <snm>Canese</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Church</snm>
                  <fnm>DM</fnm>
               </au>
               <au>
                  <snm>DiCuccio</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Edgar</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Federhen</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Helmberg</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Kenton</snm>
                  <fnm>DL</fnm>
               </au>
               <au>
                  <snm>Khovayko</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Lipman</snm>
                  <fnm>DJ</fnm>
               </au>
               <au>
                  <snm>Madden</snm>
                  <fnm>TL</fnm>
               </au>
               <au>
                  <snm>Maglott</snm>
                  <fnm>DR</fnm>
               </au>
               <au>
                  <snm>Ostell</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Pontius</snm>
                  <fnm>JU</fnm>
               </au>
               <au>
                  <snm>Pruitt</snm>
                  <fnm>KD</fnm>
               </au>
               <au>
                  <snm>Schuler</snm>
                  <fnm>GD</fnm>
               </au>
               <au>
                  <snm>Schriml</snm>
                  <fnm>LM</fnm>
               </au>
               <au>
                  <snm>Sequeira</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Sherry</snm>
                  <fnm>ST</fnm>
               </au>
               <au>
                  <snm>Sirotkin</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Starchenko</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Suzek</snm>
                  <fnm>TO</fnm>
               </au>
               <au>
                  <snm>Tatusov</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Tatusova</snm>
                  <fnm>TA</fnm>
               </au>
               <au>
                  <snm>Wagner</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Yaschenko</snm>
                  <fnm>E</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2005</pubdate>
            <volume>33</volume>
            <fpage>D39</fpage>
            <lpage>45</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">540016</pubid>
                  <pubid idtype="pmpid" link="fulltext">15608222</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B34">
            <title>
               <p>The Comprehensive Microbial Resource</p>
            </title>
            <aug>
               <au>
                  <snm>Peterson</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Umayam</snm>
                  <fnm>LA</fnm>
               </au>
               <au>
                  <snm>Dickinson</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Hickey</snm>
                  <fnm>EK</fnm>
               </au>
               <au>
                  <snm>White</snm>
                  <fnm>O</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2001</pubdate>
            <volume>29</volume>
            <fpage>123</fpage>
            <lpage>125</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">29848</pubid>
                  <pubid idtype="pmpid" link="fulltext">11125067</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B35">
            <title>
               <p>DDBJ in collaboration with mass-sequencing teams on annotation</p>
            </title>
            <aug>
               <au>
                  <snm>Tateno</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Saitou</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Okubo</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Sugawara</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Gojobori</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2005</pubdate>
            <volume>33</volume>
            <fpage>D25</fpage>
            <lpage>28</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">539974</pubid>
                  <pubid idtype="pmpid" link="fulltext">15608189</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B36">
            <title>
               <p>The EMBL Nucleotide Sequence Database</p>
            </title>
            <aug>
               <au>
                  <snm>Kanz</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Aldebert</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Althorpe</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Baker</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Baldwin</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Bates</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Browne</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Broek</snm>
                  <mnm>van den</mnm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Castro</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Cochrane</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Duggan</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Eberhardt</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Faruque</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Gamble</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Diez</snm>
                  <fnm>FG</fnm>
               </au>
               <au>
                  <snm>Harte</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Kulikova</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Lin</snm>
                  <fnm>Q</fnm>
               </au>
               <au>
                  <snm>Lombard</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Lopez</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Mancuso</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>McHale</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Nardone</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Silventoinen</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Sobhany</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Stoehr</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Tuli</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Tzouvara</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Vaughan</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Wu</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Zhu</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Apweiler</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2005</pubdate>
            <volume>33</volume>
            <fpage>D29</fpage>
            <lpage>33</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">540052</pubid>
                  <pubid idtype="pmpid" link="fulltext">15608199</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B37">
            <title>
               <p>The Universal Protein Resource (UniProt)</p>
            </title>
            <aug>
               <au>
                  <snm>Bairoch</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Apweiler</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Wu</snm>
                  <fnm>CH</fnm>
               </au>
               <au>
                  <snm>Barker</snm>
                  <fnm>WC</fnm>
               </au>
               <au>
                  <snm>Boeckmann</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Ferro</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Gasteiger</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Huang</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Lopez</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Magrane</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Martin</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Natale</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>O'Donovan</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Redaschi</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Yeh</snm>
                  <fnm>LS</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2005</pubdate>
            <volume>33</volume>
            <fpage>D154</fpage>
            <lpage>159</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">540024</pubid>
                  <pubid idtype="pmpid" link="fulltext">15608167</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B38">
            <title>
               <p>Basic local alignment search tool</p>
            </title>
            <aug>
               <au>
                  <snm>Altschul</snm>
                  <fnm>SF</fnm>
               </au>
               <au>
                  <snm>Gish</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Miller</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Myers</snm>
                  <fnm>EW</fnm>
               </au>
               <au>
                  <snm>Lipman</snm>
                  <fnm>DJ</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1990</pubdate>
            <volume>215</volume>
            <fpage>403</fpage>
            <lpage>410</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">2231712</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B39">
            <title>
               <p>Environmental genome shotgun sequencing of the Sargasso Sea</p>
            </title>
            <aug>
               <au>
                  <snm>Venter</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Remington</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Heidelberg</snm>
                  <fnm>JF</fnm>
               </au>
               <au>
                  <snm>Halpern</snm>
                  <fnm>AL</fnm>
               </au>
               <au>
                  <snm>Rusch</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Eisen</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Wu</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Paulsen</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Nelson</snm>
                  <fnm>KE</fnm>
               </au>
               <au>
                  <snm>Nelson</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Fouts</snm>
                  <fnm>DE</fnm>
               </au>
               <au>
                  <snm>Levy</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Knap</snm>
                  <fnm>AH</fnm>
               </au>
               <au>
                  <snm>Lomas</snm>
                  <fnm>MW</fnm>
               </au>
               <au>
                  <snm>Nealson</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>White</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Peterson</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Hoffman</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Parsons</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Baden-Tillson</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Pfannkoch</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Rogers</snm>
                  <fnm>YH</fnm>
               </au>
               <au>
                  <snm>Smith</snm>
                  <fnm>HO</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2004</pubdate>
            <volume>304</volume>
            <issue>5667</issue>
            <fpage>66</fpage>
            <lpage>74</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">15001713</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B40">
            <title>
               <p>The deduced <it>Vibrio cholerae </it>RecA amino acid sequence</p>
            </title>
            <aug>
               <au>
                  <snm>Margraf</snm>
                  <fnm>RL</fnm>
               </au>
               <au>
                  <snm>Roca</snm>
                  <fnm>AI</fnm>
               </au>
               <au>
                  <snm>Cox</snm>
                  <fnm>MM</fnm>
               </au>
            </aug>
            <source>Gene</source>
            <pubdate>1995</pubdate>
            <volume>152</volume>
            <issue>1</issue>
            <fpage>135</fpage>
            <lpage>136</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">7828921</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B41">
            <title>
               <p>Initial characterization of mutants in a universally conserved RecA structural motif</p>
            </title>
            <aug>
               <au>
                  <snm>Roca</snm>
                  <fnm>AI</fnm>
               </au>
            </aug>
            <source>PhD thesis</source>
            <publisher>Madison: University of Wisconsin-Madison</publisher>
            <pubdate>1997</pubdate>
         </bibl>
         <bibl id="B42">
            <title>
               <p>DNASTAR's Lasergene sequence analysis software</p>
            </title>
            <aug>
               <au>
                  <snm>Burland</snm>
                  <fnm>TG</fnm>
               </au>
            </aug>
            <source>Methods Mol Biol</source>
            <pubdate>2000</pubdate>
            <volume>132</volume>
            <fpage>71</fpage>
            <lpage>91</lpage>
            <xrefbib>
               <pubid idtype="pmpid">10547832</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B43">
            <title>
               <p>CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice</p>
            </title>
            <aug>
               <au>
                  <snm>Thompson</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Higgins</snm>
                  <fnm>DG</fnm>
               </au>
               <au>
                  <snm>Gibson</snm>
                  <fnm>TJ</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>1994</pubdate>
            <volume>22</volume>
            <issue>22</issue>
            <fpage>4673</fpage>
            <lpage>4680</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">308517</pubid>
                  <pubid idtype="pmpid" link="fulltext">7984417</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B44">
            <title>
               <p>The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools</p>
            </title>
            <aug>
               <au>
                  <snm>Thompson</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Gibson</snm>
                  <fnm>TJ</fnm>
               </au>
               <au>
                  <snm>Plewniak</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Jeanmougin</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Higgins</snm>
                  <fnm>DG</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>1997</pubdate>
            <volume>25</volume>
            <issue>24</issue>
            <fpage>4876</fpage>
            <lpage>4882</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">147148</pubid>
                  <pubid idtype="pmpid" link="fulltext">9396791</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B45">
            <aug>
               <au>
                  <snm>Pool</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Esnayra</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Bioinformatics: Converting Data to Knowledge: A Workshop Summary</source>
            <publisher>Washington, D.C.: National Academy Press</publisher>
            <pubdate>2000</pubdate>
         </bibl>
         <bibl id="B46">
            <title>
               <p>Sequencing errors and molecular evolutionary analysis</p>
            </title>
            <aug>
               <au>
                  <snm>Clark</snm>
                  <fnm>AG</fnm>
               </au>
               <au>
                  <snm>Whittam</snm>
                  <fnm>TS</fnm>
               </au>
            </aug>
            <source>Mol Biol Evol</source>
            <pubdate>1992</pubdate>
            <volume>9</volume>
            <issue>4</issue>
            <fpage>744</fpage>
            <lpage>752</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">1630310</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B47">
            <title>
               <p>Complete genome sequence of a multiple drug resistant <it>Salmonella enterica </it>serovar Typhi CT18</p>
            </title>
            <aug>
               <au>
                  <snm>Parkhill</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Dougan</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>James</snm>
                  <fnm>KD</fnm>
               </au>
               <au>
                  <snm>Thomson</snm>
                  <fnm>NR</fnm>
               </au>
               <au>
                  <snm>Pickard</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Wain</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Churcher</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Mungall</snm>
                  <fnm>KL</fnm>
               </au>
               <au>
                  <snm>Bentley</snm>
                  <fnm>SD</fnm>
               </au>
               <au>
                  <snm>Holden</snm>
                  <fnm>MT</fnm>
               </au>
               <au>
                  <snm>Sebaihia</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Baker</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Basham</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Brooks</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Chillingworth</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Connerton</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Cronin</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Davis</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Davies</snm>
                  <fnm>RM</fnm>
               </au>
               <au>
                  <snm>Dowd</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>White</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Farrar</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Feltwell</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Hamlin</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Haque</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Hien</snm>
                  <fnm>TT</fnm>
               </au>
               <au>
                  <snm>Holroyd</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Jagels</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Krogh</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Larsen</snm>
                  <fnm>TS</fnm>
               </au>
               <au>
                  <snm>Leather</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Moule</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>O'Gaora</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Parry</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Quail</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Rutherford</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Simmonds</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Skelton</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Stevens</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Whitehead</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Barrell</snm>
                  <fnm>BG</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2001</pubdate>
            <volume>413</volume>
            <issue>6858</issue>
            <fpage>848</fpage>
            <lpage>852</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">11677608</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B48">
            <title>
               <p>Functional characterization of residues in the P-loop motif of the RecA protein ATP binding site</p>
            </title>
            <aug>
               <au>
                  <snm>Konola</snm>
                  <fnm>JT</fnm>
               </au>
               <au>
                  <snm>Logan</snm>
                  <fnm>KM</fnm>
               </au>
               <au>
                  <snm>Knight</snm>
                  <fnm>KL</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1994</pubdate>
            <volume>237</volume>
            <issue>1</issue>
            <fpage>20</fpage>
            <lpage>34</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">8133517</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B49">
            <title>
               <p>DNA sequence analysis of the <it>recA </it>genes from <it>Proteus vulgaris, Erwinia carotovora, Shigella flexneri </it>and <it>Escherichia coli </it>B/r</p>
            </title>
            <aug>
               <au>
                  <snm>Zhao</snm>
                  <fnm>XJ</fnm>
               </au>
               <au>
                  <snm>McEntee</snm>
                  <fnm>K</fnm>
               </au>
            </aug>
            <source>Mol Gen Genet</source>
            <pubdate>1990</pubdate>
            <volume>222</volume>
            <issue>2&#8211;3</issue>
            <fpage>369</fpage>
            <lpage>376</lpage>
            <xrefbib>
               <pubid idtype="pmpid">2274037</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B50">
            <title>
               <p>Biocurators: contributors to the world of science</p>
            </title>
            <aug>
               <au>
                  <snm>Bourne</snm>
                  <fnm>PE</fnm>
               </au>
               <au>
                  <snm>McEntyre</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>PLoS Comput Biol</source>
            <pubdate>2006</pubdate>
            <volume>2</volume>
            <issue>10</issue>
            <fpage>e142</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1626157</pubid>
                  <pubid idtype="pmpid" link="fulltext">17411327</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B51">
            <title>
               <p>The triplet code from first principles</p>
            </title>
            <aug>
               <au>
                  <snm>Trifonov</snm>
                  <fnm>EN</fnm>
               </au>
            </aug>
            <source>J Biomol Struct Dyn</source>
            <pubdate>2004</pubdate>
            <volume>22</volume>
            <issue>1</issue>
            <fpage>1</fpage>
            <lpage>11</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">15214800</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B52">
            <title>
               <p>Analysis of a data set of paired uncomplexed protein structures: new metrics for side-chain flexibility and model evaluation</p>
            </title>
            <aug>
               <au>
                  <snm>Zhao</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Goodsell</snm>
                  <fnm>DS</fnm>
               </au>
               <au>
                  <snm>Olson</snm>
                  <fnm>AJ</fnm>
               </au>
            </aug>
            <source>Proteins</source>
            <pubdate>2001</pubdate>
            <volume>43</volume>
            <issue>3</issue>
            <fpage>271</fpage>
            <lpage>279</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">11288177</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B53">
            <title>
               <p>Codon usage tabulated from international DNA sequence databases: status for the year 2000</p>
            </title>
            <aug>
               <au>
                  <snm>Nakamura</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Gojobori</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Ikemura</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2000</pubdate>
            <volume>28</volume>
            <issue>1</issue>
            <fpage>292</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">102460</pubid>
                  <pubid idtype="pmpid" link="fulltext">10592250</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B54">
            <title>
               <p>A simple method for displaying the hydropathic character of a protein</p>
            </title>
            <aug>
               <au>
                  <snm>Kyte</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Doolittle</snm>
                  <fnm>RF</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1982</pubdate>
            <volume>157</volume>
            <issue>1</issue>
            <fpage>105</fpage>
            <lpage>132</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">7108955</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B55">
            <title>
               <p>Correlation of sequence hydrophobicities measures similarity in three-dimensional protein structure</p>
            </title>
            <aug>
               <au>
                  <snm>Sweet</snm>
                  <fnm>RM</fnm>
               </au>
               <au>
                  <snm>Eisenberg</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1983</pubdate>
            <volume>171</volume>
            <issue>4</issue>
            <fpage>479</fpage>
            <lpage>488</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">6663622</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B56">
            <title>
               <p>Helix propagation and N-cap propensities of the amino acids measured in alanine-based peptides in 40 volume percent trifluoroethanol</p>
            </title>
            <aug>
               <au>
                  <snm>Rohl</snm>
                  <fnm>CA</fnm>
               </au>
               <au>
                  <snm>Chakrabartty</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Baldwin</snm>
                  <fnm>RL</fnm>
               </au>
            </aug>
            <source>Protein Sci</source>
            <pubdate>1996</pubdate>
            <volume>5</volume>
            <issue>12</issue>
            <fpage>2623</fpage>
            <lpage>2637</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">2143311</pubid>
                  <pubid idtype="pmpid" link="fulltext">8976571</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B57">
            <title>
               <p>Amino acid difference formula to help explain protein evolution</p>
            </title>
            <aug>
               <au>
                  <snm>Grantham</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>1974</pubdate>
            <volume>185</volume>
            <issue>4154</issue>
            <fpage>862</fpage>
            <lpage>864</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">4843792</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B58">
            <title>
               <p>Matrices for detecting distant relationships</p>
            </title>
            <aug>
               <au>
                  <snm>Schwartz</snm>
                  <fnm>RM</fnm>
               </au>
               <au>
                  <snm>Dayhoff</snm>
                  <fnm>MO</fnm>
               </au>
            </aug>
            <source>Atlas of Protein Sequence &amp; Structure</source>
            <publisher>Washington, D. C.: Natl Biomed Res Found</publisher>
            <editor>Dayhoff MO</editor>
            <pubdate>1978</pubdate>
            <volume>5</volume>
            <fpage>353</fpage>
            <lpage>358</lpage>
         </bibl>
         <bibl id="B59">
            <title>
               <p>The nature of the accessible and buried surfaces in proteins</p>
            </title>
            <aug>
               <au>
                  <snm>Chothia</snm>
                  <fnm>C</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1976</pubdate>
            <volume>105</volume>
            <issue>1</issue>
            <fpage>1</fpage>
            <lpage>12</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">994183</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B60">
            <title>
               <p>The interpretation of protein structures: total volume, group volume distributions and packing density</p>
            </title>
            <aug>
               <au>
                  <snm>Richards</snm>
                  <fnm>FM</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1974</pubdate>
            <volume>82</volume>
            <issue>1</issue>
            <fpage>1</fpage>
            <lpage>14</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">4818482</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B61">
            <title>
               <p>AAindex: amino acid index database</p>
            </title>
            <aug>
               <au>
                  <snm>Kawashima</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Kanehisa</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2000</pubdate>
            <volume>28</volume>
            <issue>1</issue>
            <fpage>374</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">102411</pubid>
                  <pubid idtype="pmpid" link="fulltext">10592278</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B62">
            <title>
               <p>The structure of the <it>E. coli </it>RecA protein monomer and polymer</p>
            </title>
            <aug>
               <au>
                  <snm>Story</snm>
                  <fnm>RM</fnm>
               </au>
               <au>
                  <snm>Weber</snm>
                  <fnm>IT</fnm>
               </au>
               <au>
                  <snm>Steitz</snm>
                  <fnm>TA</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>1992</pubdate>
            <volume>355</volume>
            <issue>6358</issue>
            <fpage>318</fpage>
            <lpage>325</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">1731246</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B63">
            <title>
               <p>Inteins invading mycobacterial RecA proteins</p>
            </title>
            <aug>
               <au>
                  <snm>Saves</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Laneelle</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Daffe</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Masson</snm>
                  <fnm>JM</fnm>
               </au>
            </aug>
            <source>FEBS Lett</source>
            <pubdate>2000</pubdate>
            <volume>480</volume>
            <issue>2&#8211;3</issue>
            <fpage>221</fpage>
            <lpage>225</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">11034333</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B64">
            <aug>
               <au>
                  <snm>Dullemeijer</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Concepts and Approaches in Animal Morphology</source>
            <publisher>Assen, The Netherlands: Van Gorcum &amp; Comp</publisher>
            <pubdate>1974</pubdate>
         </bibl>
         <bibl id="B65">
            <title>
               <p>Defective dissociation of a "slow" RecA mutant protein imparts an <it>Escherichia coli </it>growth defect</p>
            </title>
            <aug>
               <au>
                  <snm>Cox</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Wood</snm>
                  <fnm>EA</fnm>
               </au>
               <au>
                  <snm>Chitteni-Pattu</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Inman</snm>
                  <fnm>RB</fnm>
               </au>
               <au>
                  <snm>Cox</snm>
                  <fnm>MM</fnm>
               </au>
            </aug>
            <source>J Biol Chem</source>
            <pubdate>2008</pubdate>
            <volume>283</volume>
            <issue>36</issue>
            <fpage>24909</fpage>
            <lpage>24921</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">18603529</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B66">
            <title>
               <p>Complementation of one RecA protein point mutation by another. Evidence for trans catalysis of ATP hydrolysis</p>
            </title>
            <aug>
               <au>
                  <snm>Cox</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Abbott</snm>
                  <fnm>SN</fnm>
               </au>
               <au>
                  <snm>Chitteni-Pattu</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Inman</snm>
                  <fnm>RB</fnm>
               </au>
               <au>
                  <snm>Cox</snm>
                  <fnm>MM</fnm>
               </au>
            </aug>
            <source>J Biol Chem</source>
            <pubdate>2006</pubdate>
            <volume>281</volume>
            <issue>18</issue>
            <fpage>12968</fpage>
            <lpage>12975</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">16527806</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B67">
            <title>
               <p>Saturation mutagenesis of the <it>E. coli </it>RecA loop L2 homologous DNA pairing region reveals residues essential for recombination and recombinational repair</p>
            </title>
            <aug>
               <au>
                  <snm>H&#246;rtnagel</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Voloshin</snm>
                  <fnm>ON</fnm>
               </au>
               <au>
                  <snm>Kinal</snm>
                  <fnm>HH</fnm>
               </au>
               <au>
                  <snm>Ma</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Schaffer-Judge</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Camerini-Otero</snm>
                  <fnm>RD</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1999</pubdate>
            <volume>286</volume>
            <fpage>1097</fpage>
            <lpage>1106</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">10047484</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B68">
            <title>
               <p>Site-directed mutagenesis in the <it>Escherichia coli recA </it>gene</p>
            </title>
            <aug>
               <au>
                  <snm>Cazaux</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Larminat</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Defais</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Biochimie</source>
            <pubdate>1991</pubdate>
            <volume>73</volume>
            <issue>2&#8211;3</issue>
            <fpage>281</fpage>
            <lpage>284</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">1883886</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B69">
            <title>
               <p>Crystal structure of archaeal recombinase RADA: a snapshot of its extended conformation</p>
            </title>
            <aug>
               <au>
                  <snm>Wu</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>He</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Moya</snm>
                  <fnm>IA</fnm>
               </au>
               <au>
                  <snm>Qian</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Luo</snm>
                  <fnm>Y</fnm>
               </au>
            </aug>
            <source>Mol Cell</source>
            <pubdate>2004</pubdate>
            <volume>15</volume>
            <issue>3</issue>
            <fpage>423</fpage>
            <lpage>435</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">15304222</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B70">
            <title>
               <p>Structural relationship of bacterial RecA proteins to recombination proteins from bacteriophage T4 and yeast</p>
            </title>
            <aug>
               <au>
                  <snm>Story</snm>
                  <fnm>RM</fnm>
               </au>
               <au>
                  <snm>Bishop</snm>
                  <fnm>DK</fnm>
               </au>
               <au>
                  <snm>Kleckner</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Steitz</snm>
                  <fnm>TA</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>1993</pubdate>
            <volume>259</volume>
            <issue>5103</issue>
            <fpage>1892</fpage>
            <lpage>1896</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">8456313</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B71">
            <title>
               <p>Specific RecA amino acid changes affect RecA-UmuD'C interaction</p>
            </title>
            <aug>
               <au>
                  <snm>Sommer</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Boudsocq</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Devoret</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Bailone</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Mol Microbiol</source>
            <pubdate>1998</pubdate>
            <volume>28</volume>
            <issue>2</issue>
            <fpage>281</fpage>
            <lpage>291</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">9622353</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B72">
            <title>
               <p>Universally conserved positions in protein folds: reading evolutionary signals about stability, folding kinetics and function</p>
            </title>
            <aug>
               <au>
                  <snm>Mirny</snm>
                  <fnm>LA</fnm>
               </au>
               <au>
                  <snm>Shakhnovich</snm>
                  <fnm>EI</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1999</pubdate>
            <volume>291</volume>
            <issue>1</issue>
            <fpage>177</fpage>
            <lpage>196</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">10438614</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B73">
            <title>
               <p>Conserved key amino acid positions (CKAAPs) derived from the analysis of common substructures in proteins</p>
            </title>
            <aug>
               <au>
                  <snm>Reddy</snm>
                  <fnm>BV</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>WW</fnm>
               </au>
               <au>
                  <snm>Shindyalov</snm>
                  <fnm>IN</fnm>
               </au>
               <au>
                  <snm>Bourne</snm>
                  <fnm>PE</fnm>
               </au>
            </aug>
            <source>Proteins</source>
            <pubdate>2001</pubdate>
            <volume>42</volume>
            <issue>2</issue>
            <fpage>148</fpage>
            <lpage>163</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">11119639</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B74">
            <title>
               <p>Understanding eukaryotic linear motifs and their role in cell signaling and regulation</p>
            </title>
            <aug>
               <au>
                  <snm>Diella</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Haslam</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Chica</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Budd</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Michael</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Brown</snm>
                  <fnm>NP</fnm>
               </au>
               <au>
                  <snm>Trave</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Gibson</snm>
                  <fnm>TJ</fnm>
               </au>
            </aug>
            <source>Front Biosci</source>
            <pubdate>2008</pubdate>
            <volume>13</volume>
            <fpage>6580</fpage>
            <lpage>6603</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">18508681</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B75">
            <title>
               <p>The DNA binding site(s) of the <it>Escherichia coli </it>RecA protein</p>
            </title>
            <aug>
               <au>
                  <snm>Rehrauer</snm>
                  <fnm>WM</fnm>
               </au>
               <au>
                  <snm>Kowalczykowski</snm>
                  <fnm>SC</fnm>
               </au>
            </aug>
            <source>J Biol Chem</source>
            <pubdate>1996</pubdate>
            <volume>271</volume>
            <fpage>11996</fpage>
            <lpage>12002</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">8662640</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B76">
            <title>
               <p>Helix propensities of the amino acids measured in alanine-based peptides without helix-stabilizing side-chain interactions</p>
            </title>
            <aug>
               <au>
                  <snm>Chakrabartty</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Kortemme</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Baldwin</snm>
                  <fnm>RL</fnm>
               </au>
            </aug>
            <source>Protein Sci</source>
            <pubdate>1994</pubdate>
            <volume>3</volume>
            <fpage>843</fpage>
            <lpage>852</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">2142718</pubid>
                  <pubid idtype="pmpid" link="fulltext">8061613</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B77">
            <title>
               <p>Insights into thermal resistance of proteins from the intrinsic stability of their alpha-helices</p>
            </title>
            <aug>
               <au>
                  <snm>Petukhov</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Kil</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Kuramitsu</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Lanzov</snm>
                  <fnm>V</fnm>
               </au>
            </aug>
            <source>Proteins</source>
            <pubdate>1997</pubdate>
            <volume>29</volume>
            <issue>3</issue>
            <fpage>309</fpage>
            <lpage>320</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">9365986</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B78">
            <title>
               <p>Mutants of <it>Escherichia coli </it>K-12 defective in DNA repair and in genetic recombination</p>
            </title>
            <aug>
               <au>
                  <snm>Howard-Flanders</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Theriot</snm>
                  <fnm>L</fnm>
               </au>
            </aug>
            <source>Genetics</source>
            <pubdate>1966</pubdate>
            <volume>53</volume>
            <fpage>1137</fpage>
            <lpage>1150</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1211086</pubid>
                  <pubid idtype="pmpid" link="fulltext">5335129</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B79">
            <title>
               <p>Negative co-dominant inhibition of RecA protein function: biochemical properties of the RecA1, RecA13 and RecA56 proteins and the effect of RecA56 protein on the activities of the wild-type RecA protein function <it>in vitro</it></p>
            </title>
            <aug>
               <au>
                  <snm>Lauder</snm>
                  <fnm>SD</fnm>
               </au>
               <au>
                  <snm>Kowalczykowski</snm>
                  <fnm>SC</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1993</pubdate>
            <volume>234</volume>
            <issue>1</issue>
            <fpage>72</fpage>
            <lpage>86</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">8230208</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B80">
            <title>
               <p>Purification of a RecA protein analogue from <it>Bacillus subtilis</it></p>
            </title>
            <aug>
               <au>
                  <snm>Lovett</snm>
                  <fnm>CM</fnm>
               </au>
               <au>
                  <snm>Roberts</snm>
                  <fnm>JW</fnm>
               </au>
            </aug>
            <source>J Biol Chem</source>
            <pubdate>1985</pubdate>
            <volume>260</volume>
            <issue>6</issue>
            <fpage>3305</fpage>
            <lpage>3313</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">3156134</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B81">
            <title>
               <p>Reevaluation of the nucleotide cofactor specificity of the RecA protein from <it>Bacillus subtilis</it></p>
            </title>
            <aug>
               <au>
                  <snm>Steffen</snm>
                  <fnm>SE</fnm>
               </au>
               <au>
                  <snm>Bryant</snm>
                  <fnm>FR</fnm>
               </au>
            </aug>
            <source>J Biol Chem</source>
            <pubdate>1999</pubdate>
            <volume>274</volume>
            <fpage>25990</fpage>
            <lpage>25994</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">10473543</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B82">
            <title>
               <p><it>Bacillus subtilis </it>SsbA and dATP regulate RecA nucleation onto single-stranded DNA</p>
            </title>
            <aug>
               <au>
                  <snm>Carrasco</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Manfredi</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Ayora</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Alonso</snm>
                  <fnm>JC</fnm>
               </au>
            </aug>
            <source>DNA Repair</source>
            <pubdate>2008</pubdate>
            <volume>7</volume>
            <issue>6</issue>
            <fpage>990</fpage>
            <lpage>996</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">18472308</pubid>
            </xrefbib>
         </bibl>
      </refgrp>
   </bm>
</art>
