Additional file 1.
Multiple sequence alignment of bacterial RecA homologs. A subset of the 300 sequences is shown representing each of the major bacterial phyla. In the alignment, a dash (-) indicates a gap and a period indicates an amino acid identical to the E. coli RecA protein. NCBI Protein database accession numbers are listed at the end unless the data was taken from the TIGR unfinished microbial genomes database. Summary lines above the alignment were calculated from all 300 sequences. The "Bioin" line indicates the bioinformatic structural elements (nanoanatomy) across the entire RecA protein: 12 motifs and the 10 connecting variable regions. "Secon" are the secondary structural elements from the E. coli RecA crystal structure where "a" are α helices, "b" are β strands, "l" are disordered loops, and "?" are disordered termini . In each case the letter or number name of the element is given in the second position. "Ident" are the 21 resides identical in all 300 sequences. "Chemi" are the 39 chemically conservative substitutions based on the following amino acid classification: a = (DE), b = (HKR), f = (AGILV), m = (NQ), o = (FWY), h = (ST), i = (P), s = (CM). "Funct" lists the 55 functionally conservative residue substitutions based on the classification: a = (DE), b = (HKR), f = (AFILMPVW), p = (CGNQSTY). Finally, "Major" are the 187 residues conserved above a 70% majority threshold (210 sequences) with invariant residues shown in uppercase. The numbering of the alignment is based upon the E. coli RecA protein sequence.
Format: PDF Size: 39KB Download file
This file can be viewed with: Adobe Acrobat Reader
Roca et al. BMC Bioinformatics 2008 9:554 doi:10.1186/1471-2105-9-554