<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
	<ui>gb-2004-5-6-r38</ui>
	<ji>GBJ</ji>
	<fm>
		<dochead>Research</dochead>
		<bibl>
			<title>
				<p>Bacterial &#945;<sub>2</sub>-macroglobulins: colonization factors acquired by horizontal gene transfer from the metazoan genome?</p>
			</title>
			<aug>
				<au id="A1">
					<snm>Budd</snm>
					<fnm>Aidan</fnm>
					<insr iid="I1"/>
				</au>
				<au id="A2">
					<snm>Blandin</snm>
					<fnm>Stephanie</fnm>
					<insr iid="I1"/>
				</au>
				<au id="A3">
					<snm>Levashina</snm>
					<mi>A</mi>
					<fnm>Elena</fnm>
					<insr iid="I2"/>
				</au>
				<au id="A4" ca="yes">
					<snm>Gibson</snm>
					<mi>J</mi>
					<fnm>Toby</fnm>
					<insr iid="I1"/>
					<email>toby.gibson@embl.de</email>
				</au>
			</aug>
			<insg>
				<ins id="I1">
					<p>European Molecular Biology Laboratory, 69012 Heidelberg, Germany</p>
				</ins>
				<ins id="I2">
					<p>UPR 9022 du CNRS, IBMC, rue Ren&#233; Descartes, F-67087 Strasbourg CEDEX, France</p>
				</ins>
			</insg>
			<source>Genome Biology</source>
			<issn>1465-6906</issn>
			<pubdate>2004</pubdate>
			<volume>5</volume>
			<issue>6</issue>
			<fpage>R38</fpage>
			<url>http://genomebiology.com/2004/5/6/R38</url>
			<xrefbib>
				<pubidlist><pubid idtype="pmpid">15186489</pubid><pubid idtype="doi">10.1186/gb-2004-5-6-r38</pubid>
				</pubidlist></xrefbib>
		</bibl>
		<history>
			<rec>
				<date>
					<day>20</day>
					<month>2</month>
					<year>2004</year>
				</date>
			</rec>
			<revrec>
				<date>
					<day>2</day>
					<month>4</month>
					<year>2004</year>
				</date>
			</revrec>
			<acc>
				<date>
					<day>8</day>
					<month>4</month>
					<year>2004</year>
				</date>
			</acc>
			<pub>
				<date>
					<day>26</day>
					<month>5</month>
					<year>2004</year>
				</date>
			</pub>
		</history>
		<cpyrt>
			<year>2004</year>
			<collab>Budd et al.; licensee BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL.</collab>
		</cpyrt>
		<shorttitle>
			<p>Bacterial &#945;<sub>2</sub>-macroglobulins: colonization factors acquired by horizontal gene transfer from the metazoan genome?</p>
		</shorttitle>
		<shortabs>
			<p>Homologs of metazoan &#945;<sub>2</sub>-macroglobulins have been found in bacteria. The distribution of these genes in diverse bacterial clades suggests they have been acquired by multiple horizontal transfers.</p>
		</shortabs>
		<abs>
			<sec>
				<st>
					<p>Abstract</p>
				</st>
				<sec>
					<st>
						<p>Background</p>
					</st>
					<p>Invasive bacteria are known to have captured and adapted eukaryotic host genes. They also readily acquire colonizing genes from other bacteria by horizontal gene transfer. Closely related species such as <it>Helicobacter pylori </it>and <it>Helicobacter hepaticus</it>, which exploit different host tissues, share almost none of their colonization genes. The protease inhibitor &#945;<sub>2</sub>-macroglobulin provides a major metazoan defense against invasive bacteria, trapping attacking proteases required by parasites for successful invasion.</p>
				</sec>
				<sec>
					<st>
						<p>Results</p>
					</st>
					<p>Database searches with metazoan &#945;<sub>2</sub>-macroglobulin sequences revealed homologous sequences in bacterial proteomes. The bacterial &#945;<sub>2</sub>-macroglobulin phylogenetic distribution is patchy and violates the vertical descent model. Bacterial &#945;<sub>2</sub>-macroglobulin genes are found in diverse clades, including purple bacteria (proteobacteria), fusobacteria, spirochetes, bacteroidetes, deinococcids, cyanobacteria, planctomycetes and thermotogae. Most bacterial species with bacterial &#945;<sub>2</sub>-macroglobulin genes exploit higher eukaryotes (multicellular plants and animals) as hosts. Both pathogenically invasive and saprophytically colonizing species possess bacterial &#945;<sub>2</sub>-macroglobulins, indicating that bacterial &#945;<sub>2</sub>-macroglobulin is a colonization rather than a virulence factor.</p>
				</sec>
				<sec>
					<st>
						<p>Conclusions</p>
					</st>
					<p>Metazoan &#945;<sub>2</sub>-macroglobulins inhibit proteases of pathogens. The bacterial homologs may function in reverse to block host antimicrobial defenses. &#945;<sub>2</sub>-macroglobulin was probably acquired one or more times from metazoan hosts and has then spread widely through other colonizing bacterial species by more than 10 independent horizontal gene transfers. <it>yfhM</it>-like bacterial &#945;<sub>2</sub>-macroglobulin genes are often found tightly linked with <it>pbpC</it>, encoding an atypical peptidoglycan transglycosylase, PBP1C, that does not function in vegetative peptidoglycan synthesis. We suggest that YfhM and PBP1C are coupled together as a periplasmic defense and repair system. Bacterial &#945;<sub>2</sub>-macroglobulins might provide useful targets for enhancing vaccine efficacy in combating infections.</p>
				</sec>
			</sec>
		</abs>
	</fm>
	<meta>
		<classifications>
			<classification type="BMC" subtype="man_spc_id" id="30010014">Microbiology and parasitology</classification>
			<classification type="BMC" subtype="man_spc_id" id="30010009">Genetics</classification>
			<classification type="BMC" subtype="man_spc_id" id="30010010">Genome studies</classification>
			<classification type="BMC" subtype="man_spc_id" id="30010016">Molecular biology</classification>
		</classifications>
	</meta>
	<bdy>
		<sec>
			<st>
				<p>Background</p>
			</st>
			<p>The broad-spectrum protease inhibitor &#945;<sub>2</sub>-macroglobulin (&#945;<sub>2</sub>M) and the complement factors C3, C4 and C5 belong to a gene family present in all metazoans ranging from corals to humans. These large (approximately 1,500 residue) proteins all undergo proteolytic processing and structural rearrangement as part of their role in host defense. The family is characterized by a unique thioester motif (CxEQ; single-letter amino-acid code), and a propensity for multiple conformationally sensitive binding interactions <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>, which define their functional properties. The highly reactive thioester bond is buried inside the molecule in the native protein, protected from precocious inactivation <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>. Upon proteolytic cleavage, the thioester bond becomes exposed and can then mediate covalent attachment to activating self and non-self surfaces, in the case of complement factors, or covalent or noncovalent crosslinking to the attacking proteases in the case of &#945;<sub>2</sub>Ms <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>. The proteolytic activation of these proteins also mediates interactions with receptors.</p>
			<p>In contrast to complement factors, which are activated by specific 'convertase' protease complexes, &#945;<sub>2</sub>Ms have an accessible 'bait' region with target sites for many proteases. The rearrangement of &#945;<sub>2</sub>M that follows cleavage of the bait region entraps the attacking protease in a cage-like structure, hindering protein substrates from reaching the protease active site <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>. In this way, exported proteases that are essential for parasitic infections can be rendered ineffective by &#945;<sub>2</sub>M entrapment <abbrgrp><abbr bid="B5">5</abbr><abbr bid="B6">6</abbr><abbr bid="B7">7</abbr></abbrgrp>. Protease-reacted &#945;<sub>2</sub>M is then cleared from circulation by binding to the receptor CD91, triggering endocytosis. In addition, &#945;<sub>2</sub>Ms bind cytokines and growth factors and regulate their clearance and activity <abbrgrp><abbr bid="B8">8</abbr><abbr bid="B9">9</abbr></abbrgrp>.</p>
			<p>Vertebrate complement factors C3, C4 and C5 are part of an activation cascade that leads to the assembly of the membrane-attack complex and lysis of the pathogen. Binding of C3 also targets pathogens for phagocytosis. Proteolytic activation of all three complement proteins yields anaphylatoxins (cleaved amino-terminal fragments) which are recognized by specific receptors and activate the inflammatory response at the site of infection. In contrast to &#945;<sub>2</sub>Ms, complement factors also possess a carboxy-terminal domain extension, the netrin or NTR module (PFAM:PF01759) <abbrgrp><abbr bid="B10">10</abbr></abbrgrp>. Some members of the complement/&#945;<sub>2</sub>M family (for example, C5 and ovostatin) have lost the thioester motif.</p>
			<p>No &#945;<sub>2</sub>M-related proteins have been found in any eukaryotes outside metazoans. Within the Metazoa, representatives have been found in all species examined, with a so-called 'C3-like' protein sequenced from the cnidarian <it>Swiftia exserta </it>(SWISS-PROT acc:Q8IYP1). There is no information from sponges as yet. We may speculate that the gene family evolved in an early metazoan in response to challenge from invasive microorganisms exploiting the new niche provided by the interstitial spaces and body cavities. The more derived role of the complement factors, together with their extra netrin domain, suggests that they arose by gene duplication from an ancestral &#945;<sub>2</sub>M-like gene. Apart from vertebrates, &#945;<sub>2</sub>M-group proteins have been most actively studied in arthropods. The horseshoe crab <it>Limulus </it>has a plasma &#945;<sub>2</sub>M that is a component of an ancient invertebrate defense system; it is able to inhibit a wide range of proteases as well as to modulate plasma cytolytic activity <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>. <it>Limulus </it>&#945;<sub>2</sub>M forms tetramers, binding covalently across the multimers rather than to the attacking proteases, but still traps these in a cage-like structure after proteolytic activation <abbrgrp><abbr bid="B12">12</abbr></abbrgrp>. In dipteran insects, there are multiple &#945;<sub>2</sub>M homologs, the thioester-containing proteins (TEPs). The <it>TEP </it>genes have been amplified by a process of tandem duplication into linked multigene families. <it>Drosophila melanogaster </it>has six <it>TEP </it>genes, whereas the mosquito <it>Anopheles gambiae </it>has 15 <abbrgrp><abbr bid="B13">13</abbr></abbrgrp>. It is thought that the impressive expansion of <it>TEP </it>genes in the mosquito might be linked to the parasitic challenge provided by its blood-sucking lifestyle <abbrgrp><abbr bid="B13">13</abbr></abbrgrp>. The first characterized TEP in mosquitoes, TEP1, binds to and promotes phagocytosis of bacteria <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>. TEP1 also binds to <it>Plasmodium berghei </it>and mediates its killing <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>. Thus the complement/&#945;<sub>2</sub>M protein family is part of an innate immune system in metazoans that long pre-dates the immunoglobulin-based immune system of vertebrates, yet remains vital for combating parasites in all animal lineages examined.</p>
			<p>While reviewing the distribution of &#945;<sub>2</sub>M/TEP proteins from invertebrates <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>, we conducted BLAST searches of the protein databases and were surprised to discover a number of bacterial sequences with BLAST E-values indicating homology with &#945;<sub>2</sub>M. Given the absence of &#945;<sub>2</sub>Ms in all non-metazoan eukaryotic lineages, it immediately seemed clear that horizontal gene transfer (HGT) of &#945;<sub>2</sub>Ms must have occurred between metazoans and bacteria. But which way? Here we summarize the evidence for numerous horizontal transfers between bacterial lineages and discuss some biochemical and medical implications of the finding.</p>
		</sec>
		<sec>
			<st>
				<p>Results</p>
			</st>
			<p>Our BLAST2SRS server provides the species in the BLAST output page: this is useful for quick visual surveys of the taxonomic distribution of a protein family. A BLAST2SRS search with human &#945;<sub>2</sub>M unexpectedly listed an entry (SWISS-PROT accession number Q9X079) with E-value 2.3e-8 from <it>Thermotoga maritima</it>, a thermophilic eubacterium. With a length of 1,538 residues, a signal sequence and a matching CxEQ motif, there was no doubt that this was a genuine &#945;<sub>2</sub>M homolog. Numerous other bacterial sequences with lower E-values but obvious topological equivalence were also listed: for example, <it>Escherichia coli </it>YfhM (P76578) at 5.8e-5; <it>Pseudomonas putida </it>AAN66197 at 1.3e-4; <it>Rhizobium meliloti </it>Q92VA6 at 5.0e-3. Profile searches with a metazoan &#945;<sub>2</sub>M alignment and subsequently with an alignment of the stronger bacterial hits revealed a number of additional, highly diverged homologs, some lacking the CxEQ. For example, <it>E. coli </it>has a second divergent homolog, YfaS (P76464). It is noteworthy that not a single instance of an archaeal &#945;<sub>2</sub>M sequence could be found. Thus &#945;<sub>2</sub>M-like sequences are restricted to eubacteria and metazoans. No function has been experimentally ascribed to any of the bacterial &#945;<sub>2</sub>Ms (bact-&#945;<sub>2</sub>Ms).</p>
			<sec>
				<st>
					<p>Bacterial &#945;<sub>2</sub>-macroglobulin sequences</p>
				</st>
				<p>Figure <figr fid="F1">1a</figr> shows an alignment of the segment spanning the CxEQ motif for a representative set of bacterial &#945;<sub>2</sub>M homologs. Not all bact-&#945;<sub>2</sub>Ms possess the CxEQ motif. Using <it>E. coli </it>as the reference, YfhM is the archetype of a large group, mostly with the thioester motif, and YfaS is the archetype of a smaller, diverged group always lacking the motif. The sequences of the YfhM group are sufficiently divergent that accurate alignment proved time-consuming, but was achieved over almost the whole sequence length, other than the highly variable amino termini. We did not attempt to align together the YfhM and YfaS groups and the metazoan &#945;<sub>2</sub>Ms. This would only be useful if the trees would be informative, but the high divergence between the groups precludes accurate alignment, leading to unreliable tree calculation. (In future, given more YfaS sequences and &#945;<sub>2</sub>Ms from more metazoan lineages and a solved three-dimensional structure to guide alignment, this might be worth revisiting.) One feature apparent in many of the aligned YfhM sequences is a conserved cysteine directly following the signal peptide (Figure <figr fid="F1">1b</figr>), indicating palmitoylation. The presence of an aspartic acid residue following the palmitoylated cysteine has been shown in <it>E. coli </it>to dictate sorting to the inner membrane <abbrgrp><abbr bid="B17">17</abbr><abbr bid="B18">18</abbr></abbrgrp>, in which case YfhM will be found in the periplasmic space, attached to the inner membrane. Given the CxEQ motif, covalent trapping of proteases in the periplasmic space seems to be the most likely function (whether the covalent links are to the trapped protease or between the &#945;<sub>2</sub>M multimers, as in the horseshoe crab <it>Limulus </it><abbrgrp><abbr bid="B12">12</abbr></abbrgrp>). The YfaS group of bact-&#945;<sub>2</sub>Ms lack a palmitoylable cysteine, so may be secreted, while absence of the CxEQ motif indicates the molecular function must be different, at least in part, though this does not, of itself, rule out protease entrapment, as in chicken ovostatin which also lacks the reactive thioester motif <abbrgrp><abbr bid="B19">19</abbr></abbrgrp>.</p>
				<fig id="F1">
					<title>
						<p>Figure 1</p>
					</title>
					<caption>
						<p>Sequence alignments</p>
					</caption>
					<text>
						<p>Sequence alignments. <b>(a) </b>Alignment detail of YfhM group bacterial &#945;<sub>2</sub>-macroglobulin sequences from bacterial proteomes plus human &#945;<sub>2</sub>-macroglobulin (&#945;<sub>2</sub>M), centred on the conserved CxEQ thioester motif. <b>(b) </b>Alignment of selected bacterial &#945;<sub>2</sub>-macroglobulin signal peptides possessing the conserved cysteine (C) residue. Signal peptides require a run of hydrophobic residues preceded by a positively charged residue. Cleavage is at the small (glycine (G)/alanine (A)) residue terminating the signal peptide (marked by a dot). Aminoacylation of lipoproteins occurs in the inner membrane at a C (marked by *) directly following the signal peptide. An aspartate residue (D) after the C acts as a retention signal to the inner membrane in <it>E. coli</it>, preventing lipoprotein transfer to the outer membrane <abbrgrp><abbr bid="B17">17</abbr><abbr bid="B18">18</abbr></abbrgrp>. Alignments are color-coded using the Clustal X defaults <abbrgrp><abbr bid="B66">66</abbr></abbrgrp>. Blue denotes conserved hydrophobicity, as in the signal peptide, while a strongly conserved C is colored pink. Accession numbers are SWISS-PROT or NCBI genomes (NP, finished genome; ZP, provisional assignment in unfinished genome). Species names follow the SWISS-PROT convention.</p>
					</text>
					<graphic file="gb-2004-5-6-r38-1"/>
				</fig>
			</sec>
			<sec>
				<st>
					<p>Genomic context of bacterial &#945;<sub>2</sub>-macroglobulins</p>
				</st>
				<p>A survey of completely sequenced bacterial genomes was undertaken to establish which lineages possessed bact-&#945;<sub>2</sub>Ms and which did not. Representative results are summarized in Figure <figr fid="F2">2</figr>. It is clear that there is a highly inconsistent correlation of bact-&#945;<sub>2</sub>M possession and phylogenetic relationship, except for very closely related species.</p>
				<fig id="F2">
					<title>
						<p>Figure 2</p>
					</title>
					<caption>
						<p>Phylogenetic distribution of bacterial &#945;<sub>2</sub>-macroglobulin homologs (&#945;<sub>2</sub>M)</p>
					</caption>
					<text>
						<p>Phylogenetic distribution of bacterial &#945;<sub>2</sub>-macroglobulin homologs (&#945;<sub>2</sub>M). Pink, species that possess bacterial &#945;<sub>2</sub>-macroglobulin genes; yellow, species without bacterial &#945;<sub>2</sub>-macroglobulin genes. Shared genomic context is indicated for genes found to co-occur with bacterial &#945;<sub>2</sub>-macroglobulin genes. Because bacterial phylogeny has many uncertainties, the tree is simplified into multiple nodes representing three levels of divergence. There is little phylogenetic consistency for bacterial &#945;<sub>2</sub>-macroglobulin possession. Colonizing proteobacteria are overwhelmingly expected to have a bacterial &#945;<sub>2</sub>-macroglobulin gene, although exceptions occur, notably <it>Helicobacter pylori</it>, <it>Vibrio cholerae </it>and <it>Neisseria meningitidis</it>. No examples of bacterial &#945;<sub>2</sub>-macroglobulin genes have been found in colonizing Gram-positives in the Firmicutes or Actinobacteria, which include such major infectious clades as streptococci and mycobacteria. <it>Anabaena </it>is a facultative plant symbiont, while other free-living cyanobacteria (here represented by <it>Synechocystis</it>) lack bacterial &#945;<sub>2</sub>-macroglobulin. <it>Thermotoga maritima</it>, <it>Magnetospirillum magnetotacticum </it>and <it>Caulobacter crescentus </it>are the only species possessing bacterial &#945;<sub>2</sub>-macroglobulin for which no apparent connection exists with niches linked to exploitation of higher eukaryotes. Genome context of bacterial &#945;<sub>2</sub>Ms is based on automated STRING annotation <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>, supplemented by re-analysis of individual genomes. Double slanted bars between genes indicate that they are not tightly linked. Bacterial &#945;<sub>2</sub>-macroglobulins make up two distinct groups typified by the <it>E. coli </it>genes <it>yfhM </it>and <it>yfaS</it>. The members of the <it>yfhM </it>group (on the left side of the figure) almost always co-occur with <it>pbpC </it>and are often, but not always, found adjacent to and on the same strand as one another in an operon configuration. Members of the <it>yfaS </it>group (grouped on the right side of the figure), when present in &#946;- or &#947;-proteobacteria, are linked to four other gene families. All their predicted gene products also possess signal peptides, but are otherwise of unknown function. In other taxa, members of the <it>yfaS </it>group of bacterial &#945;<sub>2</sub>-macroglobulins are either unassociated with any of these gene families (planctomycetes and deinococci), or linked to a member of just one of the families (thermotogae).</p>
					</text>
					<graphic file="gb-2004-5-6-r38-2"/>
				</fig>
				<p>Bact-&#945;<sub>2</sub>Ms are absent from the full proteomes of the following anciently diverged free-living species: the hyperthermophilic chemolithoautotroph <it>Aquifex aeolicus</it>, the thermophilic photolithoautotroph <it>Chlorobium tepidum</it>, the cyanobacteria <it>Synechocystis</it>, <it>Synechococcus </it>and <it>Prochlorococcus</it>, all firmicutes including <it>Bacillus subtilis</it>, all actinobacteria including <it>Streptomyces coelicolor</it>, the &#946;-proteobacterium <it>Nitrosomonas europaea </it>and the &#948;-proteobacterium <it>Geobacter metallireducens</it>. Furthermore, possession of bact-&#945;<sub>2</sub>M is inconsistently represented within clades such as the proteobacteria, spirochetes and cyanobacteria. This is well illustrated by the two species of <it>Helicobacter</it>, one exploiting the acidic stomach and the other the very different environment of the liver: only the latter has a bact-&#945;<sub>2</sub>M. The <it>H. hepaticus </it>genome lacks essentially all the proposed <it>H. pylori </it>virulence factors and is believed to possess a quite different set, adapted to its hepatobiliary habitat <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>. The irregular phylogenetic correlation suggests that bact-&#945;<sub>2</sub>Ms are 'lifestyle' genes, affecting which niches a bacterium is able to exploit. Although an association with colonization seems clear (Figure <figr fid="F2">2</figr>), there is a strong bias in bacterial genome sequencing in favor of pathogenic species: this currently precludes a statistical assessment and might create a misleading phylogenetic perspective.</p>
				<p>The STRING server <abbrgrp><abbr bid="B21">21</abbr></abbrgrp> was used to check for neighboring genes that persistently co-occur with bact-&#945;<sub>2</sub>Ms. Using either <it>yfhM </it>or <it>yfaS </it>as seed, STRING reported two conserved gene sets that are widely found with bact-&#945;<sub>2</sub>Ms. The results are summarized in Figure <figr fid="F2">2</figr>. The <it>yfhM </it>group always co-occurs with <it>pbpC</it>, which encodes penicillin-binding protein 1C (PBP1C). The gene topology is almost always consistent with <it>pbpC </it>and <it>yfhM </it>being in the same operon (or co-transcribed from a bidirectional promoter, as in <it>Anabaena</it>). The more strongly an operon structure is conserved across species, the more likely are the encoded proteins to have associated functions <abbrgrp><abbr bid="B22">22</abbr></abbrgrp>. Moreover, products of conserved gene pairs very often associate physically <abbrgrp><abbr bid="B23">23</abbr></abbrgrp>. Therefore, if YfhM is involved in colonizing or pathogenic lifestyles, so should be its partner. PBP1C is a paralog of the periplasmic cell-wall biosynthesis proteins PBP1A and PBP1B, though with the addition of a carboxy-terminal non-enzymatic domain of approximately 100 residues (PFAM:PF06832). The PBP1A and PBP1B peptidoglycan synthases each have two enzymatic domains, an amino-terminal transglycosylase and a carboxy-terminal transpeptidase (reviewed in <abbrgrp><abbr bid="B24">24</abbr></abbrgrp>). Although it possesses the two enzymatic domains, studies have shown that PBP1C does not substitute for these proteins in cell-wall biosynthesis during vegetative growth <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>: indeed deletion of <it>pbpC </it>has a weak phenotype not affecting cell viability in the laboratory, although the number of peptide crosslinks is increased <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>. The transpeptidase domain in PBP1C is thought not to bind to most of the &#946;-lactams that inhibit the paralogous enzymes, nor to be a functional transpeptidase <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>. One curious finding is that, <it>in vitro</it>, PBP1C accounts for 75% of transglycosylase activity, yet is responsible for only 3% of <it>de novo </it>peptidoglycan biosynthesis in the cell <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>. As PBP1C does not substitute for the biosynthetic enzymes, a possible role would be in emergency repairs to the peptidoglycan, where its efficient transglycosylase activity would be appropriate.</p>
				<p>The <it>yfaS </it>group of bact-&#945;<sub>2</sub>Ms is likewise usually found in a candidate operon, at least within the proteobacteria (Figure <figr fid="F2">2</figr>), in this case with four other gene families, defined by the <it>E. coli yfaA</it>, <it>yfaQ</it>, <it>yfaP </it>and <it>yfaT </it>genes. All these genes have signal sequences and their encoded proteins are expected to be secreted or periplasmic, but, otherwise, sequence analysis has yielded no clues to their function. It is possible that all the encoded proteins function to disrupt or resist host defenses. The YfaS-like bact-&#945;<sub>2</sub>Ms of the free-living and highly divergent <it>Thermotoga</it>, <it>Deinococcus </it>and <it>Rhodopirellula </it>(none of which is known to be invasive) are not found associated with most of these other genes.</p>
			</sec>
			<sec>
				<st>
					<p>Microarray expression data</p>
				</st>
				<p>The STRING server was also used to check for any significant coexpression of <it>yfhM</it>, <it>yfaS </it>and other members of the two candidate operons, using <it>E. coli </it>data from the Stanford microarray database <abbrgrp><abbr bid="B26">26</abbr></abbrgrp>. All the genes associated with those for bact-&#945;<sub>2</sub>Ms are present in the experiments included in the STRING database, and are expressed at levels significantly above background. However, none of the genes exhibits coordinated variation in expression levels either with each other or with any other genes in the <it>E. coli </it>genome under the conditions investigated.</p>
			</sec>
			<sec>
				<st>
					<p>Calculation of sequence trees</p>
				</st>
				<p>An initial rough tree calculated from an alignment of <it>yfhM </it>family sequences gave strong indications that several horizontal transfers had occurred among the available set. As <it>yfhM </it>is always found together with <it>pbpC</it>, indicating that the paired genes should have a shared phylogenetic history, a quick check of the PBP1C tree was also done. The two trees, which provide controls for each other's topologies, were very similar, indicating that the apparent HGTs were unlikely to be artifacts. Therefore, we undertook a more careful phylogenetic analysis with a view to improving the phylogenetic signal-to-noise ratio and using a method that is less prone to rate variation artifacts than neighbor-joining.</p>
				<p>Alignments were reviewed and edited by hand, then processed to remove especially noisy segments, as outlined in Materials and methods. Trees were calculated with MrBayes, a Bayesian resampling protocol that is now widely adopted <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>: MrBayes approaches the quality of maximum-likelihood methods while being quicker to calculate (though still computationally demanding). Results of the tree calculations are presented in Figure <figr fid="F3">3</figr>. The two trees differ by only three branch placements, indicating that the topologies are mostly sound, except for a few branches with low support (low posterior probabilities). As the calculated trees are unrooted, the ordering of the deepest branches cannot be mapped onto time.</p>
				<fig id="F3">
					<title>
						<p>Figure 3</p>
					</title>
					<caption>
						<p>Trees calculated from amino-acid sequence alignments</p>
					</caption>
					<text>
						<p>Trees calculated from amino-acid sequence alignments. <b>(a) </b>The YfhM group of bacterial &#945;<sub>2</sub>-macroglobulins; <b>(b) </b>the PBP1Cs that always co-occur and are usually found adjacent in the same operon. As shown by the key, branches are color-coded by taxon for easy visualization of phylogenetic inconsistencies. All branches have Bayesian posterior probabilities of 1.0 (that is, are completely stable during resampling) unless otherwise indicated. Three branches not shared between the trees are indicated by dotted lines: all other branches are congruent. The roots of the trees are not known, so the time vector of deep internal branches is not clear. See Materials and methods for details of the tree calculation.</p>
					</text>
					<graphic file="gb-2004-5-6-r38-3"/>
				</fig>
			</sec>
			<sec>
				<st>
					<p>Fitting the observed tree topologies to the vertical descent model</p>
				</st>
				<p>The number of ancestral genes required to explain an observed tree topology can be determined by embedding the sequence tree within a species tree. We prepared a species tree for the bacterial species in Figure <figr fid="F3">3</figr> such that currently uncertain affinities were assigned in favor of the observed trees: this will provide a minimum estimate of ancestral gene number. The sequence tree topology was embedded into the bacterial species tree using GeneTree <abbrgrp><abbr bid="B28">28</abbr></abbrgrp>. The reconciled tree required six gene-duplication events and 29 lineage-specific deletions. The last common ancestor (LCA) of the full set had a minimum of three genes, the LCA of the proteobacteria had four genes, while the LCA of the &#945;/&#946;-proteobacteria had six genes. The tree reveals a tendency for increasing gene number over time when vertical descent has strictly occurred.</p>
				<p>The problems of the vertical descent model are manifold. First, all sequenced extant genomes have single copies of the <it>yfhM/pbpC </it>genes, yet vertical descent shows a progression toward increasing gene number over time. This requires late but fully independent massive gene loss to have occurred in all lineages. Second, the observed robust sequence tree topologies would require a clear affinity between cyanobacteria and spirochetes, an affinity that has hitherto gone entirely unnoticed in the field of bacterial phylogeny. Third, the number of events (gene duplications and deletions) found to be required under a model of vertical descent is based on a species tree chosen to minimize this number (see Materials and methods.) As the species tree used is unlikely to be accurate in places where bacterial phylogeny is unresolved, the number of such events required under a vertical descent model is probably greater than described (and hence, correspondingly less likely.)</p>
				<p>Although bizarre evolutionary scenarios can always be invoked, the given tree topologies are difficult to explain solely by vertical descent from a common ancestral eubacterium.</p>
			</sec>
			<sec>
				<st>
					<p>Horizontal transfers of the <it>yfhM </it>and <it>pbpC </it>gene couplet</p>
				</st>
				<p>Difficulties in accounting for the observed YfhM and PBP1C trees disappear if it is assumed that a number of horizontal gene transfers have occurred. Vertical transmission then only occurred among some sets of quite closely related bacteria. There are four deeply diverged sets within the tree, which will be discussed in turn.</p>
				<sec>
					<st>
						<p>The major proteobacterial grouping</p>
					</st>
					<p>Of the 22 proteobacterial species sampled, 18 are exclusively grouped together in the two trees. The species are all plant or animal pathogens and symbionts - even the anaerobic sulfate-reducing <it>Desulfovibrio desulfuricans </it>is a symbiont of deep-sea hydrothermal vent polychete worms <abbrgrp><abbr bid="B29">29</abbr></abbrgrp>. Sub-branches compatible with vertical descent are present for five &#945;-proteobacteria including <it>Agrobacterium tumefaciens </it>and for seven &#947;-proteobacteria including <it>E. coli</it>. For bact-&#945;<sub>2</sub>M and PBP1C to have existed in proteobacteria before the &#945;/&#947; split, these gene sequences would have to be evolving more slowly than in other parts of the tree. It is more likely that the genes spread via HGT through these groups some time ago and then have been vertically inherited (at least in part). The remainder of the grouping consists of unambiguous HGT, although the direction of transfer is not always clear-cut. The &#946;-proteobacterium <it>Bordetella pertussis </it>has acquired the genes from a &#947;-proteobacterium. The &#948;-proteobacterium <it>D. desulfuricans </it>has acquired the genes from an &#945;-proteobacterium. An outlier set of &#945;- and &#947;-proteobacteria, including <it>Rickettsia conorii </it>and <it>Yersinia pestis</it>, indicate two further transfers, but in this case the order of the transfers is not determined. Therefore to create the topology of this grouping, a minimum of four unique horizontal transfers has occurred.</p>
				</sec>
				<sec>
					<st>
						<p>The bacteroidete/fusobacteria/&#949;-proteobacteria grouping</p>
					</st>
					<p>This group consists of three unrelated taxa which exploit niches related to the animal digestive system. The &#949;-proteobacterium <it>Helicobacter hepatica </it>colonizes mouse liver ducts, <it>Fusobacterium </it>species colonize the teeth, <it>Bacteroides thetaiotamicron </it>(not shown on the tree owing to an incomplete bact-&#945;<sub>2</sub>M sequence) is a major gut bacterium, while a second bacteroidete, <it>Cytophaga hutchinsonii</it>, exploits cellulose-rich animal waste. Horizontal transfer into the &#949;-proteobacterium <it>H. hepaticus </it>is clear-cut, as it is isolated on the trees from all other proteobacteria, whereas other <it>Helicobacter </it>lack these genes. Another transfer has occurred between fusobacterial and bacteroidete lineages, but the direction is not clear. A third HGT is likely to have originally introduced the genes into these lineages but cannot be formally assigned without a root.</p>
				</sec>
				<sec>
					<st>
						<p>The isolated <it>Magnetospirillum </it>&#945;-proteobacteria branch</p>
					</st>
					<p><it>Magnetospirillum magnetotacticum </it>bact-&#945;<sub>2</sub>M and PBP1C are deeply diverged from all other species, including other &#945;-proteobacteria. This positioning away from its relatives indicates that HGT occurred into the <it>Magnetospirillum </it>lineage. The strong divergence from other sequences may indicate that the sequence has undergone rapid evolution. This latter point may be addressed in future if the branch becomes populated by some closer relatives.</p>
				</sec>
				<sec>
					<st>
						<p>The cyanobacteria/spirochete/&#946;-proteobacteria grouping</p>
					</st>
					<p>This branch consists of three very unrelated taxa: cyanobacteria facultatively symbiotic with plants, spirochetes pathogenic to metazoans and a pair of closely related genera of &#946;-proteobacteria that each include free-living, symbiotic and pathogenic forms. The deepest diverged in the group are the <it>Anabaena</it>-like symbiotic cyanobacteria. The economically significant <it>Anabaena-Azolla </it>symbiosis provides the nitrogen fixation that fertilizes paddy fields <abbrgrp><abbr bid="B30">30</abbr></abbrgrp>. As other free-living cyanobacteria, such as <it>Synechococcus</it>, lack these genes, HGT into this lineage is very likely. The isolation of the <it>Ralstonia </it>and <it>Chromobacterium </it>clade from other proteobacteria also indicates HGT into their lineage. HGT for <it>Leptospira </it>(the causal agent of leptospirosis) is also indicated, as other spirochetes such as <it>Borrelia burgdorferi </it>(the causal agent of Lyme disease) and <it>Treponema pallidum </it>(the causal agent of syphilis) lack these genes. Thus, this set of genes that are clearly grouped together by molecular phylogeny, yet are found within very diverse taxa, appear to have been transmitted three times.</p>
				</sec>
			</sec>
		</sec>
		<sec>
			<st>
				<p>Discussion</p>
			</st>
			<sec>
				<st>
					<p>Sifting the evidence for bacterial HGT</p>
				</st>
				<p>There is increasing evidence that HGT has had - and continues to have - a major role in the adaptation of organisms, especially prokaryotes, to exploiting new environments. Nevertheless, it is often hard to demonstrate HGT, and there is considerable confusion about how to do so. The default hypothesis should remain vertical transmission unless there is good evidence for HGT. The over-hasty assignment of recent bacterial-to-vertebrate gene transfers, solely on the basis of BLAST E-values <abbrgrp><abbr bid="B31">31</abbr></abbrgrp>, has been firmly refuted <abbrgrp><abbr bid="B32">32</abbr><abbr bid="B33">33</abbr></abbrgrp>. Such premature HGT assignments have been surveyed and used to provide guidelines for evaluating HGT <abbrgrp><abbr bid="B34">34</abbr><abbr bid="B35">35</abbr></abbrgrp>. Sometimes the evidence is clear-cut, as when adaptive genes are carried on phage, plasmid or transposon. Inconsistent phylogenetic distribution may be evidence for HGT but must be carefully balanced against gene-loss models, recognizing that the two processes are not mutually exclusive. Phylogenetic trees only provide good evidence for HGT when branching is robust and clearly delimited by appropriate outgroups: the HGT must carry a diagnostic molecular evolutionary signal.</p>
				<p>One of the best paradigms for investigating recent and ongoing HGT in parasitic prokaryotes is the &#947;-proteobacterium <it>Vibrio cholerae</it>, which acquired pathogenicity late in recorded history. Free-living <it>Vibrio </it>species are common, harmless aquatic microorganisms. The first recorded cholera pandemic occurred in 1817, the sixth and seventh occurred recently enough to be investigated with modern molecular techniques, and the eighth is probably underway now (see <abbrgrp><abbr bid="B36">36</abbr></abbrgrp> for details). The basic pathogenicity genes <it>ctxAB</it>, which encode cholera toxin, lie within the genome of the filamentous phage CTX&#966; <abbrgrp><abbr bid="B37">37</abbr></abbrgrp>. Other pathogenicity gene 'islands' include the toxin-co-regulated pilus, needed for colonization, and the VSP-1 and VSP-2 islands, which appeared in strains of the seventh pandemic and are suggested to have been integral to that event <abbrgrp><abbr bid="B38">38</abbr></abbrgrp>. The recent O139 serotype arose by wholesale replacement of the pre-existing gene cluster encoding lipopolysaccharide O side-chain synthesis, yielding an outer surface with a different architecture, less susceptible to pre-existing immunity <abbrgrp><abbr bid="B39">39</abbr></abbrgrp>. Thus, pathogenic <it>V. cholerae </it>continues to adapt to the invasive lifestyle, to a large extent through HGT-mediated acquisition of new capabilities, including, but not limited to, better avoidance of host defenses. Although many of the functions encoded by the genes within pathogenic islands are not understood, their absence from the free-living <it>Vibrio </it>species is good evidence that they have been incorporated, and then conserved, because of a direct or indirect role in enhancing virulence. Even though it is a &#947;-proteobacterium, the genomic sequence data show that <it>V. cholerae </it>has not (re-)acquired a bact-&#945;<sub>2</sub>M gene. At least, not yet.</p>
			</sec>
			<sec>
				<st>
					<p>HGT of &#945;<sub>2</sub>-macroglobulin among colonizing bacteria</p>
				</st>
				<p>Our unexpected finding that &#945;<sub>2</sub>-macroglobulins, hitherto only known from metazoans, are widely present in eubacterial genomes has provided one of the most clear-cut examples of widespread HGT between extremely divergent bacterial taxa that can be monitored by molecular phylogenetic approaches. We have been able to infer a minimum of 11 independent HGTs for the major <it>yfhM </it>group among 27 sequences tested. Because this group always coexists with a second gene, <it>pbpC</it>, shared evolutionary history means the trees are controlled for topological consistency, so that the assignment of HGT is not in doubt. This work does not address an earlier evolutionary history preceding the link-up of this gene pair.</p>
				<p>It is striking that all four deeply diverged groups in the trees include proteobacterial species. This alone clearly indicates that HGT has occurred. Because this is the most heavily researched bacterial taxon and provides most of the sequenced genomes, it is not yet clear whether other taxa will also show multiple independent acquisitions of bact-&#945;<sub>2</sub>M and <it>pbpC</it>. Currently, the trees show a minimum of 11 independent HGT events, even if the originating (but unknown) taxon were represented here. A twelfth HGT is indicated if bact-&#945;<sub>2</sub>M was originally captured from a metazoan (or vice versa). Extensive gene loss is also likely to have contributed to the phylogenetic distributions in Figure <figr fid="F2">2</figr>, particularly amongst the &#945;-,&#946;-, and &#947;-proteobacteria, where possession seems the default yet both vertical and horizontal transmission occur. Quite possibly, a cycle of gain-loss-gain has repeatedly occurred as strains adapt between colonization and free-living environments. The role of gene loss cannot be quantified with current data, but this may become possible in the future with more comprehensive genome coverage.</p>
				<p>Where pathogenic bacteria and their eukaryotic hosts share related genes that appear to be transferred from one to the other, it is believed that the direction is overwhelmingly from the eukaryote to the bacterium. The failure to find phylogenetic evidence for bacterium-to-vertebrate gene transfers is consistent with this direction <abbrgrp><abbr bid="B32">32</abbr><abbr bid="B33">33</abbr></abbrgrp>. We expect that bact-&#945;<sub>2</sub>M was transferred from a metazoan host to a pathogenic bacterium, but this is not yet demonstrable and remains supposition. Given a simple early metazoan, where the germ cells would not be physically isolated from any bacterial infection, one can see how selection could act to fix a bact-&#945;<sub>2</sub>M gene transferred in the opposite direction, if bact-&#945;<sub>2</sub>M was originally bacterial. This issue may become resolvable in future given much more extensive phylogenetic coverage.</p>
			</sec>
			<sec>
				<st>
					<p>Bacterial &#945;<sub>2</sub>-macroglobulin in apparently free-living bacteria</p>
				</st>
				<p>Many bacterial taxa contain a plethora of strains adapted for free-living, symbiotic and pathogenic lifestyles. Examples include the <it>Ralstonia </it>and <it>Anabaena </it>genera adapted to plants, <it>Escherichia </it>and <it>Treponema </it>adapted to animals and pseudomonads adapted to both. Many free-living bacterial strains are also facultative colonizers. This creates some difficulty in cataloguing genes that are adapted to colonizing niches versus free-living: it is rarely certain whether an apparently free-living species never colonizes a higher organism, or is not part of a continuum of strains frequently exchanging lifestyle genes. Given this caveat, we reviewed all the currently completed genomes of bacteria that are not in any way known to have close associations with higher eukaryotes. The available set of Gram-positive bacterial genomes stand out as never possessing a bact-&#945;<sub>2</sub>M gene (see below). Only three apparently free-living Gram-negatives (<it>Magnetospirillum</it>, <it>Caulobacter </it>and <it>Thermotoga</it>) have bact-&#945;<sub>2</sub>Ms while seven (<it>Aquifex</it>, <it>Chlorobium</it>, <it>Synechocystis</it>, <it>Synechococcus</it>, <it>Prochlorococcus</it>, <it>Nitrosomonas </it>and <it>Geobacter</it>) do not. Thus this crude estimate would suggest that possession of a bact-&#945;<sub>2</sub>M gene is associated with colonization, not as a core colonization factor, but as an accessory that enhances fitness for the colonization environment. Further, it may imply that the three 'free-living' species possessing a bact-&#945;<sub>2</sub>M gene have undocumented facultative symbiotic capabilities with higher eukaryotes.</p>
			</sec>
			<sec>
				<st>
					<p>Usage of host &#945;<sub>2</sub>-macroglobulin by invasive Gram-positive bacteria</p>
				</st>
				<p>The Gram-positive firmicutes and actinobacteria stand out as always lacking bact-&#945;<sub>2</sub>M genes (Figure <figr fid="F2">2</figr>). However, certain Gram-positives have found a more direct way to take advantage of &#945;<sub>2</sub>M proteins. Pathogenic <it>Streptococcus pyogenes </it>directly co-opt host &#945;<sub>2</sub>M for defense against host proteases through the cell-surface proteins GRAB and protein G <abbrgrp><abbr bid="B40">40</abbr><abbr bid="B41">41</abbr></abbrgrp>. As Gram-positive bacteria do not possess an outer membrane, defensive strategies are likely to differ from those of Gram-negatives. Invasive Gram-positives are found to coat themselves in a selected set of host proteins to obstruct host defenses. Streptococcal GRAB mutants that are unable to bind &#945;<sub>2</sub>M have attenuated virulence <abbrgrp><abbr bid="B40">40</abbr></abbrgrp>. It seems remarkable that prokaryotes have evolved two totally independent strategies to take advantage of &#945;<sub>2</sub>M. On the one hand, Gram-positives are able to use the host's own protein, on the other, Gram-negatives have acquired their own gene. The clear implication is that &#945;<sub>2</sub>M functionality has a wide and general significance spanning many bacterial taxa.</p>
			</sec>
			<sec>
				<st>
					<p>Bacterial &#945;<sub>2</sub>-macroglobulin YfhM/PBP1C: a second line of defense?</p>
				</st>
				<p>The lipopolysaccharide (LPS) layer of the outer membrane of Gram-negative bacteria provides a first line of defense. The outer membrane barrier is sufficient to prevent the enzyme lysozyme from lysing Gram-negative bacteria in culture <abbrgrp><abbr bid="B42">42</abbr></abbrgrp>. Under attack from host immunity and antimicrobial peptides <abbrgrp><abbr bid="B43">43</abbr></abbrgrp>, LPS can be disrupted or stripped away - for example, when released into the circulation, it can lead to septic shock <abbrgrp><abbr bid="B44">44</abbr></abbrgrp> - leaving the peptidoglycan cell wall and inner membrane exposed. There is current interest in antibacterial strategies that endeavor to enhance lysozyme activity by co-administration with agents that disrupt the outer membrane, such as EDTA <abbrgrp><abbr bid="B42">42</abbr></abbrgrp>.</p>
				<p>The following assumptions lead us to a hypothesis for YfhM bact-&#945;<sub>2</sub>M/PBP1C as a periplasmic defense system. First, bact-&#945;<sub>2</sub>M and PBP1C form a complex, probably through the carboxy-terminal non-enzymatic domain of PBP1C. Second, the complex resides in the periplasmic space, attached by acylation to the inner membrane. Third, bact-&#945;<sub>2</sub>M functions to entrap attacking proteases. Fourth, PBP1C is a transglycosylase that polymerizes glycan chains. Fifth, a periplasmic defense is only needed when the outer membrane has been breached and peptidoglycan is under attack.</p>
				<p>The role of the bact-&#945;<sub>2</sub>M/PBP1C system is then perceived to be defense at, and repair of, peptidoglycan breaches induced by the host (Figure <figr fid="F4">4</figr>). PBP1C provides 75% of the transglycosylase activity <it>in vitro</it>, but only 3% of peptidoglycan biosynthesis <it>in vivo </it><abbrgrp><abbr bid="B25">25</abbr></abbrgrp>: it is a fast linear transglycosylase, ideal for traversing and repairing a breach. During repair it will, however, be exposed to attacking proteases and may be rapidly rendered dysfunctional. The role of bact-&#945;<sub>2</sub>M will be to entrap attacking proteases, protecting PBP1C and other periplasmic proteins such as the high-affinity lysozyme inhibitor Ivy in <it>E. coli </it><abbrgrp><abbr bid="B45">45</abbr></abbrgrp>. In this way, the fate of the invading bacterial cell will depend on the relative balance of the host's attacking forces versus the bacterial defense systems. Under an optimized host attack, such defenses would be rapidly overwhelmed but when (or where) the host is not well prepared, these defenses may serve to prolong colonization.</p>
				<fig id="F4">
					<title>
						<p>Figure 4</p>
					</title>
					<caption>
						<p>Schematic outline of the proposed defense of breaches of the bacterial outer membrane</p>
					</caption>
					<text>
						<p>Schematic outline of the proposed defense of breaches of the bacterial outer membrane. Host systems (whether antimicrobial peptides, antibody and/or complement) have opened the outer membrane, allowing lysozyme and host proteases to attack periplasmic components, leading to a further breach of the peptidoglycan. Host attack is hampered by protease trapping (bacterial &#945;<sub>2</sub>-macroglobulin) and lysozyme inhibition (Ivy), giving PBP1C a chance to repair the glycan chains. The fate of the colonizing bacterial cell will now depend on whether the bacterial defenses are exhausted or the host attacking components are too limited to achieve cell lysis. Elements of the scheme are not drawn to scale.</p>
					</text>
					<graphic file="gb-2004-5-6-r38-4"/>
				</fig>
			</sec>
			<sec>
				<st>
					<p>Potential experimental and medical applications</p>
				</st>
				<p>The <it>yfhM/pbpC </it>gene pair in bacteria not only suggests experimental research strategies, but may have medical potential to help combat pathogenic organisms. Predicted periplasmic location and complexing of bact-&#945;<sub>2</sub>M and PBP1C with each other (and any other periplasmic proteins) should be straightforward to investigate biochemically. Elucidation of the host proteases entrapped by bact-&#945;<sub>2</sub>Ms should reveal which host defense proteases are targeted at which parasites, leading to enhanced understanding of host defense mechanisms. Bact-&#945;<sub>2</sub>M-inhibited proteases should be directly active against pathogen proteins - or else act indirectly as, for example, do the proteases of the complement cascade. <it>PbpC </it>deletions should show increased sensitivity to lysozyme treatments and <it>pbpC/ivy </it>double mutants, yet more so.</p>
				<p>The bact-&#945;<sub>2</sub>M/PBP1C proteins also provide targets for medical intervention, for example by training host immunity, the administration of anti-bact-&#945;<sub>2</sub>M monoclonal antibody or in combination therapies. Antibodies to bact-&#945;<sub>2</sub>Ms should act not just by promoting immune clearance but also to block the bact-&#945;<sub>2</sub>M activity, so that the host antibacterial proteases are unhindered. This dual effect may provide an enhanced prophylactic efficacy for vaccines that are augmented with extra bact-&#945;<sub>2</sub>M protein (probably as an inactive variant) or be directly invoked by targeted anti-bact-&#945;<sub>2</sub>M antibody administration for combating acute infection. PBP1C should also be rendered dysfunctional by specific antibodies, perhaps in combination with transglycosylase inhibitors such as the antibiotic moenomycin.</p>
			</sec>
		</sec>
		<sec>
			<st>
				<p>Conclusions</p>
			</st>
			<p>Bact-&#945;<sub>2</sub>Ms are spread widely amongst symbiotic and pathogenic bacteria. The implication is that protease inhibition is often an aid to colonizing higher eukaryotes. The major form of bact-&#945;<sub>2</sub>Ms is typified by <it>E. coli </it>YfhM and is a periplasmic protein that co-occurs with periplasmic PBP1C, a candidate peptidoglycan repair enzyme. The distribution of the <it>yfhM</it>/<it>pbpC </it>gene pair is inconsistent with the established bacterial phylogeny. Molecular trees calculated for each of the proteins are in good agreement with each other. Each tree provides a control for the other tree's topology, allowing confidence in the general topology. This allows us to state with high confidence that at least 11 separate gene transfers have occurred between highly diverged bacterial taxa. An additional gene transfer has occurred between bacteria and metazoans. We are not yet able to determine in which direction this transfer occurred, and therefore the title question is not yet answerable.</p>
			<p>The known properties of &#945;<sub>2</sub>Ms and PBP1C point to a periplasmic line of defense at cell-wall breaches, mounted by the YfhM bact-&#945;<sub>2</sub>M and PBP1C. This defensive line should be sensitive to antibody-based therapeutic approaches, whether enhanced vaccine efficacy or direct administration of antibody.</p>
		</sec>
		<sec>
			<st>
				<p>Materials and methods</p>
			</st>
			<sec>
				<st>
					<p>Sequence database searches</p>
				</st>
				<p>Bacterial &#945;<sub>2</sub>Ms were clearly revealed in a search of SWISSALL <abbrgrp><abbr bid="B46">46</abbr></abbrgrp> using BLAST2SRS <abbrgrp><abbr bid="B47">47</abbr></abbrgrp> in which the species names are included in the BLAST output <abbrgrp><abbr bid="B48">48</abbr></abbrgrp>. Profile searches as described <abbrgrp><abbr bid="B49">49</abbr></abbrgrp> using the EMBL Bioccelerators <abbrgrp><abbr bid="B50">50</abbr></abbrgrp> supported and extended the findings and were used to retrieve a set of bacterial sequences. Reciprocal searches with bact-&#945;<sub>2</sub>M profiles reconfirmed the findings with good E-values (&lt;1.e-25). The sets of proteomes provided by the BLAST server <abbrgrp><abbr bid="B51">51</abbr><abbr bid="B52">52</abbr></abbrgrp> at the National Center for Biotechnology Information (NCBI) <abbrgrp><abbr bid="B53">53</abbr></abbrgrp> were surveyed to determine the presence or absence of &#945;<sub>2</sub>Ms in bacteria and in non-metazoan eukaryotes.</p>
			</sec>
			<sec>
				<st>
					<p>Survey of genomic context</p>
				</st>
				<p>The STRING server <abbrgrp><abbr bid="B54">54</abbr></abbrgrp> is a resource for exploring genome context (for example, identifying groups of genes found in close proximity in many different genomes <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>). Queries with bact-&#945;<sub>2</sub>Ms from <it>E. coli </it>or other bacteria yielded a recurring result: in most species the bact-&#945;<sub>2</sub>Ms cluster consistently with certain other gene families. This behavior is typical of gene sets belonging to the same operon. These families were retrieved and used for further database explorations, alignments and trees. To identify the location of these gene families in other genomes where linkage to bact-&#945;<sub>2</sub>Ms is less direct than those presented by STRING, we downloaded genomic database entries from the NCBI, converted the format of these files to EMBL using BioPerl <abbrgrp><abbr bid="B55">55</abbr></abbrgrp>, and assessed the location of the genes using Artemis <abbrgrp><abbr bid="B56">56</abbr></abbrgrp>. In addition, linkage of these gene families was investigated in organisms not included in STRING using the same method.</p>
			</sec>
			<sec>
				<st>
					<p>Sequence alignment and editing</p>
				</st>
				<p>Sequences were aligned using Clustal X 1.83 <abbrgrp><abbr bid="B57">57</abbr></abbrgrp>. Because many sequences are very dissimilar to each other, misaligned regions were to be expected. These were identified using the 'low scoring segments' check and either realigned using the 'realign selected range' option or were hand-edited in SeaView <abbrgrp><abbr bid="B58">58</abbr></abbrgrp>. Corrections were assessed by both improvements to conserved hydrophobic columns (indicating structurally important residues) and with the 'low scoring segments' check. Sequences excluded because they were either too divergent to be aligned or may contain sequencing errors included <it>Deinococcus radiodurans </it>and <it>Bacteroides thetaiotamicron</it>.</p>
			</sec>
			<sec>
				<st>
					<p>Calculation of sequence trees</p>
				</st>
				<p>Preliminary trees were made by neighbor-joining <abbrgrp><abbr bid="B59">59</abbr></abbrgrp> as implemented in Clustal X, excluding gaps and correcting for multiple substitutions with the Kimura PAM model. These initial trees indicated that HGT had occurred, warranting more careful assessment. Alignments were processed with the Gblocks server <abbrgrp><abbr bid="B60">60</abbr></abbrgrp> (for the divergent bact-&#945;<sub>2</sub>Ms, the low stringency settings were used). Gblocks heuristically removes poorly conserved excessively divergent segments of alignments with low signal-to-noise ratio in order to enhance the phylogenetic signal <abbrgrp><abbr bid="B61">61</abbr></abbrgrp>. Processed alignments were used to derive tree topologies using Bayesian inference of phylogeny as implemented by MrBayes v2.01 <abbrgrp><abbr bid="B27">27</abbr></abbrgrp> with maximum-likelihood branch-length estimates provided by PUZZLE <abbrgrp><abbr bid="B62">62</abbr></abbrgrp>. MrBayes was used with four heated chains over 250,000 generations, sampling every 20 trees. The likelihoods of these trees were examined to estimate the length of the burn-in phase, and all trees sampled 20,000 generations later than this point were used to create a consensus tree using the 50% majority rule. Both MrBayes and PUZZLE were used with the JTT model of amino-acid substitution <abbrgrp><abbr bid="B63">63</abbr></abbrgrp>, assuming the presence of invariant sites and using a gamma distribution approximated by four different rate categories to model rate variation between sites, estimating amino-acid frequencies from the alignment. Trees were displayed and rooted in Njplot <abbrgrp><abbr bid="B64">64</abbr></abbrgrp>.</p>
			</sec>
			<sec>
				<st>
					<p>Estimation of minimum <it>yfhM </it>gene number in the bacterial last common ancestor</p>
				</st>
				<p>The program GeneTree <abbrgrp><abbr bid="B28">28</abbr></abbrgrp> was used to evaluate the cost of embedding the YfhM sequence tree in a bacterial species tree. To compute the minimum gene number required in the last common ancestor of the given bacterial set, we set the unresolved bacterial affinities to match the YfhM/PBP1C trees (that is, cyanobacteria and spirochetes form a clade, as do bacteroidetes and fusobacteria; within the proteobacteria, the subgroup affinities were allocated to minimize the number of duplications required in the observed trees). <it>Magnetospirillum </it>was excluded from the analysis as its position is not stable in the YfhM and PBP1C trees. Embedding the observed tree topology in this bacterial species tree yielded a reconciled tree requiring six duplication and 29 deletion events.</p>
			</sec>
			<sec>
				<st>
					<p>Microarray expression data</p>
				</st>
				<p>STRING was used to investigate the expression patterns of genes as detected by DNA microarray. The Stanford Microarray Database (SMD) <abbrgrp><abbr bid="B26">26</abbr><abbr bid="B65">65</abbr></abbrgrp> was used to verify that these genes were indeed spotted on the arrays used by STRING, and that the spots displayed intensities significantly higher than background levels.</p>
			</sec>
		</sec>
	</bdy>
	<bm>
		<ack>
			<sec>
				<st>
					<p>Acknowledgements</p>
				</st>
				<p>We thank Christian von Mering and Lars Jensen for helpful discussions. We are grateful to Fotis Kafatos for his consistent support of the <it>Anopheles </it>TEP studies.</p>
			</sec>
		</ack>
		<refgrp>
			<bibl id="B1">
				<title>
					<p>alpha 2-macroglobulin, complement, and biologic defense: antigens, growth factors, microbial proteases, and receptor ligation.</p>
				</title>
				<aug>
					<au>
						<snm>Chu</snm>
						<fnm>CT</fnm>
					</au>
					<au>
						<snm>Pizzo</snm>
						<fnm>SV</fnm>
					</au>
				</aug>
				<source>Lab Invest</source>
				<pubdate>1994</pubdate>
				<volume>71</volume>
				<fpage>792</fpage>
				<lpage>812</lpage>
				<xrefbib>
					<pubid idtype="pmpid">7528831</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B2">
				<title>
					<p>Further characterization of the covalent linking reaction of alpha 2-macroglobulin.</p>
				</title>
				<aug>
					<au>
						<snm>Salvesen</snm>
						<fnm>GS</fnm>
					</au>
					<au>
						<snm>Sayers</snm>
						<fnm>CA</fnm>
					</au>
					<au>
						<snm>Barrett</snm>
						<fnm>AJ</fnm>
					</au>
				</aug>
				<source>Biochem J</source>
				<pubdate>1981</pubdate>
				<volume>195</volume>
				<fpage>453</fpage>
				<lpage>461</lpage>
				<xrefbib>
					<pubid idtype="pmpid">6172116</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B3">
				<title>
					<p>The internal thioester and the covalent binding properties of the complement proteins C3 and C4.</p>
				</title>
				<aug>
					<au>
						<snm>Law</snm>
						<fnm>SK</fnm>
					</au>
					<au>
						<snm>Dodds</snm>
						<fnm>AW</fnm>
					</au>
				</aug>
				<source>Protein Sci</source>
				<pubdate>1997</pubdate>
				<volume>6</volume>
				<fpage>263</fpage>
				<lpage>274</lpage>
				<xrefbib>
					<pubid idtype="pmpid">9041627</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B4">
				<title>
					<p>Alpha-macroglobulins: structure, shape, and mechanism of proteinase complex formation.</p>
				</title>
				<aug>
					<au>
						<snm>Sottrup-Jensen</snm>
						<fnm>L</fnm>
					</au>
				</aug>
				<source>J Biol Chem</source>
				<pubdate>1989</pubdate>
				<volume>264</volume>
				<fpage>11539</fpage>
				<lpage>11542</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">2473064</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B5">
				<title>
					<p>The contribution of proteinase inhibitors to immune defense.</p>
				</title>
				<aug>
					<au>
						<snm>Armstrong</snm>
						<fnm>PB</fnm>
					</au>
				</aug>
				<source>Trends Immunol</source>
				<pubdate>2001</pubdate>
				<volume>22</volume>
				<fpage>47</fpage>
				<lpage>52</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/S1471-4906(00)01803-2</pubid>
						<pubid idtype="pmpid" link="fulltext">11286692</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B6">
				<title>
					<p>The C-terminal domain promotes the hemorrhagic damage caused by <it>Vibrio vulnificus </it>metalloprotease.</p>
				</title>
				<aug>
					<au>
						<snm>Miyoshi</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Kawata</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Tomochika</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Shinoda</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Yamamoto</snm>
						<fnm>S</fnm>
					</au>
				</aug>
				<source>Toxicon</source>
				<pubdate>2001</pubdate>
				<volume>39</volume>
				<fpage>1883</fpage>
				<lpage>1886</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/S0041-0101(01)00171-4</pubid>
						<pubid idtype="pmpid" link="fulltext">11600151</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B7">
				<title>
					<p><it>Cryptobia </it>(<it>Trypanoplasma</it>) <it>salmositica </it>and salmonid cryptobiosis.</p>
				</title>
				<aug>
					<au>
						<snm>Woo</snm>
						<fnm>PT</fnm>
					</au>
				</aug>
				<source>J Fish Dis</source>
				<pubdate>2003</pubdate>
				<volume>26</volume>
				<fpage>627</fpage>
				<lpage>646</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1046/j.1365-2761.2003.00500.x</pubid>
						<pubid idtype="pmpid">14710756</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B8">
				<title>
					<p>Interactions between cytokines and alpha 2-macroglobulin.</p>
				</title>
				<aug>
					<au>
						<snm>Chu</snm>
						<fnm>CT</fnm>
					</au>
					<au>
						<snm>Pizzo</snm>
						<fnm>SV</fnm>
					</au>
				</aug>
				<source>Immunol Today</source>
				<pubdate>1991</pubdate>
				<volume>12</volume>
				<fpage>249</fpage>
				<xrefbib>
					<pubid idtype="pmpid">1716108</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B9">
				<title>
					<p>Alpha 2-macroglobulin functions as a cytokine carrier to induce nitric oxide synthesis and cause nitric oxide-dependent cytotoxicity in the RAW 264.7 macrophage cell line.</p>
				</title>
				<aug>
					<au>
						<snm>Lysiak</snm>
						<fnm>JJ</fnm>
					</au>
					<au>
						<snm>Hussaini</snm>
						<fnm>IM</fnm>
					</au>
					<au>
						<snm>Webb</snm>
						<fnm>DJ</fnm>
					</au>
					<au>
						<snm>Glass</snm>
						<fnm>WF</fnm>
						<suf>2nd</suf>
					</au>
					<au>
						<snm>Allietta</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Gonias</snm>
						<fnm>SL</fnm>
					</au>
				</aug>
				<source>J Biol Chem</source>
				<pubdate>1995</pubdate>
				<volume>270</volume>
				<fpage>21919</fpage>
				<lpage>21927</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1074/jbc.270.37.21919</pubid>
						<pubid idtype="pmpid" link="fulltext">7545171</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B10">
				<title>
					<p>The NTR module: domains of netrins, secreted frizzled related proteins, and type I procollagen C-proteinase enhancer protein are homologous with tissue inhibitors of metalloproteases.</p>
				</title>
				<aug>
					<au>
						<snm>Banyai</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Patthy</snm>
						<fnm>L</fnm>
					</au>
				</aug>
				<source>Protein Sci</source>
				<pubdate>1999</pubdate>
				<volume>8</volume>
				<fpage>1636</fpage>
				<lpage>1642</lpage>
				<xrefbib>
					<pubid idtype="pmpid">10452607</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B11">
				<title>
					<p>Alpha2-macroglobulin: an evolutionarily conserved arm of the innate immune system.</p>
				</title>
				<aug>
					<au>
						<snm>Armstrong</snm>
						<fnm>PB</fnm>
					</au>
					<au>
						<snm>Quigley</snm>
						<fnm>JP</fnm>
					</au>
				</aug>
				<source>Dev Comp Immunol</source>
				<pubdate>1999</pubdate>
				<volume>23</volume>
				<fpage>375</fpage>
				<lpage>390</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/S0145-305X(99)00018-X</pubid>
						<pubid idtype="pmpid">10426429</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B12">
				<title>
					<p>Localisation of the major reactive lysine residue involved in the self-crosslinking of proteinase-activated <it>Limulus </it>alpha 2-macroglobulin.</p>
				</title>
				<aug>
					<au>
						<snm>Dolmer</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Husted</snm>
						<fnm>LB</fnm>
					</au>
					<au>
						<snm>Armstrong</snm>
						<fnm>PB</fnm>
					</au>
					<au>
						<snm>Sottrup-Jensen</snm>
						<fnm>L</fnm>
					</au>
				</aug>
				<source>FEBS Lett</source>
				<pubdate>1996</pubdate>
				<volume>393</volume>
				<fpage>37</fpage>
				<lpage>40</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/0014-5793(96)00852-6</pubid>
						<pubid idtype="pmpid" link="fulltext">8804419</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B13">
				<title>
					<p>Immunity-related genes and gene families in <it>Anopheles gambiae</it>.</p>
				</title>
				<aug>
					<au>
						<snm>Christophides</snm>
						<fnm>GK</fnm>
					</au>
					<au>
						<snm>Zdobnov</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>Barillas-Mury</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Birney</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>Blandin</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Blass</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Brey</snm>
						<fnm>PT</fnm>
					</au>
					<au>
						<snm>Collins</snm>
						<fnm>FH</fnm>
					</au>
					<au>
						<snm>Danielli</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Dimopoulos</snm>
						<fnm>G</fnm>
					</au>
					<etal/>
				</aug>
				<source>Science</source>
				<pubdate>2002</pubdate>
				<volume>298</volume>
				<fpage>159</fpage>
				<lpage>165</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1126/science.1077136</pubid>
						<pubid idtype="pmpid" link="fulltext">12364793</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B14">
				<title>
					<p>Conserved role of a complement-like protein in phagocytosis revealed by dsRNA knockout in cultured cells of the mosquito, <it>Anopheles gambiae</it>.</p>
				</title>
				<aug>
					<au>
						<snm>Levashina</snm>
						<fnm>EA</fnm>
					</au>
					<au>
						<snm>Moita</snm>
						<fnm>LF</fnm>
					</au>
					<au>
						<snm>Blandin</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Vriend</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Lagueux</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Kafatos</snm>
						<fnm>FC</fnm>
					</au>
				</aug>
				<source>Cell</source>
				<pubdate>2001</pubdate>
				<volume>104</volume>
				<fpage>709</fpage>
				<lpage>718</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">11257225</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B15">
				<title>
					<p>Complement-like protein TEP1 is a determinant of vectorial capacity in the malaria vector <it>Anopheles gambiae</it>.</p>
				</title>
				<aug>
					<au>
						<snm>Blandin</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Shiao</snm>
						<fnm>S-H</fnm>
					</au>
					<au>
						<snm>Moita</snm>
						<fnm>LF</fnm>
					</au>
					<au>
						<snm>Janse</snm>
						<fnm>CJ</fnm>
					</au>
					<au>
						<snm>Waters</snm>
						<fnm>AP</fnm>
					</au>
					<au>
						<snm>Kafatos</snm>
						<fnm>FC</fnm>
					</au>
					<au>
						<snm>Levashina</snm>
						<fnm>EA</fnm>
					</au>
				</aug>
				<source>Cell</source>
				<pubdate>2004</pubdate>
				<volume>116</volume>
				<fpage>661</fpage>
				<lpage>670</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/S0092-8674(04)00173-4</pubid>
						<pubid idtype="pmpid" link="fulltext">15006349</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B16">
				<title>
					<p>Thioester-containing proteins and insect immunity.</p>
				</title>
				<aug>
					<au>
						<snm>Blandin</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Levashina</snm>
						<fnm>EA</fnm>
					</au>
				</aug>
				<source>Mol Immunol</source>
				<pubdate>2004</pubdate>
				<volume>40</volume>
				<fpage>903</fpage>
				<lpage>908</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/j.molimm.2003.10.010</pubid>
						<pubid idtype="pmpid" link="fulltext">14698229</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B17">
				<title>
					<p>Aminoacylation of the N-terminal cysteine is essential for Lol-dependent release of lipoproteins from membranes but does not depend on lipoprotein sorting signals.</p>
				</title>
				<aug>
					<au>
						<snm>Fukuda</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Matsuyama</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Hara</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Nakayama</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Nagasawa</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Tokuda</snm>
						<fnm>H</fnm>
					</au>
				</aug>
				<source>J Biol Chem</source>
				<pubdate>2002</pubdate>
				<volume>277</volume>
				<fpage>43512</fpage>
				<lpage>43518</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1074/jbc.M206816200</pubid>
						<pubid idtype="pmpid" link="fulltext">12198129</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B18">
				<title>
					<p>A single amino acid determinant of the membrane localization of lipoproteins in <it>E. coli</it>.</p>
				</title>
				<aug>
					<au>
						<snm>Yamaguchi</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Yu</snm>
						<fnm>F</fnm>
					</au>
					<au>
						<snm>Inouye</snm>
						<fnm>M</fnm>
					</au>
				</aug>
				<source>Cell</source>
				<pubdate>1988</pubdate>
				<volume>53</volume>
				<fpage>423</fpage>
				<lpage>432</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/0092-8674(88)90162-6</pubid>
						<pubid idtype="pmpid">3284654</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B19">
				<title>
					<p>Ovostatin: a novel proteinase inhibitor from chicken egg white. II. Mechanism of inhibition studied with collagenase and thermolysin.</p>
				</title>
				<aug>
					<au>
						<snm>Nagase</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Harris</snm>
						<fnm>ED</fnm>
						<suf>Jr</suf>
					</au>
				</aug>
				<source>J Biol Chem</source>
				<pubdate>1983</pubdate>
				<volume>258</volume>
				<fpage>7490</fpage>
				<lpage>7498</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">6305943</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B20">
				<title>
					<p>The complete genome sequence of the carcinogenic bacterium <it>Helicobacter hepaticus</it>.</p>
				</title>
				<aug>
					<au>
						<snm>Suerbaum</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Josenhans</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Sterzenbach</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Drescher</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Brandt</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Bell</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Droge</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Fartmann</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Fischer</snm>
						<fnm>HP</fnm>
					</au>
					<au>
						<snm>Ge</snm>
						<fnm>Z</fnm>
					</au>
					<etal/>
				</aug>
				<source>Proc Natl Acad Sci USA</source>
				<pubdate>2003</pubdate>
				<volume>100</volume>
				<fpage>7901</fpage>
				<lpage>7906</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1073/pnas.1332093100</pubid>
						<pubid idtype="pmpid" link="fulltext">12810954</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B21">
				<title>
					<p>STRING: a database of predicted functional associations between proteins.</p>
				</title>
				<aug>
					<au>
						<snm>von Mering</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Huynen</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Jaeggi</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Schmidt</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Bork</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Snel</snm>
						<fnm>B</fnm>
					</au>
				</aug>
				<source>Nucleic Acids Res</source>
				<pubdate>2003</pubdate>
				<volume>31</volume>
				<fpage>258</fpage>
				<lpage>261</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1093/nar/gkg034</pubid>
						<pubid idtype="pmpid" link="fulltext">12519996</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B22">
				<title>
					<p>The use of gene clusters to infer functional coupling.</p>
				</title>
				<aug>
					<au>
						<snm>Overbeek</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Fonstein</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>D'Souza</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Pusch</snm>
						<fnm>GD</fnm>
					</au>
					<au>
						<snm>Maltsev</snm>
						<fnm>N</fnm>
					</au>
				</aug>
				<source>Proc Natl Acad Sci USA</source>
				<pubdate>1999</pubdate>
				<volume>96</volume>
				<fpage>2896</fpage>
				<lpage>2901</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1073/pnas.96.6.2896</pubid>
						<pubid idtype="pmpid" link="fulltext">10077608</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B23">
				<title>
					<p>Conservation of gene order: a fingerprint of proteins that physically interact.</p>
				</title>
				<aug>
					<au>
						<snm>Dandekar</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Snel</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Huynen</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Bork</snm>
						<fnm>P</fnm>
					</au>
				</aug>
				<source>Trends Biochem Sci</source>
				<pubdate>1998</pubdate>
				<volume>23</volume>
				<fpage>324</fpage>
				<lpage>328</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/S0968-0004(98)01274-2</pubid>
						<pubid idtype="pmpid" link="fulltext">9787636</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B24">
				<title>
					<p>Multimodular penicillin-binding proteins: an enigmatic family of orthologs and paralogs.</p>
				</title>
				<aug>
					<au>
						<snm>Goffin</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Ghuysen</snm>
						<fnm>JM</fnm>
					</au>
				</aug>
				<source>Microbiol Mol Biol Rev</source>
				<pubdate>1998</pubdate>
				<volume>62</volume>
				<fpage>1079</fpage>
				<lpage>1093</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">9841666</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B25">
				<title>
					<p>Cloning and characterization of PBP 1C, a third member of the multimodular class A penicillin-binding proteins of <it>Escherichia coli</it>.</p>
				</title>
				<aug>
					<au>
						<snm>Schiffer</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Holtje</snm>
						<fnm>JV</fnm>
					</au>
				</aug>
				<source>J Biol Chem</source>
				<pubdate>1999</pubdate>
				<volume>274</volume>
				<fpage>32031</fpage>
				<lpage>32039</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1074/jbc.274.45.32031</pubid>
						<pubid idtype="pmpid" link="fulltext">10542235</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B26">
				<title>
					<p>The Stanford Microarray Database: data access and quality assessment tools.</p>
				</title>
				<aug>
					<au>
						<snm>Gollub</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Ball</snm>
						<fnm>CA</fnm>
					</au>
					<au>
						<snm>Binkley</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Demeter</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Finkelstein</snm>
						<fnm>DB</fnm>
					</au>
					<au>
						<snm>Hebert</snm>
						<fnm>JM</fnm>
					</au>
					<au>
						<snm>Hernandez-Boussard</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Jin</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Kaloper</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Matese</snm>
						<fnm>JC</fnm>
					</au>
					<etal/>
				</aug>
				<source>Nucleic Acids Res</source>
				<pubdate>2003</pubdate>
				<volume>31</volume>
				<fpage>94</fpage>
				<lpage>96</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1093/nar/gkg078</pubid>
						<pubid idtype="pmpid" link="fulltext">12519956</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B27">
				<title>
					<p>MRBAYES: Bayesian inference of phylogenetic trees.</p>
				</title>
				<aug>
					<au>
						<snm>Huelsenbeck</snm>
						<fnm>JP</fnm>
					</au>
					<au>
						<snm>Ronquist</snm>
						<fnm>F</fnm>
					</au>
				</aug>
				<source>Bioinformatics</source>
				<pubdate>2001</pubdate>
				<volume>17</volume>
				<fpage>754</fpage>
				<lpage>755</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1093/bioinformatics/17.8.754</pubid>
						<pubid idtype="pmpid" link="fulltext">11524383</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B28">
				<title>
					<p>GeneTree: comparing gene and species phylogenies using reconciled trees.</p>
				</title>
				<aug>
					<au>
						<snm>Page</snm>
						<fnm>RD</fnm>
					</au>
				</aug>
				<source>Bioinformatics</source>
				<pubdate>1998</pubdate>
				<volume>14</volume>
				<fpage>819</fpage>
				<lpage>820</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1093/bioinformatics/14.9.819</pubid>
						<pubid idtype="pmpid" link="fulltext">9918954</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B29">
				<title>
					<p>Diversity of dissimilatory bisulfite reductase genes of bacteria associated with the deep-sea hydrothermal vent polychaete annelid <it>Alvinella pompejana</it>.</p>
				</title>
				<aug>
					<au>
						<snm>Cottrell</snm>
						<fnm>MT</fnm>
					</au>
					<au>
						<snm>Cary</snm>
						<fnm>SC</fnm>
					</au>
				</aug>
				<source>Appl Environ Microbiol</source>
				<pubdate>1999</pubdate>
				<volume>65</volume>
				<fpage>1127</fpage>
				<lpage>1132</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">10049872</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B30">
				<title>
					<p>Pattern of development of <it>Anabaena </it>in <it>Azolla-Anabaena </it>symbiosis.</p>
				</title>
				<aug>
					<au>
						<snm>Hill</snm>
						<fnm>DJ</fnm>
					</au>
				</aug>
				<source>Planta</source>
				<pubdate>1975</pubdate>
				<volume>122</volume>
				<fpage>179</fpage>
				<lpage>184</lpage>
			</bibl>
			<bibl id="B31">
				<title>
					<p>Initial sequencing and analysis of the human genome.</p>
				</title>
				<aug>
					<au>
						<snm>Lander</snm>
						<fnm>ES</fnm>
					</au>
					<au>
						<snm>Linton</snm>
						<fnm>LM</fnm>
					</au>
					<au>
						<snm>Birren</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Nusbaum</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Zody</snm>
						<fnm>MC</fnm>
					</au>
					<au>
						<snm>Baldwin</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Devon</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Dewar</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Doyle</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>FitzHugh</snm>
						<fnm>W</fnm>
					</au>
					<etal/>
				</aug>
				<source>Nature</source>
				<pubdate>2001</pubdate>
				<volume>409</volume>
				<fpage>860</fpage>
				<lpage>921</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1038/35057062</pubid>
						<pubid idtype="pmpid" link="fulltext">11237011</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B32">
				<title>
					<p>Microbial genes in the human genome: lateral transfer or gene loss?</p>
				</title>
				<aug>
					<au>
						<snm>Salzberg</snm>
						<fnm>SL</fnm>
					</au>
					<au>
						<snm>White</snm>
						<fnm>O</fnm>
					</au>
					<au>
						<snm>Peterson</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Eisen</snm>
						<fnm>JA</fnm>
					</au>
				</aug>
				<source>Science</source>
				<pubdate>2001</pubdate>
				<volume>292</volume>
				<fpage>1903</fpage>
				<lpage>1906</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1126/science.1061036</pubid>
						<pubid idtype="pmpid" link="fulltext">11358996</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B33">
				<title>
					<p>Phylogenetic analyses do not support horizontal gene transfers from bacteria to vertebrates.</p>
				</title>
				<aug>
					<au>
						<snm>Stanhope</snm>
						<fnm>MJ</fnm>
					</au>
					<au>
						<snm>Lupas</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Italia</snm>
						<fnm>MJ</fnm>
					</au>
					<au>
						<snm>Koretke</snm>
						<fnm>KK</fnm>
					</au>
					<au>
						<snm>Volker</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Brown</snm>
						<fnm>JR</fnm>
					</au>
				</aug>
				<source>Nature</source>
				<pubdate>2001</pubdate>
				<volume>411</volume>
				<fpage>940</fpage>
				<lpage>944</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1038/35082058</pubid>
						<pubid idtype="pmpid" link="fulltext">11418856</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B34">
				<title>
					<p>Much ado about bacteria-to-vertebrate lateral gene transfer.</p>
				</title>
				<aug>
					<au>
						<snm>Genereux</snm>
						<fnm>DP</fnm>
					</au>
					<au>
						<snm>Logsdon</snm>
						<fnm>JM</fnm>
						<suf>Jr</suf>
					</au>
				</aug>
				<source>Trends Genet</source>
				<pubdate>2003</pubdate>
				<volume>19</volume>
				<fpage>191</fpage>
				<lpage>195</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/S0168-9525(03)00055-6</pubid>
						<pubid idtype="pmpid" link="fulltext">12683971</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B35">
				<title>
					<p>Detection of lateral gene transfer among microbial genomes.</p>
				</title>
				<aug>
					<au>
						<snm>Ragan</snm>
						<fnm>MA</fnm>
					</au>
				</aug>
				<source>Curr Opin Genet Dev</source>
				<pubdate>2001</pubdate>
				<volume>11</volume>
				<fpage>620</fpage>
				<lpage>626</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/S0959-437X(00)00244-6</pubid>
						<pubid idtype="pmpid" link="fulltext">11682304</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B36">
				<title>
					<p>Pathogenicity islands and phages in <it>Vibrio cholerae </it>evolution.</p>
				</title>
				<aug>
					<au>
						<snm>Faruque</snm>
						<fnm>SM</fnm>
					</au>
					<au>
						<snm>Mekalanos</snm>
						<fnm>JJ</fnm>
					</au>
				</aug>
				<source>Trends Microbiol</source>
				<pubdate>2003</pubdate>
				<volume>11</volume>
				<fpage>505</fpage>
				<lpage>510</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/j.tim.2003.09.003</pubid>
						<pubid idtype="pmpid" link="fulltext">14607067</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B37">
				<title>
					<p>Lysogenic conversion by a filamentous phage encoding cholera toxin.</p>
				</title>
				<aug>
					<au>
						<snm>Waldor</snm>
						<fnm>MK</fnm>
					</au>
					<au>
						<snm>Mekalanos</snm>
						<fnm>JJ</fnm>
					</au>
				</aug>
				<source>Science</source>
				<pubdate>1996</pubdate>
				<volume>272</volume>
				<fpage>1910</fpage>
				<lpage>1914</lpage>
				<xrefbib>
					<pubid idtype="pmpid">8658163</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B38">
				<title>
					<p>Comparative genomic analysis of <it>Vibrio cholerae</it>: genes that correlate with cholera endemic and pandemic disease.</p>
				</title>
				<aug>
					<au>
						<snm>Dziejman</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Balon</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>Boyd</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Fraser</snm>
						<fnm>CM</fnm>
					</au>
					<au>
						<snm>Heidelberg</snm>
						<fnm>JF</fnm>
					</au>
					<au>
						<snm>Mekalanos</snm>
						<fnm>JJ</fnm>
					</au>
				</aug>
				<source>Proc Natl Acad Sci USA</source>
				<pubdate>2002</pubdate>
				<volume>99</volume>
				<fpage>1556</fpage>
				<lpage>1561</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1073/pnas.042667999</pubid>
						<pubid idtype="pmpid" link="fulltext">11818571</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B39">
				<title>
					<p>Genesis of the novel epidemic Vibrio cholerae O139 strain: evidence for horizontal transfer of genes involved in polysaccharide synthesis.</p>
				</title>
				<aug>
					<au>
						<snm>Bik</snm>
						<fnm>EM</fnm>
					</au>
					<au>
						<snm>Bunschoten</snm>
						<fnm>AE</fnm>
					</au>
					<au>
						<snm>Gouw</snm>
						<fnm>RD</fnm>
					</au>
					<au>
						<snm>Mooi</snm>
						<fnm>FR</fnm>
					</au>
				</aug>
				<source>EMBO J</source>
				<pubdate>1995</pubdate>
				<volume>14</volume>
				<fpage>209</fpage>
				<lpage>216</lpage>
				<xrefbib>
					<pubid idtype="pmpid">7835331</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B40">
				<title>
					<p>Protein GRAB of <it>Streptococcus pyogenes </it>regulates proteolysis at the bacterial surface by binding alpha2-macroglobulin.</p>
				</title>
				<aug>
					<au>
						<snm>Rasmussen</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>M&#252;ller</snm>
						<fnm>HP</fnm>
					</au>
					<au>
						<snm>Bj&#246;rck</snm>
						<fnm>L</fnm>
					</au>
				</aug>
				<source>J Biol Chem</source>
				<pubdate>1999</pubdate>
				<volume>274</volume>
				<fpage>15336</fpage>
				<lpage>15344</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1074/jbc.274.22.15336</pubid>
						<pubid idtype="pmpid" link="fulltext">10336419</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B41">
				<title>
					<p>Ig-binding bacterial proteins also bind proteinase inhibitors.</p>
				</title>
				<aug>
					<au>
						<snm>Sjobring</snm>
						<fnm>U</fnm>
					</au>
					<au>
						<snm>Trojnar</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Grubb</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Akerstrom</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Bj&#246;rck</snm>
						<fnm>L</fnm>
					</au>
				</aug>
				<source>J Immunol</source>
				<pubdate>1989</pubdate>
				<volume>143</volume>
				<fpage>2948</fpage>
				<lpage>2954</lpage>
				<xrefbib>
					<pubid idtype="pmpid">2478629</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B42">
				<title>
					<p>Antimicrobial properties of lysozyme in relation to foodborne vegetative bacteria.</p>
				</title>
				<aug>
					<au>
						<snm>Masschalck</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Michiels</snm>
						<fnm>CW</fnm>
					</au>
				</aug>
				<source>Crit Rev Microbiol</source>
				<pubdate>2003</pubdate>
				<volume>29</volume>
				<fpage>191</fpage>
				<lpage>214</lpage>
				<xrefbib>
					<pubid idtype="pmpid">14582617</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B43">
				<title>
					<p>Antimicrobial polypeptides.</p>
				</title>
				<aug>
					<au>
						<snm>Ganz</snm>
						<fnm>T</fnm>
					</au>
				</aug>
				<source>J Leukoc Biol</source>
				<pubdate>2003</pubdate>
				<volume>75</volume>
				<fpage>34</fpage>
				<lpage>38</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1189/jlb.0403150</pubid>
						<pubid idtype="pmpid" link="fulltext">12960278</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B44">
				<title>
					<p>Towards antibacterial strategies: studies on the mechanisms of interaction between antibacterial peptides and model membranes.</p>
				</title>
				<aug>
					<au>
						<snm>Wiese</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Gutsmann</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Seydel</snm>
						<fnm>U</fnm>
					</au>
				</aug>
				<source>J Endotoxin Res</source>
				<pubdate>2003</pubdate>
				<volume>9</volume>
				<fpage>67</fpage>
				<lpage>84</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1179/096805103125001441</pubid>
						<pubid idtype="pmpid" link="fulltext">12803879</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B45">
				<title>
					<p><it>Escherichia coli ykfE </it>ORFan gene encodes a potent inhibitor of C-type lysozyme.</p>
				</title>
				<aug>
					<au>
						<snm>Monchois</snm>
						<fnm>V</fnm>
					</au>
					<au>
						<snm>Abergel</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Sturgis</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Jeudy</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Claverie</snm>
						<fnm>JM</fnm>
					</au>
				</aug>
				<source>J Biol Chem</source>
				<pubdate>2001</pubdate>
				<volume>276</volume>
				<fpage>18437</fpage>
				<lpage>18441</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1074/jbc.M010297200</pubid>
						<pubid idtype="pmpid" link="fulltext">11278658</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B46">
				<title>
					<p>The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003.</p>
				</title>
				<aug>
					<au>
						<snm>Boeckmann</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Bairoch</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Apweiler</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Blatter</snm>
						<fnm>MC</fnm>
					</au>
					<au>
						<snm>Estreicher</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Gasteiger</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>Martin</snm>
						<fnm>MJ</fnm>
					</au>
					<au>
						<snm>Michoud</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>O'Donovan</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Phan</snm>
						<fnm>I</fnm>
					</au>
					<etal/>
				</aug>
				<source>Nucleic Acids Res</source>
				<pubdate>2003</pubdate>
				<volume>31</volume>
				<fpage>365</fpage>
				<lpage>370</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1093/nar/gkg095</pubid>
						<pubid idtype="pmpid" link="fulltext">12520024</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B47">
				<title>
					<p>EMBL Blast sequence retrieval tool</p>
				</title>
				<url>http://blast2srs.embl.de</url>
			</bibl>
			<bibl id="B48">
				<title>
					<p>BLAST2SRS, a web server for flexible retrieval of related protein sequences in the SWISS-PROT and SPTrEMBL databases.</p>
				</title>
				<aug>
					<au>
						<snm>Bimpikis</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Budd</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Linding</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Gibson</snm>
						<fnm>TJ</fnm>
					</au>
				</aug>
				<source>Nucleic Acids Res</source>
				<pubdate>2003</pubdate>
				<volume>31</volume>
				<fpage>3792</fpage>
				<lpage>3794</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1093/nar/gkg535</pubid>
						<pubid idtype="pmpid" link="fulltext">12824420</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B49">
				<title>
					<p>Improved sensitivity of profile searches through the use of sequence weights and gap excision.</p>
				</title>
				<aug>
					<au>
						<snm>Thompson</snm>
						<fnm>JD</fnm>
					</au>
					<au>
						<snm>Higgins</snm>
						<fnm>DG</fnm>
					</au>
					<au>
						<snm>Gibson</snm>
						<fnm>TJ</fnm>
					</au>
				</aug>
				<source>Comput Appl Biosci</source>
				<pubdate>1994</pubdate>
				<volume>10</volume>
				<fpage>19</fpage>
				<lpage>29</lpage>
				<xrefbib>
					<pubid idtype="pmpid">8193951</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B50">
				<title>
					<p>BIC web home page</p>
				</title>
				<url>http://eta.embl-heidelberg.de:8000</url>
			</bibl>
			<bibl id="B51">
				<title>
					<p>Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.</p>
				</title>
				<aug>
					<au>
						<snm>Altschul</snm>
						<fnm>SF</fnm>
					</au>
					<au>
						<snm>Madden</snm>
						<fnm>TL</fnm>
					</au>
					<au>
						<snm>Schaffer</snm>
						<fnm>AA</fnm>
					</au>
					<au>
						<snm>Zhang</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Zhang</snm>
						<fnm>Z</fnm>
					</au>
					<au>
						<snm>Miller</snm>
						<fnm>W</fnm>
					</au>
					<au>
						<snm>Lipman</snm>
						<fnm>DJ</fnm>
					</au>
				</aug>
				<source>Nucleic Acids Res</source>
				<pubdate>1997</pubdate>
				<volume>25</volume>
				<fpage>3389</fpage>
				<lpage>3402</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1093/nar/25.17.3389</pubid>
						<pubid idtype="pmpid" link="fulltext">9254694</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B52">
				<title>
					<p>Database resources of the National Center for Biotechnology.</p>
				</title>
				<aug>
					<au>
						<snm>Wheeler</snm>
						<fnm>DL</fnm>
					</au>
					<au>
						<snm>Church</snm>
						<fnm>DM</fnm>
					</au>
					<au>
						<snm>Federhen</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Lash</snm>
						<fnm>AE</fnm>
					</au>
					<au>
						<snm>Madden</snm>
						<fnm>TL</fnm>
					</au>
					<au>
						<snm>Pontius</snm>
						<fnm>JU</fnm>
					</au>
					<au>
						<snm>Schuler</snm>
						<fnm>GD</fnm>
					</au>
					<au>
						<snm>Schriml</snm>
						<fnm>LM</fnm>
					</au>
					<au>
						<snm>Sequeira</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>Tatusova</snm>
						<fnm>TA</fnm>
					</au>
					<au>
						<snm>Wagner</snm>
						<fnm>L</fnm>
					</au>
				</aug>
				<source>Nucleic Acids Res</source>
				<pubdate>2003</pubdate>
				<volume>31</volume>
				<fpage>28</fpage>
				<lpage>33</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1093/nar/gkg033</pubid>
						<pubid idtype="pmpid" link="fulltext">12519941</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B53">
				<title>
					<p>NCBI BLAST</p>
				</title>
				<url>http://www.ncbi.nlm.nih.gov/BLAST</url>
			</bibl>
			<bibl id="B54">
				<title>
					<p>STRING: functional association protein networks</p>
				</title>
				<url>http://www.bork.embl-heidelberg.de/STRING</url>
			</bibl>
			<bibl id="B55">
				<title>
					<p>The Bioperl toolkit: Perl modules for the life sciences.</p>
				</title>
				<aug>
					<au>
						<snm>Stajich</snm>
						<fnm>JE</fnm>
					</au>
					<au>
						<snm>Block</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Boulez</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Brenner</snm>
						<fnm>SE</fnm>
					</au>
					<au>
						<snm>Chervitz</snm>
						<fnm>SA</fnm>
					</au>
					<au>
						<snm>Dagdigian</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Fuellen</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Gilbert</snm>
						<fnm>JG</fnm>
					</au>
					<au>
						<snm>Korf</snm>
						<fnm>I</fnm>
					</au>
					<au>
						<snm>Lapp</snm>
						<fnm>H</fnm>
					</au>
					<etal/>
				</aug>
				<source>Genome Res</source>
				<pubdate>2002</pubdate>
				<volume>12</volume>
				<fpage>1611</fpage>
				<lpage>1618</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1101/gr.361602</pubid>
						<pubid idtype="pmpid" link="fulltext">12368254</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B56">
				<title>
					<p>Artemis: sequence visualization and annotation.</p>
				</title>
				<aug>
					<au>
						<snm>Rutherford</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Parkhill</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Crook</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Horsnell</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Rice</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Rajandream</snm>
						<fnm>MA</fnm>
					</au>
					<au>
						<snm>Barrell</snm>
						<fnm>B</fnm>
					</au>
				</aug>
				<source>Bioinformatics</source>
				<pubdate>2000</pubdate>
				<volume>16</volume>
				<fpage>944</fpage>
				<lpage>945</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1093/bioinformatics/16.10.944</pubid>
						<pubid idtype="pmpid" link="fulltext">11120685</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B57">
				<title>
					<p>Multiple sequence alignment with the Clustal series of programs.</p>
				</title>
				<aug>
					<au>
						<snm>Chenna</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Sugawara</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Koike</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Lopez</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Gibson</snm>
						<fnm>TJ</fnm>
					</au>
					<au>
						<snm>Higgins</snm>
						<fnm>DG</fnm>
					</au>
					<au>
						<snm>Thompson</snm>
						<fnm>JD</fnm>
					</au>
				</aug>
				<source>Nucleic Acids Res</source>
				<pubdate>2003</pubdate>
				<volume>31</volume>
				<fpage>3497</fpage>
				<lpage>3500</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1093/nar/gkg500</pubid>
						<pubid idtype="pmpid" link="fulltext">12824352</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B58">
				<title>
					<p>SEAVIEW and PHYLO_WIN: two graphic tools for sequence alignment and molecular phylogeny.</p>
				</title>
				<aug>
					<au>
						<snm>Galtier</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Gouy</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Gautier</snm>
						<fnm>C</fnm>
					</au>
				</aug>
				<source>Comput Appl Biosci</source>
				<pubdate>1996</pubdate>
				<volume>12</volume>
				<fpage>543</fpage>
				<lpage>548</lpage>
				<xrefbib>
					<pubid idtype="pmpid">9021275</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B59">
				<title>
					<p>The neighbor-joining method: a new method for reconstructing phylogenetic trees.</p>
				</title>
				<aug>
					<au>
						<snm>Saitou</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Nei</snm>
						<fnm>M</fnm>
					</au>
				</aug>
				<source>Mol Biol Evol</source>
				<pubdate>1987</pubdate>
				<volume>4</volume>
				<fpage>406</fpage>
				<lpage>425</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">3447015</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B60">
				<title>
					<p>Gblocks server</p>
				</title>
				<url>http://woody.embl-heidelberg.de/phylo/index.html</url>
			</bibl>
			<bibl id="B61">
				<title>
					<p>Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis.</p>
				</title>
				<aug>
					<au>
						<snm>Castresana</snm>
						<fnm>J</fnm>
					</au>
				</aug>
				<source>Mol Biol Evol</source>
				<pubdate>2000</pubdate>
				<volume>17</volume>
				<fpage>540</fpage>
				<lpage>552</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">10742046</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B62">
				<title>
					<p>TREE-PUZZLE: maximum likelihood phylogenetic analysis using quartets and parallel computing.</p>
				</title>
				<aug>
					<au>
						<snm>Schmidt</snm>
						<fnm>HA</fnm>
					</au>
					<au>
						<snm>Strimmer</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Vingron</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>von Haeseler</snm>
						<fnm>A</fnm>
					</au>
				</aug>
				<source>Bioinformatics</source>
				<pubdate>2002</pubdate>
				<volume>18</volume>
				<fpage>502</fpage>
				<lpage>504</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1093/bioinformatics/18.3.502</pubid>
						<pubid idtype="pmpid" link="fulltext">11934758</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B63">
				<title>
					<p>The rapid generation of mutation data matrices from protein sequences.</p>
				</title>
				<aug>
					<au>
						<snm>Jones</snm>
						<fnm>DT</fnm>
					</au>
					<au>
						<snm>Taylor</snm>
						<fnm>WR</fnm>
					</au>
					<au>
						<snm>Thornton</snm>
						<fnm>JM</fnm>
					</au>
				</aug>
				<source>Comput Appl Biosci</source>
				<pubdate>1992</pubdate>
				<volume>8</volume>
				<fpage>275</fpage>
				<lpage>282</lpage>
				<xrefbib>
					<pubid idtype="pmpid">1633570</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B64">
				<title>
					<p>WWW-query: an on-line retrieval system for biological sequence banks.</p>
				</title>
				<aug>
					<au>
						<snm>Perriere</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Gouy</snm>
						<fnm>M</fnm>
					</au>
				</aug>
				<source>Biochimie</source>
				<pubdate>1996</pubdate>
				<volume>78</volume>
				<fpage>364</fpage>
				<lpage>369</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/0300-9084(96)84768-7</pubid>
						<pubid idtype="pmpid" link="fulltext">8905155</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B65">
				<title>
					<p>SMD: home page</p>
				</title>
				<url>http://genome-www.stanford.edu/microarray</url>
			</bibl>
			<bibl id="B66">
				<title>
					<p>The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools.</p>
				</title>
				<aug>
					<au>
						<snm>Thompson</snm>
						<fnm>JD</fnm>
					</au>
					<au>
						<snm>Gibson</snm>
						<fnm>TJ</fnm>
					</au>
					<au>
						<snm>Plewniak</snm>
						<fnm>F</fnm>
					</au>
					<au>
						<snm>Jeanmougin</snm>
						<fnm>F</fnm>
					</au>
					<au>
						<snm>Higgins</snm>
						<fnm>DG</fnm>
					</au>
				</aug>
				<source>Nucleic Acids Res</source>
				<pubdate>1997</pubdate>
				<volume>25</volume>
				<fpage>4876</fpage>
				<lpage>4882</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1093/nar/25.24.4876</pubid>
						<pubid idtype="pmpid" link="fulltext">9396791</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
		</refgrp>
	</bm>
</art>
