<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
	<ui>1471-2105-8-S1-S12</ui>
	<ji>1471-2105</ji>
	<fm>
		<dochead>Research</dochead>
		<bibl>
			<title>
				<p>On the origin and evolution of biosynthetic pathways: integrating microarray data with structure and organization of the Common Pathway genes</p>
			</title>
			<aug>
				<au id="A1">
					<snm>Fondi</snm>
					<fnm>Marco</fnm>
					<insr iid="I1"/>
					<email>marco.fondi@unifi.it</email>
				</au>
				<au id="A2">
					<snm>Brilli</snm>
					<fnm>Matteo</fnm>
					<insr iid="I1"/>
					<email>matteo.brilli@dbag.unifi.it</email>
				</au>
				<au id="A3" ca="yes">
					<snm>Fani</snm>
					<fnm>Renato</fnm>
					<insr iid="I1"/>
					<email>renato.fani@unifi.it</email>
				</au>
			</aug>
			<insg>
				<ins id="I1">
					<p>Dipartimento di Biologia Animale e Genetica, Universit&#224; di Firenze, Via Romana 17\19, Firenze, Italy</p>
				</ins>
			</insg>
			<source>BMC Bioinformatics</source>
			<supplement>
				<title>
					<p>Italian Society of Bioinformatics (BITS): Annual Meeting 2006</p>
				</title>
				<editor>Rita Casadio, Manuela Helmer-Citterich, Graziano Pesole</editor>
				<note>Research</note>
			</supplement>
			<conference>
				<title>
					<p>Italian Society of Bioinformatics (BITS): Annual Meeting 2006</p>
				</title>
				<location>Bologna, Italy</location>
				<date-range>28&#8211;29 April, 2006</date-range>
				<url>http://www.biocomp.unibo.it/bits2006/</url>
			</conference>
			<issn>1471-2105</issn>
			<pubdate>2007</pubdate>
			<volume>8</volume>
			<issue>Suppl 1</issue>
			<fpage>S12</fpage>
			<url>http://www.biomedcentral.com/1471-2105/8/S1/S12</url>
			<xrefbib>
				<pubidlist><pubid idtype="pmpid">17430556</pubid><pubid idtype="doi">10.1186/1471-2105-8-S1-S12</pubid>
				</pubidlist></xrefbib>
		</bibl>
		<history>
			<pub>
				<date>
					<day>8</day>
					<month>3</month>
					<year>2007</year>
				</date>
			</pub>
		</history>
		<cpyrt>
			<year>2007</year>
			<collab>Fondi et al; licensee BioMed Central Ltd.</collab>
			<note>This is an open access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
		</cpyrt>
		<abs>
			<sec>
				<st>
					<p>Abstract</p>
				</st>
				<sec>
					<st>
						<p>Background</p>
					</st>
					<p>The lysine, threonine, and methionine biosynthetic pathways share the three initial enzymatic steps, which are referred to as the Common Pathway (CP). In <it>Escherichia coli </it>three different aspartokinases (AKI, AKII, AKIII, the products of <it>thrA</it>, <it>metL </it>and <it>lysC</it>, respectively) can perform the first step of the CP. Moreover, two of them (AKI and AKII) are bifunctional, carrying also homoserine dehydrogenasic activity (<it>hom </it>product). The second step of the CP is catalyzed by a single aspartate semialdehyde dehydrogenase (ASDH, the product of <it>asd</it>). Thus, in the CP of <it>E. coli </it>while a single copy of ASDH performs the same reaction for three different metabolic routes, three different AKs perfom a unique step. Why and how such a situation did emerge and maintain? How is it correlated to the different regulatory mechanisms acting on these genes? The aim of this work was to trace the evolutionary pathway leading to the extant scenario in proteobacteria.</p>
				</sec>
				<sec>
					<st>
						<p>Results</p>
					</st>
					<p>The analysis of the structure, organization, phylogeny, and distribution of <it>ask </it>and <it>hom </it>genes revealed that the presence of multiple copies of these genes and their fusion events are restricted to the &#947;-subdivision of proteobacteria. This allowed us to depict a model to explain the evolution of <it>ask </it>and <it>hom </it>according to which the fused genes are the outcome of a cascade of gene duplication and fusion events that can be traced in the ancestor of &#947;-proteobacteria. <it>Moreover</it>, the appearance of fused genes paralleled the assembly of operons of different sizes, suggesting a strong correlation between the structure and organization of these genes. A statistic analysis of microarray data retrieved from experiments carried out on <it>E. coli </it>and <it>Pseudomonas aeruginosa </it>was also performed.</p>
				</sec>
				<sec>
					<st>
						<p>Conclusion</p>
					</st>
					<p>The integration of data concerning gene structure, organization, phylogeny, distribution, and microarray experiments allowed us to depict a model for the evolution of <it>ask </it>and <it>hom </it>genes in proteobacteria and to suggest a biological significance for the extant scenario.</p>
				</sec>
			</sec>
		</abs>
	</fm>
	<bdy>
		<sec>
			<st>
				<p>Background</p>
			</st>
			<p>The metabolic routes leading to the synthesis of lysine\diaminopimelic acid, methionine and threonine\isoleucine are closely interconnected forming a complex system, three steps of which represent the so-called Common Pathway (CP) <abbrgrp><abbr bid="B1">1</abbr></abbrgrp> (Figure <figr fid="F1">1</figr>). The first of them is the phosphorylation of aspartate, carried out by an aspartokinase (AK, the product of the <it>ask </it>gene) leading to &#946;-aspartyl-phosphate, which, in turn, is oxidised by an aspartate semialdehyde dehydrogenase (ASDH, the enzyme encoded by <it>asd</it>) to aspartate semialdehyde that, finally, may be transformed either into dihydrodipicolinate, the precursor of diaminopimelic acid and lysine, by dihydrodipicolinate synthase (coded for by <it>dapA</it>) or homoserine by homoserine dehydrogenase (HD, encoded by <it>hom</it>). Homoserine can be then channeled towards threonine and/or methionine biosyntheses. From an evolutionary point of view, the genes coding for these three enzymes are particularly interesting, since at least two different molecular mechanisms, i.e. paralogous gene duplication and gene fusion, appeared to have played a key role in their origin and evolution. In addition to this, in some bacteria each CP step is catalyzed by enzymes coded for by single monofunctional genes, whereas in the enterobacterium <it>Escherichia coli </it>it has been shown <abbrgrp><abbr bid="B2">2</abbr></abbrgrp> (Figure <figr fid="F1">1</figr>) that:</p>
			<fig id="F1">
				<title>
					<p>Figure 1</p>
				</title>
				<caption>
					<p>The aspartate pathway</p>
				</caption>
				<text>
					<p><b>The aspartate pathway</b>. Genes marked in red (<it>ask</it>, <it>asd</it>, and <it>hom</it>) constitute the Common Pathway [1].</p>
				</text>
				<graphic file="1471-2105-8-S1-S12-1"/>
			</fig>
			<p>i) the first step of the CP can be performed by three different aspartokinases (AKI, AKII and AKIII);</p>
			<p>ii) the second step is catalyzed by a monofunctional ASDH encoded by <it>lysC</it>; and, lastly,</p>
			<p>iii) the third step is carried out by two different homoserine dehydrogenases, referred to as HDI and HDII, which are fused to two of the three AKs: AKI and AKII, respectively. These two bifunctional proteins are coded for by two genes, <it>thrA </it>and <it>metL</it>, respectively.</p>
			<p>The expression of the two <it>E. coli </it>bifunctional proteins are differently regulated: threonine and isoleucine regulate the expression of <it>thrA</it>, and threonine controls both enzymatic activities by a negative feedback. The transcription of <it>metL </it>is repressed by methionine but no feedback inhibition, by methionine itself, has been observed on this enzyme. Finally, the expression of the gene coding for AKIII (<it>lysC</it>) and the activity of its product, are regulated in response to lysine concentration <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>.</p>
			<p>This particular structure pattern has raised the question of how and why it emerged in the course of evolution. On the basis of limited sequence data, Cassan et al. <abbrgrp><abbr bid="B3">3</abbr></abbrgrp> proposed that the present-day bifunctional enzymes may have arisen from a fusion event involving the AK and the HD ancestral coding genes. The duplication of this bifunctional gene may have originated two redundant copies carrying both AK and HD activity. Another gene duplication event may have led to the formation of the three AK copies we observe nowadays. According to this model, the monofunctional AK could have emerged in two different ways: either by a partial gene duplication event involving only the AK activity coding region of the bifunctional genes, or by inactivation, as a result of accumulation of mutations, of the HD coding sequence. Thus, both paralogous gene duplication and gene fusion might have been responsible for shaping the CP. The importance of gene duplication in the course of evolution of genomes and metabolic pathways is well established, (see <abbrgrp><abbr bid="B4">4</abbr></abbrgrp> and references therein): the production of two copies of a DNA sequences leads to an increase of genome size, and it also allows the rapid diversification of enzymatically catalyzed reactions, providing new material for the invention of new enzymatic properties and complex regulatory and developmental patterns. In addition to gene duplication, (see <abbrgrp><abbr bid="B4">4</abbr></abbrgrp> and references therein), one of the major routes of gene evolution is the fusion of independent cistrons leading to bi- or multifunctional proteins <abbrgrp><abbr bid="B5">5</abbr><abbr bid="B6">6</abbr><abbr bid="B7">7</abbr><abbr bid="B8">8</abbr><abbr bid="B9">9</abbr></abbrgrp>. Gene fusions provide a mechanism for the physical association of different catalytic domains or of catalytic and regulatory structures <abbrgrp><abbr bid="B5">5</abbr></abbrgrp>. Fusions frequently involve genes coding for proteins functioning in a concerted manner, such as enzyme catalyzing sequential steps within a metabolic pathway <abbrgrp><abbr bid="B10">10</abbr></abbrgrp>. Fusion of such catalytic centres likely promotes the channelling of intermediates that may be unstable and/or in low concentration <abbrgrp><abbr bid="B5">5</abbr></abbrgrp>; this, in turn, requires that enzymes catalysing sequential reactions are colocalized within cell <abbrgrp><abbr bid="B11">11</abbr></abbrgrp> and may (transiently) interact to form complexes that are termed metabolons <abbrgrp><abbr bid="B12">12</abbr></abbrgrp>. The high fitness of gene fusions can also rely on the tight regulation of the expression of the fused domains. This might be the case of <it>metL </it>and <it>thrA</it>.</p>
			<p>Thus, the CP might represent a very interesting model study to shed some light on the mechanisms driving the assembly of metabolic pathways and the refinement of regulatory networks. Nonetheless, in spite of the availability of several completely sequenced genomes and microarray data, neither a detailed analysis of the structure and organization of CP genes has been carried out nor any correlation of these data with expression (microarray) ones has been established until now. The aim of this work was to try to reconstruct the possible evolutionary and timing pathway(s) leading to the extant <it>ask </it>and <it>hom </it>genes, to analyse their phylogenetic distribution, to shed some light on the molecular mechanisms responsible for the assembly of the CP genes in bacteria and on the role that gene duplication(s), fusion(s) and clustering might have had in this context. To this purpose, the structure, organization and phylogenetic distribution of all the available proteobacterial <it>ask</it>, <it>hom</it>, and <it>asd </it>genes were analysed. Data obtained were integrated with expression data deriving from microarray analyses. We focused our attention on Proteobacteria for the following reasons: i) previous works <abbrgrp><abbr bid="B6">6</abbr><abbr bid="B7">7</abbr><abbr bid="B9">9</abbr></abbrgrp> have shown that gene rearrangement events, such as gene duplication, fusion, and/or clustering have strongly influenced their evolution, ii) this phylogenetic branch includes the &#947;-subdivision, that is thought to be one of the most recent branching point among Bacteria and iii) they represent a good case-study since comprise organisms living in very different habitats (going from the deep-sea hydrothermal environments of the &#949;-subdivision to the roots of plants in the case of some &#945;-proteobacteria), and with very different lifestyles, including endosymbionts and parasites.</p>
		</sec>
		<sec>
			<st>
				<p>Results and discussion</p>
			</st>
			<sec>
				<st>
					<p>Structure and phylogenetic distribution of the genes coding for AK, ASDH and HD in Proteobacteria</p>
				</st>
				<p>The aminoacid sequences of the <it>E. coli </it>AK, ASDH, and HD sequences were used as a query to probe the protein database of completely sequenced proteobacterial genomes with the BLASTP option of BLAST program <abbrgrp><abbr bid="B13">13</abbr></abbrgrp>, in order to retrieve the most similar sequences. To this purpose 58 proteobacterial genomes were selected and, in most cases, only one strain for each species was taken into account. Data obtained are schematically reported in Figure <figr fid="F2">2</figr>, where a phylogenetic tree constructed using the RpoD sequences of the 58 proteobacteria is shown together with the number and the structure of all the retrieved AK, and HD coding genes. The <it>asd </it>genes were not included in Figure <figr fid="F2">2</figr>, since just one copy of this gene was retrieved from the 58 proteobacteria. The analysis of data reported in Figure <figr fid="F2">2</figr> revealed that:</p>
				<fig id="F2">
					<title>
						<p>Figure 2</p>
					</title>
					<caption>
						<p>The structure of <it>ask </it>and <it>hom </it>genes</p>
					</caption>
					<text>
						<p><b>The structure of <it>ask </it>and <it>hom </it>genes</b>. Phylogenetic tree constructed using the RpoD sequences (Neighbor Joining, 2250 Boostrap Replicates, Complete Deletion, Poisson Correction) of the 58 proteobacteria together with the number and the structure of all the retrieved <it>ask </it>and <it>hom </it>genes.</p>
					</text>
					<graphic file="1471-2105-8-S1-S12-2"/>
				</fig>
				<p>a) in all the <it>&#945;</it>-, <it>&#946;</it>- and <it>&#948;\&#949;</it>-proteobacterial genomes a single, monofunctional, stand-alone, copy of the gene coding for AK or HD was detected; moreover, neither duplicated copies nor fusion events involving these genes were detected.</p>
				<p>b) multiple as well as fused copies of AK and HD were found only in &#947;-proteobacteria, where the scenario is (apparently) more complex and intriguing. Indeed, a variable structure and copy-number of genes coding for AK (1 to 5) and HD (1 to 2) can be observed. Moreover, there is an apparent increasing complexity concerning these genes that is parallel to the evolutionay branching of &#947;-proteobacteria, with enterobacteria and vibrionaceae showing the highest number of redundant and fused copies of AK and HD. This phylogenetic distribution strongly suggests that the duplication of AK coding genes and the fusion to HD apparently can be traced within &#947;-proteobacteria or soon after the divergence of the &#947;-proteobacterial ancestor from <it>&#945;</it>-, <it>&#946;</it>- and <it>&#948;\&#949;</it>-proteobacteria.</p>
			</sec>
			<sec>
				<st>
					<p>A model for the evolution of the AK and HD coding genes</p>
				</st>
				<p>On the basis of the phylogenetic distribution of stand-alone and bifunctional genes of the CP we propose a possible, plausible evolutionary and timing model explaining the extant scenario. The model, which is schematically reported in Figure <figr fid="F3">3</figr>, predicts that the proteobacterial ancestor possessed a single copy of <it>hom</it>, <it>ask </it>and <it>asd </it>genes. During evolution, this organization was maintained in proteobacteria belonging to the <it>&#945;</it>-, <it>&#946;</it>- and <it>&#948;\&#949;</it>-subdivisions. One of the cross-roads for the evolution of these genes is represented by the branching point between &#946;- and &#947;-proteobacteria. It appears quite possible that, in the ancestor of &#947;-proteobacteria, a first duplication of the <it>ask </it>gene may have taken place, generating two redundant copies that underwent an evolutionary divergence. The finding that no bacterium (with the exception of <it>Vibrio </it>strains, see below) shows two copies of monofunctional <it>ask </it>genes, strongly suggests that this duplication event and its further fusion to <it>hom </it>might have occurred in a relatively short evolutionary time, giving raise to an ancestral bifunctional gene, which might have retained the function of the extant <it>metL </it>and <it>thrA</it>. This sort of "gene duplication-gene fusion coupling" is quite similar to that described recently for the evolution of &#947;-proteobacterial <it>hisN </it>and <it>hisB </it>histidine biosynthetic genes <abbrgrp><abbr bid="B6">6</abbr><abbr bid="B7">7</abbr><abbr bid="B9">9</abbr></abbrgrp>. Finally, a paralogous duplication event of this bifunctional ancestor gene followed by evolutionary divergence (which very likely concerned with the regulatory mechanism, rather than the catalytic activity) led to the extant <it>metL </it>and <it>thrA </it>genes. On the basis of the phylogenetic distribution of the bifunctional genes (Figure <figr fid="F3">3</figr>), this "final" step might have occurred just before the separation between the "clusters" 1 and 2 of the &#947;-proteobacterial subdivision.</p>
				<fig id="F3">
					<title>
						<p>Figure 3</p>
					</title>
					<caption>
						<p>The evolutionary model</p>
					</caption>
					<text>
						<p><b>The evolutionary model</b>. Evolutionary model proposed to explain the evolution of <it>ask </it>and <it>hom </it>genes in proteobacteria.</p>
					</text>
					<graphic file="1471-2105-8-S1-S12-3"/>
				</fig>
				<p>The biological significance of this cascade of duplication and fusion events might rely on the "patchwork" hypothesis on the origin and evolution of metabolic pathways <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>. According to this idea, metabolic pathways may have been assembled through the recruitment of primitive enzymes that could react with a wide range of chemically related substrates. Such relatively slow, unspecific enzymes may have been enabled primitive cells containing small genomes to overcome their limited coding capabilities <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>. Paralogous gene duplication event(s) followed by evolutionary divergence might have permitted the appearance of enzymes with an increase and narrow specificity and/or the diversification of function. In this way, an ancestral enzyme belonging to a given metabolic route, is "recruited" to serve a single or other (novel) pathways. Besides, it may permit the <it>evolution and refinement of regulatory mechanisms<ul><it/></ul></it>coincident with the development of new pathways and/or the refinement of pre-existing ones.</p>
				<p>In our opinion, the evolutionary model proposed here to explain the origin and evolution the extant <it>metL </it>and <it>thrA </it>genes is in full agreement with the Jensen hypothesis and the cascade of gene duplications and fusions involving <it>ask </it>and <it>hom </it>genes might actually represent a mechanism for the refinement of the feedback regulation mechanisms controlling the activity of the enzymes they code for.</p>
			</sec>
			<sec>
				<st>
					<p>Phylogenetic analysis</p>
				</st>
				<p>If the evolutionary model proposed here is correct, one should expect that the fused copies of AK (AKI and AKII) and HD (HDI and HDII) share a degree of sequence similarity higher than that exhibited with AKIII and HD, respectively, and cluster together in a phylogenetic tree. In order to check this hypothesis, the AK and HD aminoacid sequences were aligned using the program ClustalW <abbrgrp><abbr bid="B15">15</abbr></abbrgrp> and the multialignments obtained used to draw the phylogenetic trees shown in Figure <figr fid="F4">4</figr> and <figr fid="F5">5</figr>. The analysis of the AK tree (Figure <figr fid="F4">4</figr>) showed that all the &#945;-, &#946;- and &#948;\&#949;-proteobacterial sequences form a unique cluster separated from &#947;-proteobacterial ones. Besides, the &#947;-proteobacterial AKI, AKII, and AKIII sequences form three different and separated clusters with AKIII representing the root of the others. A similar situation can be observed in the HD tree (Figure <figr fid="F5">5</figr>): &#945;-, &#946;- and &#948;\&#949;-proteobacterial HD sequences form a distinct unique cluster, while HDI and HDII form two close clusters.</p>
				<fig id="F4">
					<title>
						<p>Figure 4</p>
					</title>
					<caption>
						<p>Phylogenetic tree of AK sequences</p>
					</caption>
					<text>
						<p><b>Phylogenetic tree of AK sequences</b>. Phylogenetic trees (Neighbor Joining, 2250 Boostrap Replicates, Complete Deletion, Poisson Correction) constructed with all the retrieved sequences of AK.</p>
					</text>
					<graphic file="1471-2105-8-S1-S12-4"/>
				</fig>
				<fig id="F5">
					<title>
						<p>Figure 5</p>
					</title>
					<caption>
						<p>Phylogenetic tree of HD sequences</p>
					</caption>
					<text>
						<p><b>Phylogenetic tree of HD sequences</b>. Phylogenetic trees (Neighbor Joining, 2250 Boostrap Replicates, Complete Deletion, Poisson Correction) constructed with all the retrieved sequences of HD.</p>
					</text>
					<graphic file="1471-2105-8-S1-S12-5"/>
				</fig>
				<p>The topology of the phylogenetic trees obtained fits well with the evolutionary model proposed and indicates that horizotal gene transfer of these genes rarely occurred and did not strongly influenced the evolution of AK and HD domanis. However, even though the evolutionary model reported in Figure <figr fid="F3">3</figr> is in agreement with gene structure and phylogenetic analyses, the following exceptions have to be explained:</p>
				<p>1) The absence of <it>lysC </it>and <it>metL </it>in a group of enterobacteria (<it>Buchnera aphidicola </it>strains, <it>Candidatus Blochmannia floridanus</it>, <it>Wigglesworthia glossinidia</it>) and in <it>Haemophilus influenzae</it>, the absence of bifunctional genes in <it>H. ducrey</it>, and the lack of <it>hom </it>in <it>Coxiella burnetii</it>, <it>Ricketsia prowazekii</it>, <it>Wolbachia endosymbiont of Drosophila melanogaster </it>and <it>Bdellovibrio bacteriovorus</it>. This is very likely due to the absence of the corrensponding metabolic route(s), which, in turn, is correlated to the parasitic lifestyle of these proteobacteria. Such a lifestyle may allow the bacteria to acquire essential compounds directly from the metabolic activities of their host and the adaptation to this environmental condition might have caused the loss of entire metabolic routes or part thereof.</p>
				<p>2) The increase of the AK copies in <it>Vibrio </it>strains in respect to other &#947;-proteobacteria is probably related to the high genomic rearrangement rate typical of these species.</p>
				<p>3) The absence of bifunctional <it>ask-hom </it>genes in <it>Pseudomonas </it>and <it>Methylococcus capsulatus </it>that, in spite of their taxonomical position within &#947;-proteobacteria, exhibit the same structural and organization pattern of bacteria belonging to the <it>&#945;</it>-, <it>&#946;</it>- and <it>&#948;\&#949;</it>-subdivisions. This is not an isolated example; in fact, the same situation has been recorded for other biosynthetic pathways, such as histidine biosynthesis <abbrgrp><abbr bid="B6">6</abbr><abbr bid="B7">7</abbr></abbrgrp>. The reason(s) of such structure and organization is still unclear.</p>
				<p>4) The fusion of <it>ask </it>to <it>lysA </it>in <it>Xanthomonadaceae</it>, which represents an exception to this general model. In these bacteria the paralogous duplication of <it>ask </it>gene originated two copies, one of which fused to <it>hom</it>, whereas the other one underwent another fusion event with <it>lysA</it>, a gene coding coding for DAPDC activity). The biological significance of the last fusion might rely in the spatial colocalization of the products of the two modules and a faster feedback inhibition of the first enzyme (AK) by the end product of the pathway (lysine), whose last biosynthetic step is catalyzed by the enzyme coded for by <it>lysA</it>.</p>
			</sec>
			<sec>
				<st>
					<p>Analysis of gene organization</p>
				</st>
				<p>If the model proposed and its biological significance is correct, i.e. that the duplication and fusion events, and the successive evolutionary divergence allowed the three copies of AKs and the two of HDs to narrow their specificity and to become increasingly more sensitive to specific regulatory signals, then it is plausible to assume that the ancestral copy of AK (AKIII) might serve different metabolic pathways and hence might have been under the control of multiple different regulatory signals (i.e. the availability of DAP, lysine, threonine, methionine etc). On the other hand, the expression of the bifunctional genes, <it>thrA </it>and <it>metL</it>, once they were channelled towards the biosynthesis of threonine and methionine, should have become increasingly more dependent on more specific signals (for example the concentration of the final product of that route). In general, it is plausible that once a "new" gene introgresses and becomes part of a pre-existing metabolic pathway, it will become co-regulated with the other genes belonging to the same metabolic pathway. In some cases, co-regulation of genes of the same biosynthetic route is achieved by organizing genes in operon structures, even though co-regulation may also be obtained by regulon construction. This is particularly true for fused genes; as reported in previous works, based on the analysis of the histidine biosynthetic pathway in &#947;-proteobacteria, the appearance of fused genes (specific for a single pathway) is often parallel to their presence within operons <abbrgrp><abbr bid="B6">6</abbr><abbr bid="B7">7</abbr><abbr bid="B9">9</abbr></abbrgrp>. This raises the question whether the structure and distribution of duplicated and fused copies of <it>ask </it>and <it>hom </it>genes might somehow be correlated to their organization in the proteobacterial genome. Therefore, we analysed the organization of all the genes of the <it>lys</it>, <it>met </it>and <it>thr </it>biosynthesis in all the 58 proteobacteria. Data obtained revealed that:</p>
				<p>1. Genes involved in the DAP\lysine biosynthesis are scattered throughout the chromosome(s) of all the 58 proteobacteria taken into account (data not shown).</p>
				<p>2. In addition to <it>ask</it>, <it>asd </it>and <it>hom </it>genes, the other two genes involved in threonine biosynthesis (<it>thrB </it>and <it>thrC</it>) are scattered on the chromosome of bacteria belonging to &#945;-, &#946;- and &#948;\&#949; subdivisions (except <it>Bordetella </it>strains that own a <it>hom-thrC </it>operon) (Figure <figr fid="F6">6</figr>). The &#947;-proteobacterial scenario is completely different; according to the hypothesis mentioned above, in all of organisms possessing a bifunctional <it>thrA </it>gene, it is endowed within a three-cystronic operon, in the same relative gene order (<it>thrABC</it>), also suggesting that its construction should have been occurred once during evolution.</p>
				<fig id="F6">
					<title>
						<p>Figure 6</p>
					</title>
					<caption>
						<p>Gene organization of threonine genes</p>
					</caption>
					<text>
						<p><b>Gene organization of threonine genes</b>. Structure and organization of threonine biosynthetic genes of the 58 proteobacteria correlated with their phylogenetic position as established by RpoD analysis.</p>
					</text>
					<graphic file="1471-2105-8-S1-S12-6"/>
				</fig>
				<p>3. The organization of methionine biosynthetic genes in proteobacteria partly reflects that exhibited by <it>lys </it>or <it>thr </it>genes. In fact, in the &#945;-, &#946;- and &#948;\&#949; branches <ul>all</ul> the <it>met </it>biosynthetic genes are scattered on the chromosome(s) (Figure <figr fid="F7">7</figr>). This organization is also shared by &#947;-proteobacteria; the only exception is represented by the bifunctional <it>metL</it>, which is clustered with <it>metB </it>to form a bicistronic <it>metLB </it>operon.</p>
				<fig id="F7">
					<title>
						<p>Figure 7</p>
					</title>
					<caption>
						<p>Gene organization of methionine genes</p>
					</caption>
					<text>
						<p><b>Gene organization of methionine genes</b>. Structure and organization of methionine biosynthetic genes of the 58 proteobacteria correlated with their phylogenetic position as established by RpoD analysis.</p>
					</text>
					<graphic file="1471-2105-8-S1-S12-7"/>
				</fig>
				<p>Thus, no bifunctional gene of the CP is located outside operons. Data obtained strongly suggest that the production of genes coding for enzymes specific of a single metabolic pathway coincides with their presence within a polycistronic transcriptional unit that includes all (or at least some of) the other genes of that route. Concerning the timing of the operons construction, the comparative analysis of Figure <figr fid="F2">2</figr>, <figr fid="F5">5</figr>, and <figr fid="F6">6</figr> revealed that the "gene duplication-gene fusion coupling" occurring in &#947;-proteobacteria appears to be coincident with gene clustering and the formation of operons of different length.</p>
			</sec>
			<sec>
				<st>
					<p>Analysis of microarray experiments data</p>
				</st>
				<p>In order to elucidate the correlation existing between the structure and organization of <it>lys</it>, <it>met</it>, and <it>thr </it>genes and their expression within the cell, we analyzed the microarray data from <it>E. coli </it>and <it>P. aeruginosa</it>, which show two different arrays of structure and organization of CP genes. Microarray data were downloaded as supplemental material to published papers (see <supplr sid="S1">Additional File 1</supplr>: Additional References for the Expression compendium); only normalized and filtered data were used. Values were transformed into base 2 logarithm of the ratio of the wild type (untreated) / mutant (treated) expression levels, if not yet in that form.</p>
				<suppl id="S1">
					<title>
						<p>Additional File 1</p>
					</title>
					<text>
						<p>Additional References for the Expression compendium. List of the references used to retrieve microarray experiments data.</p>
					</text>
					<file name="1471-2105-8-S1-S12-S1.pdf">
						<p>Click here for file</p>
					</file>
				</suppl>
				<p>For each of the three metabolic pathways we carried out a pairwise comparison of the expression pattern of each gene, by calculating the Pearson's correlation coefficient.</p>
				<p>Data obtained are reported in Figure <figr fid="F8">8</figr>, whose analysis revealed:</p>
				<fig id="F8">
					<title>
						<p>Figure 8</p>
					</title>
					<caption>
						<p>Microarray data analysis</p>
					</caption>
					<text>
						<p><b>Microarray data analysis</b>. Comparison between the expression pattern of each <it>met</it>, <it>lys</it>, <it>thr </it>gene of <it>E. coli </it>(a, b, c) and <it>P. aeruginosa </it>(d, e, f).</p>
					</text>
					<graphic file="1471-2105-8-S1-S12-8"/>
				</fig>
				<p>1. A low co-regulation of the methionine biosynthetic genes (Figure <figr fid="F8">8a</figr>). Most of these genes are scarcely co-expressed, and they appeared to be expressed independently from each other. The fact that both <it>metL </it>and <it>metB </it>show very high correlation coefficient value in respect to the other <it>met </it>genes is in agreement with their operonic organization.</p>
				<p>2. The three <it>E. coli thrABC </it>genes (Figure <figr fid="F8">8b</figr>) are highly co-expressed, with correlation coefficient &gt; 0.84. This is in agreement with their organization in a compact operon.</p>
				<p>3. The trend of the lysine pathway genes in the &#947;-proteobacterium <it>E. coli </it>(Figure <figr fid="F8">8c</figr>) is quite surprising; although the <it>lys </it>genes are scattered throughout the <it>E. coli </it>chromosome, they show a high degree of co-expression with correlation coefficient values often &gt; 0.8. It is not clear how these genes can be highly co-expressed in the absence of an operonic organization. However, it is known <abbrgrp><abbr bid="B16">16</abbr></abbrgrp> that lysine biosynthetic genes are regulated by the so-called LYS <it>element </it>(lysine-specific RNA element) located in their regulatory regions and able to repress or to allow their trascription in response to lysine concentration. The high coexpression pattern of lysine bosynthetic genes might be due to this mechanism.</p>
				<p>The same analysis was carried out on lysine, methionine and threonine biosynthetic genes of <it>Pseudomonas aeruginosa</it>, whose structure and organization pattern is the same of the &#945;-, &#946;-, and &#948;\&#949; subdivision of proteobacteria. Data obtained (reported in Figure <figr fid="F8">8</figr>) showed that, overall, there is a low degree of co-expression between genes belonging to the same pathway; this is particularly pronounced for methionine, where in some cases, the correlation coefficient assumes negative values (Figure <figr fid="F8">8e</figr>), and lysine genes, whereas the <it>thr </it>biosynthetic genes were more correlated between them. The low degree of co-expression of <it>P. aeruginosa </it>genes is in agreement with their scattering on the bacterial genome.</p>
			</sec>
		</sec>
		<sec>
			<st>
				<p>Conclusion</p>
			</st>
			<p>In this work a likely model for the evolution of the genes involved in the common pathway (CP) is depicted, which is based on the comparative analysis of data concerning the structure, phylogenetic distribution, organization, phylogeny and expression of <it>ask </it>and <it>hom </it>genes in proteobacteria. The analysis of the structure of the CP genes gave a strong support to the hypothesis that at least two different molecular mechanisms played an important role in shaping the pathway, that is paralogous gene duplication(s) and gene fusion <abbrgrp><abbr bid="B17">17</abbr><abbr bid="B4">4</abbr></abbrgrp>. The analysis of <it>thr</it>, <it>met </it>and <it>lys </it>gene organization in different proteobacteria revealed that several gene arrays exist within this phylogenetic lineage, with genes completely scattered throughout the genome, partially scattered/clustered, or strictly compacted. Even though different scenarios can be depicted for this different organization, i.e. the presence of scattered or clustered genes in the ancestor of proteobacteria, data reported in this work supported the first hypothesis. According to the model proposed, the ancestor of proteobacteria possessed monofunctional <it>hom</it>, <it>ask</it>, and <it>asd </it>genes scattered throughout the genome. The extant multiple and fused copies of <it>ask </it>and <it>hom </it>genes are the outcome of a cascade of paralogous gene duplication and fusion events, which led to the appearance of bifunctional enzymes catalyzing the same metabolic steps, but "sensing" different regulatory signals.</p>
			<p>The evolutionary history of the CP genes gives another important support to the Jensen's hypothesis on the origin and evolution of metabolic pathways <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>, strengthening the idea that gene duplication, gene fusion and recruitment of genes encoding enzyme with a broad range of substrate specificity played a crucial role in the assembly of biosynthetic pathways and in the appearance of new and/or more sophisticated regulatory networks <abbrgrp><abbr bid="B4">4</abbr><abbr bid="B9">9</abbr></abbrgrp>. Indeed, the biological significance of the presence of multiple copies of <it>ask </it>and <it>hom </it>genes might rely on the refinement of regulatory mechanisms allowing each <it>ask </it>copy to be regulated by specific signals, such as the availability of the end-product of the pathway.</p>
			<p>The question of why the duplicated copies of <it>ask </it>fused to <it>hom </it>is rather intriguing. It is evident from their phylogenetic distribution that, once occurred, the fusion has been fixed; thus, it should have been evolutionary advantageous. Even though it cannot be <it>a priori </it>excluded, we do not favour the possibility that this fusion might permit the substrate tunnelling. It is possible that this gene fusion (and gene organization) resulted from both regulatory and metabolic constraints, for instance it might permit the spatial colocalization of their products and so a faster feedback inhibition of the first enzyme of the pathway, coded for by <it>ask</it>, by the product of <it>hom</it>.</p>
			<p>The existence of the <it>thrA and metL </it>gene fusions in the genome of &#947;-proteobacteria is not an isolated example; additional gene fusions occurred in these genomes, such as those involving some histidine biosynthetic genes. It is worth of note that most of bifunctional proteins recognized to date are involved in metabolic pathways of the &#947;-subdivision of proteobacteria <abbrgrp><abbr bid="B18">18</abbr></abbrgrp>. Even though there is no apparent reason to think that these organisms are more prone to gene fusions than any others, it is interesting that these gene fusions appeared to be parallel to the increasing compactness of some operons <abbrgrp><abbr bid="B9">9</abbr></abbrgrp> or to their construction, as in the case of the <it>thrABC </it>and <it>metLB </it>ones.</p>
			<p>Actually, the analysis of the organization of these genes revealed that all the <it>metL </it>and <it>thrA </it>genes are embedded within (compact) operons, whereas their monofunctional counterparts as well as the second CP gene, <it>asd</it>, are located outside gene clusters. This is not so surprising if we agree on the existence of unspecific enzymes that might serve different metabolic pathways. Indeed, it is plausible that the expression of a gene, whose product catalyses a chemical reaction leading to a product involved in different metabolic pathways should be constitutively expressed or controlled by multiple mechanisms rather than being controlled by mechanisms specific for a single route.</p>
			<p>This is also in agreement with expression data retrieved from the available microarray data; in fact, the greater the scattering of genes belonging to the same pathway, the lower the degree of correlation between them.</p>
			<p>If our model is correct, the building up of <it>thrABC </it>and <it>metLB </it>operons represents a recent invention of evolution (dated in the &#947; proteobacterial ancestor) and is apparently co-incident with the appearance of bifunctional <it>ask</it>-<it>hom </it>genes. The origin and evolution of operons is still under debate, and at least six different classes of models have been proposed to explain the existence of operons (see <abbrgrp><abbr bid="B9">9</abbr></abbrgrp> and references therein); although different forces might have driven the assembly of operons, in our opinion the major ones were those enabling the <it>fused </it>genes to be coregulated finely and the protein coded for synthesized in the correct stoichiometric ratio.</p>
		</sec>
		<sec>
			<st>
				<p>Material and methods</p>
			</st>
			<sec>
				<st>
					<p>Sequence retrieval</p>
				</st>
				<p>Amino acid sequences were retrieved from GenBank database. BLAST <abbrgrp><abbr bid="B13">13</abbr></abbrgrp> probing of database was performed with the BLASTP option of this program using default parameters. Only those sequences retrieved at an E-value below the 0.05 threshold were taken into account.</p>
			</sec>
			<sec>
				<st>
					<p>Sequence alignment</p>
				</st>
				<p>The ClustalW <abbrgrp><abbr bid="B15">15</abbr></abbrgrp> program in the BioEdit package was used to perform pairwise and multiple amino acid sequences alignments.</p>
			</sec>
			<sec>
				<st>
					<p>Phylogenetic trees construction</p>
				</st>
				<p>Phylogenetic trees were obtained with Mega 3 <abbrgrp><abbr bid="B19">19</abbr></abbrgrp> software using the Neighbor-Joining (NJ) and the Minimum Evolution (ME) methods.</p>
			</sec>
		</sec>
		<sec>
			<st>
				<p>List of abbreviations</p>
			</st>
			<p>AKI, AKII, AKIII, Aspartokinase I, II, III; <it>askI </it>and <it>askII </it>can also be named as <it>thrA </it>and <it>metL</it>; ASHD, Aspartate semialdehyde dehydrogenase; DAPDC, meso-diaminopimelate decarboxylase; HD, homoserine dehydrogenase.</p>
		</sec>
		<sec>
			<st>
				<p>Authors' contributions</p>
			</st>
			<p>All authors equally contributed to the preparation of the final version of the manuscript; MF performed the analyses during its MS degree work under the supervision of Prof. RF.</p>
		</sec>
	</bdy>
	<bm>
		<ack>
			<sec>
				<st>
					<p>Acknowledgements</p>
				</st>
				<p>This article has been published as part of <it>BMC Bioinformatics </it>Volume 8, Supplement 1, 2007: Italian Society of Bioinformatics (BITS): Annual Meeting 2006. The full contents of the supplement are available online at <url>http://www.biomedcentral.com/1471-2105/8?issue=S1</url>.</p>
			</sec>
		</ack>
		<refgrp>
			<bibl id="B1">
				<title>
					<p>The common pathway to lysine, methionine and threonine</p>
				</title>
				<aug>
					<au>
						<snm>Cohen</snm>
						<fnm>GN</fnm>
					</au>
				</aug>
				<source>Amino Acids: Biosynthesis and Genetic Regulation</source>
				<editor>Hermann KM, Somerville RL.  London, Addison-Wesley</editor>
				<pubdate>1983</pubdate>
				<fpage>141</fpage>
				<lpage>147</lpage>
			</bibl>
			<bibl id="B2">
				<title>
					<p>Biosynthesis of threonine and lysine</p>
				</title>
				<aug>
					<au>
						<snm>Patte</snm>
						<fnm>JC</fnm>
					</au>
				</aug>
				<source>Escherichia coli and Salmonella typhimurium</source>
				<publisher>ASM Press, Washington, DC</publisher>
				<editor>Neidhardt FC</editor>
				<pubdate>1996</pubdate>
				<fpage>528</fpage>
				<lpage>541</lpage>
			</bibl>
			<bibl id="B3">
				<title>
					<p>Nucleotide sequence of <it>lysC </it>gene encoding the lysine-sensitive aspartokinase III of <it>Escherichia coli </it>K12. Evolutionary pathway leading to three isofunctional enzymes</p>
				</title>
				<aug>
					<au>
						<snm>Cassan</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Parsot</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Cohen</snm>
						<fnm>GN</fnm>
					</au>
					<au>
						<snm>Patte</snm>
						<fnm>JC</fnm>
					</au>
				</aug>
				<source>J Biol Chem</source>
				<pubdate>1986</pubdate>
				<volume>261</volume>
				<issue>3</issue>
				<fpage>1052</fpage>
				<lpage>1057</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">3003049</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B4">
				<title>
					<p>Gene duplication and gene loading</p>
				</title>
				<aug>
					<au>
						<snm>Fani</snm>
						<fnm>R</fnm>
					</au>
				</aug>
				<source>Microbial evolution: gene establishment, survival, and exchange</source>
				<editor>Miller RV, Day MJ</editor>
				<pubdate>2004</pubdate>
				<fpage>67</fpage>
				<lpage>81</lpage>
			</bibl>
			<bibl id="B5">
				<title>
					<p>Evolution of metabolic pathways in enteric bacteria</p>
				</title>
				<aug>
					<au>
						<snm>Jensen</snm>
						<fnm>R</fnm>
					</au>
				</aug>
				<source>Escherichia coli and Salmonella typhimurium</source>
				<publisher>ASM Press, Washington, DC</publisher>
				<editor>Neidhardt FC</editor>
				<pubdate>1996</pubdate>
				<fpage>2649</fpage>
				<lpage>2662</lpage>
			</bibl>
			<bibl id="B6">
				<title>
					<p>The origin and evolution of eucaryal <it>HIS7 </it>genes: from metabolon to bifunctional proteins?</p>
				</title>
				<aug>
					<au>
						<snm>Brilli</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Fani</snm>
						<fnm>R</fnm>
					</au>
				</aug>
				<source>Gene</source>
				<pubdate>2004</pubdate>
				<volume>339</volume>
				<fpage>149</fpage>
				<lpage>160</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/j.gene.2004.06.033</pubid>
						<pubid idtype="pmpid" link="fulltext">15363855</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B7">
				<title>
					<p>Molecular evolution of <it>hisB </it>genes</p>
				</title>
				<aug>
					<au>
						<snm>Brilli</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Fani</snm>
						<fnm>R</fnm>
					</au>
				</aug>
				<source>J Mol Evol</source>
				<pubdate>2004</pubdate>
				<volume>58</volume>
				<issue>2</issue>
				<fpage>225</fpage>
				<lpage>237</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1007/s00239-003-2547-x</pubid>
						<pubid idtype="pmpid" link="fulltext">15042344</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B8">
				<title>
					<p>Ancient origin of the tryptophan operon and the dynamics of evolutionary change</p>
				</title>
				<aug>
					<au>
						<snm>Xie</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Keyhani</snm>
						<fnm>NO</fnm>
					</au>
					<au>
						<snm>Bonner</snm>
						<fnm>CA</fnm>
					</au>
					<au>
						<snm>Jensen</snm>
						<fnm>RA</fnm>
					</au>
				</aug>
				<source>Microbiol Mol Biol Rev</source>
				<pubdate>2003</pubdate>
				<volume>67</volume>
				<issue>3</issue>
				<fpage>303</fpage>
				<lpage>342</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">193870</pubid>
						<pubid idtype="pmpid" link="fulltext">12966138</pubid>
						<pubid idtype="doi">10.1128/MMBR.67.3.303-342.2003</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B9">
				<title>
					<p>The origin and evolution of operons: the piecewise building of the proteobacterial histidine operon</p>
				</title>
				<aug>
					<au>
						<snm>Fani</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Brilli</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Li&#242;</snm>
						<fnm>P</fnm>
					</au>
				</aug>
				<source>J Mol Evol</source>
				<pubdate>2005</pubdate>
				<volume>60</volume>
				<issue>3</issue>
				<fpage>378</fpage>
				<lpage>390</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1007/s00239-004-0198-1</pubid>
						<pubid idtype="pmpid" link="fulltext">15871048</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B10">
				<title>
					<p>Evolution of gene fusions: horizontal gene transfer versus independent events</p>
				</title>
				<aug>
					<au>
						<snm>Yanai</snm>
						<fnm>I</fnm>
					</au>
					<au>
						<snm>Wolf</snm>
						<fnm>YL</fnm>
					</au>
					<au>
						<snm>Koonin</snm>
						<fnm>EV</fnm>
					</au>
				</aug>
				<source>Genome Biol</source>
				<pubdate>2002</pubdate>
				<volume>3</volume>
				<issue>5</issue>
				<fpage>research0024</fpage>
				<lpage/>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1186/gb-2002-3-5-research0024</pubid>
						<pubid idtype="pmpid" link="fulltext">12049665</pubid>
						<pubid idtype="pmcid">115226</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B11">
				<title>
					<p>The cell-bag of enzymes or network of channels?</p>
				</title>
				<aug>
					<au>
						<snm>Mathews</snm>
						<fnm>CK</fnm>
					</au>
				</aug>
				<source>J Bacteriol</source>
				<pubdate>1993</pubdate>
				<volume>175</volume>
				<issue>20</issue>
				<fpage>6377</fpage>
				<lpage>6381</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">206744</pubid>
						<pubid idtype="pmpid" link="fulltext">8407814</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B12">
				<title>
					<p>Complexes of sequential metabolic enzymes</p>
				</title>
				<aug>
					<au>
						<snm>Srere</snm>
						<fnm>PA</fnm>
					</au>
				</aug>
				<source>Ann Rev Biochem</source>
				<pubdate>1987</pubdate>
				<volume>56</volume>
				<fpage>89</fpage>
				<lpage>124</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1146/annurev.bi.56.070187.000513</pubid>
						<pubid idtype="pmpid" link="fulltext">2441660</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B13">
				<title>
					<p>Gapped BLAST and PSI-BLAST: a new generation of protein database search programs</p>
				</title>
				<aug>
					<au>
						<snm>Altschul</snm>
						<fnm>SF</fnm>
					</au>
					<au>
						<snm>Madden</snm>
						<fnm>TL</fnm>
					</au>
					<au>
						<snm>Sch&#228;ffer</snm>
						<fnm>AA</fnm>
					</au>
					<au>
						<snm>Zhang</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Zhang</snm>
						<fnm>Z</fnm>
					</au>
					<au>
						<snm>Miller</snm>
						<fnm>W</fnm>
					</au>
					<au>
						<snm>Lipman</snm>
						<fnm>DJ</fnm>
					</au>
				</aug>
				<source>Nucl Acids Res</source>
				<pubdate>1997</pubdate>
				<volume>25</volume>
				<fpage>3389</fpage>
				<lpage>3402</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">146917</pubid>
						<pubid idtype="pmpid" link="fulltext">9254694</pubid>
						<pubid idtype="doi">10.1093/nar/25.17.3389</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B14">
				<title>
					<p>Enzyme recruitment in evolution of new function</p>
				</title>
				<aug>
					<au>
						<snm>Jensen</snm>
						<fnm>RA</fnm>
					</au>
				</aug>
				<source>Annu Rev Microbiol</source>
				<pubdate>1976</pubdate>
				<volume>30</volume>
				<fpage>409</fpage>
				<lpage>425</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1146/annurev.mi.30.100176.002205</pubid>
						<pubid idtype="pmpid" link="fulltext">791073</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B15">
				<title>
					<p>Clustal W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice</p>
				</title>
				<aug>
					<au>
						<snm>Thompson</snm>
						<fnm>JD</fnm>
					</au>
					<au>
						<snm>Higgins</snm>
						<fnm>DG</fnm>
					</au>
					<au>
						<snm>Gibson</snm>
						<fnm>TJ</fnm>
					</au>
				</aug>
				<source>Nucl Acids Res</source>
				<pubdate>1994</pubdate>
				<volume>22</volume>
				<fpage>4673</fpage>
				<lpage>4680</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">308517</pubid>
						<pubid idtype="pmpid" link="fulltext">7984417</pubid>
						<pubid idtype="doi">10.1093/nar/22.22.4673</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B16">
				<title>
					<p>Regulation of lysine biosynthesis and transport genes in bacteria: yet another RNA riboswitch?</p>
				</title>
				<aug>
					<au>
						<snm>Rodionov</snm>
						<fnm>DA</fnm>
					</au>
					<au>
						<snm>Vitreschak</snm>
						<fnm>AG</fnm>
					</au>
					<au>
						<snm>Mironov</snm>
						<fnm>AA</fnm>
					</au>
					<au>
						<snm>Gelfand</snm>
						<fnm>MS</fnm>
					</au>
				</aug>
				<source>Nucl Acids Res</source>
				<pubdate>2003</pubdate>
				<volume>31</volume>
				<issue>23</issue>
				<fpage>6748</fpage>
				<lpage>6757</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">290268</pubid>
						<pubid idtype="pmpid" link="fulltext">14627808</pubid>
						<pubid idtype="doi">10.1093/nar/gkg900</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B17">
				<title>
					<p>Molecular evolution of the histidine biosynthetic pathway</p>
				</title>
				<aug>
					<au>
						<snm>Fani</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Lio</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Lazcano</snm>
						<fnm>A</fnm>
					</au>
				</aug>
				<source>J Mol Evol</source>
				<pubdate>1995</pubdate>
				<volume>41</volume>
				<issue>6</issue>
				<fpage>760</fpage>
				<lpage>774</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1007/BF00173156</pubid>
						<pubid idtype="pmpid">8587121</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B18">
				<title>
					<p>Evolution of aromatic amino acid biosynthesis and application to the fine-tuned phylogenetic positioning of enteric bacteria</p>
				</title>
				<aug>
					<au>
						<snm>Ahmad</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Weisburg</snm>
						<fnm>WG</fnm>
					</au>
					<au>
						<snm>Jensen</snm>
						<fnm>RA</fnm>
					</au>
				</aug>
				<source>J Bacteriol</source>
				<pubdate>1990</pubdate>
				<volume>172</volume>
				<issue>2</issue>
				<fpage>1051</fpage>
				<lpage>1061</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">208536</pubid>
						<pubid idtype="pmpid" link="fulltext">2298692</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B19">
				<title>
					<p>MEGA3: Integrated software for Molecular Evolutionary Genetics Analysis and sequence alignment</p>
				</title>
				<aug>
					<au>
						<snm>Kumar</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Tamura</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Nei</snm>
						<fnm>M</fnm>
					</au>
				</aug>
				<source>Brief Bioinform</source>
				<pubdate>2004</pubdate>
				<volume>5</volume>
				<issue>2</issue>
				<fpage>150</fpage>
				<lpage>163</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1093/bib/5.2.150</pubid>
						<pubid idtype="pmpid" link="fulltext">15260895</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
		</refgrp>
	</bm>
</art>
