<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
	<ui>gb-2005-6-3-r25</ui>
	<ji>GBJ</ji>
	<fm>
		<dochead>Research</dochead>
		<bibl>
			<title>
				<p>Transcriptional slippage in bacteria: distribution in sequenced genomes and utilization in IS element gene expression</p>
			</title>
			<aug>
				<au id="A1">
					<snm>Baranov</snm>
					<mi>V</mi>
					<fnm>Pavel</fnm>
					<insr iid="I1"/>
					<insr iid="I2"/>
					<email>baranov@genetics.utah.edu</email>
				</au>
				<au id="A2">
					<snm>Hammer</snm>
					<mi>W</mi>
					<fnm>Andrew</fnm>
					<insr iid="I1"/>
					<email>ahammer@genetics.utah.edu</email>
				</au>
				<au id="A3">
					<snm>Zhou</snm>
					<fnm>Jiadong</fnm>
					<insr iid="I1"/>
					<insr iid="I3"/>
					<email>jiadong_zhou@gg.nitto.co.jp</email>
				</au>
				<au id="A4">
					<snm>Gesteland</snm>
					<mi>F</mi>
					<fnm>Raymond</fnm>
					<insr iid="I1"/>
					<email>ray.gesteland@genetics.utah.edu</email>
				</au>
				<au id="A5" ca="yes">
					<snm>Atkins</snm>
					<mi>F</mi>
					<fnm>John</fnm>
					<insr iid="I1"/>
					<insr iid="I2"/>
					<email>atkins@genetics.utah.edu</email>
				</au>
			</aug>
			<insg>
				<ins id="I1">
					<p>Department of Human Genetics, University of Utah, Salt Lake City, UT 84112-5330, USA</p>
				</ins>
				<ins id="I2">
					<p>Bioscience Institute, University College Cork, Cork, Ireland</p>
				</ins>
				<ins id="I3">
					<p>Current address: Gene Technology Division, Nitto Denko Technical Corporation, 401 Jones Road, Oceanside, CA 92054, USA</p>
				</ins>
			</insg>
			<source>Genome Biology</source>
			<issn>1465-6906</issn>
			<pubdate>2005</pubdate>
			<volume>6</volume>
			<issue>3</issue>
			<fpage>R25</fpage>
			<url>http://genomebiology.com/2005/6/3/R25</url>
			<xrefbib>
				<pubidlist><pubid idtype="pmpid">15774026</pubid><pubid idtype="doi">10.1186/gb-2005-6-3-r25</pubid>
				</pubidlist></xrefbib>
		</bibl>
		<history>
			<rec>
				<date>
					<day>27</day>
					<month>8</month>
					<year>2004</year>
				</date>
			</rec>
			<revrec>
				<date>
					<day>16</day>
					<month>12</month>
					<year>2004</year>
				</date>
			</revrec>
			<acc>
				<date>
					<day>25</day>
					<month>1</month>
					<year>2005</year>
				</date>
			</acc>
			<pub>
				<date>
					<day>15</day>
					<month>2</month>
					<year>2005</year>
				</date>
			</pub>
		</history>
		<cpyrt>
			<year>2005</year>
			<collab>Baranov et al.; licensee BioMed Central Ltd.</collab>
			<note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
		</cpyrt>
		<shorttitle>
			<p>Homopolymeric runs in bacterial genomes</p>
		</shorttitle>
		<shortabs>
			<p>To find a length of slippage-prone sequences at which selection against transcriptional slippage is evident, the transcription of repetitive runs of A and T of different lengths in 108 bacterial genomes was analyzed. IS element genes were found to exploit transcriptional slippage for regulation of gene expression.</p>
		</shortabs>
		<abs>
			<sec>
				<st>
					<p>Abstract</p>
				</st>
				<sec>
					<st>
						<p>Background</p>
					</st>
					<p>Transcription slippage occurs on certain patterns of repeat mononucleotides, resulting in synthesis of a heterogeneous population of mRNAs. Individual mRNA molecules within this population differ in the number of nucleotides they contain that are not specified by the template. When transcriptional slippage occurs in a coding sequence, translation of the resulting mRNAs yields more than one protein product. Except where the products of the resulting mRNAs have distinct functions, transcription slippage occurring in a coding region is expected to be disadvantageous. This probably leads to selection against most slippage-prone sequences in coding regions.</p>
				</sec>
				<sec>
					<st>
						<p>Results</p>
					</st>
					<p>To find a length at which such selection is evident, we analyzed the distribution of repetitive runs of A and T of different lengths in 108 bacterial genomes. This length varies significantly among different bacteria, but in a large proportion of available genomes corresponds to nine nucleotides. Comparative sequence analysis of these genomes was used to identify occurrences of 9A and 9T transcriptional slippage-prone sequences used for gene expression.</p>
				</sec>
				<sec>
					<st>
						<p>Conclusions</p>
					</st>
					<p>IS element genes are the largest group found to exploit this phenomenon. A number of genes with disrupted open reading frames (ORFs) have slippage-prone sequences at which transcriptional slippage would result in uninterrupted ORF restoration at the mRNA level. The ability of such genes to encode functional full-length protein products brings into question their annotation as pseudogenes and in these cases is pertinent to the significance of the term 'authentic frameshift' frequently assigned to such genes.</p>
				</sec>
			</sec>
		</abs>
	</fm>
	<meta>
		<classifications>
			<classification type="BMC" subtype="man_spc_id" id="30010016">Molecular biology</classification>
			<classification type="BMC" subtype="man_spc_id" id="30010014">Microbiology and parasitology</classification>
			<classification type="BMC" subtype="man_spc_id" id="30010009">Genetics</classification>
		</classifications>
	</meta>
	<bdy>
		<sec>
			<st>
				<p>Background</p>
			</st>
			<p>During transcription, RNA polymerase catalyzes incorporation of nucleotides into growing RNA chains on the basis of complementarity to the DNA template. While transcribing long poly(A) or poly(T) tracts, however, slippage or 'stuttering' (also known as pseudo-templated transcription) occurs, with the resulting incorporation of one or more extra nucleotides or occasional lack of a base or two corresponding to the run of repeat bases. Transcription slippage was first reported from <it>in vitro </it>studies <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>, and investigated <it>in vivo </it>later <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>. Although sequences that are able to cause efficient transcriptional slippage occur infrequently in genomic DNA, they have been found and a functional role has been assigned to some of them. For example, transcription slippage is utilized for regulation of the <it>Escherichia coli pyrBI </it>and <it>codBA </it>operons and occurs shortly after transcription initiation when special conditions apply <abbrgrp><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr></abbrgrp>.</p>
			<p>When a transcription slippage-prone ('slippery') sequence occurs in a coding sequence, the mRNA products are heterogeneous. In such an mRNA population, the sequence downstream of a slippery pattern generally occurs in all three different reading phases relative to the reading frame 5' of the slippage-prone sequence. Translation of these mRNAs yields protein products that differ in their amino-acid sequence downstream of the slippery sequence. For genes encoding a single functional protein product, the presence of slippery sequences is expected to be detrimental, as it is likely to squander cellular resources to synthesize unwanted, or in some instances even deleterious, products. Aberrant forms of beta-amyloid precursor protein and ubiquitin B found in Alzheimer's and Down syndrome patients are associated with molecular misreading, whose mechanism is likely to be transcriptional slippage <abbrgrp><abbr bid="B5">5</abbr></abbrgrp>. Moreover, this type of molecular misreading was suggested to be relevant to the aging process <abbrgrp><abbr bid="B6">6</abbr><abbr bid="B7">7</abbr><abbr bid="B8">8</abbr></abbrgrp>. Transcriptional slippage in the human <it>APC </it>gene (in addition to replicational slippage <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>) has also been proposed as a cause of colorectal cancer <abbrgrp><abbr bid="B10">10</abbr></abbrgrp>.</p>
			<p>There are, however, at least two situations where transcriptional slippage inside a coding region can be advantageous. One is where a frameshift mutation occurs in the coding sequence and transcription slippage at a nearby site permits synthesis of a proportion of mRNAs in which a non-templated nucleotide(s) compensates for this mutation, thereby restoring the original framing. An example involving a single nucleotide deletion occurs in <it>apoB</it>, the human gene in which defects cause familial hypobetalipoproteinemia. In addition to encoding the expected truncated dysfunctional product, about a tenth of the product is full length as a result of its mRNA template having an extra A inserted in a run of eight As <abbrgrp><abbr bid="B11">11</abbr><abbr bid="B12">12</abbr></abbrgrp>. A similar situation was recently reported for the canine <it>AP3B1 </it>gene <abbrgrp><abbr bid="B13">13</abbr></abbrgrp>.</p>
			<p>A second situation in which transcription slippage has a positive outcome is when it leads to synthesis of more than one useful product from a single gene - during expression of the P gene in paramyxoviruses, for example. The best-studied example is in Sendai virus, where a specific number of untemplated Gs are inserted at the position corresponding to the slippery site (reviewed in <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>). Remarkably, this process depends on a hexanucleotide phasing of the slippery sequence relative to the end of genome and this is modulated by viral protein N <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>. In addition to its involvement in paramyxovirus decoding, transcriptional slippage is used for the synthesis of additional functional proteins in other viruses, such as Ebola virus <abbrgrp><abbr bid="B16">16</abbr><abbr bid="B17">17</abbr><abbr bid="B18">18</abbr></abbrgrp>.</p>
			<p>Utilization of transcriptional slippage is not limited to viral genes. Highly efficient transcription slippage in the decoding of the cellular <it>dnaX </it>gene of <it>Thermus thermophilus </it>results in 50% of the product being shorter than the 'standard' product <abbrgrp><abbr bid="B19">19</abbr></abbrgrp>. This gene has a run of nine As in its sense strand seven-eighths of the way through its coding sequence. During transcription, RNA polymerase synthesizes mRNA that contains poly(A) runs of variable length. When the number of As is equal to the templated 9 or 9 + 3n, the full-length product, the DNA polymerase III tau subunit, is synthesized. When the number of As is anything else, for example 8, 10, 11, 13, the translating ribosomes encounter a 3' stop codon located close to the poly(A) run. They terminate, resulting in the synthesis of a shorter product (Figure <figr fid="F1">1</figr>), the gamma subunit of DNA polymerase III, which has distinctive functional properties <abbrgrp><abbr bid="B20">20</abbr><abbr bid="B21">21</abbr></abbrgrp>. In some other bacteria such as <it>E. coli </it><abbrgrp><abbr bid="B22">22</abbr><abbr bid="B23">23</abbr><abbr bid="B24">24</abbr></abbrgrp> and its close relatives <abbrgrp><abbr bid="B25">25</abbr><abbr bid="B26">26</abbr></abbrgrp>, <it>dnaX </it>also encodes both subunits, but the shorter one is synthesized via ribosomal frameshifting instead of transcriptional slippage. The same end result can be achieved by nonstandard events at different levels of readout <abbrgrp><abbr bid="B19">19</abbr></abbrgrp>.</p>
			<fig id="F1">
				<title>
					<p>Figure 1</p>
				</title>
				<caption>
					<p>A scheme for the nonlinear expression of <it>Thermus thermophilus dnaX </it>via transcriptional slippage</p>
				</caption>
				<text>
					<p>A scheme for the nonlinear expression of <it>Thermus thermophilus dnaX </it>via transcriptional slippage. Transcription of <it>dnaX </it>results in synthesis of a population of mRNAs in which the sequence 3' of the slippery AAAAAAAAA is framed in different molecules in all three reading frames relative to sequence 5' of the slippery motif.</p>
				</text>
				<graphic file="gb-2005-6-3-r25-1" hint_layout="double"/>
			</fig>
			<p>Another example of the use of transcriptional slippage was recently reported in the decoding of the <it>Shigella flexneri mxiE </it>gene which encodes a transcription activator <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>. <it>mxiE </it>consists of two overlapping open reading frames (ORFs), <it>mxiEa </it>and <it>mxiEb</it>. Transcriptional insertion of an additional non-templated nucleotide at the run of Us results in a proportion of the mRNAs having <it>mxiEa </it>and <it>mxiEb </it>in the same reading frame <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>. Therefore, in contrast to <it>T. thermophilus dnaX </it>transcriptional slippage, where the novel product is shorter than the product of standard decoding, <it>mxiE </it>transcriptional slippage is required for synthesis of the longer protein product.</p>
			<p>Transcription slippage-prone sequences are expected to be under-represented in coding regions <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>, because functional utilization of such sequences is unlikely to be common. The recent dramatic increase in the number of sequenced bacterial genomes provides an opportunity to perform wide-scale analysis of whole kingdoms of life <abbrgrp><abbr bid="B28">28</abbr></abbrgrp>. The current study explores whether long runs of As or Ts are indeed avoided in the coding regions of 108 sequenced bacterial genomes, and where such runs do occur, whether they play a positive functional role in gene expression.</p>
		</sec>
		<sec>
			<st>
				<p>Results</p>
			</st>
			<sec>
				<st>
					<p>Distribution of homopolymeric A and T runs in bacterial genomes</p>
				</st>
				<p>If any sequence pattern is randomly distributed in a genomic sequence, the following equation should be satisfied:</p>
				<p>Pc/Pg <graphic file="gb-2005-6-3-r25-i1.gif"/> Nc/Ng</p>
				<p>Where Pc is the number of pattern copies in coding regions, Pg is the number of copies in the whole genome, Nc the number of nucleotides in coding regions and Ng the size of the whole genome. We have analyzed the ratio Pc/Pg for 118 published eubacterial and archaeal genomes for homopolymeric A or T patterns of different lengths (see Additional data file 1). An example of such an analysis for a few representative genomes is illustrated in Figure <figr fid="F2">2a</figr>. For several genomes, a sharp reduction in Pc/Pg is evident during transition from the patterns containing <it>n </it>number of As or Ts to the patterns containing <it>n </it>+ 1 As or Ts. The position of the transition is different among the genomes analyzed. A sharp transition is evident only for AT-rich bacterial genomes; in GC-rich bacterial genomes the existence of long A/T runs has a low probability (if random) <abbrgrp><abbr bid="B29">29</abbr></abbrgrp>. Therefore, they are more likely to occur if there is positive selection. In some AT-rich genomes, however, there is no transition in the Pc/Pg ratio at any length (for example, <it>Borrelia burgdorferi</it>). This suggests that such organisms have developed a mechanism to suppress transcriptional slippage at long runs of As or Ts. Indeed the frequency of 9 A/T or 10 A/T runs in such genomes is about one per gene.</p>
				<fig id="F2">
					<title>
						<p>Figure 2</p>
					</title>
					<caption>
						<p>Analysis of the distribution of runs of As and Ts in selected genomes</p>
					</caption>
					<text>
						<p>Analysis of the distribution of runs of As and Ts in selected genomes. Run length is indicated on the <it>x</it>-axis; the ratio of pattern occurrence on the <it>y</it>-axis. <b>(a) </b>Ratio of occurrences in coding regions and in entire genomic sequences. <b>(b) </b>Ratio of occurrences of A runs in real genomes and average occurrence in 1,000 randomized genomes. Biases preserved during the randomization procedure are indicated above each pair of graphs. Accession numbers are as follows: NC_000913 <it>E. coli </it>K12; NC_000915 <it>Helicobacter pylori</it>; NC_000963 <it>Rickettsia prowazekii</it>; NC_001318 <it>Borrelia burgdorferi</it>; NC_002163 <it>Campylobacter jejuni</it>; NC_003450 <it>Corynebacterium glutamicum</it>; NC_003364 <it>Pseudomonas aeruginosa</it>; NC_004344 <it>Wigglesworthia glossinidia</it>.</p>
					</text>
					<graphic file="gb-2005-6-3-r25-2" hint_layout="double"/>
				</fig>
				<p>Comparison of poly(A) and poly(T) occurrence in genomic sequences versus coding regions has two disadvantages. First, runs of As cannot be discriminated from runs of Ts at the level of genomic sequences. Second, such runs could have a positive or negative role(s) outside of coding regions. For example, long runs of Ts can serve as parts of transcriptional terminators, although poly(T) runs do not have to be uninterrupted for this purpose <abbrgrp><abbr bid="B30">30</abbr></abbrgrp>. In addition, the occurrence of A and T runs can be affected by dinucleotide bias, codon usage and amino-acid composition of encoding proteins.</p>
				<p>To minimize the influence of these factors on our analysis, we used another approach to estimate the distribution of such patterns. A thousand random genomes were generated for every genome shown in Figure <figr fid="F2">2a</figr> using the following rules: protein sequences from the real genomes were preserved, but the codons encoding the amino acids were randomized, taking into account codon usage. Such random genomes are relieved of selective pressure to avoid slippery sequences. A similar approach was previously used for statistical analysis of frameshift-inducing patterns in <it>E. coli </it><abbrgrp><abbr bid="B31">31</abbr></abbrgrp> and secondary RNA structures in bacterial genomes <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>. In addition, we used randomization approaches that preserved dinucleotide bias and both dinucleotide bias and codon usage using the DiShuffle and CodonDishuffle programs developed by Katz and Burge <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>. Figure <figr fid="F2">2b</figr> shows the distribution of A/T runs in such random genomes compared to the real genomes. If there were no selective pressure on a particular pattern, its occurrence in random genomes would be similar to its occurrence in a corresponding real genome. If there were negative selection against a particular pattern, it would occur more frequently in random genomes than in real ones. This analysis confirmed our general conclusion that runs of As and Ts of a certain length are avoided in some prokaryotic genomes, but the length of the pattern that is likely to be harmful varies among different genomes. Consequently, such patterns are significantly under-represented in AT-rich genomes.</p>
				<p>Interestingly, in the genome of <it>Wigglesworthia glossinidia</it>, A/T patterns of any length occur with the same frequency in coding and noncoding regions, suggesting that transcriptional slippage is not possible in this species at patterns of any length. However, when the occurrence of such patterns is compared with their occurrence in random genomes, a negative selection is evident for patterns of exceptional length. This suggests that very long patterns have a negative effect not associated with transcriptional slippage.</p>
			</sec>
			<sec>
				<st>
					<p>Functional roles of transcriptional slippage</p>
				</st>
				<p>The next step was to find occurrences of transcriptional slippage and to investigate, using comparative sequence analysis, whether they are likely to have any functional role. The scheme of this analysis is shown in Figure <figr fid="F3">3</figr>. We searched for occurrences of 9As and 9Ts in protein encoding genes. Only those genes were selected where transcriptional slippage would result in synthesis of a protein which is larger than the counterpart generated by standard decoding. When transcriptional slippage results in the synthesis of a truncated product, as in decoding <it>T. thermophilus dnaX</it>, it is difficult to predict functional importance on the basis of comparative sequence analysis, as there is no extensive 'new' coding sequence suitable for such an analysis. The next filter was the exclusion of genes from bacteria where transcriptional slippage is unlikely to occur on runs of 9As and Ts. Organisms with AT-rich genomes that do not demonstrate selection against 9A and 9T sequences within their coding regions may have evolved to suppress transcriptional slippage on 9A and 9T and are unlikely to exhibit it. To select bacteria in which transcriptional slippage on 9A and 9T is unlikely, we first determined the number of genes containing 9A and 9T. For those bacteria where this number was higher than the threshold number 20 (we assumed that it is unlikely that transcriptional slippage can be utilized by more than 20 genes in the same species) we searched for evidence of negative selection against these sequences. If such sequences were not under-represented, corresponding bacteria were considered as those where transcriptional slippage is unlikely to occur on 9A or 9T runs. Genes from such bacteria were excluded from further analysis.</p>
				<fig id="F3">
					<title>
						<p>Figure 3</p>
					</title>
					<caption>
						<p>Scheme for functional analysis of slippery patterns in coding sequences</p>
					</caption>
					<text>
						<p>Scheme for functional analysis of slippery patterns in coding sequences.</p>
					</text>
					<graphic file="gb-2005-6-3-r25-3" hint_layout="single"/>
				</fig>
				<p>The remaining pool of genes contained some identical genes. Some of these exist in multiple copies inside the same genome whereas others are identical because they derived from genomes of highly related species. Such identical genes were combined to reduce redundancy. In the list of these genes (<supplr sid="S2">2</supplr>) only one representative is given for each group of identical genes. The products of those genes that can be generated by transcriptional slippage were compared to each other using tBLASTn <abbrgrp><abbr bid="B33">33</abbr></abbrgrp>, and to those derived from other sequences present in sequenced bacterial genomes. Genes that produced no significant sequence similarity were considered as ORFans <abbrgrp><abbr bid="B34">34</abbr><abbr bid="B35">35</abbr></abbrgrp>. Since ORFans are not suitable for comparative analysis, they were excluded from further analysis (shown in gray in <supplr sid="S2">2</supplr>). The number of gene groups for which homologs were found is 53.</p>
				<p>The likelihood of functional utilization of transcriptional slippage was estimated using comparative sequence analysis. According to the scheme utilized (Figure <figr fid="F4">4</figr>), we consider transcriptional slippage patterns likely to be functional if the organization of ORFs fused by transcriptional slippage is the same in at least two non-identical sequences sharing significant sequence similarity. We have not found evidence of functional utilization of transcriptional slippage for 40 cases (shown in blue in <supplr sid="S2">2</supplr>). Most probably, although transcriptional slippage is likely to occur during expression of these genes, it has no significant detrimental effect. This result is consistent with our previous finding that sequences that direct significant levels of frameshifting in the <it>E. coli </it>genome may occur without apparent function <abbrgrp><abbr bid="B31">31</abbr></abbrgrp>. Six cases were found where protein products expressed by transcriptional slippage have homologs encoded in a single ORF in genes from other species.</p>
				<p>One example is shown in Figure <figr fid="F5">5</figr>. Such genes are normally considered as pseudogenes, because their ORF is disrupted. However, transcriptional slippage should result in the synthesis of normal functional protein and consequently such genes should not be treated as inactive as a result of frameshift mutation. These genes are shown in green in <supplr sid="S2">2</supplr>. In seven cases (red in <supplr sid="S2">2</supplr>) homologs were found with both a conserved organization of the overlapping ORFs and a conserved pattern of 9As in the overlapping regions. Among them, six cases derive from IS elements whose total number of copies is 27. One group is composed of the <it>mapW </it>genes from <it>Staphylococcus aureus </it>strains; <it>mapW </it>is a functional candidate derived from a non-mobile element.</p>
				<fig id="F4">
					<title>
						<p>Figure 4</p>
					</title>
					<caption>
						<p>Different types of ORF organization for genes sharing sequence similarities</p>
					</caption>
					<text>
						<p>Different types of ORF organization for genes sharing sequence similarities.</p>
					</text>
					<graphic file="gb-2005-6-3-r25-4" hint_layout="single"/>
				</fig>
				<fig id="F5">
					<title>
						<p>Figure 5</p>
					</title>
					<caption>
						<p>Codon alignments of DNA and mRNA sequences of orthologous genes from two different strains of <it>E. coli</it></p>
					</caption>
					<text>
						<p>Codon alignments of DNA and mRNA sequences of orthologous genes from two different strains of <it>E. coli</it>. In the DNA, an A causing a frameshift mutation is underlined. In the mRNA, a tandem A inserted by transcriptional slippage which results in ORF restoration is underlined.</p>
					</text>
					<graphic file="gb-2005-6-3-r25-5" hint_layout="single"/>
				</fig>
				<p>Transcriptional slippage was recently found in the <it>S. flexneri </it>pathogenicity-encoding plasmid that carries the <it>mxiE </it>gene <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>; it is not included in the 108 sequences of complete genomes downloaded for the present study (even though the chromosomal sequence was included).</p>
			</sec>
		</sec>
		<sec>
			<st>
				<p>Discussion</p>
			</st>
			<p>We have obtained an initial view of the distribution and functional utilization of simple transcriptional slippage sites in bacterial genomes performed on a multiple-genome scale. The data obtained demonstrate that runs of As and Ts, which result in efficient transcriptional slippage, are significantly underrepresented in coding regions of AT-rich genomes. One likely reason for this underrepresentation is the 'slippery' nature of such sites. In addition to transcriptional slippage, these sequences are likely to be hypermutable as a result of slippage during replication. This also contributes to negative selection against these sequences. It has previously been shown that in eukaryotes short repetitive sequences of specific length are usually under-represented in coding regions compared to noncoding regions <abbrgrp><abbr bid="B36">36</abbr></abbrgrp>. The implication is that such sequences are susceptible to frameshift errors at the DNA level. We cannot distinguish whether the reason for negative selection against A or T runs is slippage at the replication or transcriptional level or at both. Our approach to finding genes where transcriptional slippage is functionally utilized can, however, discriminate it from replicational slippage in some instances. Since we deal with those cases where sequence extension after a slippery pattern in a shifted reading frame is conserved among several homologs, it is very likely that this extension is expressed. Theoretically, its expression can be achieved as a result of replicational and/or transcriptional slippage. In the first case, the result would be the existence of a population of bacteria with heterogeneous genomes where different members of such a population would have a different number of nucleotides within a repetitive run, as previously described for several occurrences in the <it>Campylobacter jejuni </it><abbrgrp><abbr bid="B37">37</abbr></abbrgrp>. We have found several such examples for the group of genes that we classified as 'pseudo pseudogenes' (an example is in Figure <figr fid="F5">5</figr>).</p>
			<p>If a specific run of 9As or 9Ts occurs within a number of homologs and the length of such run is conserved among all homologs, then it is very likely that this specific run is used for purposeful transcriptional slippage to generate a set of heterogeneous mRNAs. Subsequent translation of such mRNAs will result in the synthesis of more than one protein product from the same gene. An example is shown in Figure <figr fid="F6">6</figr> for IS elements from <it>D. radiodurans</it>. We have not found homologous IS elements that contain insertions or deletions in the run of As. Those shown on Figure <figr fid="F6">6</figr> are the only homologs found.</p>
			<fig id="F6">
				<title>
					<p>Figure 6</p>
				</title>
				<caption>
					<p>Alignment of a portion of <it>Deinococcus radiodurans </it>IS elements containing a run of nine or eight As</p>
				</caption>
				<text>
					<p>Alignment of a portion of <it>Deinococcus radiodurans </it>IS elements containing a run of nine or eight As. Universally conserved residues are in bold, runs of As are in red. The alignment was built using Clustal [54].</p>
				</text>
				<graphic file="gb-2005-6-3-r25-6" hint_layout="double"/>
			</fig>
			<p>In general, a conserved run of As or Ts in several homologs does not imply that replication slippage is impossible on such a run. For example, when insertion of an additional nucleotide is deleterious, there will be selection against sequences with the additional nucleotide. However, in this case such replicational slippage cannot be referred to as being functional.</p>
			<p>The comparative sequence analysis of genes with runs of nine As and Ts from genomes where such repeat bases are slippage-prone, revealed <it>S. aureus mapW </it>as a candidate for functional utilization of transcriptional slippage. <it>mapW </it>belongs to a group of <it>map </it>genes encoding MHC class II (major histocompatibility complex class II)-like proteins. <it>mapW </it>consists of two ORFs and it was proposed earlier that they can be expressed together to produce a full-length 'fusion' protein <abbrgrp><abbr bid="B38">38</abbr></abbrgrp>. Perhaps the ability of <it>S. aureus </it>to encode MHC-II like proteins with variable length can facilitate survival in mammals of varied genetic backgrounds <abbrgrp><abbr bid="B39">39</abbr></abbrgrp>. However, the presence of <it>mapW </it>genes with an uninterrupted ORF in some <it>S. aureus </it>strains suggests that replicational slippage can be also utilized in this case.</p>
			<p>The largest group of functionally utilized transcriptional slippage sites belongs to mobile IS elements. We have found patterns of 9 As in 27 IS elements from the following organisms - <it>Deinococcus radiodurans</it>, <it>Mesorhizobium loti</it>, <it>Nostoc </it>sp. PCC 7120, <it>Streptococcus pyogenes </it>and <it>Sulfolobus solfataricus</it>. Interestingly, some homologous IS elements from <it>D. radiodurans </it>and <it>Nostoc </it>sp. PCC 7120 have 8As instead of 9As in the same location. This suggests that in these organisms, transcriptional slippage is productive even on eight As. Figure <figr fid="F6">6</figr> illustrates codon alignment of homologous IS elements from <it>D. radiodurans</it>. It is clear that the stretch of As is evolutionally preserved among these IS elements (although its length varies, there is no deletion or insertions) and their ORF organization suggests that runs of As are used to produce ORF fusions. (A high-resolution FITC mass spectrometric analysis of numerous tryptic peptides from <it>D. radiodurans </it>has been performed by Smith and colleagues <abbrgrp><abbr bid="B40">40</abbr></abbrgrp>. A preliminary analysis of these data is revealing products of IS element mRNAs synthesized via transcriptional slippage (R. Smith, P.V.B., A.W.H, J.Z, R.F.G. and J.F.A, unpublished results) Alignment of IS elements from <it>Nostoc </it>is not shown, as all its elements are identical except for the length of the poly(A) run varying from 8 to 10 As. Many IS elements encode their transposase in two overlapping ORFs, <it>orfA </it>and <it>orfB</it>. Synthesis of a fused ORFA-ORFB product is required for transposition. The most common known mechanism for synthesis of ORFA-ORFB fusion is -1 ribosomal frameshifting (see <abbrgrp><abbr bid="B41">41</abbr><abbr bid="B42">42</abbr><abbr bid="B43">43</abbr><abbr bid="B44">44</abbr></abbrgrp> for reviews). Transcriptional slippage has, however, been proposed previously as an alternative mechanism for one IS element <abbrgrp><abbr bid="B19">19</abbr></abbrgrp>. The present study has identified a number of IS elements utilizing transcriptional slippage for synthesis of their ORFA-ORFB fusion. Therefore transcriptional slippage can be considered as a common mechanism for IS element expression.</p>
			<p>In addition, we have found a set of pseudo pseudogenes where what is normally considered as a frameshift mutation extends a non-slippery pattern of 8 As to the slippage-prone sequence of 9 As. As a result, such a frameshift mutation does not lead to full inactivation of a gene that normally could be annotated as a pseudogene, as a normal functional product is still produced. The advantage of the unusual decoding of these genes by transcriptional slippage, compared to standard decoding of wild-type counterparts, is uncertain. It is clear that such cases were generated by single mutations and they may, or may not, be present in different isolates from the same species. Transcriptional slippage can, however, be considered as functionally utilized, since if such genes were transcribed, a proportion of the mRNA synthesized should contain the intact coding information. This important consideration needs to be taken in account in genome annotation.</p>
			<p>Although organism-specific utilization of transcriptional slippage cannot be ruled out, we have identified a large number of genes where, using comparative analysis, no apparent functional role can be assigned for transcriptional slippage. This result is parallel to our previous analysis of frameshift-inducing sequences in the <it>E. coli </it>K12 genome <abbrgrp><abbr bid="B31">31</abbr></abbrgrp>. It was shown that a significant level of frameshifting errors occur in many <it>E. coli </it>genes containing A_AAA_AAG sequences (codons are separated by underscoring), but no such sequences were found in highly expressed genes <abbrgrp><abbr bid="B31">31</abbr></abbrgrp>. Similar considerations can be applied here for transcriptional slippage. When erroneous nonstandard decoding occurs in genes that are not highly expressed, the cellular load is modest owing to the low level of aberrant product compared to the total protein mass. Such situations may be easily tolerated.</p>
			<p>Transcriptional slippage motifs were found in many ORFans, but any functional purpose could not be assessed in the present study. We found runs of 9A or 9T in 48 ORFans. The origin(s) of ORFans is mysterious. While some of them are likely to be 'coincidental ORFs' or 'junk ORFans' which do not produce proteins under any conditions <abbrgrp><abbr bid="B45">45</abbr><abbr bid="B46">46</abbr><abbr bid="B47">47</abbr></abbrgrp>, many ORFans are likely to be real genes <abbrgrp><abbr bid="B36">36</abbr><abbr bid="B48">48</abbr><abbr bid="B49">49</abbr></abbrgrp>.</p>
			<p>The analysis of transcriptional slippage in this study was limited to that occurring on 9As and 9Ts. It is clear, however, that the efficiency of transcriptional slippage on runs of As and Ts is highly organism dependent, and there are a number of bacteria in which transcriptional slippage may occur on runs of shorter length. In addition, transcriptional slippage patterns can occur on other nucleotide repeats. The simplest mechanism that can be proposed for transcriptional slippage is dissociation of the growing RNA chain from its DNA template while inside an open RNA polymerase complex, and subsequent re-association with the DNA template at a new location (Figure <figr fid="F7">7</figr>). On this basis, other repeat patterns of low complexity are likely to result in transcriptional slippage. For example, (AT)<it>n </it>may result in insertion of additional non-templated ATs. Transcriptional slippage sites can be also formed by combination of two relatively short homopolymeric patterns as in paramyxoviruses <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>.</p>
			<p>Simple sequence repeats (SSR), also known as microsatellites, occur frequently in virulence genes of different pathogenic bacteria <abbrgrp><abbr bid="B37">37</abbr><abbr bid="B50">50</abbr><abbr bid="B51">51</abbr></abbrgrp>. Because of replicational slippage, they are responsible for hypermutability and phase variations in pathogenic bacteria <abbrgrp><abbr bid="B52">52</abbr></abbrgrp>. The effect of such sequences on transcription and translation has not yet been extensively studied. Such sequences could also result in nonstandard decoding (transcriptional slippage or ribosomal frameshifting) and consequently express more than one protein product. Expression of multiple products encoded by virulence genes may be beneficial for pathogens as a strategy for evading the host immune response. Statistical, experimental and functional analysis of such sequences in relation to transcription and translation will hopefully be the subject of further investigation.</p>
			<fig id="F7">
				<title>
					<p>Figure 7</p>
				</title>
				<caption>
					<p>A model of transcriptional slippage</p>
				</caption>
				<text>
					<p>A model of transcriptional slippage.</p>
				</text>
				<graphic file="gb-2005-6-3-r25-7" hint_layout="single"/>
			</fig>
		</sec>
		<sec>
			<st>
				<p>Materials and methods</p>
			</st>
			<sec>
				<st>
					<p>Analysis of A and T repeat distribution in bacterial genomes</p>
				</st>
				<p>Fasta files containing nucleotide sequences of entire bacterial genomes and nucleotide sequences of coding regions were downloaded from the National Center for Biotechnology Information ftp site <abbrgrp><abbr bid="B53">53</abbr></abbrgrp> on 25 March, 2003. Occurrences of A and T runs with different lengths were calculated for each genome in the file containing genomic sequences (accession_number.fna) and in the files containing nucleotide sequences of coding ORFs (accession_number.ffn). The ratios of occurrences of runs of A and T between .fna files and .ffn files were calculated for every accession number and the data are summarized in <supplr sid="S1">1</supplr>.</p>
				<p>Random genomes were generated for representative genomes as described in <abbrgrp><abbr bid="B31">31</abbr></abbrgrp>. In addition we applied DiShuffle and CodonDiShuffle programs provided by C. Burge <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>. The ratios between occurrences of A and T runs in real genomes and the mean values for A and T runs in random genomes were further calculated.</p>
			</sec>
			<sec>
				<st>
					<p>Generation of novel protein sequences corresponding to those produced via transcriptional slippage</p>
				</st>
				<p>Runs of 9A or 9Ts were sought within coding regions of genomic sequences of completed bacterial genomes. To generate a novel <it>in silico </it>protein that can be produced by transcriptional slippage, one and two As or Ts were introduced into the pattern of 9As or 9Ts. The length of the resulting ORF in these sequences was compared to the ORF in the original sequences. Those sequences that contain ORFs longer than the original were selected for further analysis.</p>
			</sec>
		</sec>
		<sec>
			<st>
				<p>Additional data files</p>
			</st>
			<p>Additional data is available with the online version of this paper. Additional data file <supplr sid="S1">1</supplr> contains numbers of occurrences of A and T runs in bacterial genomes. Additional data file <supplr sid="S2">2</supplr> contains information about genes where 9A or 9T patterns were found.</p>
			<suppl id="S1">
				<title>
					<p>Additional File 1</p>
				</title>
				<caption>
					<p>Numbers of occurrences of A and T runs in bacterial genomes</p>
				</caption>
				<text>
					<p>Numbers of occurrences of A and T runs in bacterial genomes. Column A is used for the names of analyzed files and row 1 indicates the length of A/T run. Sheet 'whole genomes' corresponds to occurrences in entire genomes, sheet 'coding sequences' corresponds to occurrences in coding regions and sheet 'ATRatio' corresponds to the ratio between these numbers</p>
				</text>
				<file name="gb-2005-6-3-r25-S1.xls">
					<p>Click here for file</p>
				</file>
			</suppl>
			<suppl id="S2">
				<title>
					<p>Additional File 2</p>
				</title>
				<caption>
					<p>Information about genes where 9A or 9T patterns were found</p>
				</caption>
				<text>
					<p>Information about genes where 9A or 9T patterns were found. These genes correspond to the pool of genes selected for comparative analysis. The table contains representatives from 98 gene groups selected in step 4 of the scheme in Figure <figr fid="F3">3</figr>. Column A is used for accession numbers. B is for coordinates of the corresponding gene. C indicates whether it is a run of A or T in a sense strand. D shows the functional status of a gene. The functional status is annotated by text and by color. The red color is used for genes with potential positive role of transcriptional slippage, blue is for those where there is no positive functional role, green is for genes where transcriptional slippage might restore a disrupted ORF and grey is used for ORFans, where functional status cannot be assessed. Column E contains nucleotide sequences of mRNAs produced <it>via </it>transcriptional slippage with ORFs longer than those in the original DNA templates. Column F contains the corresponding protein sequences</p>
				</text>
				<file name="gb-2005-6-3-r25-S2.xls">
					<p>Click here for file</p>
				</file>
			</suppl>
		</sec>
	</bdy>
	<bm>
		<ack>
			<sec>
				<st>
					<p>Acknowledgements</p>
				</st>
				<p>We are grateful to Chris Burge for providing us with source codes for the DiShuffle and DiCodonShuffle programs. We thank Norma Wills for her key role in the background work on which this study is based. The salary of J.F.A. was supported by NIH grant GM48152 and an award from Science Foundation Ireland. The salary of P.V.B. was supported by DOE grant DE-FG03-01ER63132 to R.F.G.</p>
			</sec>
		</ack>
		<refgrp>
			<bibl id="B1">
				<title>
					<p>Deoxyribonucleic acid-directed synthesis of ribonucleic acid by an enzyme from <it>Escherichia coli</it>.</p>
				</title>
				<aug>
					<au>
						<snm>Chamberlin</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Berg</snm>
						<fnm>P</fnm>
					</au>
				</aug>
				<source>Proc Natl Acad Sci USA</source>
				<pubdate>1962</pubdate>
				<volume>48</volume>
				<fpage>81</fpage>
				<lpage>94</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">285509</pubid>
						<pubid idtype="pmpid">13877961</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B2">
				<title>
					<p>Transcriptional slippage occurs during elongation at runs of adenine or thymine in <it>Escherichia coli</it>.</p>
				</title>
				<aug>
					<au>
						<snm>Wagner</snm>
						<fnm>LA</fnm>
					</au>
					<au>
						<snm>Weiss</snm>
						<fnm>RB</fnm>
					</au>
					<au>
						<snm>Driscoll</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Dunn</snm>
						<fnm>DS</fnm>
					</au>
					<au>
						<snm>Gesteland</snm>
						<fnm>RF</fnm>
					</au>
				</aug>
				<source>Nucleic Acids Res</source>
				<pubdate>1990</pubdate>
				<volume>18</volume>
				<fpage>3529</fpage>
				<lpage>3535</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">331007</pubid>
						<pubid idtype="pmpid">2194164</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B3">
				<title>
					<p>Regulation of <it>pyrBI </it>operon expression in <it>Escherichia coli </it>by UTP-sensitive reiterative RNA synthesis during transcriptional initiation.</p>
				</title>
				<aug>
					<au>
						<snm>Liu</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Heath</snm>
						<fnm>LS</fnm>
					</au>
					<au>
						<snm>Turnbough</snm>
						<fnm>CL</fnm>
						<suf>Jr</suf>
					</au>
				</aug>
				<source>Genes Dev</source>
				<pubdate>1994</pubdate>
				<volume>8</volume>
				<fpage>2904</fpage>
				<lpage>2912</lpage>
				<xrefbib>
					<pubid idtype="pmpid">7527789</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B4">
				<title>
					<p>Regulation of <it>codBA </it>operon expression in <it>Escherichia coli </it>by UTP-dependent reiterative transcription and UTP-sensitive transcriptional start site switching.</p>
				</title>
				<aug>
					<au>
						<snm>Qi</snm>
						<fnm>F</fnm>
					</au>
					<au>
						<snm>Turnbough</snm>
						<fnm>CL</fnm>
						<suf>Jr</suf>
					</au>
				</aug>
				<source>J Mol Biol</source>
				<pubdate>1995</pubdate>
				<volume>254</volume>
				<fpage>552</fpage>
				<lpage>565</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1006/jmbi.1995.0638</pubid>
						<pubid idtype="pmpid" link="fulltext">7500333</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B5">
				<title>
					<p>Frameshift mutants of beta amyloid precursor protein and ubiquitin-B in Alzheimer's and Down patients.</p>
				</title>
				<aug>
					<au>
						<snm>van Leeuwen</snm>
						<fnm>FW</fnm>
					</au>
					<au>
						<snm>de Kleijn</snm>
						<fnm>DP</fnm>
					</au>
					<au>
						<snm>van den Hurk</snm>
						<fnm>HH</fnm>
					</au>
					<au>
						<snm>Neubauer</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Sonnemans</snm>
						<fnm>MA</fnm>
					</au>
					<au>
						<snm>Sluijs</snm>
						<fnm>JA</fnm>
					</au>
					<au>
						<snm>Koycu</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Ramdjielal</snm>
						<fnm>RD</fnm>
					</au>
					<au>
						<snm>Salehi</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Martens</snm>
						<fnm>GJ</fnm>
					</au>
					<etal/>
				</aug>
				<source>Science</source>
				<pubdate>1998</pubdate>
				<volume>279</volume>
				<fpage>242</fpage>
				<lpage>247</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1126/science.279.5348.242</pubid>
						<pubid idtype="pmpid" link="fulltext">9422699</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B6">
				<title>
					<p>Transcriptional infidelity in aging cells and its relevance for the Orgel hypothesis.</p>
				</title>
				<aug>
					<au>
						<snm>Martin</snm>
						<fnm>GM</fnm>
					</au>
					<au>
						<snm>Bressler</snm>
						<fnm>SL</fnm>
					</au>
				</aug>
				<source>Neurobiol Aging</source>
				<pubdate>2000</pubdate>
				<volume>21</volume>
				<fpage>897</fpage>
				<lpage>900</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/S0197-4580(00)00193-7</pubid>
						<pubid idtype="pmpid" link="fulltext">11124438</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B7">
				<title>
					<p>Molecular misreading: a new type of transcript mutation expressed during aging.</p>
				</title>
				<aug>
					<au>
						<snm>van Leeuwen</snm>
						<fnm>FW</fnm>
					</au>
					<au>
						<snm>Fischer</snm>
						<fnm>DF</fnm>
					</au>
					<au>
						<snm>Kamel</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Sluijs</snm>
						<fnm>JA</fnm>
					</au>
					<au>
						<snm>Sonnemans</snm>
						<fnm>MA</fnm>
					</au>
					<au>
						<snm>Benne</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Swaab</snm>
						<fnm>DF</fnm>
					</au>
					<au>
						<snm>Salehi</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Hol</snm>
						<fnm>EM</fnm>
					</au>
				</aug>
				<source>Neurobiol Aging</source>
				<pubdate>2000</pubdate>
				<volume>21</volume>
				<fpage>879</fpage>
				<lpage>891</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/S0197-4580(00)00151-2</pubid>
						<pubid idtype="pmpid" link="fulltext">11124436</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B8">
				<title>
					<p>Neuropeptide research discloses part of the secrets of Alzheimer's disease neuropathogenesis: state of the art 2004.</p>
				</title>
				<aug>
					<au>
						<snm>van Leeuwen</snm>
						<fnm>FW</fnm>
					</au>
				</aug>
				<source>Neurosci Lett</source>
				<pubdate>2004</pubdate>
				<volume>361</volume>
				<fpage>124</fpage>
				<lpage>127</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/j.neulet.2003.12.050</pubid>
						<pubid idtype="pmpid" link="fulltext">15135909</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B9">
				<title>
					<p>Familial colorectal cancer in Ashkenazim due to a hypermutable tract in APC.</p>
				</title>
				<aug>
					<au>
						<snm>Laken</snm>
						<fnm>SJ</fnm>
					</au>
					<au>
						<snm>Petersen</snm>
						<fnm>GM</fnm>
					</au>
					<au>
						<snm>Gruber</snm>
						<fnm>SB</fnm>
					</au>
					<au>
						<snm>Oddoux</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Ostrer</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Giardiello</snm>
						<fnm>FM</fnm>
					</au>
					<au>
						<snm>Hamilton</snm>
						<fnm>SR</fnm>
					</au>
					<au>
						<snm>Hampel</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Markowitz</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Klimstra</snm>
						<fnm>D</fnm>
					</au>
					<etal/>
				</aug>
				<source>Nat Genet</source>
				<pubdate>1997</pubdate>
				<volume>17</volume>
				<fpage>79</fpage>
				<lpage>83</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1038/ng0997-79</pubid>
						<pubid idtype="pmpid" link="fulltext">9288102</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B10">
				<title>
					<p>Long runs of adenines and human mutations.</p>
				</title>
				<aug>
					<au>
						<snm>Raabe</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Linton</snm>
						<fnm>MF</fnm>
					</au>
					<au>
						<snm>Young</snm>
						<fnm>SG</fnm>
					</au>
				</aug>
				<source>Am J Med Genet</source>
				<pubdate>1998</pubdate>
				<volume>76</volume>
				<fpage>101</fpage>
				<lpage>102</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1002/(SICI)1096-8628(19980226)76:1&lt;101::AID-AJMG19&gt;3.0.CO;2-P</pubid>
						<pubid idtype="pmpid" link="fulltext">9508075</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B11">
				<title>
					<p>Reading-frame restoration with an apolipoprotein B gene frameshift mutation.</p>
				</title>
				<aug>
					<au>
						<snm>Linton</snm>
						<fnm>MF</fnm>
					</au>
					<au>
						<snm>Pierotti</snm>
						<fnm>V</fnm>
					</au>
					<au>
						<snm>Young</snm>
						<fnm>SG</fnm>
					</au>
				</aug>
				<source>Proc Natl Acad Sci USA</source>
				<pubdate>1992</pubdate>
				<volume>89</volume>
				<fpage>11431</fpage>
				<lpage>11435</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">50565</pubid>
						<pubid idtype="pmpid" link="fulltext">1454832</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B12">
				<title>
					<p>Reading-frame restoration by transcriptional slippage at long stretches of adenine residues in mammalian cells.</p>
				</title>
				<aug>
					<au>
						<snm>Linton</snm>
						<fnm>MF</fnm>
					</au>
					<au>
						<snm>Raabe</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Pierotti</snm>
						<fnm>V</fnm>
					</au>
					<au>
						<snm>Young</snm>
						<fnm>SG</fnm>
					</au>
				</aug>
				<source>J Biol Chem</source>
				<pubdate>1997</pubdate>
				<volume>272</volume>
				<fpage>14127</fpage>
				<lpage>14132</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1074/jbc.272.22.14127</pubid>
						<pubid idtype="pmpid" link="fulltext">9162040</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B13">
				<title>
					<p>Paradoxical homozygous expression from heterozygotes and heterozygous expression from homozygotes as a consequence of transcriptional infidelity through a polyadenine tract in the AP3B1 gene responsible for canine cyclic neutropenia.</p>
				</title>
				<aug>
					<au>
						<snm>Benson</snm>
						<fnm>KF</fnm>
					</au>
					<au>
						<snm>Person</snm>
						<fnm>RE</fnm>
					</au>
					<au>
						<snm>Li</snm>
						<fnm>FQ</fnm>
					</au>
					<au>
						<snm>Williams</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Horwitz</snm>
						<fnm>M</fnm>
					</au>
				</aug>
				<source>Nucleic Acids Res</source>
				<pubdate>2004</pubdate>
				<volume>32</volume>
				<fpage>6327</fpage>
				<lpage>6333</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">535682</pubid>
						<pubid idtype="pmpid" link="fulltext">15576359</pubid>
						<pubid idtype="doi">10.1093/nar/gkh974</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B14">
				<title>
					<p>The versatility of paramyxovirus RNA polymerase stuttering.</p>
				</title>
				<aug>
					<au>
						<snm>Hausmann</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Garcin</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Delenda</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Kolakofsky</snm>
						<fnm>D</fnm>
					</au>
				</aug>
				<source>J Virol</source>
				<pubdate>1999</pubdate>
				<volume>73</volume>
				<fpage>5568</fpage>
				<lpage>5576</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">112614</pubid>
						<pubid idtype="pmpid" link="fulltext">10364305</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B15">
				<title>
					<p>Chemical modification of nucleotide bases and mRNA editing depend on hexamer or nucleoprotein phase in Sendai virus nucleocapsids.</p>
				</title>
				<aug>
					<au>
						<snm>Iseni</snm>
						<fnm>F</fnm>
					</au>
					<au>
						<snm>Baudin</snm>
						<fnm>F</fnm>
					</au>
					<au>
						<snm>Garcin</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Marq</snm>
						<fnm>JB</fnm>
					</au>
					<au>
						<snm>Ruigrok</snm>
						<fnm>RW</fnm>
					</au>
					<au>
						<snm>Kolakofsky</snm>
						<fnm>D</fnm>
					</au>
				</aug>
				<source>RNA</source>
				<pubdate>2002</pubdate>
				<volume>8</volume>
				<fpage>1056</fpage>
				<lpage>1067</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1017/S1355838202029977</pubid>
						<pubid idtype="pmpid" link="fulltext">12212849</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B16">
				<title>
					<p>GP mRNA of Ebola virus is edited by the Ebola virus polymerase and by T7 and vaccinia virus polymerases.</p>
				</title>
				<aug>
					<au>
						<snm>Volchkov</snm>
						<fnm>VE</fnm>
					</au>
					<au>
						<snm>Becker</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Volchkova</snm>
						<fnm>VA</fnm>
					</au>
					<au>
						<snm>Ternovoj</snm>
						<fnm>VA</fnm>
					</au>
					<au>
						<snm>Kotov</snm>
						<fnm>AN</fnm>
					</au>
					<au>
						<snm>Netesov</snm>
						<fnm>SV</fnm>
					</au>
					<au>
						<snm>Klenk</snm>
						<fnm>HD</fnm>
					</au>
				</aug>
				<source>Virology</source>
				<pubdate>1995</pubdate>
				<volume>214</volume>
				<fpage>421</fpage>
				<lpage>430</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1006/viro.1995.0052</pubid>
						<pubid idtype="pmpid" link="fulltext">8553543</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B17">
				<title>
					<p>The virion glycoproteins of Ebola viruses are encoded in two reading frames and are expressed through transcriptional editing.</p>
				</title>
				<aug>
					<au>
						<snm>Sanchez</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Trappier</snm>
						<fnm>SG</fnm>
					</au>
					<au>
						<snm>Mahy</snm>
						<fnm>BW</fnm>
					</au>
					<au>
						<snm>Peters</snm>
						<fnm>CJ</fnm>
					</au>
					<au>
						<snm>Nichol</snm>
						<fnm>ST</fnm>
					</au>
				</aug>
				<source>Proc Natl Acad Sci USA</source>
				<pubdate>1996</pubdate>
				<volume>93</volume>
				<fpage>3602</fpage>
				<lpage>3607</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">39657</pubid>
						<pubid idtype="pmpid" link="fulltext">8622982</pubid>
						<pubid idtype="doi">10.1073/pnas.93.8.3602</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B18">
				<title>
					<p>Recovery of infectious Ebola virus from complementary DNA: RNA editing of the GP gene and viral cytotoxicity.</p>
				</title>
				<aug>
					<au>
						<snm>Volchkov</snm>
						<fnm>VE</fnm>
					</au>
					<au>
						<snm>Volchkova</snm>
						<fnm>VA</fnm>
					</au>
					<au>
						<snm>Muhlberger</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>Kolesnikova</snm>
						<fnm>LV</fnm>
					</au>
					<au>
						<snm>Weik</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Dolnik</snm>
						<fnm>O</fnm>
					</au>
					<au>
						<snm>Klenk</snm>
						<fnm>HD</fnm>
					</au>
				</aug>
				<source>Science</source>
				<pubdate>2001</pubdate>
				<volume>291</volume>
				<fpage>1965</fpage>
				<lpage>1969</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1126/science.1057269</pubid>
						<pubid idtype="pmpid" link="fulltext">11239157</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B19">
				<title>
					<p>Nonlinearity in genetic decoding: homologous DNA replicase genes use alternatives of transcriptional slippage or translational frameshifting.</p>
				</title>
				<aug>
					<au>
						<snm>Larsen</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Wills</snm>
						<fnm>NM</fnm>
					</au>
					<au>
						<snm>Nelson</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Atkins</snm>
						<fnm>JF</fnm>
					</au>
					<au>
						<snm>Gesteland</snm>
						<fnm>RF</fnm>
					</au>
				</aug>
				<source>Proc Natl Acad Sci USA</source>
				<pubdate>2000</pubdate>
				<volume>97</volume>
				<fpage>1683</fpage>
				<lpage>1688</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">26496</pubid>
						<pubid idtype="pmpid" link="fulltext">10677518</pubid>
						<pubid idtype="doi">10.1073/pnas.97.4.1683</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B20">
				<title>
					<p><it>Thermus thermophilis dnaX </it>homolog encoding gamma- and tau-like proteins of the chromosomal replicase.</p>
				</title>
				<aug>
					<au>
						<snm>Yurieva</snm>
						<fnm>O</fnm>
					</au>
					<au>
						<snm>Skangalis</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Kuriyan</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>O'Donnell</snm>
						<fnm>M</fnm>
					</au>
				</aug>
				<source>J Biol Chem</source>
				<pubdate>1997</pubdate>
				<volume>272</volume>
				<fpage>27131</fpage>
				<lpage>27139</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1074/jbc.272.43.27131</pubid>
						<pubid idtype="pmpid" link="fulltext">9341154</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B21">
				<title>
					<p>DNA polymerase III holoenzyme from <it>Thermus thermophilus </it>identification, expression, purification of components, and use to reconstitute a processive replicase.</p>
				</title>
				<aug>
					<au>
						<snm>Bullard</snm>
						<fnm>JM</fnm>
					</au>
					<au>
						<snm>Williams</snm>
						<fnm>JC</fnm>
					</au>
					<au>
						<snm>Acker</snm>
						<fnm>WK</fnm>
					</au>
					<au>
						<snm>Jacobi</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Janjic</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>McHenry</snm>
						<fnm>CS</fnm>
					</au>
				</aug>
				<source>J Biol Chem</source>
				<pubdate>2002</pubdate>
				<volume>277</volume>
				<fpage>13401</fpage>
				<lpage>13408</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1074/jbc.M110833200</pubid>
						<pubid idtype="pmpid" link="fulltext">11823461</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B22">
				<title>
					<p>The gamma subunit of DNA polymerase III holoenzyme of <it>Escherichia coli </it>is produced by ribosomal frameshifting.</p>
				</title>
				<aug>
					<au>
						<snm>Flower</snm>
						<fnm>AM</fnm>
					</au>
					<au>
						<snm>McHenry</snm>
						<fnm>CS</fnm>
					</au>
				</aug>
				<source>Proc Natl Acad Sci USA</source>
				<pubdate>1990</pubdate>
				<volume>87</volume>
				<fpage>3713</fpage>
				<lpage>3717</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">53973</pubid>
						<pubid idtype="pmpid" link="fulltext">2187190</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B23">
				<title>
					<p>Translational frameshifting generates the gamma subunit of DNA polymerase III holoenzyme.</p>
				</title>
				<aug>
					<au>
						<snm>Tsuchihashi</snm>
						<fnm>Z</fnm>
					</au>
					<au>
						<snm>Kornberg</snm>
						<fnm>A</fnm>
					</au>
				</aug>
				<source>Proc Natl Acad Sci USA</source>
				<pubdate>1990</pubdate>
				<volume>87</volume>
				<fpage>2516</fpage>
				<lpage>2520</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">53720</pubid>
						<pubid idtype="pmpid" link="fulltext">2181440</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B24">
				<title>
					<p>Programmed ribosomal frameshifting generates the <it>Escherichia coli </it>DNA polymerase III gamma subunit from within the tau subunit reading frame.</p>
				</title>
				<aug>
					<au>
						<snm>Blinkowa</snm>
						<fnm>AL</fnm>
					</au>
					<au>
						<snm>Walker</snm>
						<fnm>JR</fnm>
					</au>
				</aug>
				<source>Nucleic Acids Res</source>
				<pubdate>1990</pubdate>
				<volume>18</volume>
				<fpage>1725</fpage>
				<lpage>1729</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">330589</pubid>
						<pubid idtype="pmpid">2186364</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B25">
				<title>
					<p>Programmed ribosomal frameshifting generates the <it>Escherichia coli </it>DNA polymerase III gamma subunit from within the tau subunit reading frame.</p>
				</title>
				<aug>
					<au>
						<snm>Blinkowa</snm>
						<fnm>AL</fnm>
					</au>
					<au>
						<snm>Walker</snm>
						<fnm>JR</fnm>
					</au>
				</aug>
				<source>Nucleic Acids Res</source>
				<pubdate>1990</pubdate>
				<volume>18</volume>
				<fpage>1725</fpage>
				<lpage>1729</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">330589</pubid>
						<pubid idtype="pmpid">2186364</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B26">
				<title>
					<p>Recoding: translational bifurcations in gene expression.</p>
				</title>
				<aug>
					<au>
						<snm>Baranov</snm>
						<fnm>PV</fnm>
					</au>
					<au>
						<snm>Gesteland</snm>
						<fnm>RF</fnm>
					</au>
					<au>
						<snm>Atkins</snm>
						<fnm>JF</fnm>
					</au>
				</aug>
				<source>Gene</source>
				<pubdate>2002</pubdate>
				<volume>286</volume>
				<fpage>187</fpage>
				<lpage>201</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/S0378-1119(02)00423-7</pubid>
						<pubid idtype="pmpid" link="fulltext">11943474</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B27">
				<title>
					<p>Frameshifting by transcriptional slippage is involved in production of MxiE, the transcription activator regulated by the activity of the type III secretion apparatus in <it>Shigella flexneri</it>.</p>
				</title>
				<aug>
					<au>
						<snm>Penno</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Sansonetti</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Parsot</snm>
						<fnm>C</fnm>
					</au>
				</aug>
				<source>Mol Microbiol</source>
				<pubdate>2005</pubdate>
				<inpress/>
			</bibl>
			<bibl id="B28">
				<title>
					<p>Database resources of the National Center for Biotechnology Information: update.</p>
				</title>
				<aug>
					<au>
						<snm>Wheeler</snm>
						<fnm>DL</fnm>
					</au>
					<au>
						<snm>Church</snm>
						<fnm>DM</fnm>
					</au>
					<au>
						<snm>Edgar</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Federhen</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Helmberg</snm>
						<fnm>W</fnm>
					</au>
					<au>
						<snm>Madden</snm>
						<fnm>TL</fnm>
					</au>
					<au>
						<snm>Pontius</snm>
						<fnm>JU</fnm>
					</au>
					<au>
						<snm>Schuler</snm>
						<fnm>GD</fnm>
					</au>
					<au>
						<snm>Schriml</snm>
						<fnm>LM</fnm>
					</au>
					<au>
						<snm>Sequeira</snm>
						<fnm>E</fnm>
					</au>
					<etal/>
				</aug>
				<source>Nucleic Acids Res</source>
				<pubdate>2004</pubdate>
				<fpage>D35</fpage>
				<lpage>D40</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">308807</pubid>
						<pubid idtype="pmpid" link="fulltext">14681353</pubid>
						<pubid idtype="doi">10.1093/nar/gkh073</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B29">
				<title>
					<p>Abundant microsatellite polymorphism in <it>Saccharomyces cerevisiae</it>, and the different distributions of microsatellites in eight prokaryotes and <it>S. cerevisiae</it>, result from strong mutation pressures and a variety of selective forces.</p>
				</title>
				<aug>
					<au>
						<snm>Field</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Wills</snm>
						<fnm>C</fnm>
					</au>
				</aug>
				<source>Proc Natl Acad Sci U S A</source>
				<pubdate>1998</pubdate>
				<volume>95</volume>
				<issue>4</issue>
				<fpage>1647</fpage>
				<lpage>1652</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">19132</pubid>
						<pubid idtype="pmpid" link="fulltext">9465070</pubid>
						<pubid idtype="doi">10.1073/pnas.95.4.1647</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B30">
				<title>
					<p>Transcription termination at intrinsic terminators: the role of the RNA hairpin.</p>
				</title>
				<aug>
					<au>
						<snm>Wilson</snm>
						<fnm>KS</fnm>
					</au>
					<au>
						<snm>von Hippel</snm>
						<fnm>PH</fnm>
					</au>
				</aug>
				<source>Proc Natl Acad Sci USA</source>
				<pubdate>1995</pubdate>
				<volume>92</volume>
				<fpage>8793</fpage>
				<lpage>8797</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">41053</pubid>
						<pubid idtype="pmpid" link="fulltext">7568019</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B31">
				<title>
					<p>Sequences that direct significant levels of frameshifting are frequent in coding regions of <it>Escherichia coli</it>.</p>
				</title>
				<aug>
					<au>
						<snm>Gurvich</snm>
						<fnm>OL</fnm>
					</au>
					<au>
						<snm>Baranov</snm>
						<fnm>PV</fnm>
					</au>
					<au>
						<snm>Zhou</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Hammer</snm>
						<fnm>AW</fnm>
					</au>
					<au>
						<snm>Gesteland</snm>
						<fnm>RF</fnm>
					</au>
					<au>
						<snm>Atkins</snm>
						<fnm>JF</fnm>
					</au>
				</aug>
				<source>EMBO J</source>
				<pubdate>2003</pubdate>
				<volume>22</volume>
				<fpage>5941</fpage>
				<lpage>5950</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">275418</pubid>
						<pubid idtype="pmpid" link="fulltext">14592990</pubid>
						<pubid idtype="doi">10.1093/emboj/cdg561</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B32">
				<title>
					<p>Widespread selection for local RNA secondary structure in coding regions of bacterial genes.</p>
				</title>
				<aug>
					<au>
						<snm>Katz</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Burge</snm>
						<fnm>CB</fnm>
					</au>
				</aug>
				<source>Genome Res</source>
				<pubdate>2003</pubdate>
				<volume>13</volume>
				<fpage>2042</fpage>
				<lpage>2051</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">403678</pubid>
						<pubid idtype="pmpid" link="fulltext">12952875</pubid>
						<pubid idtype="doi">10.1101/gr.1257503</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B33">
				<title>
					<p>Basic local alignment search tool.</p>
				</title>
				<aug>
					<au>
						<snm>Altschul</snm>
						<fnm>SF</fnm>
					</au>
					<au>
						<snm>Gish</snm>
						<fnm>W</fnm>
					</au>
					<au>
						<snm>Miller</snm>
						<fnm>W</fnm>
					</au>
					<au>
						<snm>Myers</snm>
						<fnm>EW</fnm>
					</au>
					<au>
						<snm>Lipman</snm>
						<fnm>DJ</fnm>
					</au>
				</aug>
				<source>J Mol Biol</source>
				<pubdate>1990</pubdate>
				<volume>215</volume>
				<fpage>403</fpage>
				<lpage>410</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1006/jmbi.1990.9999</pubid>
						<pubid idtype="pmpid" link="fulltext">2231712</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B34">
				<title>
					<p>Finding families for genomic ORFans.</p>
				</title>
				<aug>
					<au>
						<snm>Fischer</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Eisenberg</snm>
						<fnm>D</fnm>
					</au>
				</aug>
				<source>Bioinformatics</source>
				<pubdate>1999</pubdate>
				<volume>15</volume>
				<fpage>759</fpage>
				<lpage>762</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1093/bioinformatics/15.9.759</pubid>
						<pubid idtype="pmpid" link="fulltext">10498776</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B35">
				<title>
					<p>The ORFanage: an ORFan database.</p>
				</title>
				<aug>
					<au>
						<snm>Siew</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Azaria</snm>
						<fnm>Y</fnm>
					</au>
					<au>
						<snm>Fischer</snm>
						<fnm>D</fnm>
					</au>
				</aug>
				<source>Nucleic Acids Res</source>
				<pubdate>2004</pubdate>
				<fpage>D281</fpage>
				<lpage>D283</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">308850</pubid>
						<pubid idtype="pmpid" link="fulltext">14681413</pubid>
						<pubid idtype="doi">10.1093/nar/gkh116</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B36">
				<title>
					<p>Selection against frameshift mutations limits microsatellite expansion in coding DNA.</p>
				</title>
				<aug>
					<au>
						<snm>Metzgar</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Bytof</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Wills</snm>
						<fnm>C</fnm>
					</au>
				</aug>
				<source>Genome Res</source>
				<pubdate>2000</pubdate>
				<volume>10</volume>
				<fpage>72</fpage>
				<lpage>80</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">310501</pubid>
						<pubid idtype="pmpid" link="fulltext">10645952</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B37">
				<title>
					<p>The genome sequence of the food-borne pathogen <it>Campylobacter jejuni </it>reveals hypervariable sequences.</p>
				</title>
				<aug>
					<au>
						<snm>Parkhill</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Wren</snm>
						<fnm>BW</fnm>
					</au>
					<au>
						<snm>Mungall</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Ketley</snm>
						<fnm>JM</fnm>
					</au>
					<au>
						<snm>Churcher</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Basham</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Chillingworth</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Davies</snm>
						<fnm>RM</fnm>
					</au>
					<au>
						<snm>Feltwell</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Holroyd</snm>
						<fnm>S</fnm>
					</au>
					<etal/>
				</aug>
				<source>Nature</source>
				<pubdate>2000</pubdate>
				<volume>403</volume>
				<fpage>665</fpage>
				<lpage>668</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1038/35001088</pubid>
						<pubid idtype="pmpid" link="fulltext">10688204</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B38">
				<title>
					<p>Whole genome sequencing of meticillin-resistant <it>Staphylococcus aureus</it>.</p>
				</title>
				<aug>
					<au>
						<snm>Kuroda</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Ohta</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Uchiyama</snm>
						<fnm>I</fnm>
					</au>
					<au>
						<snm>Baba</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Yuzawa</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Kobayashi</snm>
						<fnm>I</fnm>
					</au>
					<au>
						<snm>Cui</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Oguchi</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Aoki</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Nagai</snm>
						<fnm>Y</fnm>
					</au>
					<etal/>
				</aug>
				<source>Lancet</source>
				<pubdate>2001</pubdate>
				<volume>357</volume>
				<fpage>1225</fpage>
				<lpage>1240</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/S0140-6736(00)04403-2</pubid>
						<pubid idtype="pmpid" link="fulltext">11418146</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B39">
				<title>
					<p>The <it>Staphylococcus aureus </it>Map protein is an immunomodulator that interferes with T cell-mediated responses.</p>
				</title>
				<aug>
					<au>
						<snm>Lee</snm>
						<fnm>LY</fnm>
					</au>
					<au>
						<snm>Miyamoto</snm>
						<fnm>YJ</fnm>
					</au>
					<au>
						<snm>McIntyre</snm>
						<fnm>BW</fnm>
					</au>
					<au>
						<snm>Hook</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>McCrea</snm>
						<fnm>KW</fnm>
					</au>
					<au>
						<snm>McDevitt</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Brown</snm>
						<fnm>EL</fnm>
					</au>
				</aug>
				<source>J Clin Invest</source>
				<pubdate>2002</pubdate>
				<volume>110</volume>
				<fpage>1461</fpage>
				<lpage>1471</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">151818</pubid>
						<pubid idtype="pmpid" link="fulltext">12438444</pubid>
						<pubid idtype="doi">10.1172/JCI200216318</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B40">
				<title>
					<p>Global analysis of the <it>Deinococcus radiodurans </it>proteome by using accurate mass tags.</p>
				</title>
				<aug>
					<au>
						<snm>Lipton</snm>
						<fnm>MS</fnm>
					</au>
					<au>
						<snm>Pasa-Tolic</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Anderson</snm>
						<fnm>GA</fnm>
					</au>
					<au>
						<snm>Anderson</snm>
						<fnm>DJ</fnm>
					</au>
					<au>
						<snm>Auberry</snm>
						<fnm>DL</fnm>
					</au>
					<au>
						<snm>Battista</snm>
						<fnm>JR</fnm>
					</au>
					<au>
						<snm>Daly</snm>
						<fnm>MJ</fnm>
					</au>
					<au>
						<snm>Fredrickson</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Hixson</snm>
						<fnm>KK</fnm>
					</au>
					<au>
						<snm>Kostandarithes</snm>
						<fnm>H</fnm>
					</au>
					<etal/>
				</aug>
				<source>Proc Natl Acad Sci USA</source>
				<pubdate>2002</pubdate>
				<volume>99</volume>
				<fpage>11049</fpage>
				<lpage>11054</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">129300</pubid>
						<pubid idtype="pmpid" link="fulltext">12177431</pubid>
						<pubid idtype="doi">10.1073/pnas.172170199</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B41">
				<title>
					<p>Translational frameshifting in the control of transposition in bacteria.</p>
				</title>
				<aug>
					<au>
						<snm>Chandler</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Fayet</snm>
						<fnm>O</fnm>
					</au>
				</aug>
				<source>Mol Microbiol</source>
				<pubdate>1993</pubdate>
				<volume>7</volume>
				<fpage>497</fpage>
				<lpage>503</lpage>
				<xrefbib>
					<pubid idtype="pmpid">8384687</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B42">
				<title>
					<p>Bacterial insertion sequences.</p>
				</title>
				<aug>
					<au>
						<snm>Ohtsubo</snm>
						<fnm>F</fnm>
					</au>
					<au>
						<snm>Sekine</snm>
						<fnm>Y</fnm>
					</au>
				</aug>
				<source>Curr Top Microbiol Immunol</source>
				<pubdate>1996</pubdate>
				<volume>204</volume>
				<fpage>1</fpage>
				<lpage>26</lpage>
				<xrefbib>
					<pubid idtype="pmpid">8556862</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B43">
				<title>
					<p>Programmed Alternative Reading of the Genetic Code</p>
				</title>
				<aug>
					<au>
						<snm>Farabaugh</snm>
						<fnm>PJ</fnm>
					</au>
				</aug>
				<publisher>Georgetown, TX: R.G. Landes Co</publisher>
				<pubdate>1997</pubdate>
			</bibl>
			<bibl id="B44">
				<title>
					<p>Insertion sequences.</p>
				</title>
				<aug>
					<au>
						<snm>Mahillon</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Chandler</snm>
						<fnm>M</fnm>
					</au>
				</aug>
				<source>Microbiol Mol Biol Rev</source>
				<pubdate>1998</pubdate>
				<volume>62</volume>
				<fpage>725</fpage>
				<lpage>774</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">98933</pubid>
						<pubid idtype="pmpid" link="fulltext">9729608</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B45">
				<title>
					<p>Complete DNA sequence of yeast chromosome XI.</p>
				</title>
				<aug>
					<au>
						<snm>Dujon</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Alexandraki</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Andre</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Ansorge</snm>
						<fnm>W</fnm>
					</au>
					<au>
						<snm>Baladron</snm>
						<fnm>V</fnm>
					</au>
					<au>
						<snm>Ballesta</snm>
						<fnm>JP</fnm>
					</au>
					<au>
						<snm>Banrevi</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Bolle</snm>
						<fnm>PA</fnm>
					</au>
					<au>
						<snm>Bolotin-Fukuhara</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Bossier</snm>
						<fnm>P</fnm>
					</au>
					<etal/>
				</aug>
				<source>Nature</source>
				<pubdate>1994</pubdate>
				<volume>369</volume>
				<fpage>371</fpage>
				<lpage>378</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1038/369371a0</pubid>
						<pubid idtype="pmpid" link="fulltext">8196765</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B46">
				<title>
					<p>Life with 6000 genes.</p>
				</title>
				<aug>
					<au>
						<snm>Goffeau</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Barrell</snm>
						<fnm>BG</fnm>
					</au>
					<au>
						<snm>Bussey</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Davis</snm>
						<fnm>RW</fnm>
					</au>
					<au>
						<snm>Dujon</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Feldmann</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Galibert</snm>
						<fnm>F</fnm>
					</au>
					<au>
						<snm>Hoheisel</snm>
						<fnm>JD</fnm>
					</au>
					<au>
						<snm>Jacq</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Johnston</snm>
						<fnm>M</fnm>
					</au>
					<etal/>
				</aug>
				<source>Science</source>
				<pubdate>1996</pubdate>
				<volume>274</volume>
				<fpage>546, 563</fpage>
				<lpage>547</lpage>
				<xrefbib>
					<pubid idtype="doi">10.1126/science.274.5287.546</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B47">
				<title>
					<p><it>Escherichia coli ykfE </it>ORFan gene encodes a potent inhibitor of C-type lysozyme.</p>
				</title>
				<aug>
					<au>
						<snm>Monchois</snm>
						<fnm>V</fnm>
					</au>
					<au>
						<snm>Abergel</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Sturgis</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Jeudy</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Claverie</snm>
						<fnm>JM</fnm>
					</au>
				</aug>
				<source>J Biol Chem</source>
				<pubdate>2001</pubdate>
				<volume>276</volume>
				<fpage>18437</fpage>
				<lpage>18441</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1074/jbc.M010297200</pubid>
						<pubid idtype="pmpid" link="fulltext">11278658</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B48">
				<title>
					<p>Bacterial genomes as new gene homes: the genealogy of ORFans in <it>E. coli</it>.</p>
				</title>
				<aug>
					<au>
						<snm>Daubin</snm>
						<fnm>V</fnm>
					</au>
					<au>
						<snm>Ochman</snm>
						<fnm>H</fnm>
					</au>
				</aug>
				<source>Genome Res</source>
				<pubdate>2004</pubdate>
				<volume>14</volume>
				<fpage>1036</fpage>
				<lpage>1042</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">419781</pubid>
						<pubid idtype="pmpid" link="fulltext">15173110</pubid>
						<pubid idtype="doi">10.1101/gr.2231904</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B49">
				<title>
					<p>Twenty thousand ORFan microbial protein families for the biologist?</p>
				</title>
				<aug>
					<au>
						<snm>Siew</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Fischer</snm>
						<fnm>D</fnm>
					</au>
				</aug>
				<source>Structure</source>
				<pubdate>2003</pubdate>
				<volume>11</volume>
				<fpage>7</fpage>
				<lpage>9</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/S0969-2126(02)00938-3</pubid>
						<pubid idtype="pmpid" link="fulltext">12517334</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B50">
				<title>
					<p>Tetrameric repeat units associated with virulence factor phase variation in <it>Haemophilus </it>also occur in <it>Neisseria </it>spp. and <it>Moraxella catarrhalis</it>.</p>
				</title>
				<aug>
					<au>
						<snm>Peak</snm>
						<fnm>IR</fnm>
					</au>
					<au>
						<snm>Jennings</snm>
						<fnm>MP</fnm>
					</au>
					<au>
						<snm>Hood</snm>
						<fnm>DW</fnm>
					</au>
					<au>
						<snm>Bisercic</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Moxon</snm>
						<fnm>ER</fnm>
					</au>
				</aug>
				<source>FEMS Microbiol Lett</source>
				<pubdate>1996</pubdate>
				<volume>137</volume>
				<fpage>109</fpage>
				<lpage>114</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/0378-1097(96)00048-1</pubid>
						<pubid idtype="pmpid" link="fulltext">8935664</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B51">
				<title>
					<p>Compositional biases of bacterial genomes and evolutionary implications.</p>
				</title>
				<aug>
					<au>
						<snm>Karlin</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Mrazek</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Campbell</snm>
						<fnm>AM</fnm>
					</au>
				</aug>
				<source>J Bacteriol</source>
				<pubdate>1997</pubdate>
				<volume>179</volume>
				<fpage>3899</fpage>
				<lpage>3913</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">179198</pubid>
						<pubid idtype="pmpid" link="fulltext">9190805</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B52">
				<title>
					<p>Phenotypic variation <it>in Haemophilus influenzae</it>: the interrelationship of colony opacity, capsule and lipopolysaccharide.</p>
				</title>
				<aug>
					<au>
						<snm>Roche</snm>
						<fnm>RJ</fnm>
					</au>
					<au>
						<snm>Moxon</snm>
						<fnm>ER</fnm>
					</au>
				</aug>
				<source>Microb Pathog</source>
				<pubdate>1995</pubdate>
				<volume>18</volume>
				<fpage>129</fpage>
				<lpage>140</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmpid">7643742</pubid>
						<pubid idtype="doi">10.1016/S0882-4010(95)90117-5</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B53">
				<title>
					<p>NCBI ftp site</p>
				</title>
				<url>ftp://ftp.ncbi.nlm.nih.gov/genomes/Bacteria/</url>
			</bibl>
			<bibl id="B54">
				<title>
					<p>The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools.</p>
				</title>
				<aug>
					<au>
						<snm>Thompson</snm>
						<fnm>JD</fnm>
					</au>
					<au>
						<snm>Gibson</snm>
						<fnm>TJ</fnm>
					</au>
					<au>
						<snm>Plewniak</snm>
						<fnm>F</fnm>
					</au>
					<au>
						<snm>Jeanmougin</snm>
						<fnm>F</fnm>
					</au>
					<au>
						<snm>Higgins</snm>
						<fnm>DG</fnm>
					</au>
				</aug>
				<source>Nucleic Acids Res</source>
				<pubdate>1997</pubdate>
				<volume>25</volume>
				<fpage>4876</fpage>
				<lpage>4882</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">147148</pubid>
						<pubid idtype="pmpid" link="fulltext">9396791</pubid>
						<pubid idtype="doi">10.1093/nar/25.24.4876</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
		</refgrp>
	</bm>
</art>
