<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
	<ui>gb-2004-6-1-r10</ui>
	<ji>GBJ</ji>
	<fm>
		<dochead>Method</dochead>
		<bibl>
			<title>
				<p>Microarray-based resequencing of multiple <it>Bacillus anthracis </it>isolates</p>
			</title>
			<aug>
				<au id="A1" ca="yes">
					<snm>Zwick</snm>
					<mi>E</mi>
					<fnm>Michael</fnm>
					<insr iid="I1"/>
					<insr iid="I2"/>
					<email>mzwick@genetics.emory.edu</email>
				</au>
				<au id="A2">
					<snm>Mcafee</snm>
					<fnm>Farrell</fnm>
					<insr iid="I1"/>
					<email>mcafeef@nmrc.navy.mil</email>
				</au>
				<au id="A3">
					<snm>Cutler</snm>
					<mi>J</mi>
					<fnm>David</fnm>
					<insr iid="I3"/>
					<email>dcutler@jhmi.edu</email>
				</au>
				<au id="A4">
					<snm>Read</snm>
					<mi>D</mi>
					<fnm>Timothy</fnm>
					<insr iid="I1"/>
					<email>readt@nmrc.navy.mil</email>
				</au>
				<au id="A5">
					<snm>Ravel</snm>
					<fnm>Jacques</fnm>
					<insr iid="I4"/>
					<email>jravel@tigr.org</email>
				</au>
				<au id="A6">
					<snm>Bowman</snm>
					<mi>R</mi>
					<fnm>Gregory</fnm>
					<insr iid="I1"/>
					<email>grb22@cornell.edu</email>
				</au>
				<au id="A7">
					<snm>Galloway</snm>
					<mi>R</mi>
					<fnm>Darrell</fnm>
					<insr iid="I1"/>
					<email>gallowayd@nmrc.navy.mil</email>
				</au>
				<au id="A8">
					<snm>Mateczun</snm>
					<fnm>Alfred</fnm>
					<insr iid="I1"/>
					<email>mateczuna@nmrc.navy.mil</email>
				</au>
			</aug>
			<insg>
				<ins id="I1">
					<p>Biological Defense Research Directorate, Naval Medical Research Center, 503 Robert Grant Avenue, Silver Spring, MD 20910, USA</p>
				</ins>
				<ins id="I2">
					<p>Department of Human Genetics, Emory University School of Medicine, Atlanta, GA 30322, USA</p>
				</ins>
				<ins id="I3">
					<p>McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, 733 North Broadway, Baltimore, MD 21205, USA</p>
				</ins>
				<ins id="I4">
					<p>The Institute for Genomic Research, 9712 Medical Center Drive, Rockville, MD 20850, USA</p>
				</ins>
			</insg>
			<source>Genome Biology</source>
			<issn>1465-6906</issn>
			<pubdate>2004</pubdate>
			<volume>6</volume>
			<issue>1</issue>
			<fpage>R10</fpage>
			<url>http://genomebiology.com/2004/6/1/R10</url>
			<xrefbib>
				<pubidlist><pubid idtype="pmpid">15642093</pubid><pubid idtype="doi">10.1186/gb-2004-6-1-r10</pubid>
				</pubidlist></xrefbib>
		</bibl>
		<history>
			<rec>
				<date>
					<day>26</day>
					<month>7</month>
					<year>2004</year>
				</date>
			</rec>
			<revrec>
				<date>
					<day>18</day>
					<month>10</month>
					<year>2004</year>
				</date>
			</revrec>
			<acc>
				<date>
					<day>19</day>
					<month>11</month>
					<year>2004</year>
				</date>
			</acc>
			<pub>
				<date>
					<day>17</day>
					<month>12</month>
					<year>2004</year>
				</date>
			</pub>
		</history>
		<cpyrt>
			<year>2004</year>
			<collab>Zwick et al.; licensee BioMed Central Ltd.</collab>
			<note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
		</cpyrt>
		<shorttitle>
			<p>Resequencing of multiple <it>Bacillus </it>isolates</p>
		</shorttitle>
		<shortabs>
			<p>Custom-designed resequencing arrays were used to generate 3.1 Mb of genomic sequence from a panel of 56 <it>Bacillus anthracis </it>strains. Sequence quality was shown to be very high by replication and by comparison to independently generated shotgun sequence</p>
		</shortabs>
		<abs>
			<sec>
				<st>
					<p>Abstract</p>
				</st>
				<p>We used custom-designed resequencing arrays to generate 3.1 Mb of genomic sequence from a panel of 56 <it>Bacillus anthracis </it>strains. Sequence quality was shown to be very high by replication (discrepancy rate of 7.4 &#215; 10<sup>-7</sup>) and by comparison to independently generated shotgun sequence (discrepancy rate &lt; 2.5 &#215; 10<sup>-6</sup>). Population genomics studies of microbial pathogens using rapid resequencing technologies such as resequencing arrays are critical for recognizing newly emerging or genetically engineered strains.</p>
			</sec>
		</abs>
	</fm>
	<meta>
		<classifications>
			<classification type="BMC" subtype="man_spc_id" id="30010014">Microbiology and parasitology</classification>
			<classification type="BMC" subtype="man_spc_id" id="30010010">Genome studies</classification>
			<classification type="BMC" subtype="man_spc_id" id="30010013">Methods</classification>
		</classifications>
	</meta>
	<bdy>
		<sec>
			<st>
				<p>Background</p>
			</st>
			<p>Population genomics, the study of genome-wide patterns of genetic variation in a large number of organisms, is emerging as a vigorous new field of study <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B2">2</abbr><abbr bid="B3">3</abbr></abbrgrp>. Rapid, accurate and inexpensive resequencing could enable a variety of potential applications and studies. For the biowarfare (BW) pathogen, <it>Bacillus anthracis</it>, genomic sequences from multiple strains and non-pathogenic close relatives could aid studies that definitively identify <it>B. anthracis </it>in environmental and clinical samples, determine forensic attribution and phylogenetic relationships of strains, and uncover the genetic basis of phenotypic variation in traits such as mammalian virulence. Moreover, first recognizing the presence of a novel pathogen, and then attempting the difficult task of discerning between novel naturally occurring pathogenic organisms (for instance <it>Bacillus cereus </it>G9241 <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>) and artificially enhanced bacterial pathogens, requires a thorough knowledge of extant patterns and levels of genetic variation in natural populations. Unusual patterns of genetic variation may serve as evidence aiding the detection of these unusual types of pathogens.</p>
			<p>The current technological model for genome sequencing employs high-throughput shotgun sequencing at large centers. This highly successful enterprise has completed about 200 bacterial genomes with more than 500 ongoing as of July 2004 <abbrgrp><abbr bid="B5">5</abbr></abbrgrp>. The genome sequences of the <it>B. anthracis </it>Ames chromosome (5.2 Mb, NC_003997) and plasmids pXO1 (181.6 kilobases (kb), NC_001496) and pXO2 (96.2 kb, NC_002146) have been determined <abbrgrp><abbr bid="B6">6</abbr><abbr bid="B7">7</abbr><abbr bid="B8">8</abbr></abbrgrp>, as have the genomes of three near neighbors, <it>B. cereus </it>ATCC 14579 <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>, <it>B. cereus </it>ATCC 10987 <abbrgrp><abbr bid="B10">10</abbr></abbrgrp> and <it>B. cereus </it>G9241 <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>. A strain of <it>B. anthracis </it>Ames strain isolated from a victim of the autumn 2001 bioterror attack in Florida was also sequenced to a high level of coverage using the random shotgun method and compared to the Ames sequence to identify 60 new markers that included single nucleotide polymorphisms (SNPs), inserted or deleted sequences, and tandem repeats <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>. The success of this effort has led to an extensive phylogeny-based whole-genome shotgun resequencing effort in <it>B. anthracis </it>(reported by <abbrgrp><abbr bid="B12">12</abbr></abbrgrp>). Whole-genome shotgun studies are increasingly being used to explore variation among more closely related bacterial strains <abbrgrp><abbr bid="B13">13</abbr><abbr bid="B14">14</abbr><abbr bid="B15">15</abbr></abbrgrp>. However, the relatively high costs of these efforts have limited the extent of their application.</p>
			<p>Numerous molecular methods for genotyping <it>B. anthracis </it>and near neighbors of the <it>Bacillus cereus sensu lato </it>group <abbrgrp><abbr bid="B16">16</abbr></abbrgrp> have been developed and successfully employed in a wide variety of studies. These include DNA sequence surveys from one or a few number of loci <abbrgrp><abbr bid="B17">17</abbr><abbr bid="B18">18</abbr><abbr bid="B19">19</abbr><abbr bid="B20">20</abbr><abbr bid="B21">21</abbr></abbrgrp>, repetitive element polymorphism-PCR <abbrgrp><abbr bid="B22">22</abbr><abbr bid="B23">23</abbr></abbrgrp> and amplified fragment length polymorphisms (AFLP) <abbrgrp><abbr bid="B24">24</abbr><abbr bid="B25">25</abbr><abbr bid="B26">26</abbr><abbr bid="B27">27</abbr></abbrgrp>. However, because of the relative paucity of genetic variation between isolates <abbrgrp><abbr bid="B28">28</abbr></abbrgrp>, the most effective method for subtyping <it>B. anthracis </it>has employed multiple locus variable number of tandem repeats analysis (MLVA) <abbrgrp><abbr bid="B29">29</abbr><abbr bid="B30">30</abbr><abbr bid="B31">31</abbr></abbrgrp>. Similar to the mammalian short tandem repeat methodology, MLVA determines strain phylogenetic relationships based on a relatively few, highly variable genomic repeat regions. While being relatively rapid and inexpensive, a key limitiation of MLVA lies in its exclusive focus on loci with common alleles that are differentiated by size. Because of the relatively rapid mutational process generating variation at these loci, similarly sized markers may have different evolutionary origins.</p>
			<p>Clearly, a method for rapid, inexpensive genome resequencing of bacterial strains would be of great benefit for genotyping, forensics and studies of the genetic basis of strain phenotypic variation. Developing DNA-based biodetection assays depends upon prior knowledge of patterns of genetic variation within and between bacterial species. It would be ideal to enable technologies that could combine the high information content of whole-genome resequencing of strains while also being rapid and inexpensive like MLVA, AFLP and multi-locus sequence typing (MLST). Furthermore, while conventional strain typing methodologies have focused on the utility of common variants, rare variants may prove to be especially informative for forensic applications.</p>
			<p>High-density oligonucleotide resequencing microarrays are a highly parallel technology that can enable the rapid identification of DNA sequence variants with minimal laboratory effort and infrastructure <abbrgrp><abbr bid="B32">32</abbr><abbr bid="B33">33</abbr></abbrgrp>. Previous applications of microarrays on bacterial genomes <abbrgrp><abbr bid="B34">34</abbr><abbr bid="B35">35</abbr></abbrgrp> or small eukaryotic genomes like yeast <abbrgrp><abbr bid="B36">36</abbr><abbr bid="B37">37</abbr></abbrgrp>, focused on methods that scanned specific genes or a genomic region for genetic variants. Initial high-throughput microarray applications in the human genome for SNP discovery <abbrgrp><abbr bid="B38">38</abbr><abbr bid="B39">39</abbr><abbr bid="B40">40</abbr></abbrgrp> were successful, but also reported that between 12% and 45% of the detected variants were false. Subsequent experimental improvements and the development of the ABACUS algorithm/software package <abbrgrp><abbr bid="B32">32</abbr></abbrgrp> significantly reduced SNP false-positive ascertainment, radically improved genotype calling and automatically assigned quality scores to each genotype call. These fundamental advances enabled rapid resequencing of 40 human genomic regions <abbrgrp><abbr bid="B32">32</abbr><abbr bid="B41">41</abbr></abbrgrp> and ABACUS is now the standard application for microarray-based resequencing.</p>
			<p>Here we present the first microarray-based high-throughput resequencing of a large collection of <it>B. anthracis </it>isolates. Our study first reaffirms, and then directly demonstrates that the quality of microarray-generated DNA sequence data is directly comparable to that produced by conventional shotgun sequencing. We then estimate the levels of genetic variation in the annotated genomic regions we resequenced, characterize the frequency spectrum of DNA sequence variants we observe, and finally explore patterns of linkage disequilibrium and recombination among those variants. Because of the scalability and minimal effort associated with microarray-based resequencing, our work demonstrates the possibility of a rapid and cost-effective method of genome resequencing that could be applied to both environmental, and ultimately clinical specimens.</p>
		</sec>
		<sec>
			<st>
				<p>Results</p>
			</st>
			<sec>
				<st>
					<p>Resequencing <it>B. anthracis </it>with microarrays</p>
				</st>
				<p>A panel of 56 <it>B. anthracis </it>strains from the Biological Defense Research Directorate's strain collection (see Additional data file 1) was resequenced using Affymetrix resequencing arrays (RAs) and base calls determined using the ABACUS software package <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>. Each RA was capable of resequencing 29,212 base-pairs (bp) or about 0.5% of the <it>B. anthracis </it>genome from a single isolate sample (see Additional data file 2). Long PCR sample preparation and chip processing was conducted for 118 RAs. Analysis of these 118 RAs with the ABACUS software package shows that 115 are successful (97.5%). Experimental failure occurs when less than 60% of the total possible bases fail to achieve quality scores exceeding the ABACUS user-defined threshold. For this study, the total threshold was set at 31 and a strand minimum of -2 <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>, as determined from analysis of a replication experiment described below.</p>
				<p>The 115 successful RAs call 92.6% of the possible bases (3,109,539 bp out of a total possible of 3,359,380 bp). Figure <figr fid="F1">1</figr> shows the distribution of quality scores across all 3,359,380 base calls. Amplicon failure, typically arising from long PCR (LPCR) failure, accounts for 1.1% of the uncalled bases. The remaining base-calling failure (6.3%) consists of features on the RAs that fail to generate quality scores exceeding the experimental threshold.</p>
				<fig id="F1">
					<title>
						<p>Figure 1</p>
					</title>
					<caption>
						<p>ABACUS quality scores for base calls in <it>B. anthracis</it></p>
					</caption>
					<text>
						<p>ABACUS quality scores for base calls in <it>B. anthracis</it>. A quality score measures the difference, in log<sub>10 </sub>units, between the likelihood support level for the best base-call model minus that for the second-best model [32]. Of the bases, 92.6% possess quality scores that exceeded the threshold (31) used for this study.</p>
					</text>
					<graphic file="gb-2004-6-1-r10-1"/>
				</fig>
				<p>Previous results demonstrated that base-calling failure was concentrated among RA oligonucleotide probes containing multiple purines. Purine-rich probes were observed to have lower hybridization intensities at identical positions across multiple RAs. Guanine-rich probes, in particular, showed the greatest reduction in hybridization intensity (see Figure 6 in <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>). Consequently, total quality scores at these sites frequently failed to exceed the quality-score threshold and they remained uncalled. To determine if probe sequence composition, specifically purine and guanine content, contributed to the 6.3% of bases not called, the sequence composition of the purine-rich oligonucleotide probes at 4,209 sites successfully called on all 115 RAs (484,035 total sites) was compared to that at the 886 sites that failed to be called on any RA (101,890 total sites). These failed sites account for 3.0% of the total base calling failure in the experiment. Uncalled sites are composed of oligonucleotide probes with a significantly higher purine composition (<it>P </it>&lt; 10<sup>-22</sup>). A similar pattern is detected if we limit our analysis to guanine-rich probes (<it>P </it>&lt; 10<sup>-9</sup>). This latter result is surprising given that the <it>B. anthracis </it>genomic sequences we examined have a low G+C content (~34%). Nevertheless, these analyses demonstrate that both purine-rich and guanine-rich oligonucleotide probes are significantly more likely to fail to generate quality scores exceeding the experimental threshold.</p>
			</sec>
			<sec>
				<st>
					<p>Assessing microarray resequencing data quality</p>
				</st>
				<p>Building on the recognition of the importance of automated algorithms to assess data quality <abbrgrp><abbr bid="B42">42</abbr><abbr bid="B43">43</abbr></abbrgrp>, we used two methods to assess the quality of microarray resequencing data <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>. The first consisted of a replicate experiment where 51 samples were independently hybridized on 102 RAs. A parameter search that optimized the percentage of called bases, while minimizing the number of discrepancies between replicates was then performed. A total of 1,489,812 bases could have been called in each replicate experiment. At the optimal parameter values (total threshold of 31, strand minimum of -2 see Cutler <it>et al. </it><abbrgrp><abbr bid="B32">32</abbr></abbrgrp>), 90.6% (1,349,178) of sites are called in both replicates. Other parameter values provide similar levels of base calling and discrepancy rates. The optimal parameter values are similar to those previously used by Cutler <it>et al. </it><abbrgrp><abbr bid="B32">32</abbr></abbrgrp>. Of the bases called in both replicates, 1,349,177 are called identically. Only one site is called differently. This corresponds to a replication discrepancy rate of 7.4 &#215; 10<sup>-7 </sup>(Table <tblr tid="T1">1</tblr>). If repeatability could be related to accuracy, then this level of repeatability would correspond to a phred score of at least 61 <abbrgrp><abbr bid="B42">42</abbr><abbr bid="B43">43</abbr></abbrgrp>. This calculation assumes that the discrepancy rate corresponds to a binomial error probability of <it>P</it>, where phred = -10 log<sub>10</sub><it>P</it>. These replication levels and discrepancy rates are consistent with those previously reported <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>, providing further evidence for the ability of RAs analyzed with ABACUS to produce highly replicable data.</p>
				<tbl id="T1" hint_layout="single">
					<title>
						<p>Table 1</p>
					</title>
					<caption>
						<p>Assessing microarray resequencing data quality</p>
					</caption>
					<tblbdy cols="2">
						<r>
							<c ca="left">
								<p>
									<b>Replication experiment</b>
								</p>
							</c>
							<c>
								<p/>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Total number of bases called in replicate 1</p>
							</c>
							<c ca="center">
								<p>1,383,229</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Total number of bases called in replicate 2</p>
							</c>
							<c ca="center">
								<p>1,373,905</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Total number of bases called in both replicates</p>
							</c>
							<c ca="center">
								<p>1,349,177</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Total number of bases called differently</p>
							</c>
							<c ca="center">
								<p>1</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Replication experiment discrepancy rate</p>
							</c>
							<c ca="center">
								<p>7.4E-07</p>
							</c>
						</r>
						<r>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>
									<b>Accuracy estimation experiment</b>
								</p>
							</c>
							<c>
								<p/>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Total number of bases called identically</p>
							</c>
							<c ca="center">
								<p>398,452</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Total number of bases called differently</p>
							</c>
							<c ca="center">
								<p>15</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Accuracy experiment discrepancy rate</p>
							</c>
							<c ca="center">
								<p>3.8E-05</p>
							</c>
						</r>
					</tblbdy>
				</tbl>
				<p>While RA data is highly replicable, repeated systematic errors would not be detected in a replicate experiment. To obtain an independent estimate of RA sequence accuracy, we compared the sequence data from 30 RAs where the same <it>B. anthracis </it>strain had been sequenced using the random shotgun approach and deposited in GenBank (<it>B. anthracis</it>: strain Ames, NC_003997 <abbrgrp><abbr bid="B8">8</abbr></abbrgrp>, Vollum, NZ_AAEP00000000, 4 June 2004 update, strain Australia 94, NZ_AAES00000000, 7 June 2004 update, strain Kruger B NZ_AAEQ00000000, 7 June 2004 update (J Ravel, DA Rasko, MF Shumway, L Jiang, RZ Cer, NB Federova, M Wilson, S Stanley, S Decker, TD Read, <it>et al.</it>, unpublished work). In a comparison of 398,467 bp of RA- and shotgun-generated sequence, we observed 15 discrepancies occurring at six sites. This corresponds to a discrepancy rate of 3.8 &#215; 10<sup>-5</sup>. If we make the conservative assumption that all discrepancies lie in the RA-generated sequence, this level of accuracy would correspond to a phred score of at least 44.</p>
				<p>To determine if this conservative assumption is warranted, we examined in greater detail the nature of the RA/shotgun sequence discrepancies. Five of the discrepant sites, accounting for 10 discrepancies total (twofold RA replication at each site), were found in Kruger B strain sequences. The one remaining site, accounting for five discrepancies (fivefold RA replication at this site), was found in Vollum strain sequences. At all 15 discrepancies, the RA called a base identical to the Ames reference sequence <abbrgrp><abbr bid="B8">8</abbr></abbrgrp>, while the Kruger B/Vollum shotgun sequence called a new SNP. The fact that the shotgun sequence called a SNP at every discrepancy was surprising, leading us to examine more closely the level of shotgun coverage and assembly at each discrepant site. A comparison of the latest shotgun assembly of the Kruger B strain (J Ravel, <it>et al.</it>, unpublished work) with the RA Kruger B strain base calls agreed with the RA base calls. The latest Vollum shotgun assembly (J Ravel, <it>et al.</it>, unpublished work) still disagreed at the one site (five discrepancies total), but this discrepancy was based on a single shotgun sequencing read with a phred score of 7 at the discrepant base. Clearly, the shotgun coverage lacks sufficient depth at this site to make a reliable base call and it seems far more likely that the fivefold RA base call is correct. Hence, the RA sequence data has less than one discrepancy per 398,467 bases called, or a discrepancy rate of &lt; 2.5 &#215; 10<sup>-6 </sup>(Table <tblr tid="T1">1</tblr>). This observed level of sequencing accuracy corresponds to a phred score of 56. These data demonstrate that our conservative assumption is not warranted. Resequencing array data quality from a single experiment matches, and in some cases perhaps exceeds, that obtained by multiple DNA sequencing reads using conventional DNA sequencing technologies <abbrgrp><abbr bid="B42">42</abbr><abbr bid="B43">43</abbr></abbrgrp>.</p>
			</sec>
			<sec>
				<st>
					<p>Patterns and levels of genetic variation in <it>B. anthracis</it></p>
				</st>
				<p>We identify 37 SNPs among 56 <it>B. anthracis </it>strains. The SNP location, base-call, and position relative to the respective GenBank reference sequences <abbrgrp><abbr bid="B6">6</abbr><abbr bid="B7">7</abbr><abbr bid="B8">8</abbr></abbrgrp> are contained in Additional data file 3. Twenty-four of the 37 SNPs, including two singletons, were independently confirmed in identical strains where whole-genome random shotgun sequence was available (A0039, A4088 and A0442 in Additional data file 1 (J Ravel, <it>et al.</it>, unpublished work)). Of the remaining 13 SNPs not independently verified by The Institute of Genomic Research (TIGR), 11 were seen only once in our collection of strains and two SNPs were seen three times.</p>
				<p>Population genetic inference typically assumes that study samples are selected without prior knowledge of their patterns of genetic variation. For this study, we selected diverse strains from widely distant geographic regions in an attempt to sample the full extent of genetic variation in <it>B. anthracis</it>. The number of SNPs identified, the amount of sequence generated and the nucleotide diversity <abbrgrp><abbr bid="B44">44</abbr></abbrgrp> of the 56 strains is contained in Table <tblr tid="T2">2</tblr>. We performed analyses for sequences comprising the total dataset, for each genomic region separately, and for the total dataset with each resequenced base assigned into an annotated SNP class. We report three main findings. First, the total average level of DNA sequence variation in <it>B. anthracis </it>is very low. This finding is in agreement with previous studies <abbrgrp><abbr bid="B11">11</abbr><abbr bid="B28">28</abbr></abbrgrp>. This level of genetic variation is much lower than that seen in commonly studied bacterial species <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>, roughly half of that observed in the human genome and 25-fold lower than that observed in <it>D. melanogaster </it><abbrgrp><abbr bid="B38">38</abbr><abbr bid="B39">39</abbr><abbr bid="B45">45</abbr><abbr bid="B46">46</abbr><abbr bid="B47">47</abbr><abbr bid="B48">48</abbr></abbrgrp>. Second, the <it>B. anthracis </it>chromosome appears less variable than either the pXO1 or pXO2 plasmids, although this difference is not statistically significant. Third, the patterns of genetic variation by SNP class (see Table <tblr tid="T2">2</tblr> and Additional data file 4) are similar to that seen in other well studied bacterial <abbrgrp><abbr bid="B14">14</abbr></abbrgrp> and eukaryotic genomes <abbrgrp><abbr bid="B45">45</abbr></abbrgrp>. Silent sites, those sites that when mutated do not alter the protein primary structure, are significantly more variable than are amino acid altering replacement sites (<it>P </it>= 0.0011). Intergenic regions are observed to have intermediate levels of genetic variation, whereas replacement sites, those sites that when mutated alter the protein primary structure, are the least variable. Replacement sites are marginally significantly less variable than intergenic sites (<it>P </it>= 0.039) whereas silent sites are not significantly more variable than intergenic sites (<it>P </it>= 0.22).</p>
				<tbl id="T2" hint_layout="double">
					<title>
						<p>Table 2</p>
					</title>
					<caption>
						<p>Observed genetic variation in <it>B. anthracis</it></p>
					</caption>
					<tblbdy cols="5">
						<r>
							<c>
								<p/>
							</c>
							<c ca="center">
								<p>Observed number of SNPs</p>
							</c>
							<c ca="center">
								<p>Total amount resequenced (bp)</p>
							</c>
							<c ca="center">
								<p>Nucleotide diversity (&#215; 10<sup>4</sup>) &#177; 2 SEs</p>
							</c>
							<c ca="center">
								<p>Tajima's D</p>
							</c>
						</r>
						<r>
							<c cspan="5">
								<hr/>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>
									<b>Total</b>
								</p>
							</c>
							<c ca="center">
								<p>37</p>
							</c>
							<c ca="center">
								<p>1,544,913</p>
							</c>
							<c ca="center">
								<p>2.9 &#177; 1.3</p>
							</c>
							<c ca="center">
								<p>-0.93</p>
							</c>
						</r>
						<r>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>
									<b>Genomic location</b>
								</p>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Chromosome</p>
							</c>
							<c ca="center">
								<p>18</p>
							</c>
							<c ca="center">
								<p>874,564</p>
							</c>
							<c ca="center">
								<p>2.5 &#177; 1.4</p>
							</c>
							<c ca="center">
								<p>-0.95</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>pXO1</p>
							</c>
							<c ca="center">
								<p>9</p>
							</c>
							<c ca="center">
								<p>325,397</p>
							</c>
							<c ca="center">
								<p>3.3 &#177; 2.4</p>
							</c>
							<c ca="center">
								<p>-0.54</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>pXO2</p>
							</c>
							<c ca="center">
								<p>10</p>
							</c>
							<c ca="center">
								<p>344,952</p>
							</c>
							<c ca="center">
								<p>3.5 &#177; 2.5</p>
							</c>
							<c ca="center">
								<p>-0.73</p>
							</c>
						</r>
						<r>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>
									<b>SNP class</b>
								</p>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Silent</p>
							</c>
							<c ca="center">
								<p>15</p>
							</c>
							<c ca="center">
								<p>243,481</p>
							</c>
							<c ca="center">
								<p>7.5 &#177; 4.3</p>
							</c>
							<c ca="center">
								<p>-0.55</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Replacement</p>
							</c>
							<c ca="center">
								<p>9</p>
							</c>
							<c ca="center">
								<p>898,837</p>
							</c>
							<c ca="center">
								<p>1.2 &#177; 0.80</p>
							</c>
							<c ca="center">
								<p>-0.64</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Intergenic</p>
							</c>
							<c ca="center">
								<p>13</p>
							</c>
							<c ca="center">
								<p>402,595</p>
							</c>
							<c ca="center">
								<p>3.8 &#177; 2.3</p>
							</c>
							<c ca="center">
								<p>-1.09</p>
							</c>
						</r>
					</tblbdy>
				</tbl>
				<p>The neutral theory of molecular evolution predicts a characteristic frequency spectrum of SNPs, or segregating sites, for populations at equilibrium <abbrgrp><abbr bid="B49">49</abbr></abbrgrp>. Deviations from this expected distribution are observed when an experimental population sample contains an excess of low frequency, rare SNPs, or an excess of high frequency, common SNPs, relative to the neutral expectation. These deviations can arise as a consequence of demographic history and/or the action of natural selection <abbrgrp><abbr bid="B50">50</abbr></abbrgrp>. Figure <figr fid="F2">2</figr> compares the observed and expected percent of SNPs in four allele-frequency classes. The data suggest an observed excess of rare SNPs as compared to that expected under the neutral theory. For example, while the neutral theory predicts that approximately 60% of SNPs should have minor allele frequencies less than or equal to 0.25, we observe that more that 92% of the <it>B. anthracis </it>SNPs we discovered have minor allele frequencies that fall into this class, a statistically significant difference (Figure <figr fid="F2">2</figr>).</p>
				<fig id="F2">
					<title>
						<p>Figure 2</p>
					</title>
					<caption>
						<p><it>B. anthracis </it>SNP frequency spectrum</p>
					</caption>
					<text>
						<p><it>B. anthracis </it>SNP frequency spectrum. An excess of rare SNPs are observed in our sample. Ninety-two percent of the SNPs that we discovered have a minor allele frequency less than or equal to 0.25. This finding (92%) is significantly different from the neutral theory expectation (60%). This excess can arise as a consequence of rapid, population expansion from a small founder population and/or the action of natural selection.</p>
					</text>
					<graphic file="gb-2004-6-1-r10-2"/>
				</fig>
				<p>We used the Tajima's D statistic <abbrgrp><abbr bid="B50">50</abbr></abbrgrp> to further assess this pattern for the entire dataset, for SNPs from each genomic region and for each SNP class (Table <tblr tid="T2">2</tblr>). Tajima's D is a summary statistic for the site (or SNP) frequency spectrum, whose value is negative when there is an excess of rare variants and positive when there is an excess of common variants, relative to the neutral expectation. The test statistic is calculated from two different estimates of levels of genetic variation, the number of segregating sites <abbrgrp><abbr bid="B44">44</abbr></abbrgrp> and the average number of nucleotide differences estimated from pairwise comparisons <abbrgrp><abbr bid="B50">50</abbr></abbrgrp>. We observe that Tajima's D is negative for SNPs comprising the total dataset, each genomic region and each SNP class. While none of the individual test statistics is statistically significant, they collectively suggest an excess of rare variants in <it>B. anthracis</it>. If we scale our variation estimates drawn from the 0.5% resequenced in 56 <it>B. anthracis </it>genomes, we can estimate a range around the total number of SNPs that one would detect upon sequencing two random <it>B. anthracis </it>isolates, sampled in the same fashion as isolates in this study were chosen. Our results indicate that we should expect to find, on average, between 944 (standard deviation (SD) 454) <abbrgrp><abbr bid="B50">50</abbr></abbrgrp> and 1,586 SNPs (SD 762) <abbrgrp><abbr bid="B44">44</abbr></abbrgrp>. A substantial proportion of these SNPs, probably more than expected under the neutral theory, would be rare.</p>
				<p>Using multiple sequence alignments of 17 genes from <it>B. anthracis </it>(NC_003997, Ames) and <it>B. cereus </it>(NC_004722, ATCC 14579 <abbrgrp><abbr bid="B9">9</abbr></abbrgrp> and NC_003909, ATCC 10987 <abbrgrp><abbr bid="B10">10</abbr></abbrgrp>) the patterns of genetic polymorphism and divergence at silent and replacement sites was assessed. The raw counts are presented in Table <tblr tid="T3">3</tblr>. It is striking that two <it>B. cereus </it>strains exhibit more polymorphism at silent and replacement sites than divergence from <it>B. anthracis</it>. This result confirms, at the DNA sequence level, previous results suggesting that the <it>B. cereus </it>species group is diverse and polyphyletic in origin. <it>B. anthracis </it>then appears to be a clonal lineage derived from, and nested within, a diverse species. In other words, the species names do not encompass or reflect the evolutionary history of the species <abbrgrp><abbr bid="B10">10</abbr><abbr bid="B51">51</abbr><abbr bid="B52">52</abbr></abbrgrp>.</p>
				<tbl id="T3">
					<title>
						<p>Table 3</p>
					</title>
					<caption>
						<p>Observed patterns of polymorphism/divergence between <it>B. anthracis </it>(Ames) and <it>B. cereus </it>(ATCC 14579, ATCC 10987)</p>
					</caption>
					<tblbdy cols="3">
						<r>
							<c>
								<p/>
							</c>
							<c ca="center">
								<p>Silent sites</p>
							</c>
							<c ca="center">
								<p>Replacement sites</p>
							</c>
						</r>
						<r>
							<c cspan="3">
								<hr/>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Polymorphic sites within <it>B. cereus </it>strains</p>
							</c>
							<c ca="center">
								<p>660</p>
							</c>
							<c ca="center">
								<p>136</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Divergent sites between <it>B. anthracis </it>and <it>B. cereus</it></p>
							</c>
							<c ca="center">
								<p>646</p>
							</c>
							<c ca="center">
								<p>125</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Polymorphic sites within <it>B. anthracis </it>strains</p>
							</c>
							<c ca="center">
								<p>11</p>
							</c>
							<c ca="center">
								<p>3</p>
							</c>
						</r>
					</tblbdy>
				</tbl>
			</sec>
			<sec>
				<st>
					<p>No evidence for recombination in <it>B. anthracis </it>chromosome</p>
				</st>
				<p>The 37 SNPs discovered on the <it>B. anthracis </it>chromosome and plasmids pXO1 and pXO2, possess in total, 636 pairs of sites where two alleles are observed. In principle, the alleles at each pair of sites could form four distinct haplotypes. Plasmid transfer between different <it>B. anthracis </it>strains would affect physically unlinked site pairs resulting in four distinct haplotypes. Homologous recombination or gene conversion between physically linked site pairs is also expected to produce all four haplotypes. The straightforward counting of the number of haplotypes that one detects in a large population sample, such as the one used in this study, is often referred to as the four-gamete test <abbrgrp><abbr bid="B53">53</abbr></abbrgrp>.</p>
				<p>Among the 636 site pairs in our sample, we observe 26 pairs of sites with two haplotypes, 610 pairs of sites with three haplotypes, and no pairs of sites with four haplotypes. This striking result implies that the value of D', the standardized measure of linkage disequilibrium (LD) <abbrgrp><abbr bid="B54">54</abbr></abbrgrp>, is equal to 1, its maximum value, for all site pairs that we observe. Among the 137 site pairs where we could have detected statistically significant LD at <it>P </it>&lt; 10<sup>-3</sup>, we observe that 52 site pairs exhibit statistically significant LD. Four of the six site pairs showing significant LD on the <it>B. anthracis </it>main chromosome are over 500 kb apart.</p>
			</sec>
			<sec>
				<st>
					<p>Correlation of RA resequencing data with MLVA typing</p>
				</st>
				<p>Because of the low level of genetic variation in <it>B. anthracis </it>(<abbrgrp><abbr bid="B28">28</abbr><abbr bid="B29">29</abbr></abbrgrp> and this study), determining the phylogenetic relationship among <it>B. anthracis </it>strains has proven difficult. Twenty-four <it>B. anthracis </it>strains characterized with a single fluorescent AFLP primer combination were reported to be monomorphic <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>. One recent MLST study sequenced seven housekeeping genes (approximately 3 kb total) in 5 <it>B. anthracis </it>strains and reported that the strains were monomorphic at the sites examined. Another recent MLST study sequenced seven genes (approximately 3 kb total) in 11 diverse <it>B. anthracis </it>strains finding three polymorphic nucleotides <abbrgrp><abbr bid="B55">55</abbr></abbrgrp>. Neither the AFLP nor the MLST studies discover and genotype sufficient genetic variation to distinguish between <it>B. anthracis </it>strains.</p>
				<p>The most successful marker-based approach used to date, MLVA, determined the genotypes at eight VNTR loci in 426 <it>B. anthracis </it>isolates, enabling the construction of a phylogenetic tree of <it>B. anthracis </it>strains <abbrgrp><abbr bid="B29">29</abbr></abbrgrp>. We sought to determine if our resequencing of 0.5% of each of 56 <it>B. anthracis </it>genomes is capable of confirming the major phylogenetic groupings determined by MLVA. To test this, we concatenated the 37 variant positions for all strains in this study, calculated a distance matrix using a simple Kimura substitution model, and generated an Unweighted Pair Group Method Arithmetic Mean (UPGMA) tree (see methods <abbrgrp><abbr bid="B56">56</abbr></abbrgrp>; Figure <figr fid="F3">3</figr>). The strains group together in a manner broadly similar to that found by Keim <it>et al. </it><abbrgrp><abbr bid="B29">29</abbr></abbrgrp> with B strains forming an outgroup and most A strains being found together in the same subgroups (Figure <figr fid="F3">3</figr>). There are exceptions: one group in Figure <figr fid="F3">3</figr> contains a mix of A3a, A1a, A1b and A2 strains. This anomaly is probably due to the relatively few SNPs that effectively distinguish these groups when only 0.5% of the genome is sampled. All <it>B. anthracis </it>Ames strains but ASC394 correctly cluster in an A3b group. <it>B. anthracis </it>ASC394 may be a case of an originally mistyped or mislabeled strain. Nevertheless, our data suggest that limited, random resequencing of 0.5% of the 56 <it>B. anthracis </it>genomes discovers and genotypes sufficient genetic variation to determine the major phylogenetic relationships among <it>B. anthracis </it>strains.</p>
				<fig id="F3">
					<title>
						<p>Figure 3</p>
					</title>
					<caption>
						<p>Radial tree showing inferred phylogenetic relationships of <it>B. anthracis </it>strains from this study</p>
					</caption>
					<text>
						<p>Radial tree showing inferred phylogenetic relationships of <it>B. anthracis </it>strains from this study. The 37 variable positions identified in this study were concatenated together to create artificial sequence types. Groups of strains with identical sequence types were A0488 and ASC006; A0039, ASC025, ASC031, ASC070, ASC074 and ASC394; ASC074 and ASC054; A0328, ASC061 and ASC073; A0034, ASC159, ASC165 and ASC398. A DNA distance matrix was created using DNADIST, plotted as a UPGMA tree using NEIGHBOR and the tree plotted using DRAWGRAM [56]. The B1 strain A0465 was used as an outgroup.</p>
					</text>
					<graphic file="gb-2004-6-1-r10-3"/>
				</fig>
			</sec>
		</sec>
		<sec>
			<st>
				<p>Discussion</p>
			</st>
			<p>Population genomics requires the random sampling of genome-wide patterns of DNA sequence variation in a large number of organisms. Such studies require high-throughput, highly accurate, cost-effective resequencing technologies. While the conventional industrial-scale shotgun-sequencing model is clearly the best technology available for <it>de novo </it>generation of genomic sequence, it may not be the best approach for resequencing large numbers of strains. RAs, as originally applied for human genome resequencing <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>, offer one competing technology that can rapidly produce very high-quality data with limited personnel and infrastructure requirements. Our application of RAs to resequence multiple genomic regions in the biowarfare pathogen, <it>Bacillus anthracis</it>, further supports this perspective.</p>
			<p>Studies of DNA sequence variation are most informative when both rare and common variants are identified. While the limited ascertainment of selected common variants can be employed to identify broad evolutionary relationships among bacterial genomes, and in fact underlies most bacterial strain typing methodologies, the ultimate forensic application of resequencing lies in the ascertainment of rare, presumably newly arising variants, that may allow more precise determination of a strain's origin. Rare variants may be particularly informative since they are likely to be restricted to specific strains (substrains/isolates). Strain genotyping of common variants provides an incomplete description of genomic patterns of DNA sequence variation, while obtaining most or all of the genomic sequence from multiple strains allows a maximally informative analyses of DNA sequence variation, its function, and ultimately, the evolutionary history of the organisms. The ability to rapidly, accurately and inexpensively resequence entire bacterial genomes should also contribute to an understanding of a variety of important phenotypic traits in <it>B. anthracis </it>and other bacterial pathogens <abbrgrp><abbr bid="B57">57</abbr><abbr bid="B58">58</abbr><abbr bid="B59">59</abbr><abbr bid="B60">60</abbr><abbr bid="B61">61</abbr><abbr bid="B62">62</abbr></abbrgrp>.</p>
			<p>Our study demonstrates that microarray-based resequencing is technologically robust and generates highly replicable and accurate data when compared to alternative sequence technologies (Table <tblr tid="T1">1</tblr>). In this experiment, 115 RAs, or 97.5% of the total attempted, were processed successfully obtaining an average high-quality base-calling rate of 92.6%. Called bases are shown to be highly replicable (discrepancy rate of 7.4 &#215; 10<sup>-7</sup>) and accurate when compared to conventional shotgun sequence (discrepancy rate of &lt; 2.5 &#215; 10<sup>-6</sup>). Clearly, RA-generated resequencing data from a single experiment is comparable, in terms of data quality, to DNA sequence generated from multiple shotgun reads by a DNA sequencing center. The major technical challenge facing RA-based resequencing is to increase overall call rates while not compromising data quality. Modifications of RA synthesis, experimental protocols and the ABACUS software algorithm could all contribute to improved base-calling rates. While it is possible to increase call rates while sacrificing data quality, there is a need to focus on generating very high-quality data at virtually all sites. If this is absent, the second-best outcome is to call all bases in an environment in which we understand the nature of probable errors. In diverse fields where RAs might be widely used as a first-stage screening tool, such as BW agent identification or human clinical testing, the imperative is to use highly sensitive technologies that minimize the false-negative rate. False-positive findings could be confirmed later in a second-stage screen with an alternative technology such as conventional dideoxy chain termination sequencing.</p>
			<p>Microarray-based resequencing identifies and genotypes SNPs in a single experiment. No prior knowledge of the variability of a site is required - only a reference genomic sequence. Microarray design and applications are flexible. It is, however, important to note that the use of RAs in this study is not as a SNP typing technology. Thus, problems in interpreting the inferred phylogenetic relationships between strains that arise from SNP typing schema are avoided <abbrgrp><abbr bid="B63">63</abbr></abbrgrp>. RA-based resequencing resembles MLST methodology used for bacterial strains <abbrgrp><abbr bid="B52">52</abbr><abbr bid="B55">55</abbr><abbr bid="B64">64</abbr></abbrgrp>. MLST attempts to choose the most informative genomic regions to resequence, largely because of the costs associated and technological limitations in generating enough DNA sequence data on a large collection of variant strains. While a typical MLST approach might resequence between 3 and 4 kb, in organisms like <it>B. anthracis </it>that have low levels of genetic variation (<abbrgrp><abbr bid="B28">28</abbr><abbr bid="B51">51</abbr><abbr bid="B55">55</abbr></abbrgrp> and this study), this amount of generated sequence is insufficient. Clearly, RAs, such as those used in this study that can resequence approximately 29 kb, could rapidly increase this amount and be used for MLST studies. Furthermore, manufacturing improvements that reduce RA feature sizes enable the resequencing of greater quantities of genomic sequence per microarray. Ongoing work at NMRC/BDRD is evaluating RAs that can resequence 300 kb per chip. At that RA feature density, when combined with whole-genome amplification protocols, a single technician in two days could resequence the entire <it>B. anthracis </it>genome on approximately 15 RAs.</p>
			<p>Our data provides the first population genetic estimation of the levels and patterns of DNA sequence variation in <it>B. anthracis</it>. We report three main findings. First, among <it>B. anthracis </it>isolates sampled in the same fashion as in this study we would expect two randomly selected <it>B. anthracis </it>strains to differ, on average, at between 944 (SD 454) and 1,586 SNPs (SD 762). The variance surrounding these expectations is large, and any two isolates may differ from the expectation. Closely related, nonrandomly sampled isolates, such as those sequenced in <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>, will have far fewer SNPs than that expected for samples drawn from a worldwide collection. Nevertheless, our data suggest that were it possible to rapidly resequence entire <it>B. anthracis </it>genomes, sufficient genetic variation is likely to be found to make very fine-level discrimination of strain collections. Resequencing offers the best chance to identify newly arising, rare, strain-specific variants that will discriminate between very closely related strains, since we expect identical genotypes at the known common genetic variants <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>. We also observe, that as seen in eukaryotic genomes <abbrgrp><abbr bid="B45">45</abbr></abbrgrp>, the amount of silent variation per site within genes is much higher than that seen at replacement sites. Intergenic regions are seen to have intermediate levels of polymorphism. This pattern is expected to arise if noncoding intergenic regions possess variants visible to natural selection. If SNPs in intergenic regions were purely neutral, then we would expect to see levels of polymorphism similar to that at silent sites, which are undoubtedly under less stringent selective forces.</p>
			<p>Second, the neutral theory of molecular evolution predicts that in a population at equilibrium, a significant proportion of the observed genetic variation will consist of rare genetic variants <abbrgrp><abbr bid="B49">49</abbr></abbrgrp>. We observe a significant excess of rare SNPs as compared to that expected under the neutral theory (Table <tblr tid="T2">2</tblr>). This pattern of variation classically has at least two possible causes. The first consists of a recent population expansion from a small founder population. The second consists of the action of natural selection on genetic variants <abbrgrp><abbr bid="B65">65</abbr><abbr bid="B66">66</abbr><abbr bid="B67">67</abbr></abbrgrp>. Resequencing technologies will be of particular use in populations of organisms exhibiting this pattern of genetic variation.</p>
			<p>Finally, we see no evidence for plasmid exchange or recombination altering the patterns of DNA sequence variation among <it>B. anthracis </it>strains in the regions that we resequenced. Some of the regions that we resequenced contain genes whose function influences <it>B. anthracis </it>pathogenicity or surrounded the bacterial origin of replication. In other bacterial species, these types of regions are the most likely to exhibit recombination <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>. The fact that we observe no evidence of plasmid exchange or recombination among physically linked markers in the regions that we resequenced, is striking.</p>
			<p>The simplest interpretation of this observation is that the <it>B. anthracis </it>strains that we examined are ultimately derived from a single clonal ancestor and that the exchange of plasmids and recombination between strains during the course of their evolution is either very rare or nonexistent. While models of natural selection could also account for the patterns that we see, we think a simple demographic model of recent, rapid clonal expansion is parsimonious and best supported by our data. Hence, our findings suggest that <it>B. anthracis </it>populations consist of multiple closely related clones whose life histories prevent the opportunity for homologous recombination between different strains. We note, however, that while we resequenced 0.5% of the <it>B. anthracis </it>genome, including regions where we expected to detect recombination, further data collection from multiple genomic regions, or the entire genome, would allow a more thorough analysis of this pattern. Sequencing a larger percentage of the genome in a similar-sized or larger sample of isolates would provide greater power to detect rare recombination events. We are undertaking such a project to test the validity of our inference and to better determine if recombination is rare or absent among <it>B. anthracis </it>strains.</p>
			<p>The absence of recombination in <it>B. anthracis</it>, a potential biowarfare agent, suggests a novel approach to identifying a newly arising or a genetically engineered strain. A recombination event could arise through rare natural genetic exchange or as a consequence of genetic engineering. Irrespective of the cause, the discovery of a <it>B. anthracis </it>strain possessing evidence of genetic recombination would warrant close examination and probably demand immediate further phenotypic and genomic characterization.</p>
			<p>Taken together, the findings of a low number of differences between strains, a preponderance of rare variants, and an absence of recombination all point to a scenario where the current world population of <it>B. anthracis </it>has expanded recently from a single clone derived from, and nested within a diverse species, <it>B. cereus</it>. Other bacterial pathogens, such as the potential biowarfare agent <it>Yersinia pestis</it>, possess a similar recent pattern of rapid expansion <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>. However, the patterns of genetic variation in <it>Y. pestis </it>are quite different from that seen in <it>B. anthracis</it>, for instance in the much more active role of insertion sequences in <it>Yersinia</it>. We speculate that the <it>B. anthracis </it>history of clonal expansion could arise as a consequence of the life history of a highly pathogenic sporulating mammalian pathogen. Exploring the population biology of less virulent members of the <it>B. cereus </it>group could directly test this. These population genomics studies could determine if clonal clusters of <it>B. cereus </it>strains exhibit similar population dynamics and patterns of genetic variation, or whether the picture of <it>B. anthracis </it>emerging from studies such as this is as unusual as the level of pathogenicity of the species itself.</p>
		</sec>
		<sec>
			<st>
				<p>Conclusions</p>
			</st>
			<p>Microarray-based resequencing can rapidly generate very high quality data, enabling population genomics studies in bacteria. We find no evidence for plasmid exchange or recombination altering the patterns of DNA sequence variation among <it>B. anthracis </it>strains in the regions that we resequenced The patterns of genetic variation in the <it>B. anthracis </it>regions resequenced are consistent with that expected for a bacterial species that has undergone a rapid, historically recent expansion from a single clone. Detecting plasmid exchange or recombination between <it>B. anthracis </it>genetic variants could act as an indicator of a newly emerging or genetically engineered strain.</p>
		</sec>
		<sec>
			<st>
				<p>Materials and methods</p>
			</st>
			<sec>
				<st>
					<p><it>B. anthracis </it>strains surveyed</p>
				</st>
				<p>We selected a geographically diverse panel of 56 <it>B. anthracis </it>strains from the Biological Defense Research Directorate collection (see Additional data file 1). Twenty-four of the strains originated from the Louisiana State University collection <abbrgrp><abbr bid="B29">29</abbr></abbrgrp>. These have been typed by MLVA <abbrgrp><abbr bid="B29">29</abbr></abbrgrp> and in order to sample diversity, we chose a group that had representatives of the A1a, A1b, A2, A3a, A3b, A3d, A4, B1 and B2 lineages. The remaining 35 strains originate from a UK collection and were chosen to represent geographical variation as well as unusual phenotypes such as gamma phage and penicillin resistance. Six of the UK strains were reisolates of the Ames strain <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>, which allowed us to test the reproducibility of resequencing.</p>
			</sec>
			<sec>
				<st>
					<p>Resequencing array design</p>
				</st>
				<p>Unique genomic sequences were identified using Miropeats <abbrgrp><abbr bid="B68">68</abbr></abbrgrp> at the default thresholds from among the <it>B. anthracis </it>Ames chromosome (5.2 megabase-pair (Mb), NC_003997) and plasmids pXO1 (181.6 kb, NC_001496) and pXO2 (96.2 kb, NC_002146). The genomic regions that we resequenced included at least one gene of interest (pXO1: toxin lethal factor precursor <it>lef</it>, toxin moiety, protective antigen <it>pagA</it>; pXO2: encapsulation protein gene <it>CapC</it>; Ames chromosome: <it>vrrA</it>, DNA-directed RNA polymerase <it>rpoB</it>, <it>yfhp </it>protein), but also included many surrounding loci (see Additional data file 4 for complete listing). The total chip design consisted of 6,191 bp from pXO1, 6,725 bp from pXO2, and 16,584 bp from the Ames chromosome (total submitted sequence 29,500 bp). From these unique sequences, a single 20 &#215; 25 &#956;m RA design capable of resequencing 29,212 bp or 0.5% of the <it>B. anthracis </it>genome was fabricated by Affymetrix (see Additional data file 3). The final sequences submitted for RA design are contained in Additional data file 5.</p>
			</sec>
			<sec>
				<st>
					<p><it>B. anthracis </it>strain genomic DNA isolation</p>
				</st>
				<p>Five milliliters of brain heart infusion (BHI) was inoculated and grown 12-16 h at 37&#176;C. One-ml aliquots of cells were centrifuged for 10 min at 5,000-7,500<it>g</it>. Pellets were resuspended in 720 &#956;l enzymatic lysis buffer (20 mM Tris-Cl pH 8.0, 2 mM EDTA, 1.2% Triton X-100, 20 mg/ml lysozyme) and incubated at 37&#176;C for 1 h. After incubation 100 ml of Proteinase K was added along with 800 ml of Qiagen buffer AL, and incubated at 70&#176;C for an additional 30 min. Then, 800 ml of 100% ethanol was added and this was split onto four of the Qiagen DNAeasy tissue kit. The DNA was then washed and eluted according to the Qiagen protocol. After the DNA was eluted, it was passed through a 0.22 mm filter. Sterility was confirmed by plating 10% DNA preparation directly on SBA plates with a second 10% inoculated into a 5 ml broth culture. The plate and the broth were allowed to incubate for 7 days. Two hundred milliliters of the broth culture was subcultured onto SBA at day 4. If there was no growth on any of these cultures the DNA was considered sterile and removed from the BSL-3 lab for subsequence analyses.</p>
			</sec>
			<sec>
				<st>
					<p>Sample preparation and RA hybridization</p>
				</st>
				<p>Genomic DNA was amplified using Long PCR (LPCR) protocols described in Cutler <it>et al</it>. <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>. The primers that amplified each RA fragment are shown in Additional data file 3. The primer sequences were:</p>
				<p>ant8 AAAAAGACGAGATGCGTCAACATCCCGTCCCA,</p>
				<p>ant9 TCAACTAAATCCGCACCTAGGGTTGCTGTAAG,</p>
				<p>ant10 ATTACTTTGAGTGGTCCCGTCTTTATCCCCCT,</p>
				<p>ant11 ACATTAGCAGGCAAGGACAGTGGTGTTGGAGA,</p>
				<p>ant14 ATTCACGCTCTCCCACCCAGATATTCCTACAT,</p>
				<p>ant15 GTCCTAATATCGGTGAGCAACGCAGGGTAGTT,</p>
				<p>ant20 GAGAAGAACCCCTACTACACGCATTGATACTG,</p>
				<p>ant21 TTTAGTAGCGAGGGTACAGGCGCGTTTATACC,</p>
				<p>ant26 TGGAAGCAGGCTTCGTAAGTGTAGGCGACGTT,</p>
				<p>ant27 GTTGCATGTTCGCTCCCATAAGTGCGCGGTTA,</p>
				<p>ant 32 AATGGGTGTATAGGGGTGATCTGTTGTGATGG,</p>
				<p>ant33 TCCATGTTCGGCCATCTGATTCCGTCACTACT.</p>
				<p>Long PCR product concentration was determined by using Pico-Green (Molecular Probes, Inc.) with lambda DNA standards (Invitrogen). The LPCR products were then pooled, DNAse digested, biotin endlabelled and hybridized to individual RAs overnight following established protocols contained in <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>. Subsequent washes and stains were carried out as described in Cutler <it>et al. </it><abbrgrp><abbr bid="B32">32</abbr></abbrgrp> and were only washed and not antibody stained. RAs were scanned at 570 nm, with a pixel size of 3 &#956;m/pixel averaged over two scans. Automated grid alignment and base calling was performed for the .DAT files on a Mac G5 computer with the ABACUS software suite.</p>
			</sec>
			<sec>
				<st>
					<p>RA sequence determination</p>
				</st>
				<p>An ABACUS parameter search was employed to determine those parameters that called the maximal number of bases while minimizing discrepancies <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>. This total experiment consisted of 118 RAs, of which three failed (&lt; 60% base calling). Of the remaining 115 RAs, 8 were used to sequence individual strains once. Of the remaining 107 RAs, 96 were used to replicate hybridize 48 <it>B. anthracis </it>strains, while the remaining 11 RAs were used as additional multiple replicates of these same strains. In total, sequence data was generated from 56 unique <it>B. anthracis </it>strains (see Additional data file 1 for strain listing). In order to obtain the most complete data possible, for those strains with replicate RA sequences, a single composite strain sequence was generated for subsequent population genetic analyses. The current version of ABACUS algorithm is not designed to detect insertion/deletion variation.</p>
				<p>The effect of oligonucleotide probe composition was determined by choosing for each base, the probe with the most purines or the most guanines. The number of times that a given base was called was tabulated across all 115 successful RAs. The mean purine and guanine composition was determined for the classes that were called in all 115 RAs and uncalled in all 115 RAs. A Student's <it>t </it>test with unequal variances was used to test for difference in mean sequence composition (purines/guanines) between the always called and never called classes. The DNA sequence files for the 115 RAs and the original RA image files (.DAT files) are available from the authors and will be made available through the NCBI Trace Archive.</p>
			</sec>
			<sec>
				<st>
					<p>Population genetic analyses</p>
				</st>
				<p>All population genetic analyses were calculated using the popgen_fasta2.0.c code (Cutler DJ, unpublished work) on the collection of 56 sets of <it>B. anthracis </it>fasta files. The fasta files were analyzed in total and separately for the main chromosome and plasmids pXO1 and pXO2. The identification of genes was taken from publicly available annotation contained in the relevant GenBank refseq files (<it>B. anthracis </it>str. Ames NC_003997; pXO1, NC_001496; pXO2 NC_002146). The statistical significance of linkage disequilibrium between site pairs was performed by using the Fisher's Exact Test at <it>P </it>&lt; 10<sup>-3</sup><abbrgrp><abbr bid="B69">69</abbr></abbrgrp>.</p>
			</sec>
			<sec>
				<st>
					<p>Estimating levels of genetic variation</p>
				</st>
				<p>To account for missing data, &#952; is estimated by [&#931;<sub>n</sub>(S<sub>n</sub>/a<sub>n</sub>)]/L, where S<sub>n </sub>is the number of observed segregating sites at positions with exactly n alleles sequenced (n is a maximum of 56, fewer with missing data), a<sub>n </sub>= &#931;<sub>i = 1..<it>n</it>-1 </sub>1/i, and L is the total length of the sequence examined. Var{&#952;} is estimated by [&#931;<sub>n </sub>(L<sub>n</sub>&#952;/a<sub>n </sub>+ (L<sub>n</sub>)<sup>2</sup>b<sub>n</sub>&#952;<sup>2</sup>/(a<sub>n</sub>)<sup>2</sup>]/L<sup>2</sup>, where L<sub>n </sub>is the number of sites with data from exactly n alleles, and b<sub>n </sub>= &#931;<sub>i = 1..<it>n</it>-1 </sub>1/i<sup>2</sup>. With missing data &#960; is estimated by [&#931;<sub>i </sub>2p<sub>i</sub>q<sub>i</sub>n<sub>i</sub>/(n<sub>i </sub>- 1)]/L, where the sum is taken over all sites i, p<sub>i </sub>and q<sub>i </sub>are the allele frequencies at site i, and n<sub>i </sub>is the number of alleles sequenced at site i.</p>
				<p>To determine if the estimates of theta between SNP types (silent, replacement, intergenic) are significantly different, we used the number of samples sequenced, the number of segregating sites, and the length of the region to find a maximum-likelihood estimate of theta per site for each SNP type using equations 11 and 12 in Hudson <abbrgrp><abbr bid="B70">70</abbr></abbrgrp>. We compared all possible SNP types against each other (silent vs replacement, silent vs intergenic, replacement vs. intergenic). For a given pair of SNP types, we first determined the maximum-likelihood estimator of theta for each type individually. We then determined the maximum-likelihood estimator of theta, assuming both types had identical theta per site. We ask whether the model with different thetas for each type fits significantly better than the model with a single theta through a likelihood ratio test. Reported significances are the <it>p</it>-values from the likelihood ratio test.</p>
			</sec>
			<sec>
				<st>
					<p>Site frequency spectrum</p>
				</st>
				<p>Comparing the observed site frequency spectrum with that expected under the neutral theory is a powerful approach to detect unusual patterns of genetic diversity. We employed two different approaches for this analysis. First, we calculated the expected number of sites with minor allele frequency i as &#931;<sub>n</sub>&#952;L<sub>n </sub>[1/i + 1/(n - i)] and from this determine the expected percent of sites under the neutral expectation. This is directly compared with the observed percent of SNPs in Figure <figr fid="F2">2</figr>. Confidence intervals for the sample proportion of each SNP minor allele frequency classes as</p>
				<p>
					<graphic file="gb-2004-6-1-r10-i1.gif"/>
				</p>
				<p>where N is the number of SNPs observed for each class, <graphic file="gb-2004-6-1-r10-i2.gif"/> is their observed frequency, and <graphic file="gb-2004-6-1-r10-i3.gif"/>.</p>
				<p>As a second method, we employed Tajima's D statistic <abbrgrp><abbr bid="B50">50</abbr></abbrgrp>, estimated as (&#960; - &#952;)/Var(&#960; - &#952;). Under the neutral model, &#960; and &#952; have the same expectation, hence Tajima's D is expected to be 0. Since &#960; is a function of site heterozygosities and &#952; is a function of the total number of segregating sites, Tajima's D is negative (positive) with an excess (deficit) of rare sites. We use our estimated values of &#960; <abbrgrp><abbr bid="B51">51</abbr></abbrgrp> and &#952; <abbrgrp><abbr bid="B45">45</abbr></abbrgrp>, multiplied by the total genome <it>B. anthracis </it>genome length (5,505,178), to determine the expected number of SNPs that we would expect to observe among two <it>B. anthracis </it>strains sampled in same random fashion as isolates in this study were chosen. Using Equations 6-9 in <abbrgrp><abbr bid="B51">51</abbr></abbrgrp>, we calculated the variance of <graphic file="gb-2004-6-1-r10-i4.gif"/> and &#952; estimators. The one standard deviation (SD) that we report is the square root of this variance.</p>
			</sec>
			<sec>
				<st>
					<p>Phylogenetic tree inference</p>
				</st>
				<p>The 37 variable positions identified in this study were concatenated together to create artificial sequence types. A DNA distance matrix was created using DNADIST, plotted as a UPGMA tree using NEIGHBOR and the tree plotted using DRAWGRAM <abbrgrp><abbr bid="B57">57</abbr></abbrgrp>.</p>
			</sec>
		</sec>
		<sec>
			<st>
				<p>Additional data files</p>
			</st>
			<p>The following additional data are available with the online version of this article. Additional data file <supplr sid="s1">1</supplr> lists <it>B. anthracis </it>strains from the Biological Defense Research Directorate (BDRD) strain collection resequenced in this study. Additional data file <supplr sid="s2">2</supplr> lists the BDRD-01 RA fragment names, the GenBank reference sequence from which they are derived, the length of the unique genomic sequences submitted to RA design, the length of the unique genomic sequences capable of being queried, and the LPCR primer pairs used to amplify the RA fragments. Additional data file <supplr sid="s3">3</supplr> lists the <it>B. anthracis </it>SNPs identified in this study. The data include the BDRD SNP ID, the GenBank reference sequence and RA fragment containing the SNP, the SNP position relative to the GenBank reference sequence and the RA sequence, the SNP frequency, and the listing of the base calls in all strains at sites harboring SNPs. Additional data file <supplr sid="s4">4</supplr> lists the 31 <it>B. anthracis </it>genes partially or wholly resequenced in this study. The observed number SNPs by SNP type (silent vs replacement) for each gene are provided. Finally, Additional data file <supplr sid="s5">5</supplr> shows the genomic sequences submitted to RA design for BDRD-01.</p>
			<suppl id="s1">
				<title>
					<p>Additional data file 1</p>
				</title>
				<caption>
					<p><it>B. anthracis </it>strains from the Biological Defense Research Directorate (BDRD) strain collection resequenced in this study</p>
				</caption>
				<text>
					<p><it>B. anthracis </it>strains from the Biological Defense Research Directorate (BDRD) strain collection resequenced in this study</p>
				</text>
				<file name="gb-2004-6-1-r10-s1.pdf">
					<p>Click here for additional data file</p>
				</file>
			</suppl>
			<suppl id="s2">
				<title>
					<p>Additional data file 2</p>
				</title>
				<caption>
					<p>The BDRD-01 RA fragment names, the GenBank reference sequence from which they are derived, the length of the unique genomic sequences submitted to RA design, the length of the unique genomic sequences capable of being queried, and the LPCR primer pairs used to amplify the RA fragments</p>
				</caption>
				<text>
					<p>The BDRD-01 RA fragment names, the GenBank reference sequence from which they are derived, the length of the unique genomic sequences submitted to RA design, the length of the unique genomic sequences capable of being queried, and the LPCR primer pairs used to amplify the RA fragments</p>
				</text>
				<file name="gb-2004-6-1-r10-s2.pdf">
					<p>Click here for additional data file</p>
				</file>
			</suppl>
			<suppl id="s3">
				<title>
					<p>Additional data file 3</p>
				</title>
				<caption>
					<p>The <it>B. anthracis </it>SNPs identified in this study. The data include the BDRD SNP ID, the GenBank reference sequence and RA fragment containing the SNP, the SNP position relative to the GenBank reference sequence and the RA sequence, the SNP frequency, and the listing of the base calls in all strains at sites harboring SNPs</p>
				</caption>
				<text>
					<p>The <it>B. anthracis </it>SNPs identified in this study. The data include the BDRD SNP ID, the GenBank reference sequence and RA fragment containing the SNP, the SNP position relative to the GenBank reference sequence and the RA sequence, the SNP frequency, and the listing of the base calls in all strains at sites harboring SNPs</p>
				</text>
				<file name="gb-2004-6-1-r10-s3.pdf">
					<p>Click here for additional data file</p>
				</file>
			</suppl>
			<suppl id="s4">
				<title>
					<p>Additional data file 4</p>
				</title>
				<caption>
					<p>The 31 <it>B. anthracis </it>genes partially or wholly resequenced in this study</p>
				</caption>
				<text>
					<p>The 31 <it>B. anthracis </it>genes partially or wholly resequenced in this study</p>
				</text>
				<file name="gb-2004-6-1-r10-s4.pdf">
					<p>Click here for additional data file</p>
				</file>
			</suppl>
			<suppl id="s5">
				<title>
					<p>Additional data file 5</p>
				</title>
				<caption>
					<p>The genomic sequences submitted to RA design for BDRD-01</p>
				</caption>
				<text>
					<p>The genomic sequences submitted to RA design for BDRD-01</p>
				</text>
				<file name="gb-2004-6-1-r10-s5.pdf">
					<p>Click here for additional data file</p>
				</file>
			</suppl>
		</sec>
	</bdy>
	<bm>
		<ack>
			<sec>
				<st>
					<p>Acknowledgements</p>
				</st>
				<p>Funding from the Defense Threat Reduction Agency (DTRA) was used to support this study. The authors would like to thank Peter Turnbull for aid in <it>B. anthracis </it>strain selection, Michael Chute for <it>B. anthracis </it>genomic DNA isolation and David Rasko for comments on the manuscript. The views expressed in this paper are those of the authors and do not reflect the official policy or position of the Department of Navy, Department of Defense or US Government.</p>
			</sec>
		</ack>
		<refgrp>
			<bibl id="B1">
				<title>
					<p>Microbial population genomics and ecology.</p>
				</title>
				<aug>
					<au>
						<snm>DeLong</snm>
						<fnm>EF</fnm>
					</au>
				</aug>
				<source>Curr Opin Microbiol</source>
				<pubdate>2002</pubdate>
				<volume>5</volume>
				<fpage>520</fpage>
				<lpage>524</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/S1369-5274(02)00353-3</pubid>
						<pubid idtype="pmpid" link="fulltext">12354561</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B2">
				<title>
					<p>Towards microbial systems science: integrating microbial perspective, from genomes to biomes.</p>
				</title>
				<aug>
					<au>
						<snm>DeLong</snm>
						<fnm>EF</fnm>
					</au>
				</aug>
				<source>Environ Microbiol</source>
				<pubdate>2002</pubdate>
				<volume>4</volume>
				<fpage>9</fpage>
				<lpage>10</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1046/j.1462-2920.2002.t01-12-00257.x</pubid>
						<pubid idtype="pmpid" link="fulltext">11966814</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B3">
				<title>
					<p>Redefining bacterial populations: a post-genomic reformation.</p>
				</title>
				<aug>
					<au>
						<snm>Joyce</snm>
						<fnm>EA</fnm>
					</au>
					<au>
						<snm>Chan</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Salama</snm>
						<fnm>NR</fnm>
					</au>
					<au>
						<snm>Falkow</snm>
						<fnm>S</fnm>
					</au>
				</aug>
				<source>Nat Rev Genet</source>
				<pubdate>2002</pubdate>
				<volume>3</volume>
				<fpage>462</fpage>
				<lpage>473</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">12042773</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B4">
				<title>
					<p>Identification of anthrax toxin genes in a <it>Bacillus cereus </it>associated with an illness resembling inhalation anthrax.</p>
				</title>
				<aug>
					<au>
						<snm>Hoffmaster</snm>
						<fnm>AR</fnm>
					</au>
					<au>
						<snm>Ravel</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Rasko</snm>
						<fnm>DA</fnm>
					</au>
					<au>
						<snm>Chapman</snm>
						<fnm>GD</fnm>
					</au>
					<au>
						<snm>Chute</snm>
						<fnm>MD</fnm>
					</au>
					<au>
						<snm>Marston</snm>
						<fnm>CK</fnm>
					</au>
					<au>
						<snm>De</snm>
						<fnm>BK</fnm>
					</au>
					<au>
						<snm>Sacchi</snm>
						<fnm>CT</fnm>
					</au>
					<au>
						<snm>Fitzgerald</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Mayer</snm>
						<fnm>LW</fnm>
					</au>
					<etal/>
				</aug>
				<source>Proc Natl Acad Sci USA</source>
				<pubdate>2004</pubdate>
				<volume>101</volume>
				<fpage>8449</fpage>
				<lpage>8454</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">420414</pubid>
						<pubid idtype="pmpid" link="fulltext">15155910</pubid>
						<pubid idtype="doi">10.1073/pnas.0402414101</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B5">
				<title>
					<p>GOLD Genomes OnLine Database</p>
				</title>
				<url>http://www.genomesonline.org</url>
			</bibl>
			<bibl id="B6">
				<title>
					<p>Sequence and organization of pXO1, the large <it>Bacillus anthracis</it> plasmid harboring the anthrax toxin genes.</p>
				</title>
				<aug>
					<au>
						<snm>Okinaka</snm>
						<fnm>RT</fnm>
					</au>
					<au>
						<snm>Cloud</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Hampton</snm>
						<fnm>O</fnm>
					</au>
					<au>
						<snm>Hoffmaster</snm>
						<fnm>AR</fnm>
					</au>
					<au>
						<snm>Hill</snm>
						<fnm>KK</fnm>
					</au>
					<au>
						<snm>Keim</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Koehler</snm>
						<fnm>TM</fnm>
					</au>
					<au>
						<snm>Lamke</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Kumano</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Mahillon</snm>
						<fnm>J</fnm>
					</au>
					<etal/>
				</aug>
				<source>J Bacteriol</source>
				<pubdate>1999</pubdate>
				<volume>181</volume>
				<fpage>6509</fpage>
				<lpage>6515</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">103788</pubid>
						<pubid idtype="pmpid" link="fulltext">10515943</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B7">
				<title>
					<p>Sequence, assembly and analysis of pX01 and pX02.</p>
				</title>
				<aug>
					<au>
						<snm>Okinaka</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Cloud</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Hampton</snm>
						<fnm>O</fnm>
					</au>
					<au>
						<snm>Hoffmaster</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Hill</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Keim</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Koehler</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Lamke</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Kumano</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Manter</snm>
						<fnm>D</fnm>
					</au>
					<etal/>
				</aug>
				<source>J Appl Microbiol</source>
				<pubdate>1999</pubdate>
				<volume>87</volume>
				<fpage>261</fpage>
				<lpage>262</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1046/j.1365-2672.1999.00883.x</pubid>
						<pubid idtype="pmpid" link="fulltext">10475962</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B8">
				<title>
					<p>The genome sequence of <it>Bacillus anthracis </it>Ames and comparison to closely related bacteria.</p>
				</title>
				<aug>
					<au>
						<snm>Read</snm>
						<fnm>TD</fnm>
					</au>
					<au>
						<snm>Peterson</snm>
						<fnm>SN</fnm>
					</au>
					<au>
						<snm>Tourasse</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Baillie</snm>
						<fnm>LW</fnm>
					</au>
					<au>
						<snm>Paulsen</snm>
						<fnm>IT</fnm>
					</au>
					<au>
						<snm>Nelson</snm>
						<fnm>KE</fnm>
					</au>
					<au>
						<snm>Tettelin</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Fouts</snm>
						<fnm>DE</fnm>
					</au>
					<au>
						<snm>Eisen</snm>
						<fnm>JA</fnm>
					</au>
					<au>
						<snm>Gill</snm>
						<fnm>SR</fnm>
					</au>
					<etal/>
				</aug>
				<source>Nature</source>
				<pubdate>2003</pubdate>
				<volume>423</volume>
				<fpage>81</fpage>
				<lpage>86</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1038/nature01586</pubid>
						<pubid idtype="pmpid" link="fulltext">12721629</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B9">
				<title>
					<p>Genome sequence of <it>Bacillus cereus </it>and comparative analysis with <it>Bacillus anthracis</it>.</p>
				</title>
				<aug>
					<au>
						<snm>Ivanova</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Sorokin</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Anderson</snm>
						<fnm>I</fnm>
					</au>
					<au>
						<snm>Galleron</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Candelon</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Kapatral</snm>
						<fnm>V</fnm>
					</au>
					<au>
						<snm>Bhattacharyya</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Reznik</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Mikhailova</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Lapidus</snm>
						<fnm>A</fnm>
					</au>
					<etal/>
				</aug>
				<source>Nature</source>
				<pubdate>2003</pubdate>
				<volume>423</volume>
				<fpage>87</fpage>
				<lpage>91</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1038/nature01582</pubid>
						<pubid idtype="pmpid" link="fulltext">12721630</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B10">
				<title>
					<p>The genome sequence of <it>Bacillus cereus </it>ATCC 10987 reveals metabolic adaptations and a large plasmid related to <it>Bacillus anthracis </it>pXO1.</p>
				</title>
				<aug>
					<au>
						<snm>Rasko</snm>
						<fnm>DA</fnm>
					</au>
					<au>
						<snm>Ravel</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Okstad</snm>
						<fnm>OA</fnm>
					</au>
					<au>
						<snm>Helgason</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>Cer</snm>
						<fnm>RZ</fnm>
					</au>
					<au>
						<snm>Jiang</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Shores</snm>
						<fnm>KA</fnm>
					</au>
					<au>
						<snm>Fouts</snm>
						<fnm>DE</fnm>
					</au>
					<au>
						<snm>Tourasse</snm>
						<fnm>NJ</fnm>
					</au>
					<au>
						<snm>Angiuoli</snm>
						<fnm>SV</fnm>
					</au>
					<etal/>
				</aug>
				<source>Nucleic Acids Res</source>
				<pubdate>2004</pubdate>
				<volume>32</volume>
				<fpage>977</fpage>
				<lpage>988</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">373394</pubid>
						<pubid idtype="pmpid" link="fulltext">14960714</pubid>
						<pubid idtype="doi">10.1093/nar/gkh258</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B11">
				<title>
					<p>Comparative genome sequencing for discovery of novel polymorphisms in <it>Bacillus anthracis</it>.</p>
				</title>
				<aug>
					<au>
						<snm>Read</snm>
						<fnm>TD</fnm>
					</au>
					<au>
						<snm>Salzberg</snm>
						<fnm>SL</fnm>
					</au>
					<au>
						<snm>Pop</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Shumway</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Umayam</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Jiang</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Holtzapple</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>Busch</snm>
						<fnm>JD</fnm>
					</au>
					<au>
						<snm>Smith</snm>
						<fnm>KL</fnm>
					</au>
					<au>
						<snm>Schupp</snm>
						<fnm>JM</fnm>
					</au>
					<etal/>
				</aug>
				<source>Science</source>
				<pubdate>2002</pubdate>
				<volume>296</volume>
				<fpage>2028</fpage>
				<lpage>2033</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1126/science.1071837</pubid>
						<pubid idtype="pmpid" link="fulltext">12004073</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B12">
				<title>
					<p>Genomics and microbiology. Microbial forensics - "cross-examining pathogens".</p>
				</title>
				<aug>
					<au>
						<snm>Cummings</snm>
						<fnm>CA</fnm>
					</au>
					<au>
						<snm>Relman</snm>
						<fnm>DA</fnm>
					</au>
				</aug>
				<source>Science</source>
				<pubdate>2002</pubdate>
				<volume>296</volume>
				<fpage>1976</fpage>
				<lpage>1979</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1126/science.1073125</pubid>
						<pubid idtype="pmpid" link="fulltext">12004075</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B13">
				<title>
					<p>Genomewide pattern of synonymous nucleotide substitution in two complete genomes of <it>Mycobacterium tuberculosis</it>.</p>
				</title>
				<aug>
					<au>
						<snm>Hughes</snm>
						<fnm>AL</fnm>
					</au>
					<au>
						<snm>Friedman</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Murray</snm>
						<fnm>M</fnm>
					</au>
				</aug>
				<source>Emerg Infect Dis</source>
				<pubdate>2002</pubdate>
				<volume>8</volume>
				<fpage>1342</fpage>
				<lpage>1346</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">12453367</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B14">
				<title>
					<p>Inferences from whole-genome sequences of bacterial pathogens.</p>
				</title>
				<aug>
					<au>
						<snm>Whittam</snm>
						<fnm>TS</fnm>
					</au>
					<au>
						<snm>Bumbaugh</snm>
						<fnm>AC</fnm>
					</au>
				</aug>
				<source>Curr Opin Genet Dev</source>
				<pubdate>2002</pubdate>
				<volume>12</volume>
				<fpage>719</fpage>
				<lpage>725</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/S0959-437X(02)00361-1</pubid>
						<pubid idtype="pmpid" link="fulltext">12433587</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B15">
				<title>
					<p>The yersiniae - a model genus to study the rapid evolution of bacterial pathogens.</p>
				</title>
				<aug>
					<au>
						<snm>Wren</snm>
						<fnm>BW</fnm>
					</au>
				</aug>
				<source>Nat Rev Microbiol</source>
				<pubdate>2003</pubdate>
				<volume>1</volume>
				<fpage>55</fpage>
				<lpage>64</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1038/nrmicro730</pubid>
						<pubid idtype="pmpid">15040180</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B16">
				<title>
					<p>The hidden lifestyles of <it>Bacillus cereus </it>and relatives.</p>
				</title>
				<aug>
					<au>
						<snm>Jensen</snm>
						<fnm>GB</fnm>
					</au>
					<au>
						<snm>Hansen</snm>
						<fnm>BM</fnm>
					</au>
					<au>
						<snm>Eilenberg</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Mahillon</snm>
						<fnm>J</fnm>
					</au>
				</aug>
				<source>Environ Microbiol</source>
				<pubdate>2003</pubdate>
				<volume>5</volume>
				<fpage>631</fpage>
				<lpage>640</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1046/j.1462-2920.2003.00461.x</pubid>
						<pubid idtype="pmpid" link="fulltext">12871230</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B17">
				<title>
					<p>Genetic variability of <it>Bacillus anthracis </it>and related species.</p>
				</title>
				<aug>
					<au>
						<snm>Harrell</snm>
						<fnm>LJ</fnm>
					</au>
					<au>
						<snm>Andersen</snm>
						<fnm>GL</fnm>
					</au>
					<au>
						<snm>Wilson</snm>
						<fnm>KH</fnm>
					</au>
				</aug>
				<source>J Clin Microbiol</source>
				<pubdate>1995</pubdate>
				<volume>33</volume>
				<fpage>1847</fpage>
				<lpage>1850</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">228283</pubid>
						<pubid idtype="pmpid" link="fulltext">7665658</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B18">
				<title>
					<p>Identification of a region of genetic variability among <it>Bacillus anthracis </it>strains and related species.</p>
				</title>
				<aug>
					<au>
						<snm>Andersen</snm>
						<fnm>GL</fnm>
					</au>
					<au>
						<snm>Simchock</snm>
						<fnm>JM</fnm>
					</au>
					<au>
						<snm>Wilson</snm>
						<fnm>KH</fnm>
					</au>
				</aug>
				<source>J Bacteriol</source>
				<pubdate>1996</pubdate>
				<volume>178</volume>
				<fpage>377</fpage>
				<lpage>384</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">177668</pubid>
						<pubid idtype="pmpid" link="fulltext">8550456</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B19">
				<title>
					<p>Genetic diversity in the protective antigen gene of <it>Bacillus anthracis</it>.</p>
				</title>
				<aug>
					<au>
						<snm>Price</snm>
						<fnm>LB</fnm>
					</au>
					<au>
						<snm>Hugh-Jones</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Jackson</snm>
						<fnm>PJ</fnm>
					</au>
					<au>
						<snm>Keim</snm>
						<fnm>P</fnm>
					</au>
				</aug>
				<source>J Bacteriol</source>
				<pubdate>1999</pubdate>
				<volume>181</volume>
				<fpage>2358</fpage>
				<lpage>2362</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">93658</pubid>
						<pubid idtype="pmpid" link="fulltext">10197996</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B20">
				<title>
					<p><it>vrrB</it>, a hypervariable open reading frame in <it>Bacillus anthracis</it>.</p>
				</title>
				<aug>
					<au>
						<snm>Schupp</snm>
						<fnm>JM</fnm>
					</au>
					<au>
						<snm>Klevytska</snm>
						<fnm>AM</fnm>
					</au>
					<au>
						<snm>Zinser</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Price</snm>
						<fnm>LB</fnm>
					</au>
					<au>
						<snm>Keim</snm>
						<fnm>P</fnm>
					</au>
				</aug>
				<source>J Bacteriol</source>
				<pubdate>2000</pubdate>
				<volume>182</volume>
				<fpage>3989</fpage>
				<lpage>3997</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">94584</pubid>
						<pubid idtype="pmpid" link="fulltext">10869077</pubid>
						<pubid idtype="doi">10.1128/JB.182.14.3989-3997.2000</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B21">
				<title>
					<p>Identification of <it>Bacillus anthracis </it>by <it>rpoB </it>sequence analysis and multiplex PCR.</p>
				</title>
				<aug>
					<au>
						<snm>Ko</snm>
						<fnm>KS</fnm>
					</au>
					<au>
						<snm>Kim</snm>
						<fnm>JM</fnm>
					</au>
					<au>
						<snm>Kim</snm>
						<fnm>JW</fnm>
					</au>
					<au>
						<snm>Jung</snm>
						<fnm>BY</fnm>
					</au>
					<au>
						<snm>Kim</snm>
						<fnm>W</fnm>
					</au>
					<au>
						<snm>Kim</snm>
						<fnm>IJ</fnm>
					</au>
					<au>
						<snm>Kook</snm>
						<fnm>YH</fnm>
					</au>
				</aug>
				<source>J Clin Microbiol</source>
				<pubdate>2003</pubdate>
				<volume>41</volume>
				<fpage>2908</fpage>
				<lpage>2914</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">165277</pubid>
						<pubid idtype="pmpid" link="fulltext">12843020</pubid>
						<pubid idtype="doi">10.1128/JCM.41.7.2908-2914.2003</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B22">
				<title>
					<p>Genetic relationship in the 'Bacillus cereus group' by rep-PCR fingerprinting and sequencing of a <it>Bacillus anthracis</it>-specific rep-PCR fragment.</p>
				</title>
				<aug>
					<au>
						<snm>Cherif</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Brusetti</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Borin</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Rizzi</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Boudabous</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Khyami-Horani</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Daffonchio</snm>
						<fnm>D</fnm>
					</au>
				</aug>
				<source>J Appl Microbiol</source>
				<pubdate>2003</pubdate>
				<volume>94</volume>
				<fpage>1108</fpage>
				<lpage>1119</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1046/j.1365-2672.2003.01945.x</pubid>
						<pubid idtype="pmpid" link="fulltext">12752821</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B23">
				<title>
					<p><it>Bacillus anthracis </it>diverges from related clades of the <it>Bacillus cereus </it>group in 16S-23S ribosomal DNA intergenic transcribed spacers containing tRNA genes.</p>
				</title>
				<aug>
					<au>
						<snm>Cherif</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Borin</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Rizzi</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Ouzari</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Boudabous</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Khyami-Horani</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Daffonchio</snm>
						<fnm>D</fnm>
					</au>
				</aug>
				<source>Appl Environ Microbiol</source>
				<pubdate>2003</pubdate>
				<volume>69</volume>
				<fpage>33</fpage>
				<lpage>40</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">152393</pubid>
						<pubid idtype="pmpid" link="fulltext">12513974</pubid>
						<pubid idtype="doi">10.1128/AEM.69.1.33-40.2003</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B24">
				<title>
					<p>Genetic comparison of <it>Bacillus anthracis </it>and its close relatives using amplified fragment length polymorphism and polymerase chain reaction analysis.</p>
				</title>
				<aug>
					<au>
						<snm>Jackson</snm>
						<fnm>PJ</fnm>
					</au>
					<au>
						<snm>Hill</snm>
						<fnm>KK</fnm>
					</au>
					<au>
						<snm>Laker</snm>
						<fnm>MT</fnm>
					</au>
					<au>
						<snm>Ticknor</snm>
						<fnm>LO</fnm>
					</au>
					<au>
						<snm>Keim</snm>
						<fnm>P</fnm>
					</au>
				</aug>
				<source>J Appl Microbiol</source>
				<pubdate>1999</pubdate>
				<volume>87</volume>
				<fpage>263</fpage>
				<lpage>269</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1046/j.1365-2672.1999.00884.x</pubid>
						<pubid idtype="pmpid" link="fulltext">10475963</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B25">
				<title>
					<p>Fluorescent amplified fragment length polymorphism analysis of Norwegian <it>Bacillus cereus </it>and <it>Bacillus thuringiensis </it>soil isolates.</p>
				</title>
				<aug>
					<au>
						<snm>Ticknor</snm>
						<fnm>LO</fnm>
					</au>
					<au>
						<snm>Kolsto</snm>
						<fnm>AB</fnm>
					</au>
					<au>
						<snm>Hill</snm>
						<fnm>KK</fnm>
					</au>
					<au>
						<snm>Keim</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Laker</snm>
						<fnm>MT</fnm>
					</au>
					<au>
						<snm>Tonks</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Jackson</snm>
						<fnm>PJ</fnm>
					</au>
				</aug>
				<source>Appl Environ Microbiol</source>
				<pubdate>2001</pubdate>
				<volume>67</volume>
				<fpage>4863</fpage>
				<lpage>4873</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">93242</pubid>
						<pubid idtype="pmpid" link="fulltext">11571195</pubid>
						<pubid idtype="doi">10.1128/AEM.67.10.4863-4873.2001</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B26">
				<title>
					<p>Genome differences that distinguish <it>Bacillus anthracis </it>from <it>Bacillus cereus </it>and <it>Bacillus thuringiensis</it>.</p>
				</title>
				<aug>
					<au>
						<snm>Radnedge</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Agron</snm>
						<fnm>PG</fnm>
					</au>
					<au>
						<snm>Hill</snm>
						<fnm>KK</fnm>
					</au>
					<au>
						<snm>Jackson</snm>
						<fnm>PJ</fnm>
					</au>
					<au>
						<snm>Ticknor</snm>
						<fnm>LO</fnm>
					</au>
					<au>
						<snm>Keim</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Andersen</snm>
						<fnm>GL</fnm>
					</au>
				</aug>
				<source>Appl Environ Microbiol</source>
				<pubdate>2003</pubdate>
				<volume>69</volume>
				<fpage>2755</fpage>
				<lpage>2764</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">154536</pubid>
						<pubid idtype="pmpid" link="fulltext">12732546</pubid>
						<pubid idtype="doi">10.1128/AEM.69.5.2755-2764.2003</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B27">
				<title>
					<p>Fluorescent amplified fragment length polymorphism analysis of <it>Bacillus anthracis</it>, <it>Bacillus cereus</it>, and <it>Bacillus thuringiensis </it>isolates.</p>
				</title>
				<aug>
					<au>
						<snm>Hill</snm>
						<fnm>KK</fnm>
					</au>
					<au>
						<snm>Ticknor</snm>
						<fnm>LO</fnm>
					</au>
					<au>
						<snm>Okinaka</snm>
						<fnm>RT</fnm>
					</au>
					<au>
						<snm>Asay</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Blair</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Bliss</snm>
						<fnm>KA</fnm>
					</au>
					<au>
						<snm>Laker</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Pardington</snm>
						<fnm>PE</fnm>
					</au>
					<au>
						<snm>Richardson</snm>
						<fnm>AP</fnm>
					</au>
					<au>
						<snm>Tonks</snm>
						<fnm>M</fnm>
					</au>
					<etal/>
				</aug>
				<source>Appl Environ Microbiol</source>
				<pubdate>2004</pubdate>
				<volume>70</volume>
				<fpage>1068</fpage>
				<lpage>1080</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">348840</pubid>
						<pubid idtype="pmpid" link="fulltext">14766590</pubid>
						<pubid idtype="doi">10.1128/AEM.70.2.1068-1080.2004</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B28">
				<title>
					<p>Molecular diversity in <it>Bacillus anthracis</it>.</p>
				</title>
				<aug>
					<au>
						<snm>Keim</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Klevytska</snm>
						<fnm>AM</fnm>
					</au>
					<au>
						<snm>Price</snm>
						<fnm>LB</fnm>
					</au>
					<au>
						<snm>Schupp</snm>
						<fnm>JM</fnm>
					</au>
					<au>
						<snm>Zinser</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Smith</snm>
						<fnm>KL</fnm>
					</au>
					<au>
						<snm>Hugh-Jones</snm>
						<fnm>ME</fnm>
					</au>
					<au>
						<snm>Okinaka</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Hill</snm>
						<fnm>KK</fnm>
					</au>
					<au>
						<snm>Jackson</snm>
						<fnm>PJ</fnm>
					</au>
				</aug>
				<source>J Appl Microbiol</source>
				<pubdate>1999</pubdate>
				<volume>87</volume>
				<fpage>215</fpage>
				<lpage>217</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1046/j.1365-2672.1999.00873.x</pubid>
						<pubid idtype="pmpid" link="fulltext">10475952</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B29">
				<title>
					<p>Multiple-locus variable-number tandem repeat analysis reveals genetic relationships within <it>Bacillus anthracis</it>.</p>
				</title>
				<aug>
					<au>
						<snm>Keim</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Price</snm>
						<fnm>LB</fnm>
					</au>
					<au>
						<snm>Klevytska</snm>
						<fnm>AM</fnm>
					</au>
					<au>
						<snm>Smith</snm>
						<fnm>KL</fnm>
					</au>
					<au>
						<snm>Schupp</snm>
						<fnm>JM</fnm>
					</au>
					<au>
						<snm>Okinaka</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Jackson</snm>
						<fnm>PJ</fnm>
					</au>
					<au>
						<snm>Hugh-Jones</snm>
						<fnm>ME</fnm>
					</au>
				</aug>
				<source>J Bacteriol</source>
				<pubdate>2000</pubdate>
				<volume>182</volume>
				<fpage>2928</fpage>
				<lpage>2936</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">102004</pubid>
						<pubid idtype="pmpid" link="fulltext">10781564</pubid>
						<pubid idtype="doi">10.1128/JB.182.10.2928-2936.2000</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B30">
				<title>
					<p><it>Bacillus anthracis </it>diversity in Kruger National Park.</p>
				</title>
				<aug>
					<au>
						<snm>Smith</snm>
						<fnm>KL</fnm>
					</au>
					<au>
						<snm>DeVos</snm>
						<fnm>V</fnm>
					</au>
					<au>
						<snm>Bryden</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Price</snm>
						<fnm>LB</fnm>
					</au>
					<au>
						<snm>Hugh-Jones</snm>
						<fnm>ME</fnm>
					</au>
					<au>
						<snm>Keim</snm>
						<fnm>P</fnm>
					</au>
				</aug>
				<source>J Clin Microbiol</source>
				<pubdate>2000</pubdate>
				<volume>38</volume>
				<fpage>3780</fpage>
				<lpage>3784</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">87475</pubid>
						<pubid idtype="pmpid" link="fulltext">11015402</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B31">
				<title>
					<p>Diversity among French <it>Bacillus anthracis </it>isolates.</p>
				</title>
				<aug>
					<au>
						<snm>Fouet</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Smith</snm>
						<fnm>KL</fnm>
					</au>
					<au>
						<snm>Keys</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Vaissaire</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Le Doujet</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Levy</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Mock</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Keim</snm>
						<fnm>P</fnm>
					</au>
				</aug>
				<source>J Clin Microbiol</source>
				<pubdate>2002</pubdate>
				<volume>40</volume>
				<fpage>4732</fpage>
				<lpage>4734</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">154597</pubid>
						<pubid idtype="pmpid" link="fulltext">12454180</pubid>
						<pubid idtype="doi">10.1128/JCM.40.12.4732-4734.2002</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B32">
				<title>
					<p>High-throughput variation detection and genotyping using microarrays.</p>
				</title>
				<aug>
					<au>
						<snm>Cutler</snm>
						<fnm>DJ</fnm>
					</au>
					<au>
						<snm>Zwick</snm>
						<fnm>ME</fnm>
					</au>
					<au>
						<snm>Carrasquillo</snm>
						<fnm>MM</fnm>
					</au>
					<au>
						<snm>Yohn</snm>
						<fnm>CT</fnm>
					</au>
					<au>
						<snm>Tobin</snm>
						<fnm>KP</fnm>
					</au>
					<au>
						<snm>Kashuk</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Mathews</snm>
						<fnm>DJ</fnm>
					</au>
					<au>
						<snm>Shah</snm>
						<fnm>NA</fnm>
					</au>
					<au>
						<snm>Eichler</snm>
						<fnm>EE</fnm>
					</au>
					<au>
						<snm>Warrington</snm>
						<fnm>JA</fnm>
					</au>
					<au>
						<snm>Chakravarti</snm>
						<fnm>A</fnm>
					</au>
				</aug>
				<source>Genome Res</source>
				<pubdate>2001</pubdate>
				<volume>11</volume>
				<fpage>1913</fpage>
				<lpage>1925</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">11691856</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B33">
				<title>
					<p>The Human MitoChip: a high-throughput sequencing microarray for mitochondrial mutation detection.</p>
				</title>
				<aug>
					<au>
						<snm>Maitra</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Cohen</snm>
						<fnm>Y</fnm>
					</au>
					<au>
						<snm>Gillespie</snm>
						<fnm>SE</fnm>
					</au>
					<au>
						<snm>Mambo</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>Fukushima</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Hoque</snm>
						<fnm>MO</fnm>
					</au>
					<au>
						<snm>Shah</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Goggins</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Califano</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Sidransky</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Chakravarti</snm>
						<fnm>A</fnm>
					</au>
				</aug>
				<source>Genome Res</source>
				<pubdate>2004</pubdate>
				<volume>14</volume>
				<fpage>812</fpage>
				<lpage>819</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">479107</pubid>
						<pubid idtype="pmpid" link="fulltext">15123581</pubid>
						<pubid idtype="doi">10.1101/gr.2228504</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B34">
				<title>
					<p>Simultaneous genotyping and species identification using hybridization pattern recognition analysis of generic <it>Mycobacterium </it>DNA arrays.</p>
				</title>
				<aug>
					<au>
						<snm>Gingeras</snm>
						<fnm>TR</fnm>
					</au>
					<au>
						<snm>Ghandour</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Wang</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>Berno</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Small</snm>
						<fnm>PM</fnm>
					</au>
					<au>
						<snm>Drobniewski</snm>
						<fnm>F</fnm>
					</au>
					<au>
						<snm>Alland</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Desmond</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>Holodniy</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Drenkow</snm>
						<fnm>J</fnm>
					</au>
				</aug>
				<source>Genome Res</source>
				<pubdate>1998</pubdate>
				<volume>8</volume>
				<fpage>435</fpage>
				<lpage>448</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">9582189</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B35">
				<title>
					<p>High-density microarray of small-subunit ribosomal DNA probes.</p>
				</title>
				<aug>
					<au>
						<snm>Wilson</snm>
						<fnm>KH</fnm>
					</au>
					<au>
						<snm>Wilson</snm>
						<fnm>WJ</fnm>
					</au>
					<au>
						<snm>Radosevich</snm>
						<fnm>JL</fnm>
					</au>
					<au>
						<snm>DeSantis</snm>
						<fnm>TZ</fnm>
					</au>
					<au>
						<snm>Viswanathan</snm>
						<fnm>VS</fnm>
					</au>
					<au>
						<snm>Kuczmarski</snm>
						<fnm>TA</fnm>
					</au>
					<au>
						<snm>Andersen</snm>
						<fnm>GL</fnm>
					</au>
				</aug>
				<source>Appl Environ Microbiol</source>
				<pubdate>2002</pubdate>
				<volume>68</volume>
				<fpage>2535</fpage>
				<lpage>2541</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">127547</pubid>
						<pubid idtype="pmpid" link="fulltext">11976131</pubid>
						<pubid idtype="doi">10.1128/AEM.68.5.2535-2541.2002</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B36">
				<title>
					<p>Direct allelic variation scanning of the yeast genome.</p>
				</title>
				<aug>
					<au>
						<snm>Winzeler</snm>
						<fnm>EA</fnm>
					</au>
					<au>
						<snm>Richards</snm>
						<fnm>DR</fnm>
					</au>
					<au>
						<snm>Conway</snm>
						<fnm>AR</fnm>
					</au>
					<au>
						<snm>Goldstein</snm>
						<fnm>AL</fnm>
					</au>
					<au>
						<snm>Kalman</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>McCullough</snm>
						<fnm>MJ</fnm>
					</au>
					<au>
						<snm>McCusker</snm>
						<fnm>JH</fnm>
					</au>
					<au>
						<snm>Stevens</snm>
						<fnm>DA</fnm>
					</au>
					<au>
						<snm>Wodicka</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Lockhart</snm>
						<fnm>DJ</fnm>
					</au>
					<au>
						<snm>Davis</snm>
						<fnm>RW</fnm>
					</au>
				</aug>
				<source>Science</source>
				<pubdate>1998</pubdate>
				<volume>281</volume>
				<fpage>1194</fpage>
				<lpage>1197</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1126/science.281.5380.1194</pubid>
						<pubid idtype="pmpid" link="fulltext">9712584</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B37">
				<title>
					<p>Genetic diversity in yeast assessed with whole-genome oligonucleotide arrays.</p>
				</title>
				<aug>
					<au>
						<snm>Winzeler</snm>
						<fnm>EA</fnm>
					</au>
					<au>
						<snm>Castillo-Davis</snm>
						<fnm>CI</fnm>
					</au>
					<au>
						<snm>Oshiro</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Liang</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Richards</snm>
						<fnm>DR</fnm>
					</au>
					<au>
						<snm>Zhou</snm>
						<fnm>Y</fnm>
					</au>
					<au>
						<snm>Hartl</snm>
						<fnm>DL</fnm>
					</au>
				</aug>
				<source>Genetics</source>
				<pubdate>2003</pubdate>
				<volume>163</volume>
				<fpage>79</fpage>
				<lpage>89</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">12586698</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B38">
				<title>
					<p>Patterns of single-nucleotide polymorphisms in candidate genes for blood-pressure homeostasis.</p>
				</title>
				<aug>
					<au>
						<snm>Halushka</snm>
						<fnm>MK</fnm>
					</au>
					<au>
						<snm>Fan</snm>
						<fnm>JB</fnm>
					</au>
					<au>
						<snm>Bentley</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Hsie</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Shen</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Weder</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Cooper</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Lipshutz</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Chakravarti</snm>
						<fnm>A</fnm>
					</au>
				</aug>
				<source>Nat Genet</source>
				<pubdate>1999</pubdate>
				<volume>22</volume>
				<fpage>239</fpage>
				<lpage>247</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1038/10297</pubid>
						<pubid idtype="pmpid" link="fulltext">10391210</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B39">
				<title>
					<p>Characterization of single-nucleotide polymorphisms in coding regions of human genes.</p>
				</title>
				<aug>
					<au>
						<snm>Cargill</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Altshuler</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Ireland</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Sklar</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Ardlie</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Patil</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Shaw</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Lane</snm>
						<fnm>CR</fnm>
					</au>
					<au>
						<snm>Lim</snm>
						<fnm>EP</fnm>
					</au>
					<au>
						<snm>Kalyanaraman</snm>
						<fnm>N</fnm>
					</au>
					<etal/>
				</aug>
				<source>Nat Genet</source>
				<pubdate>1999</pubdate>
				<volume>22</volume>
				<fpage>231</fpage>
				<lpage>238</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1038/10290</pubid>
						<pubid idtype="pmpid" link="fulltext">10391209</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B40">
				<title>
					<p>Large-scale identification, mapping, and genotyping of single-nucleotide polymorphisms in the human genome.</p>
				</title>
				<aug>
					<au>
						<snm>Wang</snm>
						<fnm>DG</fnm>
					</au>
					<au>
						<snm>Fan</snm>
						<fnm>JB</fnm>
					</au>
					<au>
						<snm>Siao</snm>
						<fnm>CJ</fnm>
					</au>
					<au>
						<snm>Berno</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Young</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Sapolsky</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Ghandour</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Perkins</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Winchester</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>Spencer</snm>
						<fnm>J</fnm>
					</au>
					<etal/>
				</aug>
				<source>Science</source>
				<pubdate>1998</pubdate>
				<volume>280</volume>
				<fpage>1077</fpage>
				<lpage>1082</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1126/science.280.5366.1077</pubid>
						<pubid idtype="pmpid" link="fulltext">9582121</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B41">
				<title>
					<p>New developments in high-throughput resequencing and variation detection using high density microarrays.</p>
				</title>
				<aug>
					<au>
						<snm>Warrington</snm>
						<fnm>JA</fnm>
					</au>
					<au>
						<snm>Shah</snm>
						<fnm>NA</fnm>
					</au>
					<au>
						<snm>Chen</snm>
						<fnm>X</fnm>
					</au>
					<au>
						<snm>Janis</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Liu</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Kondapalli</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Reyes</snm>
						<fnm>V</fnm>
					</au>
					<au>
						<snm>Savage</snm>
						<fnm>MP</fnm>
					</au>
					<au>
						<snm>Zhang</snm>
						<fnm>Z</fnm>
					</au>
					<au>
						<snm>Watts</snm>
						<fnm>R</fnm>
					</au>
					<etal/>
				</aug>
				<source>Hum Mutat</source>
				<pubdate>2002</pubdate>
				<volume>19</volume>
				<fpage>402</fpage>
				<lpage>409</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1002/humu.10075</pubid>
						<pubid idtype="pmpid" link="fulltext">11933194</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B42">
				<title>
					<p>Base-calling of automated sequencer traces using phred. I. Accuracy assessment.</p>
				</title>
				<aug>
					<au>
						<snm>Ewing</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Hillier</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Wendl</snm>
						<fnm>MC</fnm>
					</au>
					<au>
						<snm>Green</snm>
						<fnm>P</fnm>
					</au>
				</aug>
				<source>Genome Res</source>
				<pubdate>1998</pubdate>
				<volume>8</volume>
				<fpage>175</fpage>
				<lpage>185</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">9521921</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B43">
				<title>
					<p>Base-calling of automated sequencer traces using phred. II. Error probabilities.</p>
				</title>
				<aug>
					<au>
						<snm>Ewing</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Green</snm>
						<fnm>P</fnm>
					</au>
				</aug>
				<source>Genome Res</source>
				<pubdate>1998</pubdate>
				<volume>8</volume>
				<fpage>186</fpage>
				<lpage>194</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">9521922</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B44">
				<title>
					<p>On the number of segregating sites in genetical models without recombination.</p>
				</title>
				<aug>
					<au>
						<snm>Watterson</snm>
						<fnm>GA</fnm>
					</au>
				</aug>
				<source>Theor Popul Biol</source>
				<pubdate>1975</pubdate>
				<volume>7</volume>
				<fpage>256</fpage>
				<lpage>276</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/0040-5809(75)90020-9</pubid>
						<pubid idtype="pmpid">1145509</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B45">
				<title>
					<p>Patterns of genetic variation in Mendelian and complex traits.</p>
				</title>
				<aug>
					<au>
						<snm>Zwick</snm>
						<fnm>ME</fnm>
					</au>
					<au>
						<snm>Cutler</snm>
						<fnm>DJ</fnm>
					</au>
					<au>
						<snm>Chakravarti</snm>
						<fnm>A</fnm>
					</au>
				</aug>
				<source>Annu Rev Genomics Hum Genet</source>
				<pubdate>2000</pubdate>
				<volume>1</volume>
				<fpage>387</fpage>
				<lpage>407</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1146/annurev.genom.1.1.387</pubid>
						<pubid idtype="pmpid" link="fulltext">11701635</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B46">
				<title>
					<p>Initial sequencing and analysis of the human genome.</p>
				</title>
				<aug>
					<au>
						<cnm>Internation Human Genome Sequencing Consortium</cnm>
					</au>
				</aug>
				<source>Nature</source>
				<pubdate>2001</pubdate>
				<volume>409</volume>
				<fpage>860</fpage>
				<lpage>921</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1038/35057062</pubid>
						<pubid idtype="pmpid" link="fulltext">11237011</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B47">
				<title>
					<p>The sequence of the human genome.</p>
				</title>
				<aug>
					<au>
						<snm>Venter</snm>
						<fnm>JC</fnm>
					</au>
					<au>
						<snm>Adams</snm>
						<fnm>MD</fnm>
					</au>
					<au>
						<snm>Myers</snm>
						<fnm>EW</fnm>
					</au>
					<au>
						<snm>Li</snm>
						<fnm>PW</fnm>
					</au>
					<au>
						<snm>Mural</snm>
						<fnm>RJ</fnm>
					</au>
					<au>
						<snm>Sutton</snm>
						<fnm>GG</fnm>
					</au>
					<au>
						<snm>Smith</snm>
						<fnm>HO</fnm>
					</au>
					<au>
						<snm>Yandell</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Evans</snm>
						<fnm>CA</fnm>
					</au>
					<au>
						<snm>Holt</snm>
						<fnm>RA</fnm>
					</au>
					<etal/>
				</aug>
				<source>Science</source>
				<pubdate>2001</pubdate>
				<volume>291</volume>
				<fpage>1304</fpage>
				<lpage>1351</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1126/science.1058040</pubid>
						<pubid idtype="pmpid" link="fulltext">11181995</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B48">
				<title>
					<p>Haplotype variation and linkage disequilibrium in 313 human genes.</p>
				</title>
				<aug>
					<au>
						<snm>Stephens</snm>
						<fnm>JC</fnm>
					</au>
					<au>
						<snm>Schneider</snm>
						<fnm>JA</fnm>
					</au>
					<au>
						<snm>Tanguay</snm>
						<fnm>DA</fnm>
					</au>
					<au>
						<snm>Choi</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Acharya</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Stanley</snm>
						<fnm>SE</fnm>
					</au>
					<au>
						<snm>Jiang</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Messer</snm>
						<fnm>CJ</fnm>
					</au>
					<au>
						<snm>Chew</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Han</snm>
						<fnm>JH</fnm>
					</au>
					<etal/>
				</aug>
				<source>Science</source>
				<pubdate>2001</pubdate>
				<volume>293</volume>
				<fpage>489</fpage>
				<lpage>493</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1126/science.1059431</pubid>
						<pubid idtype="pmpid" link="fulltext">11452081</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B49">
				<aug>
					<au>
						<snm>Kimura</snm>
						<fnm>M</fnm>
					</au>
				</aug>
				<source>The Neutral Theory of Molecular Evolution</source>
				<publisher>Cambridge: Cambridge University Press</publisher>
				<pubdate>1983</pubdate>
			</bibl>
			<bibl id="B50">
				<title>
					<p>Statistical method for testing the neutral mutation hypothesis by DNA polymorphism.</p>
				</title>
				<aug>
					<au>
						<snm>Tajima</snm>
						<fnm>F</fnm>
					</au>
				</aug>
				<source>Genetics</source>
				<pubdate>1989</pubdate>
				<volume>123</volume>
				<fpage>585</fpage>
				<lpage>595</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">2513255</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B51">
				<title>
					<p><it>Bacillus anthracis</it>, <it>Bacillus cereus</it>, and <it>Bacillus thuringiensis </it>- one species on the basis of genetic evidence.</p>
				</title>
				<aug>
					<au>
						<snm>Helgason</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>Okstad</snm>
						<fnm>OA</fnm>
					</au>
					<au>
						<snm>Caugant</snm>
						<fnm>DA</fnm>
					</au>
					<au>
						<snm>Johansen</snm>
						<fnm>HA</fnm>
					</au>
					<au>
						<snm>Fouet</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Mock</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Hegna</snm>
						<fnm>I</fnm>
					</au>
					<au>
						<snm>Kolsto</snm>
						<fnm/>
					</au>
				</aug>
				<source>Appl Environ Microbiol</source>
				<pubdate>2000</pubdate>
				<volume>66</volume>
				<fpage>2627</fpage>
				<lpage>2630</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">110590</pubid>
						<pubid idtype="pmpid" link="fulltext">10831447</pubid>
						<pubid idtype="doi">10.1128/AEM.66.6.2627-2630.2000</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B52">
				<title>
					<p>Multilocus sequence typing scheme for bacteria of the <it>Bacillus cereus </it>group.</p>
				</title>
				<aug>
					<au>
						<snm>Helgason</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>Tourasse</snm>
						<fnm>NJ</fnm>
					</au>
					<au>
						<snm>Meisal</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Caugant</snm>
						<fnm>DA</fnm>
					</au>
					<au>
						<snm>Kolsto</snm>
						<fnm>AB</fnm>
					</au>
				</aug>
				<source>Appl Environ Microbiol</source>
				<pubdate>2004</pubdate>
				<volume>70</volume>
				<fpage>191</fpage>
				<lpage>201</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">321270</pubid>
						<pubid idtype="pmpid" link="fulltext">14711642</pubid>
						<pubid idtype="doi">10.1128/AEM.70.1.191-201.2004</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B53">
				<title>
					<p>Statistical properties of the number of recombination events in the history of a sample of DNA sequences.</p>
				</title>
				<aug>
					<au>
						<snm>Hudson</snm>
						<fnm>RR</fnm>
					</au>
					<au>
						<snm>Kaplan</snm>
						<fnm>NL</fnm>
					</au>
				</aug>
				<source>Genetics</source>
				<pubdate>1985</pubdate>
				<volume>111</volume>
				<fpage>147</fpage>
				<lpage>164</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">4029609</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B54">
				<title>
					<p>The interaction of selection and linkage I. General considerations, heterotic models.</p>
				</title>
				<aug>
					<au>
						<snm>Lewontin</snm>
						<fnm>RC</fnm>
					</au>
				</aug>
				<source>Genetics</source>
				<pubdate>1964</pubdate>
				<volume>49</volume>
				<fpage>49</fpage>
				<lpage>67</lpage>
			</bibl>
			<bibl id="B55">
				<title>
					<p>Population structure and evolution of the <it>Bacillus cereus </it>group.</p>
				</title>
				<aug>
					<au>
						<snm>Priest</snm>
						<fnm>FG</fnm>
					</au>
					<au>
						<snm>Barker</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Baillie</snm>
						<fnm>LWJ</fnm>
					</au>
					<au>
						<snm>Holmes</snm>
						<fnm>EC</fnm>
					</au>
					<au>
						<snm>Maiden</snm>
						<fnm>MCJ</fnm>
					</au>
				</aug>
				<source>J Bacteriol</source>
				<pubdate>2004</pubdate>
				<volume>186</volume>
				<fpage>7959</fpage>
				<lpage>7970</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1128/JB.186.23.7959-7970.2004</pubid>
						<pubid idtype="pmpid" link="fulltext">15547268</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B56">
				<aug>
					<au>
						<snm>Felsenstein</snm>
						<fnm>J</fnm>
					</au>
				</aug>
				<source>PHYLIP (Phylogeny Inference Package) version 3.6</source>
				<publisher>Seattle, WA: Department of Genome Sciences, University of Washington</publisher>
				<pubdate>2004</pubdate>
			</bibl>
			<bibl id="B57">
				<title>
					<p>Search for potential vaccine candidate open reading frames in the <it>Bacillus anthracis </it>virulence plasmid pXO1: <it>in silico </it>and <it>in vitro </it>screening.</p>
				</title>
				<aug>
					<au>
						<snm>Ariel</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Zvi</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Grosfeld</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Gat</snm>
						<fnm>O</fnm>
					</au>
					<au>
						<snm>Inbar</snm>
						<fnm>Y</fnm>
					</au>
					<au>
						<snm>Velan</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Cohen</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Shafferman</snm>
						<fnm>A</fnm>
					</au>
				</aug>
				<source>Infect Immun</source>
				<pubdate>2002</pubdate>
				<volume>70</volume>
				<fpage>6817</fpage>
				<lpage>6827</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">133087</pubid>
						<pubid idtype="pmpid" link="fulltext">12438358</pubid>
						<pubid idtype="doi">10.1128/IAI.70.12.6817-6827.2002</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B58">
				<title>
					<p>Molecular analysis of rifampin resistance in <it>Bacillus anthracis </it>and <it>Bacillus cereus</it>.</p>
				</title>
				<aug>
					<au>
						<snm>Vogler</snm>
						<fnm>AJ</fnm>
					</au>
					<au>
						<snm>Busch</snm>
						<fnm>JD</fnm>
					</au>
					<au>
						<snm>Percy-Fine</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Tipton-Hunton</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Smith</snm>
						<fnm>KL</fnm>
					</au>
					<au>
						<snm>Keim</snm>
						<fnm>P</fnm>
					</au>
				</aug>
				<source>Antimicrob Agents Chemother</source>
				<pubdate>2002</pubdate>
				<volume>46</volume>
				<fpage>511</fpage>
				<lpage>513</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">127050</pubid>
						<pubid idtype="pmpid" link="fulltext">11796364</pubid>
						<pubid idtype="doi">10.1128/AAC.46.2.511-513.2002</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B59">
				<title>
					<p>Genome-based bioinformatic selection of chromosomal <it>Bacillus anthracis</it> putative vaccine candidates coupled with proteomic identification of surface-associated antigens.</p>
				</title>
				<aug>
					<au>
						<snm>Ariel</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Zvi</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Makarova</snm>
						<fnm>KS</fnm>
					</au>
					<au>
						<snm>Chitlaru</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Elhanany</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>Velan</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Cohen</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Friedlander</snm>
						<fnm>AM</fnm>
					</au>
					<au>
						<snm>Shafferman</snm>
						<fnm>A</fnm>
					</au>
				</aug>
				<source>Infect Immun</source>
				<pubdate>2003</pubdate>
				<volume>71</volume>
				<fpage>4563</fpage>
				<lpage>4579</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">165985</pubid>
						<pubid idtype="pmpid" link="fulltext">12874336</pubid>
						<pubid idtype="doi">10.1128/IAI.71.8.4563-4579.2003</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B60">
				<title>
					<p>Enterobacterial adhesins and the case for studying SNPs in bacteria.</p>
				</title>
				<aug>
					<au>
						<snm>Weissman</snm>
						<fnm>SJ</fnm>
					</au>
					<au>
						<snm>Moseley</snm>
						<fnm>SL</fnm>
					</au>
					<au>
						<snm>Dykhuizen</snm>
						<fnm>DE</fnm>
					</au>
					<au>
						<snm>Sokurenko</snm>
						<fnm>EV</fnm>
					</au>
				</aug>
				<source>Trends Microbiol</source>
				<pubdate>2003</pubdate>
				<volume>11</volume>
				<fpage>115</fpage>
				<lpage>117</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/S0966-842X(03)00010-6</pubid>
						<pubid idtype="pmpid" link="fulltext">12648942</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B61">
				<title>
					<p>Gradual evolution in bacteria: evidence from Bacillus systematics.</p>
				</title>
				<aug>
					<au>
						<snm>Feldgarden</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Byrd</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Cohan</snm>
						<fnm>FM</fnm>
					</au>
				</aug>
				<source>Microbiology</source>
				<pubdate>2003</pubdate>
				<volume>149</volume>
				<fpage>3565</fpage>
				<lpage>3573</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1099/mic.0.26457-0</pubid>
						<pubid idtype="pmpid" link="fulltext">14663088</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B62">
				<title>
					<p>Formation and composition of the <it>Bacillus anthracis </it>endospore.</p>
				</title>
				<aug>
					<au>
						<snm>Liu</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Bergman</snm>
						<fnm>NH</fnm>
					</au>
					<au>
						<snm>Thomason</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Shallom</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Hazen</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Crossno</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Rasko</snm>
						<fnm>DA</fnm>
					</au>
					<au>
						<snm>Ravel</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Read</snm>
						<fnm>TD</fnm>
					</au>
					<au>
						<snm>Peterson</snm>
						<fnm>SN</fnm>
					</au>
					<etal/>
				</aug>
				<source>J Bacteriol</source>
				<pubdate>2004</pubdate>
				<volume>186</volume>
				<fpage>164</fpage>
				<lpage>178</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">303457</pubid>
						<pubid idtype="pmpid" link="fulltext">14679236</pubid>
						<pubid idtype="doi">10.1128/JB.186.1.164-178.2004</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B63">
				<title>
					<p>Modeling bacterial evolution with comparative-genome-based marker systems: application to <it>Mycobacterium tuberculosis </it>evolution and pathogenesis.</p>
				</title>
				<aug>
					<au>
						<snm>Alland</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Whittam</snm>
						<fnm>TS</fnm>
					</au>
					<au>
						<snm>Murray</snm>
						<fnm>MB</fnm>
					</au>
					<au>
						<snm>Cave</snm>
						<fnm>MD</fnm>
					</au>
					<au>
						<snm>Hazbon</snm>
						<fnm>MH</fnm>
					</au>
					<au>
						<snm>Dix</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Kokoris</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Duesterhoeft</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Eisen</snm>
						<fnm>JA</fnm>
					</au>
					<au>
						<snm>Fraser</snm>
						<fnm>CM</fnm>
					</au>
					<au>
						<snm>Fleischmann</snm>
						<fnm>RD</fnm>
					</au>
				</aug>
				<source>J Bacteriol</source>
				<pubdate>2003</pubdate>
				<volume>185</volume>
				<fpage>3392</fpage>
				<lpage>3399</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">155390</pubid>
						<pubid idtype="pmpid" link="fulltext">12754238</pubid>
						<pubid idtype="doi">10.1128/JB.185.11.3392-3399.2003</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B64">
				<title>
					<p>Multilocus sequence typing: a portable approach to the identification of clones within populations of pathogenic microorganisms.</p>
				</title>
				<aug>
					<au>
						<snm>Maiden</snm>
						<fnm>MC</fnm>
					</au>
					<au>
						<snm>Bygraves</snm>
						<fnm>JA</fnm>
					</au>
					<au>
						<snm>Feil</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>Morelli</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Russell</snm>
						<fnm>JE</fnm>
					</au>
					<au>
						<snm>Urwin</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Zhang</snm>
						<fnm>Q</fnm>
					</au>
					<au>
						<snm>Zhou</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Zurth</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Caugant</snm>
						<fnm>DA</fnm>
					</au>
					<etal/>
				</aug>
				<source>Proc Natl Acad Sci USA</source>
				<pubdate>1998</pubdate>
				<volume>95</volume>
				<fpage>3140</fpage>
				<lpage>3145</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">19708</pubid>
						<pubid idtype="pmpid" link="fulltext">9501229</pubid>
						<pubid idtype="doi">10.1073/pnas.95.6.3140</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B65">
				<title>
					<p>The "hitchhiking effect" revisited.</p>
				</title>
				<aug>
					<au>
						<snm>Kaplan</snm>
						<fnm>NL</fnm>
					</au>
					<au>
						<snm>Hudson</snm>
						<fnm>RR</fnm>
					</au>
					<au>
						<snm>Langley</snm>
						<fnm>CH</fnm>
					</au>
				</aug>
				<source>Genetics</source>
				<pubdate>1989</pubdate>
				<volume>123</volume>
				<fpage>887</fpage>
				<lpage>899</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">2612899</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B66">
				<title>
					<p>The effect of background selection against deleterious mutations on weakly selected, linked variants.</p>
				</title>
				<aug>
					<au>
						<snm>Charlesworth</snm>
						<fnm>B</fnm>
					</au>
				</aug>
				<source>Genet Res</source>
				<pubdate>1994</pubdate>
				<volume>63</volume>
				<fpage>213</fpage>
				<lpage>227</lpage>
				<xrefbib>
					<pubid idtype="pmpid">8082838</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B67">
				<title>
					<p>The pattern of neutral molecular variation under the background selection model.</p>
				</title>
				<aug>
					<au>
						<snm>Charlesworth</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Charlesworth</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Morgan</snm>
						<fnm>MT</fnm>
					</au>
				</aug>
				<source>Genetics</source>
				<pubdate>1995</pubdate>
				<volume>141</volume>
				<fpage>1619</fpage>
				<lpage>1632</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">8601499</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B68">
				<title>
					<p>Miropeats: graphical DNA sequence comparisons.</p>
				</title>
				<aug>
					<au>
						<snm>Parsons</snm>
						<fnm>JD</fnm>
					</au>
				</aug>
				<source>Comput Appl Biosci</source>
				<pubdate>1995</pubdate>
				<volume>11</volume>
				<fpage>615</fpage>
				<lpage>619</lpage>
				<xrefbib>
					<pubid idtype="pmpid">8808577</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B69">
				<aug>
					<au>
						<snm>Weir</snm>
						<fnm>BS</fnm>
					</au>
				</aug>
				<source>Genetic Data Analysis II</source>
				<publisher>Sunderland, MA: Sinauer Associates</publisher>
				<pubdate>1996</pubdate>
			</bibl>
			<bibl id="B70">
				<source>Gene genealogies and the coalescent process</source>
				<publisher>Oxford: Oxford University Press</publisher>
				<editor>Hudson RR</editor>
				<pubdate>1991</pubdate>
			</bibl>
		</refgrp>
	</bm>
</art>
