<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1471-2105-8-278</ui>
   <ji>1471-2105</ji>
   <fm>
      <dochead>Methodology article</dochead>
      <bibl>
         <title>
            <p>Computer-aided identification of polymorphism sets diagnostic for groups of bacterial and viral genetic variants</p>
         </title>
         <aug>
            <au id="A1">
               <snm>Price</snm>
               <mi>P</mi>
               <fnm>Erin</fnm>
               <insr iid="I1"/>
               <email>esme_epp@hotmail.com</email>
            </au>
            <au id="A2">
               <snm>Inman-Bamber</snm>
               <fnm>John</fnm>
               <insr iid="I1"/>
               <email>john.bamber@optusnet.com.au</email>
            </au>
            <au id="A3">
               <snm>Thiruvenkataswamy</snm>
               <fnm>Venugopal</fnm>
               <insr iid="I1"/>
               <email>vtswamy@tpg.com.au</email>
            </au>
            <au id="A4">
               <snm>Huygens</snm>
               <fnm>Flavia</fnm>
               <insr iid="I1"/>
               <email>f.huygens@qut.edu.au</email>
            </au>
            <au id="A5" ca="yes">
               <snm>Giffard</snm>
               <mi>M</mi>
               <fnm>Philip</fnm>
               <insr iid="I1"/>
               <email>p.giffard@qut.edu.au</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Cooperative Research Centre for Diagnostics, Institute of Health and Biomedical Innovation, Queensland University of Technology, Cnr Blamey St and Musk Ave, Kelvin Grove, Australia</p>
            </ins>
         </insg>
         <source>BMC Bioinformatics</source>
         <issn>1471-2105</issn>
         <pubdate>2007</pubdate>
         <volume>8</volume>
         <issue>1</issue>
         <fpage>278</fpage>
         <url>http://www.biomedcentral.com/1471-2105/8/278</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">17672919</pubid>
               <pubid idtype="doi">10.1186/1471-2105-8-278</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>02</day>
               <month>4</month>
               <year>2007</year>
            </date>
         </rec>
         <acc>
            <date>
               <day>01</day>
               <month>8</month>
               <year>2007</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>01</day>
               <month>8</month>
               <year>2007</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2007</year>
         <collab>Price et al; licensee BioMed Central Ltd.</collab>
         <note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>Single nucleotide polymorphisms (SNPs) and genes that exhibit presence/absence variation have provided informative marker sets for bacterial and viral genotyping. Identification of marker sets optimised for these purposes has been based on maximal generalized discriminatory power as measured by Simpson's Index of Diversity, or on the ability to identify specific variants. Here we describe the Not-N algorithm, which is designed to identify small sets of genetic markers diagnostic for user-specified subsets of known genetic variants. The algorithm does not treat the user-specified subset and the remaining genetic variants equally. Rather Not-N analysis is designed to underpin assays that provide 0% false negatives, which is very important for e.g. diagnostic procedures for clinically significant subgroups within microbial species.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>The Not-N algorithm has been incorporated into the "Minimum SNPs" computer program and used to derive genetic markers diagnostic for multilocus sequence typing-defined clonal complexes, hepatitis C virus (HCV) subtypes, and phylogenetic clades defined by comparative genome hybridization (CGH) data for <it>Campylobacter jejuni</it>, <it>Yersinia enterocolitica </it>and <it>Clostridium difficile</it>.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusion</p>
               </st>
               <p>Not-N analysis is effective for identifying small sets of genetic markers diagnostic for microbial sub-groups. The best results to date have been obtained with CGH data from several bacterial species, and HCV sequence data.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>The last two decades have seen an exponential increase in the generation of comparative genetic data from within bacterial and viral species. Many of the bacterial data sets are derived from electrophoresis-based genotyping methods, such as pulsed-field gel electrophoresis, which has been used to develop the inter-laboratory PulseNet system for real-time monitoring of foodborne bacterial pathogens <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. More recently, databases of defined genetic polymorphisms have become available. Conspicuous examples are multilocus sequence typing (MLST) databases <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B3">3</abbr></abbrgrp>, the results of comparative genome hybridization (CGH) studies on bacteria <abbrgrp><abbr bid="B4">4</abbr><abbr bid="B5">5</abbr><abbr bid="B6">6</abbr><abbr bid="B7">7</abbr></abbrgrp>, and whole-genome sequence databases for bacteria and viruses <abbrgrp><abbr bid="B8">8</abbr><abbr bid="B9">9</abbr><abbr bid="B10">10</abbr><abbr bid="B11">11</abbr><abbr bid="B12">12</abbr></abbrgrp>.</p>
         <p>The extensive knowledge base of comparative genetic information can be exploited to develop rationally-designed genotyping methods for examining epidemiology, or inferring virulence potential, vaccine susceptibility or antimicrobial-antiviral resistance. One approach to discriminating known genotypes within a species is to interrogate every known genetic polymorphism. However, this approach is inefficient due to linkage of alleles, and may also provide more resolving power than is required <abbrgrp><abbr bid="B13">13</abbr></abbrgrp>. Despite considerable improvements in nucleic acid analysis technology in recent years, there remains a need for cost-effective and rapid genotyping methods that interrogate small sets of polymorphisms and provide the required information in an efficient manner <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>. Such methods have potential applications in infection control, point-of-care diagnosis, high-throughput public health investigations, food microbiology and biodefense. Suitable emerging technology platforms for such marker sets include real-time PCR <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>, medium-density arrays <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>, and more recently 'lab-on-a-chip' devices <abbrgrp><abbr bid="B17">17</abbr><abbr bid="B18">18</abbr></abbrgrp>.</p>
         <p>The considerable volume of comparative genetic information now available renders computerized data-mining the only practical means of identifying sets of polymorphisms optimized for particular genotyping tasks. Our research group has previously developed and described the "Minimum SNPs" computer program, which extracts resolution-optimized single-nucleotide polymorphism (SNP) sets from complex DNA sequence alignments <abbrgrp><abbr bid="B19">19</abbr></abbrgrp>. Previously described capabilities of "Minimum SNPs" included the identification of SNP sets that discriminated a single sequence variant from all other know variants (the '%' mode), and the identification of SNP sets that maximize Simpson's Index of Diversity (<it>D</it>), and are therefore optimized with respect to discriminating all variants from each other <abbrgrp><abbr bid="B19">19</abbr><abbr bid="B20">20</abbr></abbrgrp>. The % module has been applied to the identification of highly informative SNPs that define specific <it>Neisseria meningitidis </it>and <it>Staphylococcus aureus </it>variants <abbrgrp><abbr bid="B19">19</abbr></abbrgrp>, while the <it>D </it>module has been used to extract <it>D</it>-maximized SNP sets from <it>S. aureus </it>and <it>Campylobacter jejuni </it>MLST databases <abbrgrp><abbr bid="B21">21</abbr><abbr bid="B22">22</abbr><abbr bid="B23">23</abbr></abbrgrp>, and, more recently, to derive <it>D</it>-maximized binary gene sets (sets of genes that are present in some isolates but not others) from <it>C. jejuni </it>CGH data <abbrgrp><abbr bid="B24">24</abbr></abbrgrp>.</p>
         <p>Other bioinformatics programs that carry out similar functions to "Minimum SNPs" include a linkage disequilibrium-selection algorithm, which identifies SNPs diagnostic for haplotype blocks in mammalian genomes <abbrgrp><abbr bid="B13">13</abbr></abbrgrp>, and SNPT, which also incorporates a <it>D </it>maximization module <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>. One function notably absent from previous versions of "Minimum SNPs" and similar programs is the ability to identify sets of genetic markers that discriminate a user-defined group of variants from all other known variants. Such a function could underpin genotyping methodologies designed to identify all variants within a species that possess specific traits of interest, such as increased virulence or resistance properties. There are several considerations when designing such an algorithm, such as conversion of the user-defined group of sequence variants into a consensus sequence, and scoring of the resolving power of genetic marker sets. Here we report a novel algorithm for identifying such genetic marker sets, and its application to the analysis of microbial and viral comparative genetic data.</p>
      </sec>
      <sec>
         <st>
            <p>Results and Discussion</p>
         </st>
         <sec>
            <st>
               <p>MLST datasets</p>
            </st>
            <p>The clonal complexes (CCs) defined by MLST data are emerging as important epidemiological or taxonomic units <abbrgrp><abbr bid="B14">14</abbr><abbr bid="B26">26</abbr></abbrgrp>. Therefore it was investigated whether Not-N analysis could identify diagnostic SNPs for major CCs of a variety of bacterial species. In <it>E. coli</it>, two independent MLST schemes were examined. Using scheme 1, Not-N analysis successfully identified 15 SNPs that completely differentiated the 10 major CCs (see additional file <supplr sid="S1">1</supplr>). The second <it>E. coli </it>MLST scheme, which contains a larger cohort of isolates than scheme 1, required 24 SNPs to differentiate the 12 major CCs. Not-N analysis was unable to completely differentiate the largest CC, with eight SNPs resolving 98.5% of the out-group from the group of interest. Not-N analysis of the <it>H. influenzae </it>MLST dataset identified 24 SNPs that differentiated the seven major CCs from the remaining ST population (results not shown).</p>
            <suppl id="S1">
               <title>
                  <p>Additional file 1</p>
               </title>
               <text>
                  <p>Single-nucleotide polymorphisms extracted from multilocus sequence typing data using the Not-N module of "Minimum SNPs".</p>
               </text>
               <file name="1471-2105-8-278-S1.doc">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <p>In contrast to <it>H. influenzae </it>and <it>E. coli</it>, Not-N analysis was unable to identify high-confidence SNPs (&#8805; 98%) for four <it>S. aureus </it>and five <it>C. jejuni </it>CCs. These CCs were the largest in their respective databases. The difficulty in identifying SNPs diagnostic for the larger CCs was investigated further and was found to be due to CC members that have diverged from the CC founders by recombination rather than mutation. <it>S. aureus </it>exhibits a low recombination frequency, with approximately 10% of CC members arising by recombination <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>. While mutation predominantly generates single novel SNPs, recombinants acquire a pre-existing allele from elsewhere in the species. The large pool of alleles present in the larger CCs increases the probability that one or more alleles per locus have arisen by recombination with a pre-existing allele. Given that most <it>S. aureus </it>SNPs are dimorphic, coupled with the effects of recombination and the small pool of available SNPs in <it>S. aureus </it>sequence data, there exist few SNPs unique to all members of a CC. Therefore, the probability of finding highly discriminatory SNP sets is low. In support of this, the Not-N algorithm was used to find sets of SNPs diagnostic for the 212 methicillin-resistant <it>S. aureus </it>(MRSA) STs in the MLST database <abbrgrp><abbr bid="B28">28</abbr></abbrgrp>. The MRSA dataset is smaller than the complete <it>S. aureus </it>MLST database, and therefore the CCs contain correspondingly fewer recombinants. This analysis yielded 22 SNPs that delineated the ten major MRSA CCs with 100% confidence (results not shown).</p>
            <p>In <it>C. jejuni</it>, the influence of recombination on Not-N performance is similar to <it>S. aureus </it>but is more extreme. The majority of STs in <it>C. jejuni </it>arise by recombination, at an estimated frequency of 50 times the rate of mutation <abbrgrp><abbr bid="B29">29</abbr></abbrgrp>, resulting in an even smaller probability of identifying CC-specific SNPs. SNP sets diagnostic for <it>S. aureus </it>CCs are more efficiently derived using the <it>D </it>maximization algorithm of "Minimum SNPs", while in <it>C. jejuni</it>, the high recombination rate renders the identification of small numbers of CC-specific SNPs with high discriminatory power and a low false-negative rate potentially impossible by any means. Other researchers have identified small numbers of <it>C. jejuni </it>CC SNPs characteristic of six major CCs of <it>C. jejuni</it>; ST-21, ST-45, ST-48, ST-61, ST-206 and ST-257 <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>. However, these SNPs result in a high proportion (between 17 and 54%) of false-negative STs, and may therefore be unsuitable for certain genotyping applications.</p>
            <p>The "Minimum SNPs" software has previously been used to derive <it>D</it>-optimized (diversity maximizing) SNP sets from the <it>S. aureus </it>and <it>C. jejuni </it>databases <abbrgrp><abbr bid="B21">21</abbr><abbr bid="B22">22</abbr><abbr bid="B23">23</abbr></abbrgrp>. In the case of <it>S. aureus</it>, the genotypes defined by approximately eight <it>D</it>-optimized SNPs correspond closely to the population structure as defined by eBURST analysis. Thus, the <it>D</it>-optimized SNP sets appear superior to the Not-N derived SNP sets for assigning an <it>S. aureus </it>isolate to a CC. In the case of <it>C. jejuni</it>, the correspondence between <it>D</it>-optimized SNP genotypes and population structure is less than in <it>S. aureus </it>because of the higher recombination frequency. However, adding the interrogation of more loci such as a hypervariable region (sequencing of the flagellin A short variable region) to the SNP-based genotyping reduced to insignificance the incidence of unrelated isolates failing to be discriminated, thus demonstrating the value of the <it>D</it>-optimized SNP set. In summary, for both species, Not-N-derived SNPs are generally not optimal for assigning isolates to larger CCs, but are highly effective for identifying the smaller CCs.</p>
         </sec>
         <sec>
            <st>
               <p>CGH datasets</p>
            </st>
            <p>CGH allows the large-scale identification of genetic differences across a number of strains. Bayesian-based algorithms applied to the CGH data of <it>C. jejuni</it>, <it>Y. enterocolitica </it>and <it>C. difficile </it>enabled the identification of phylogenetic clades that can predict infection source or pathogenicity traits. Champion et al. <abbrgrp><abbr bid="B5">5</abbr></abbrgrp> identified two distinct clades in <it>C. jejuni </it>predictive of infection source, with one clade containing predominantly livestock isolates and the other non-livestock (environmental) isolates. Isolates identified from human infection were roughly evenly distributed between the two clades. In <it>Y. enterocolitica</it>, three clades corresponding to non-pathogenic (biotype 1A), low-pathogenicity (biotypes 2&#8211;5) and highly pathogenic (biotype 1B) were identified by the comparative phylogenomics approach <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>. For <it>C. difficile</it>, CGH phylogeny identified four clades comprising a hypervirulent clade, a toxin A-B+ clade, and two clades containing human and animal isolates <abbrgrp><abbr bid="B6">6</abbr></abbrgrp>.</p>
            <p>The three CGH studies identified binary genes specific to particular clades using MacClade, parsimony-based software that is used for reconstructing phylogeny and interpreting patterns of character evolution <abbrgrp><abbr bid="B30">30</abbr></abbrgrp>. MacClade 4 was used to identify 33 coding sequences (CDS) in <it>C. jejuni </it>that were characteristic of the livestock clade, including the gene cluster <it>Cj1321 </it>to <it>Cj1326 </it>within the O-linked flagellin glycosylation locus. However, none of the 33 CDS identified by MacClade 4 analysis were 100% specific to the livestock clade <abbrgrp><abbr bid="B5">5</abbr></abbrgrp>. In contrast, Not-N analysis identified two binary genes from the <it>C. jejuni </it>CGH data that separated, with 100% confidence, isolates from the livestock clade from those in the nonlivestock clade (Table <tblr tid="T1">1</tblr>).</p>
            <tbl id="T1">
               <title>
                  <p>Table 1</p>
               </title>
               <caption>
                  <p>Not-N analysis-derived binary gene targets from CGH data of <it>Campylobacter jejuni</it>, <it>Yersinia enterocolitica </it>and <it>Clostridium difficile</it>.</p>
               </caption>
               <tblbdy cols="6">
                  <r>
                     <c ca="center">
                        <p>
                           <b>Bacterium</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>Clade<sup>a</sup></b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>Gene 1 (%)</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>Gene 2 (%)</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>Gene 3 (%)</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>No. of Pathways<sup>b</sup></b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c cspan="6">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <it>C. jejuni</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>Livestock</p>
                     </c>
                     <c ca="center">
                        <p><it>Cj0818</it>, present (85.7)</p>
                     </c>
                     <c ca="center">
                        <p><it>Cj0424</it>, present (100)</p>
                     </c>
                     <c ca="center">
                        <p>---</p>
                     </c>
                     <c ca="center">
                        <p>8</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>Non-livestock</p>
                     </c>
                     <c ca="center">
                        <p>---</p>
                     </c>
                     <c ca="center">
                        <p>---</p>
                     </c>
                     <c ca="center">
                        <p>---</p>
                     </c>
                     <c ca="center">
                        <p>n/a</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <it>Y. enterocolitica</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>Non-pathogenic</p>
                     </c>
                     <c ca="center">
                        <p>Ye8081-4002, absent (100)</p>
                     </c>
                     <c ca="center">
                        <p>---</p>
                     </c>
                     <c ca="center">
                        <p>---</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>Low pathogenicity</p>
                     </c>
                     <c ca="center">
                        <p>Ye8081-0306, absent (100)</p>
                     </c>
                     <c ca="center">
                        <p>---</p>
                     </c>
                     <c ca="center">
                        <p>---</p>
                     </c>
                     <c ca="center">
                        <p>15</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>Highly pathogenic</p>
                     </c>
                     <c ca="center">
                        <p>Ye8081-0113, present (100)</p>
                     </c>
                     <c ca="center">
                        <p>---</p>
                     </c>
                     <c ca="center">
                        <p>---</p>
                     </c>
                     <c ca="center">
                        <p>50</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <it>C. difficile</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>HY</p>
                     </c>
                     <c ca="center">
                        <p>CD2669, absent (98.1)</p>
                     </c>
                     <c ca="center">
                        <p>CD2983, present (100)</p>
                     </c>
                     <c ca="center">
                        <p>---</p>
                     </c>
                     <c ca="center">
                        <p>13</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>A-B+</p>
                     </c>
                     <c ca="center">
                        <p>CD2983, absent (100)</p>
                     </c>
                     <c ca="center">
                        <p>---</p>
                     </c>
                     <c ca="center">
                        <p>---</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>HA1</p>
                     </c>
                     <c ca="center">
                        <p>CD2570, present (46.4)</p>
                     </c>
                     <c ca="center">
                        <p>CD2669, present (83.9)</p>
                     </c>
                     <c ca="center">
                        <p>CD2983, present (100)</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>HA2</p>
                     </c>
                     <c ca="center">
                        <p>CD0265, absent (100)</p>
                     </c>
                     <c ca="center">
                        <p>---</p>
                     </c>
                     <c ca="center">
                        <p>---</p>
                     </c>
                     <c ca="center">
                        <p>8</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p><sup>a </sup>Clades defined for <it>Y. enterocolitica</it>,<it>C. jejuni </it>and <it>C. difficile </it>by references [4-6] respectively.</p>
                  <p><sup>b </sup>Corresponds to the number of alternate outputs provided by Not-N analysis that are not shown in the. n/a, not applicable.</p>
               </tblfn>
            </tbl>
            <p>Within the <it>Y. enterocolitica </it>CGH data, MacClade 4 analysis identified several CDS that were 100% specific to each of the three pathogenicity clades <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>. Some of these genes, such as YE1820 (characteristic of the non-pathogenic clade), were classed as divergent based on the array signal but considered present for the purposes of the MacClade analysis. CGH cannot reliably detect small differences in hybridizations caused by moderate gene divergence <abbrgrp><abbr bid="B31">31</abbr></abbrgrp>. For this reason, only genes that were clearly present or absent were analyzed by Not-N to remove potential miscalled binary gene status. Three binary genes were identified that enabled 100% discrimination between the differing pathogeneses of <it>Y. enterocolitica</it>. In <it>C. difficile</it>, MacClade analysis did not identify binary gene sets specific for isolates within the four clades, and many of the identified genes were divergent <abbrgrp><abbr bid="B6">6</abbr></abbrgrp>. In comparison, Not-N analysis identified four binary genes that separated the four distinct clades described by Stabler et al. <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> with 100% confidence. These results demonstrate that the application of Not-N to CGH data was both more efficient and able to identify fewer binary targets than MacClade analyses (Table <tblr tid="T1">1</tblr>).</p>
         </sec>
         <sec>
            <st>
               <p>Viral sequence datasets</p>
            </st>
            <p>A large number of complete genome sequences are currently available for both HCV (188 genomes) and HIV-1 (1507 genomes) <abbrgrp><abbr bid="B8">8</abbr><abbr bid="B10">10</abbr></abbrgrp>, providing an ideal resource to examine the performance of Not-N analysis on well-characterized loci within these viruses. Despite examining several loci, Not-N analysis was unable to identify SNPs diagnostic for any of the HIV-1 clades. This is likely due to the exceptionally high degree of recombination between HIV-1 variants that has resulted in the emergence of circulating recombinant forms (CRFs) <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>. Given these results, and the acceptance of sequencing as the appropriate HIV-1 genotyping approach, it was concluded that Not-N derived SNPs are not suitable for HIV-1 genotyping. In contrast, Not-N analysis identified 15 HCV SNPs that delineate, with 100% confidence, the 13 predominant subtypes of this virus (Table <tblr tid="T2">2</tblr>). Interestingly, Not-N analysis failed to identify comparably informative SNP sets for the six major genotype groups (1&#8211;6) of HCV, possibly as a consequence of the high level of divergence between subtypes within each genotype group.</p>
            <tbl id="T2">
               <title>
                  <p>Table 2</p>
               </title>
               <caption>
                  <p>Single-nucleotide polymorphisms identified by Not-N analysis for the major subtypes of hepatitis C virus.</p>
               </caption>
               <tblbdy cols="6">
                  <r>
                     <c ca="center">
                        <p>
                           <b>HCV subtype</b>
                           <sup>a</sup>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>No. genotypes</b>
                        </p>
                     </c>
                     <c cspan="2" ca="center">
                        <p>
                           <b>SNP 1</b>
                        </p>
                     </c>
                     <c cspan="2" ca="center">
                        <p>
                           <b>SNP 2</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c cspan="4">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>Position<sup>b</sup></p>
                     </c>
                     <c ca="center">
                        <p>Discrimination (%)</p>
                     </c>
                     <c ca="center">
                        <p>Position</p>
                     </c>
                     <c ca="center">
                        <p>Discrimination (%)</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="6">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>1a</p>
                     </c>
                     <c ca="center">
                        <p>117</p>
                     </c>
                     <c ca="center">
                        <p>126* (C or T)</p>
                     </c>
                     <c ca="center">
                        <p>100</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>1b</p>
                     </c>
                     <c ca="center">
                        <p>382</p>
                     </c>
                     <c ca="center">
                        <p>103 (C)</p>
                     </c>
                     <c ca="center">
                        <p>72.1</p>
                     </c>
                     <c ca="center">
                        <p>194* (G)/238* (C) 100%</p>
                     </c>
                     <c ca="center">
                        <p>100</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>2a</p>
                     </c>
                     <c ca="center">
                        <p>38</p>
                     </c>
                     <c ca="center">
                        <p>258 (A or G) 99.7%</p>
                     </c>
                     <c ca="center">
                        <p>99.7</p>
                     </c>
                     <c ca="center">
                        <p>182* (T)</p>
                     </c>
                     <c ca="center">
                        <p>100</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>2b</p>
                     </c>
                     <c ca="center">
                        <p>53</p>
                     </c>
                     <c ca="center">
                        <p>314 (A)</p>
                     </c>
                     <c ca="center">
                        <p>99.7</p>
                     </c>
                     <c ca="center">
                        <p>127 (A)</p>
                     </c>
                     <c ca="center">
                        <p>100</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>2c</p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                     <c ca="center">
                        <p>50 (T)</p>
                     </c>
                     <c ca="center">
                        <p>99.7</p>
                     </c>
                     <c ca="center">
                        <p>39* (G)</p>
                     </c>
                     <c ca="center">
                        <p>100</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>3a</p>
                     </c>
                     <c ca="center">
                        <p>49</p>
                     </c>
                     <c ca="center">
                        <p>295 (G)</p>
                     </c>
                     <c ca="center">
                        <p>100</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>3b</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="center">
                        <p>307 (T)</p>
                     </c>
                     <c ca="center">
                        <p>99.8</p>
                     </c>
                     <c ca="center">
                        <p>126* (G)</p>
                     </c>
                     <c ca="center">
                        <p>100</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>4a</p>
                     </c>
                     <c ca="center">
                        <p>26</p>
                     </c>
                     <c ca="center">
                        <p>182* (A)</p>
                     </c>
                     <c ca="center">
                        <p>97.8</p>
                     </c>
                     <c ca="center">
                        <p>325* (A)</p>
                     </c>
                     <c ca="center">
                        <p>100</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>4d</p>
                     </c>
                     <c ca="center">
                        <p>17</p>
                     </c>
                     <c ca="center">
                        <p>154 (A)</p>
                     </c>
                     <c ca="center">
                        <p>100</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>4f</p>
                     </c>
                     <c ca="center">
                        <p>21</p>
                     </c>
                     <c ca="center">
                        <p>325* (C)</p>
                     </c>
                     <c ca="center">
                        <p>99.5</p>
                     </c>
                     <c ca="center">
                        <p>194* (T)/238* (A) 100%</p>
                     </c>
                     <c ca="center">
                        <p>100</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>4t</p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>338/339 (C)</p>
                     </c>
                     <c ca="center">
                        <p>100</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>5a</p>
                     </c>
                     <c ca="center">
                        <p>18</p>
                     </c>
                     <c ca="center">
                        <p>100 (C)</p>
                     </c>
                     <c ca="center">
                        <p>100</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>6a</p>
                     </c>
                     <c ca="center">
                        <p>34</p>
                     </c>
                     <c ca="center">
                        <p>39* (C or T)</p>
                     </c>
                     <c ca="center">
                        <p>100</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p><sup>a</sup>Subtypes containing less than four confirmed sequences were not included in the analysis. Sequences were downloaded from the hepatitis C virus (HCV) sequence database [10].</p>
                  <p><sup>b</sup>The single-nucleotide polymorphism (SNP) position refers to a 340 bp fragment of the RNA-dependent RNA polymerase NS5B spanning nucleotides 8276 to 8615 (GenBank accession <ext-link ext-link-type="gen" ext-link-id="AF009606">AF009606</ext-link> [48]). NS5B is used to construct phylogenetic trees for HCV, which form the basis of the genotype and subtype nomenclature [40].</p>
                  <p>*SNP discriminates multiple subtypes.</p>
               </tblfn>
            </tbl>
            <p>Current HCV genotyping methods such as the line probe assay (INNO-LiPA) <abbrgrp><abbr bid="B33">33</abbr></abbrgrp>, the real-time PCR-based Abbott HCV analyte-specific reagent (ASR) and COBAS TaqMan48 HCV tests <abbrgrp><abbr bid="B34">34</abbr><abbr bid="B35">35</abbr></abbrgrp>, restriction fragment length polymorphism analysis <abbrgrp><abbr bid="B36">36</abbr></abbrgrp> and primer extension methods <abbrgrp><abbr bid="B37">37</abbr></abbrgrp> primarily target the 5' non-coding region (5'-NCR). A drawback of targeting the 5'-NCR is that some subtypes, such as 1a and 1b, 1b and 6a, or 2a and 2c, remain indistinguishable in a small number of cases due to the conserved nature of this region <abbrgrp><abbr bid="B38">38</abbr><abbr bid="B39">39</abbr></abbrgrp>. In contrast, the 15 HCV SNPs identified by Not-N analysis were derived from RNA-polymerase NS5B rather than the 5'NCR, and unlike 5'-NCR-derived SNPs, are 100% specific for each of the 13 subtypes of HCV. This finding is significant as the correlation between HCV genotype and clinical outcome is well-documented. Genotype-specific differences between HCV variants aid in assessing the clinical management of infection, with genotypes 1 and 4 more resistant than genotypes 2 and 3 to interferon-&#945;-based therapy. In addition, HCV variants appear specific to particular geographic regions, such as the widespread distribution of HCV subtype 1a throughout the USA and Northern Europe <abbrgrp><abbr bid="B40">40</abbr></abbrgrp>.</p>
            <p>To our knowledge, this is the first set of genotyping targets that enables the specific and accurate discrimination of the 13 major subtypes of HCV. Real-time PCR-based methods, such as allele-specific real-time PCR <abbrgrp><abbr bid="B41">41</abbr></abbrgrp> or high-resolution melt analysis <abbrgrp><abbr bid="B42">42</abbr></abbrgrp>, are promising candidates for interrogating the 15 SNPs due to their ability to accurately interrogate polymorphisms in diverse DNA sequence, such as that found within the NS5B region of HCV.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Conclusion</p>
         </st>
         <p>This study has shown that the Not-N algorithm provides a practical tool for identifying diagnostic polymorphisms that discriminate bacterial or viral populations of interest. Not-N analysis was particularly valuable with bacterial CGH and HCV genome sequence data, where the software identified genetic markers with superior performance to polymorphisms in current use. The ability of the algorithm to select SNPs diagnostic for MLST-defined CCs was dependent on CC size, with large numbers of SNPs required to delineate the larger CCs that have undergone extensive recombination. The purpose of the Not-N algorithm is conceptually similar to the identification of canonical phylogenetic SNPs, such as those previously described for <it>Bacillus anthracis </it><abbrgrp><abbr bid="B43">43</abbr></abbrgrp>, and indeed would be valuable for identifying canonical SNPs in other clonal populations. Not-N analysis is likely to become increasingly useful as comparative databases expand in size, and as more is uncovered about the relationships between pathogen genotype, infection epidemiology and clinical outcomes. This approach to data analysis may also be applied to the identification of discriminatory sets of genetic polymorphisms that have direct biological significance, rather than being simply diagnostic markers. In such instances, it may not be so critical for the "0% false negative" criterion to be fulfilled. This allows an approach in which the analysis is carried out twice, with one or other of the two groups of variants defined as the "group of interest" in each case. This increases the probability that informative sets of polymorphisms will be identified.</p>
      </sec>
      <sec>
         <st>
            <p>Methods</p>
         </st>
         <sec>
            <st>
               <p>Bacterial and viral databases</p>
            </st>
            <p>The sequence type (ST) and allele files for <it>C. jejuni</it>, <it>S. aureus</it>, and <it>Haemophilus influenzae </it>were downloaded from the MLST databases for these organisms <abbrgrp><abbr bid="B28">28</abbr></abbrgrp>. For <it>Escherichia coli</it>, data was obtained from two MLST schemes using seven loci <abbrgrp><abbr bid="B44">44</abbr><abbr bid="B45">45</abbr><abbr bid="B46">46</abbr></abbrgrp>, eBURST v3 was used to assign STs to clonal complexes (CCs) <abbrgrp><abbr bid="B47">47</abbr></abbrgrp>. For <it>S. aureus </it>and <it>C. jejuni</it>, CCs were defined as STs sharing 6/7 loci with the ancestral clone, whereas with <it>H. influenzae </it>and <it>E. coli</it>, this parameter was set to 5/7 loci and 4/7 loci, respectively.</p>
            <p>Sequence data for HIV-1 and HCV were downloaded from the respective databases <abbrgrp><abbr bid="B8">8</abbr><abbr bid="B10">10</abbr></abbrgrp>. The region spanning nucleotides 8276 to 8615 of the HCV genome corresponding to a partial sequence of the RNA-dependent RNA polymerase, NS5B, was chosen for SNP analysis as this region is used to construct phylogenetic trees for HCV <abbrgrp><abbr bid="B40">40</abbr></abbrgrp>. In total 770 NS5B sequences were analyzed. The thirteen confirmed HCV genotypes were examined: 1a, 1b, 2a, 2b, 2c, 3a, 3b, 4a, 4d, 4f, 4t, 5a and 6a. For HIV-1, we tested the ability of the Not-N algorithm to select SNPs that would identify the genotype M group as this genotype comprises over 99% of human HIV-1 infections <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>.</p>
            <p>CGH array data for <it>C. jejuni</it>, <it>Y. enterocolitica </it>and <it>C. difficile </it>was downloaded from B&#956;G@s <abbrgrp><abbr bid="B48">48</abbr></abbrgrp> accessions E-BUGS-22, E-BUGS-36 and E-BUGS-41). CGH data was filtered to exclude genes considered divergent in one or more strains, and 'flagged' genes (data missing in one or more strains, due to a poor array signal). Based on these criteria, 696, 1080 and 785 genes from the available 111 <it>C. jejuni</it>, 93 <it>Y. enterocolitica </it>and 74 <it>C. difficile </it>strains <abbrgrp><abbr bid="B4">4</abbr><abbr bid="B5">5</abbr><abbr bid="B6">6</abbr></abbrgrp> were analyzed using the Not-N module of "Minimum SNPs". Gene presence or absence was converted to nucleotide format to enable "Minimum SNPs" analysis as previously described <abbrgrp><abbr bid="B24">24</abbr></abbrgrp>. Isolates were grouped for Not-N analysis according to previous CGH phylogeny with the exception of <it>Y. enterocolitica </it>strain 237_02, which was shown to group with the non-pathogenic clade following ClustalX phylogenetic analysis <abbrgrp><abbr bid="B49">49</abbr></abbrgrp> and visualization using TreeView 1.6.6 <abbrgrp><abbr bid="B50">50</abbr></abbrgrp> of the filtered dataset.</p>
         </sec>
         <sec>
            <st>
               <p>The Not-N algorithm and its implementation</p>
            </st>
            <p>The Not-N algorithm is designed to derive, from sequence alignments, sets of SNPs or binary genes that discriminate a user-defined subset of the isolates (the group of interest) from all the other genotypes in the alignment (the out-group). The fundamental principle of this algorithm is that it does not treat the group of interest and the out-group equally. A position in the alignment is only considered informative if one or more bases that are present at that position in the out-group are not present in any of the sequences in the group of interest. The resolving power of a position is the proportion of out-group sequences that contain the base(s) in common with the group of interest. The rationale for the algorithm design is twofold. Firstly, the derived SNP sets cannot give rise to false-negatives since SNPs are specifically selected to identify all members of a group of interest. This lack of false-negatives is important if the SNP sets are to form the bases of e.g. diagnostic procedures for identifying virulent subgroups within a species. Secondly, Not-N can accommodate polymorphisms within the group of interest; that is, the algorithm is not reliant on identifying only invariant sites within the group of interest. Therefore, Not-N efficiently uses the available sequence data. The algorithm is demonstrated in Table <tblr tid="T3">3</tblr>. STs 1, 2, and 3 are the group of interest whilst the remaining STs are the out-group. A consensus sequence is assembled by scoring each nucleotide in the alignment as 'Not-A/C/G/T'. The out-group sequences are subsequently scored as a match (+) or mismatch (-) relative to the consensus sequence for the group of interest.</p>
            <tbl id="T3">
               <title>
                  <p>Table 3</p>
               </title>
               <caption>
                  <p>The Not-N algorithm and its implementation by the Minimum SNPs computer program. A. Data for seven hypothetical sequence types (STs) at six single-nucleotide polymorphisms (SNPs). B. Not-N analysis output of the alignment at A. Four sets of two SNPs are identified, all of which reach 100% discrimination. C. Result obtained if positions 3 and 4 are excluded.</p>
               </caption>
               <tblbdy cols="7">
                  <r>
                     <c ca="center">
                        <p>A.</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Sequence ID</p>
                     </c>
                     <c ca="center">
                        <p>SNP 1</p>
                     </c>
                     <c ca="center">
                        <p>SNP 2</p>
                     </c>
                     <c ca="center">
                        <p>SNP 3</p>
                     </c>
                     <c ca="center">
                        <p>SNP 4</p>
                     </c>
                     <c ca="center">
                        <p>SNP 5</p>
                     </c>
                     <c ca="center">
                        <p>SNP 6</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>ST 1*</p>
                     </c>
                     <c ca="center">
                        <p>G</p>
                     </c>
                     <c ca="center">
                        <p>G</p>
                     </c>
                     <c ca="center">
                        <p>G</p>
                     </c>
                     <c ca="center">
                        <p>A</p>
                     </c>
                     <c ca="center">
                        <p>T</p>
                     </c>
                     <c ca="center">
                        <p>T</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>ST 2*</p>
                     </c>
                     <c ca="center">
                        <p>A</p>
                     </c>
                     <c ca="center">
                        <p>G</p>
                     </c>
                     <c ca="center">
                        <p>T</p>
                     </c>
                     <c ca="center">
                        <p>T</p>
                     </c>
                     <c ca="center">
                        <p>T</p>
                     </c>
                     <c ca="center">
                        <p>G</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>ST 3*</p>
                     </c>
                     <c ca="center">
                        <p>G</p>
                     </c>
                     <c ca="center">
                        <p>G</p>
                     </c>
                     <c ca="center">
                        <p>G</p>
                     </c>
                     <c ca="center">
                        <p>C</p>
                     </c>
                     <c ca="center">
                        <p>T</p>
                     </c>
                     <c ca="center">
                        <p>G</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>ST 4</p>
                     </c>
                     <c ca="center">
                        <p>A</p>
                     </c>
                     <c ca="center">
                        <p>G</p>
                     </c>
                     <c ca="center">
                        <p>A</p>
                     </c>
                     <c ca="center">
                        <p>G</p>
                     </c>
                     <c ca="center">
                        <p>A</p>
                     </c>
                     <c ca="center">
                        <p>T</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>ST 5</p>
                     </c>
                     <c ca="center">
                        <p>A</p>
                     </c>
                     <c ca="center">
                        <p>A</p>
                     </c>
                     <c ca="center">
                        <p>A</p>
                     </c>
                     <c ca="center">
                        <p>G</p>
                     </c>
                     <c ca="center">
                        <p>T</p>
                     </c>
                     <c ca="center">
                        <p>T</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>ST 6</p>
                     </c>
                     <c ca="center">
                        <p>A</p>
                     </c>
                     <c ca="center">
                        <p>G</p>
                     </c>
                     <c ca="center">
                        <p>T</p>
                     </c>
                     <c ca="center">
                        <p>C</p>
                     </c>
                     <c ca="center">
                        <p>A</p>
                     </c>
                     <c ca="center">
                        <p>A</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>ST 7</p>
                     </c>
                     <c ca="center">
                        <p>A</p>
                     </c>
                     <c ca="center">
                        <p>A</p>
                     </c>
                     <c ca="center">
                        <p>A</p>
                     </c>
                     <c ca="center">
                        <p>G</p>
                     </c>
                     <c ca="center">
                        <p>T</p>
                     </c>
                     <c ca="center">
                        <p>G</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Consensus (STs 1, 2 &amp; 3)</p>
                     </c>
                     <c ca="center">
                        <p>Not informative</p>
                     </c>
                     <c ca="center">
                        <p>Not-ACT</p>
                     </c>
                     <c ca="center">
                        <p>Not-AC</p>
                     </c>
                     <c ca="center">
                        <p>Not-G</p>
                     </c>
                     <c ca="center">
                        <p>Not-ACG</p>
                     </c>
                     <c ca="center">
                        <p>Not-AC</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>>ST4</p>
                     </c>
                     <c ca="center">
                        <p>n/a</p>
                     </c>
                     <c ca="center">
                        <p>+</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                     <c ca="center">
                        <p>+</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>>ST5</p>
                     </c>
                     <c ca="center">
                        <p>n/a</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                     <c ca="center">
                        <p>+</p>
                     </c>
                     <c ca="center">
                        <p>+</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>>ST6</p>
                     </c>
                     <c ca="center">
                        <p>n/a</p>
                     </c>
                     <c ca="center">
                        <p>+</p>
                     </c>
                     <c ca="center">
                        <p>+</p>
                     </c>
                     <c ca="center">
                        <p>+</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>>ST7</p>
                     </c>
                     <c ca="center">
                        <p>n/a</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                     <c ca="center">
                        <p>+</p>
                     </c>
                     <c ca="center">
                        <p>+</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Confidence (%)</p>
                     </c>
                     <c ca="center">
                        <p>Position not used</p>
                     </c>
                     <c ca="center">
                        <p>50</p>
                     </c>
                     <c ca="center">
                        <p>75</p>
                     </c>
                     <c ca="center">
                        <p>75</p>
                     </c>
                     <c ca="center">
                        <p>50</p>
                     </c>
                     <c ca="center">
                        <p>25</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c cspan="7" ca="left">
                        <p>*"Group of interest" sequences</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>B.</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>SNP set</p>
                     </c>
                     <c ca="center">
                        <p>SNP 1 position and consensus</p>
                     </c>
                     <c ca="center">
                        <p>Cumulative discriminatory power (%)</p>
                     </c>
                     <c ca="center">
                        <p>SNP 2 position and consensus</p>
                     </c>
                     <c ca="center">
                        <p>Cumulative discriminatory power</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>3, NOT AC,</p>
                     </c>
                     <c ca="center">
                        <p>75</p>
                     </c>
                     <c ca="center">
                        <p>5, NOT ACG</p>
                     </c>
                     <c ca="center">
                        <p>100</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>3, NOT AC</p>
                     </c>
                     <c ca="center">
                        <p>75</p>
                     </c>
                     <c ca="center">
                        <p>6, NOT AC</p>
                     </c>
                     <c ca="center">
                        <p>100</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>4, NOT G</p>
                     </c>
                     <c ca="center">
                        <p>75</p>
                     </c>
                     <c ca="center">
                        <p>5, NOT ACG</p>
                     </c>
                     <c ca="center">
                        <p>100</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>4, NOT G</p>
                     </c>
                     <c ca="center">
                        <p>75</p>
                     </c>
                     <c ca="center">
                        <p>6, NOT AC</p>
                     </c>
                     <c ca="center">
                        <p>100</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>C.</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>SNP set</p>
                     </c>
                     <c ca="center">
                        <p>SNP 1 position and consensus</p>
                     </c>
                     <c ca="center">
                        <p>Cumulative discriminatory power (%)</p>
                     </c>
                     <c ca="center">
                        <p>SNP 2 position and consensus</p>
                     </c>
                     <c ca="center">
                        <p>Cumulative discriminatory power (%)</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>2, NOT ACT,</p>
                     </c>
                     <c ca="center">
                        <p>50</p>
                     </c>
                     <c ca="center">
                        <p>5, NOT ACG</p>
                     </c>
                     <c ca="center">
                        <p>100</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
               </tblbdy>
            </tbl>
            <p>The Not-N function has been incorporated into the "Minimum SNPs" version 2.043 software <abbrgrp><abbr bid="B19">19</abbr><abbr bid="B23">23</abbr></abbrgrp>. Previous versions of this software identified SNP sets on the basis of two user-selectable performance criteria:maximization of <it>D</it>, or maximization of the power to discriminate one user-selected sequence from all known sequences. The Not-N algorithm represents a third user-selectable performance criterion for SNP set assembly. The SNP sets are assembled one SNP at a time, with the SNP giving the highest informative power identified first, followed by the SNP that gives the highest informative power in combination with the previous SNP, and so forth. Where different SNPs have identical informative power, multiple SNP combinations are assembled, until either a pre-set level of discrimination, a pre-set number of SNPs, or 100% discrimination is reached. The software incorporates 'include' and 'exclude' functions that allow the operator to force the inclusion of one or more SNPs in the output SNP set, or to remove one or more SNPs from the analysis. This provides considerable flexibility in SNP set assembly, which can be of benefit when optimizing actual assays, and provides a means of protecting against "local minima" (SNP sets with non-optimal resolving power due to pathway constraints imposed by the first identified SNP). In the example shown in Table <tblr tid="T3">3</tblr>, SNP 1 is classed as non-informative as the group of interest is not deficient in any bases in comparison to the out-group at that position, whilst SNPs 2 to 7, either alone or in combination, discriminate the group of interest from the out-group with different levels of confidence. Use of the exclude function to remove SNPs 3 and 4 yields a new SNP set that reaches 100% discrimination (Table <tblr tid="T3">3C</tblr>) as efficiently as the SNP sets in Table <tblr tid="T3">3B</tblr>.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Availability and Requirements</p>
         </st>
         <p>"Minimum SNPs" version 2.043, together with documentation, may be obtained from <url>http://dev-www.ihbi.qut.edu.au/research/cells_tissue/phil_giffard/</url>. There is a requirement to agree to a click-wrap license that is applicable to non-commercial use only. The software is written using the Java Runtime Environment which makes is essentially platform independent. Users need to have the Java Runtime Environment installed on their computer. This is freeware that can be obtained from <url>http://www.java.com/en/download/manual.jsp</url>. Downloading of this also requires agreeing to a license.</p>
      </sec>
      <sec>
         <st>
            <p>Competing interests</p>
         </st>
         <p>The authors on this manuscript are inventors on patent applications describing this algorithm and its applications. They may in consequence be eligible for financial benefit if these patent applications are commercialised.</p>
      </sec>
      <sec>
         <st>
            <p>Authors' contributions</p>
         </st>
         <p>EPP carried out most of the data analysis and drafted the manuscript. JI-B and FH carried out additional data analysis. VT wrote the source code of the Not-N algorithm for integration into the "Minimum SNPs" software. PG conceived of the study, and participated in its design and coordination and helped draft the manuscript. All authors read and approved the final manuscript.</p>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>The authors thank all those who curate, fund and contribute data to the on-line databases that were used in this study. The authors thank Talima Pearson and Jeffrey Foster for critical reading of the manuscript and Bill Lott for assistance with the HCV sequence database. This work was funded by the Cooperative Research Centres Program of the Australian Federal Government. EPP is in receipt of a research studentship from the Institute of Health and Biomedical Innovation, Queensland University of Technology.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>PulseNet: the molecular subtyping network for foodborne bacterial disease surveillance, United States</p>
            </title>
            <aug>
               <au>
                  <snm>Swaminathan</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Barrett</snm>
                  <fnm>TJ</fnm>
               </au>
               <au>
                  <snm>Hunter</snm>
                  <fnm>SB</fnm>
               </au>
               <au>
                  <snm>Tauxe</snm>
                  <fnm>RV</fnm>
               </au>
               <au>
                  <cnm>CDC PulseNet Task Force</cnm>
               </au>
            </aug>
            <source>Emerg Infect Dis</source>
            <pubdate>2001</pubdate>
            <volume>7</volume>
            <fpage>382</fpage>
            <lpage>389</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">11384513</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>Multilocus sequence typing: a portable approach to the identification of clones within populations of pathogenic microorganisms</p>
            </title>
            <aug>
               <au>
                  <snm>Maiden</snm>
                  <fnm>MC</fnm>
               </au>
               <au>
                  <snm>Bygraves</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Feil</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Morelli</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Russell</snm>
                  <fnm>JE</fnm>
               </au>
               <au>
                  <snm>Urwin</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>Q</fnm>
               </au>
               <au>
                  <snm>Zhou</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Zurth</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Caugant</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>Feavers</snm>
                  <fnm>IM</fnm>
               </au>
               <au>
                  <snm>Achtman</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Spratt</snm>
                  <fnm>BG</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>1998</pubdate>
            <volume>95</volume>
            <fpage>3140</fpage>
            <lpage>3145</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">19708</pubid>
                  <pubid idtype="pmpid" link="fulltext">9501229</pubid>
                  <pubid idtype="doi">10.1073/pnas.95.6.3140</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>Database-driven multi locus sequence typing (MLST) of bacterial pathogens</p>
            </title>
            <aug>
               <au>
                  <snm>Chan</snm>
                  <fnm>M-S</fnm>
               </au>
               <au>
                  <snm>Maiden</snm>
                  <fnm>MC</fnm>
               </au>
               <au>
                  <snm>Spratt</snm>
                  <fnm>BG</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2001</pubdate>
            <volume>17</volume>
            <fpage>1077</fpage>
            <lpage>1083</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/17.11.1077</pubid>
                  <pubid idtype="pmpid" link="fulltext">11724739</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>Application of comparative phylogenomics to study the evolution of <it>Yersinia enterocolitica </it>and to identify genetic differences relating to pathogenicity</p>
            </title>
            <aug>
               <au>
                  <snm>Howard</snm>
                  <fnm>SL</fnm>
               </au>
               <au>
                  <snm>Gaunt</snm>
                  <fnm>MW</fnm>
               </au>
               <au>
                  <snm>Hinds</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Witney</snm>
                  <fnm>AA</fnm>
               </au>
               <au>
                  <snm>Stabler</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Wren</snm>
                  <fnm>BW</fnm>
               </au>
            </aug>
            <source>J Bacteriol</source>
            <pubdate>2006</pubdate>
            <volume>188</volume>
            <fpage>3645</fpage>
            <lpage>3653</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1482848</pubid>
                  <pubid idtype="pmpid" link="fulltext">16672618</pubid>
                  <pubid idtype="doi">10.1128/JB.188.10.3645-3653.2006</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>Comparative phylogenomics of the food-borne pathogen <it>Campylobacter jejuni </it>reveals genetic markers predictive of infection source</p>
            </title>
            <aug>
               <au>
                  <snm>Champion</snm>
                  <fnm>OL</fnm>
               </au>
               <au>
                  <snm>Gaunt</snm>
                  <fnm>MW</fnm>
               </au>
               <au>
                  <snm>Gundogdu</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Elmi</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Witney</snm>
                  <fnm>AA</fnm>
               </au>
               <au>
                  <snm>Hinds</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Dorrell</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Wren</snm>
                  <fnm>BW</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2005</pubdate>
            <volume>102</volume>
            <fpage>16043</fpage>
            <lpage>16048</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1276044</pubid>
                  <pubid idtype="pmpid" link="fulltext">16230626</pubid>
                  <pubid idtype="doi">10.1073/pnas.0503252102</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>Comparative phylogenomics of <it>Clostridium difficile </it>reveals clade specificity and microevolution of hypervirulent strains</p>
            </title>
            <aug>
               <au>
                  <snm>Stabler</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Gerding</snm>
                  <fnm>DN</fnm>
               </au>
               <au>
                  <snm>Songer</snm>
                  <fnm>JG</fnm>
               </au>
               <au>
                  <snm>Drudy</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Brazier</snm>
                  <fnm>JS</fnm>
               </au>
               <au>
                  <snm>Trinh</snm>
                  <fnm>HT</fnm>
               </au>
               <au>
                  <snm>Witney</snm>
                  <fnm>AA</fnm>
               </au>
               <au>
                  <snm>Hinds</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Wren</snm>
                  <fnm>BW</fnm>
               </au>
            </aug>
            <source>J Bacteriol</source>
            <pubdate>2006</pubdate>
            <volume>188</volume>
            <fpage>7297</fpage>
            <lpage>7305</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1636221</pubid>
                  <pubid idtype="pmpid" link="fulltext">17015669</pubid>
                  <pubid idtype="doi">10.1128/JB.00664-06</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>Comparative genomics of <it>Neisseria meningitidis</it>: core genome, islands of horizontal gene transfer and pathogen-specific genes</p>
            </title>
            <aug>
               <au>
                  <snm>Hotopp</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Grifantini</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Kumar</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Tzeng</snm>
                  <fnm>YL</fnm>
               </au>
               <au>
                  <snm>Fouts</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Frigimelica</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Draghi</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Giuliani</snm>
                  <fnm>MM</fnm>
               </au>
               <au>
                  <snm>Rappuoli</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Stephens</snm>
                  <fnm>DS</fnm>
               </au>
               <au>
                  <snm>Grandi</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Tettelin</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>Microbiology</source>
            <pubdate>2006</pubdate>
            <volume>152</volume>
            <fpage>3733</fpage>
            <lpage>3749</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1099/mic.0.29261-0</pubid>
                  <pubid idtype="pmpid" link="fulltext">17159225</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>HIV sequence databases</p>
            </title>
            <aug>
               <au>
                  <snm>Kuiken</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Korber</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Shafer</snm>
                  <fnm>RW</fnm>
               </au>
            </aug>
            <source>AIDS Rev</source>
            <pubdate>2003</pubdate>
            <volume>5</volume>
            <fpage>52</fpage>
            <lpage>61</lpage>
            <url>http://hiv-web.lanl.gov/components/hiv-db/combined_search_s_tree/search.html</url>
            <note>Accessed 17 Jan 2007</note>
            <xrefbib>
               <pubid idtype="pmpid">12875108</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p><it>coli</it>BASE: anonline database for <it>Escherichia coli</it>, <it>Shigella </it>and <it>Salmonella </it>comparative genomics</p>
            </title>
            <aug>
               <au>
                  <snm>Chaudhuri</snm>
                  <fnm>RR</fnm>
               </au>
               <au>
                  <snm>Khan</snm>
                  <fnm>AM</fnm>
               </au>
               <au>
                  <snm>Pallen</snm>
                  <fnm>MJ</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2004</pubdate>
            <volume>32</volume>
            <fpage>D296</fpage>
            <lpage>D299</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">308765</pubid>
                  <pubid idtype="pmpid" link="fulltext">14681417</pubid>
                  <pubid idtype="doi">10.1093/nar/gkh031</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>The Los Alamos hepatitis C sequence database</p>
            </title>
            <aug>
               <au>
                  <snm>Kuiken</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Yusim</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Boykin</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Richardson</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2005</pubdate>
            <volume>21</volume>
            <fpage>379</fpage>
            <lpage>384</lpage>
            <url>http://hcv.lanl.gov/components/hcv-db/combined_search/searchi.html</url>
            <note>Accessed 17 Jan 2007</note>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/bth485</pubid>
                  <pubid idtype="pmpid" link="fulltext">15377502</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p><it>x</it>BASE, a collection of online databases for bacterial comparative genomics</p>
            </title>
            <aug>
               <au>
                  <snm>Chaudhuri</snm>
                  <fnm>RR</fnm>
               </au>
               <au>
                  <snm>Pallen</snm>
                  <fnm>MJ</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2006</pubdate>
            <volume>34</volume>
            <fpage>D335</fpage>
            <lpage>D337</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1347502</pubid>
                  <pubid idtype="pmpid" link="fulltext">16381881</pubid>
                  <pubid idtype="doi">10.1093/nar/gkj140</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>Genome Information Broker for Viruses (GIB-V): database for comparative analysis of virus genomes</p>
            </title>
            <aug>
               <au>
                  <snm>Hirahata</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Abe</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Tanaka</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Kuwana</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Shigemoto</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Miyazaki</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Suzuki</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Sugawara</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2007</pubdate>
            <volume>35</volume>
            <fpage>D339</fpage>
            <lpage>D342</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1781101</pubid>
                  <pubid idtype="pmpid" link="fulltext">17158166</pubid>
                  <pubid idtype="doi">10.1093/nar/gkl1004</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B13">
            <title>
               <p>Selecting a maximally informative set of single-nucleotide polymorphisms for association analyses using linkage disequilibrium</p>
            </title>
            <aug>
               <au>
                  <snm>Carlson</snm>
                  <fnm>CS</fnm>
               </au>
               <au>
                  <snm>Eberle</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Rieder</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Yi</snm>
                  <fnm>Q</fnm>
               </au>
               <au>
                  <snm>Kruglyak</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Nickerson</snm>
                  <fnm>DA</fnm>
               </au>
            </aug>
            <source>Am J Hum Genet</source>
            <pubdate>2004</pubdate>
            <volume>74</volume>
            <fpage>106</fpage>
            <lpage>120</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1181897</pubid>
                  <pubid idtype="pmpid" link="fulltext">14681826</pubid>
                  <pubid idtype="doi">10.1086/381000</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>Real-time single-nucleotide polymorphism profiling using TaqMan technology for rapid recognition of <it>Campylobacter jejuni </it>clonal complexes</p>
            </title>
            <aug>
               <au>
                  <snm>Best</snm>
                  <fnm>EL</fnm>
               </au>
               <au>
                  <snm>Fox</snm>
                  <fnm>AJ</fnm>
               </au>
               <au>
                  <snm>Frost</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Bolton</snm>
                  <fnm>FJ</fnm>
               </au>
            </aug>
            <source>J Med Microbiol</source>
            <pubdate>2005</pubdate>
            <volume>54</volume>
            <fpage>919</fpage>
            <lpage>925</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid>16157544</pubid>
                  <pubid idtype="pmpid" link="fulltext">16157544</pubid>
                  <pubid idtype="doi">10.1099/jmm.0.45971-0</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B15">
            <title>
               <p>Real time quantitative PCR</p>
            </title>
            <aug>
               <au>
                  <snm>Heid</snm>
                  <fnm>CA</fnm>
               </au>
               <au>
                  <snm>Stevens</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Livak</snm>
                  <fnm>KJ</fnm>
               </au>
               <au>
                  <snm>Williams</snm>
                  <fnm>PM</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>1996</pubdate>
            <volume>6</volume>
            <fpage>986</fpage>
            <lpage>94</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1101/gr.6.10.986</pubid>
                  <pubid idtype="pmpid" link="fulltext">8908518</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <title>
               <p>Enabling large-scale pharmacogenetic studies by high-throughput mutation detection and genotyping technologies</p>
            </title>
            <aug>
               <au>
                  <snm>Shi</snm>
                  <fnm>MM</fnm>
               </au>
            </aug>
            <source>Clin Chem</source>
            <pubdate>2001</pubdate>
            <volume>47</volume>
            <fpage>164</fpage>
            <lpage>172</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">11159763</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>Genotyping with microfluidic devices</p>
            </title>
            <aug>
               <au>
                  <snm>Sz&#225;ntai</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Guttman</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Electrophoresis</source>
            <pubdate>2006</pubdate>
            <volume>27</volume>
            <fpage>4896</fpage>
            <lpage>4903</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/elps.200600568</pubid>
                  <pubid idtype="pmpid" link="fulltext">17117382</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p>Towards a portable microchip system with integrated thermal control and polymer waveguides for real-time PCR</p>
            </title>
            <aug>
               <au>
                  <snm>Wang</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Sekulovic</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Kutter</snm>
                  <fnm>JP</fnm>
               </au>
               <au>
                  <snm>Bang</snm>
                  <fnm>DD</fnm>
               </au>
               <au>
                  <snm>Wolff</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Electrophoresis</source>
            <pubdate>2006</pubdate>
            <volume>27</volume>
            <fpage>5051</fpage>
            <lpage>5058</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/elps.200600355</pubid>
                  <pubid idtype="pmpid" link="fulltext">17124710</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B19">
            <title>
               <p>Identification and interrogation of highly informative single nucleotide polymorphism sets defined by bacterial multilocus sequence typing databases</p>
            </title>
            <aug>
               <au>
                  <snm>Robertson</snm>
                  <fnm>GA</fnm>
               </au>
               <au>
                  <snm>Thiruvenkataswamy</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Shilling</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Price</snm>
                  <fnm>EP</fnm>
               </au>
               <au>
                  <snm>Huygens</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Henskens</snm>
                  <fnm>FA</fnm>
               </au>
               <au>
                  <snm>Giffard</snm>
                  <fnm>PM</fnm>
               </au>
            </aug>
            <source>J Med Microbiol</source>
            <pubdate>2004</pubdate>
            <volume>53</volume>
            <fpage>35</fpage>
            <lpage>45</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1099/jmm.0.05365-0</pubid>
                  <pubid idtype="pmpid" link="fulltext">14663103</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B20">
            <title>
               <p>Numerical index of the discriminatory ability of typing systems: an application of Simpson's index of diversity</p>
            </title>
            <aug>
               <au>
                  <snm>Hunter</snm>
                  <fnm>PR</fnm>
               </au>
               <au>
                  <snm>Gaston</snm>
                  <fnm>MA</fnm>
               </au>
            </aug>
            <source>J Clin Microbiol</source>
            <pubdate>1988</pubdate>
            <volume>26</volume>
            <fpage>2465</fpage>
            <lpage>2466</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">266921</pubid>
                  <pubid idtype="pmpid" link="fulltext">3069867</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B21">
            <title>
               <p><it>Staphylococcus aureus </it>genotyping using novel real-time PCR formats</p>
            </title>
            <aug>
               <au>
                  <snm>Huygens</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Inman-Bamber</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Nimmo</snm>
                  <fnm>GR</fnm>
               </au>
               <au>
                  <snm>Munckhof</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Schooneveldt</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Harrison</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>McMahon</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Giffard</snm>
                  <fnm>PM</fnm>
               </au>
            </aug>
            <source>J Clin Microbiol</source>
            <pubdate>2006</pubdate>
            <fpage>3712</fpage>
            <lpage>3719</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1594813</pubid>
                  <pubid idtype="pmpid" link="fulltext">17021101</pubid>
                  <pubid idtype="doi">10.1128/JCM.00843-06</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B22">
            <title>
               <p>Methicillin-resistant <it>Staphylococcus aureus </it>genotyping using a small set of polymorphisms</p>
            </title>
            <aug>
               <au>
                  <snm>Stephens</snm>
                  <fnm>AJ</fnm>
               </au>
               <au>
                  <snm>Huygens</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Inman-Bamber</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Price</snm>
                  <fnm>EP</fnm>
               </au>
               <au>
                  <snm>Nimmo</snm>
                  <fnm>GR</fnm>
               </au>
               <au>
                  <snm>Schooneveldt</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Munckhof</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Giffard</snm>
                  <fnm>PM</fnm>
               </au>
            </aug>
            <source>J Med Microbiol</source>
            <pubdate>2006</pubdate>
            <volume>55</volume>
            <fpage>43</fpage>
            <lpage>51</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1099/jmm.0.46157-0</pubid>
                  <pubid idtype="pmpid" link="fulltext">16388029</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B23">
            <title>
               <p>Genotyping of <it>Campylobacter jejuni </it>using seven single-nucleotide polymorphisms in combination with <it>flaA </it>Short Variable Region Sequencing</p>
            </title>
            <aug>
               <au>
                  <snm>Price</snm>
                  <fnm>EP</fnm>
               </au>
               <au>
                  <snm>Thiruvenkataswamy</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Mickan</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Unicomb</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Rios</snm>
                  <fnm>RE</fnm>
               </au>
               <au>
                  <snm>Huygens</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Giffard</snm>
                  <fnm>PM</fnm>
               </au>
            </aug>
            <source>J Med Microbiol</source>
            <pubdate>2006</pubdate>
            <volume>55</volume>
            <fpage>1061</fpage>
            <lpage>1070</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1099/jmm.0.46460-0</pubid>
                  <pubid idtype="pmpid" link="fulltext">16849726</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B24">
            <title>
               <p>Fingerprinting of <it>Campylobacter jejuni </it>using resolution-optimized binary gene targets derived from Comparative Genome Hybridization studies</p>
            </title>
            <aug>
               <au>
                  <snm>Price</snm>
                  <fnm>EP</fnm>
               </au>
               <au>
                  <snm>Huygens</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Giffard</snm>
                  <fnm>PM</fnm>
               </au>
            </aug>
            <source>Appl Environ Microbiol</source>
            <pubdate>2006</pubdate>
            <volume>72</volume>
            <fpage>7793</fpage>
            <lpage>7803</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1694235</pubid>
                  <pubid idtype="pmpid" link="fulltext">16997982</pubid>
                  <pubid idtype="doi">10.1128/AEM.01338-06</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B25">
            <title>
               <p>Global phylogeny of <it>Mycobacterium tuberculosis </it>based on single nucleotide polymorphism (SNP) analysis: insights into tuberculosis evolution, phylogenetic accuracy of other DNA fingerprinting systems, and recommendations for a minimal standard SNP set</p>
            </title>
            <aug>
               <au>
                  <snm>Filliol</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Motiwala</snm>
                  <fnm>AS</fnm>
               </au>
               <au>
                  <snm>Cavatore</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Qi</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Hazbon</snm>
                  <fnm>MH</fnm>
               </au>
               <au>
                  <snm>Bobadilla del Valle</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Fyfe</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Garcia-Garcia</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Rastogi</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Sola</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Zozio</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Guerrero</snm>
                  <fnm>MI</fnm>
               </au>
               <au>
                  <snm>Leon</snm>
                  <fnm>CI</fnm>
               </au>
               <au>
                  <snm>Crabtree</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Angiuoli</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Eisenach</snm>
                  <fnm>KD</fnm>
               </au>
               <au>
                  <snm>Durmaz</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Joloba</snm>
                  <fnm>ML</fnm>
               </au>
               <au>
                  <snm>Rendon</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Sifuentes-Osornio</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Ponce de Leon</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Cave</snm>
                  <fnm>MD</fnm>
               </au>
               <au>
                  <snm>Fleischmann</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Whittam</snm>
                  <fnm>TS</fnm>
               </au>
               <au>
                  <snm>Alland</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>J Bacteriol</source>
            <pubdate>2006</pubdate>
            <volume>188</volume>
            <fpage>759</fpage>
            <lpage>772</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1347298</pubid>
                  <pubid idtype="pmpid" link="fulltext">16385065</pubid>
                  <pubid idtype="doi">10.1128/JB.188.2.759-772.2006</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B26">
            <title>
               <p>Multilocus sequence typing scheme for <it>Enterococcus faecalis </it>reveals hospital-adapted genetic complexes in a background of high rates of recombination</p>
            </title>
            <aug>
               <au>
                  <snm>Ruiz-Garbajosa</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Bonten</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Robinson</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>Top</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Nallapareddy</snm>
                  <fnm>SR</fnm>
               </au>
               <au>
                  <snm>Torres</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Coque</snm>
                  <fnm>TM</fnm>
               </au>
               <au>
                  <snm>Canton</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Baquero</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Murray</snm>
                  <fnm>BE</fnm>
               </au>
               <au>
                  <snm>del Campo</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Willems</snm>
                  <fnm>RJ</fnm>
               </au>
            </aug>
            <source>J Clin Microbiol</source>
            <pubdate>2006</pubdate>
            <volume>44</volume>
            <fpage>2220</fpage>
            <lpage>2228</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1489431</pubid>
                  <pubid idtype="pmpid" link="fulltext">16757624</pubid>
                  <pubid idtype="doi">10.1128/JCM.02596-05</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B27">
            <title>
               <p>How clonal is <it>Staphylococcus aureus</it>?</p>
            </title>
            <aug>
               <au>
                  <snm>Feil</snm>
                  <fnm>EJ</fnm>
               </au>
               <au>
                  <snm>Cooper</snm>
                  <fnm>JE</fnm>
               </au>
               <au>
                  <snm>Grundmann</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Robinson</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>Enright</snm>
                  <fnm>MC</fnm>
               </au>
               <au>
                  <snm>Berendt</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Peacock</snm>
                  <fnm>SJ</fnm>
               </au>
               <au>
                  <snm>Smith</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Murphy</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Spratt</snm>
                  <fnm>BG</fnm>
               </au>
               <au>
                  <snm>Moore</snm>
                  <fnm>CE</fnm>
               </au>
               <au>
                  <snm>Day</snm>
                  <fnm>NP</fnm>
               </au>
            </aug>
            <source>J Bacteriol</source>
            <pubdate>2003</pubdate>
            <fpage>3307</fpage>
            <lpage>3316</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">155367</pubid>
                  <pubid idtype="pmpid" link="fulltext">12754228</pubid>
                  <pubid idtype="doi">10.1128/JB.185.11.3307-3316.2003</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B28">
            <title>
               <p>mlstdbNet &#8211; distributed multi-locus sequence typing (MLST) databases</p>
            </title>
            <aug>
               <au>
                  <snm>Jolley</snm>
                  <fnm>KA</fnm>
               </au>
               <au>
                  <snm>Chan</snm>
                  <fnm>M-S</fnm>
               </au>
               <au>
                  <snm>Maiden</snm>
                  <fnm>MC</fnm>
               </au>
            </aug>
            <source>BMC Bioinformatics</source>
            <pubdate>2004</pubdate>
            <volume>5</volume>
            <fpage>85</fpage>
            <url>http://www.mlst.net/</url>
            <note>Accessed 18 January 2007</note>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">459211</pubid>
                  <pubid idtype="pmpid" link="fulltext">15222901</pubid>
                  <pubid idtype="doi">10.1186/1471-2105-5-86</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B29">
            <title>
               <p>Comparative genotyping of <it>Campylobacter jejuni </it>by amplified fragment length polymorphism, multilocus sequence typing, and short repeat sequencing: strain diversity, host range, and recombination</p>
            </title>
            <aug>
               <au>
                  <snm>Schouls</snm>
                  <fnm>LM</fnm>
               </au>
               <au>
                  <snm>Reulen</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Duim</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Wagenaar</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Willems</snm>
                  <fnm>RJ</fnm>
               </au>
               <au>
                  <snm>Dingle</snm>
                  <fnm>KE</fnm>
               </au>
               <au>
                  <snm>Colles</snm>
                  <fnm>FM</fnm>
               </au>
               <au>
                  <snm>Van Embden</snm>
                  <fnm>JD</fnm>
               </au>
            </aug>
            <source>J Clin Microbiol</source>
            <pubdate>2003</pubdate>
            <volume>41</volume>
            <fpage>15</fpage>
            <lpage>26</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">149617</pubid>
                  <pubid idtype="pmpid" link="fulltext">12517820</pubid>
                  <pubid idtype="doi">10.1128/JCM.41.1.15-26.2003</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B30">
            <title>
               <p>Interactive analysis of phylogeny and character evolution using the computer program MacClade</p>
            </title>
            <aug>
               <au>
                  <snm>Maddison</snm>
                  <fnm>WP</fnm>
               </au>
               <au>
                  <snm>Maddison</snm>
                  <fnm>DR</fnm>
               </au>
            </aug>
            <source>Folia Primatol</source>
            <pubdate>1989</pubdate>
            <volume>53</volume>
            <fpage>190</fpage>
            <lpage>202</lpage>
            <xrefbib>
               <pubid idtype="pmpid">2606395</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B31">
            <title>
               <p>Large-scale comparative genomics meta-analysis of <it>Campylobacter jejuni </it>isolates reveals low level of genome plasticity</p>
            </title>
            <aug>
               <au>
                  <snm>Taboada</snm>
                  <fnm>EN</fnm>
               </au>
               <au>
                  <snm>Acedillo</snm>
                  <fnm>RR</fnm>
               </au>
               <au>
                  <snm>Carrillo</snm>
                  <fnm>CD</fnm>
               </au>
               <au>
                  <snm>Findlay</snm>
                  <fnm>WA</fnm>
               </au>
               <au>
                  <snm>Mederios</snm>
                  <fnm>DT</fnm>
               </au>
               <au>
                  <snm>Mykytczuk</snm>
                  <fnm>OL</fnm>
               </au>
               <au>
                  <snm>Roberts</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Valencia</snm>
                  <fnm>CA</fnm>
               </au>
               <au>
                  <snm>Farber</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Nash</snm>
                  <fnm>JH</fnm>
               </au>
            </aug>
            <source>J Clin Microbiol</source>
            <pubdate>2004</pubdate>
            <volume>42</volume>
            <fpage>4566</fpage>
            <lpage>4576</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">522315</pubid>
                  <pubid idtype="pmpid" link="fulltext">15472310</pubid>
                  <pubid idtype="doi">10.1128/JCM.42.10.4566-4576.2004</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B32">
            <title>
               <p>Molecular epidemiology of HIV</p>
            </title>
            <aug>
               <au>
                  <snm>Kandathil</snm>
                  <fnm>AJ</fnm>
               </au>
               <au>
                  <snm>Ramalingam</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Kannangai</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>David</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Sridharan</snm>
                  <fnm>G</fnm>
               </au>
            </aug>
            <source>Indian J Med Res</source>
            <pubdate>2005</pubdate>
            <volume>121</volume>
            <fpage>333</fpage>
            <lpage>344</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">15817947</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B33">
            <title>
               <p>Typing of hepatitis C virus isolates and characterization of new subtypes using a line probe assay</p>
            </title>
            <aug>
               <au>
                  <snm>Stuyver</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Rossau</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Wyseur</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Duhamel</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Vanderborght</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Van Heuverswyn</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Maertens</snm>
                  <fnm>G</fnm>
               </au>
            </aug>
            <source>J Gen Virol</source>
            <pubdate>1993</pubdate>
            <volume>74</volume>
            <fpage>1093</fpage>
            <lpage>1102</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">8389799</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B34">
            <title>
               <p>Evaluation of the COBAS TaqMan HCV test with automated sample processing using the MagNA pure LC instrument</p>
            </title>
            <aug>
               <au>
                  <snm>Germer</snm>
                  <fnm>JJ</fnm>
               </au>
               <au>
                  <snm>Harmsen</snm>
                  <fnm>WS</fnm>
               </au>
               <au>
                  <snm>Mandrekar</snm>
                  <fnm>JN</fnm>
               </au>
               <au>
                  <snm>Mitchell</snm>
                  <fnm>PS</fnm>
               </au>
               <au>
                  <snm>Yao</snm>
                  <fnm>JD</fnm>
               </au>
            </aug>
            <source>J Clin Microbiol</source>
            <pubdate>2005</pubdate>
            <volume>43</volume>
            <fpage>293</fpage>
            <lpage>298</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">540141</pubid>
                  <pubid idtype="pmpid" link="fulltext">15634985</pubid>
                  <pubid idtype="doi">10.1128/JCM.43.1.293-298.2005</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B35">
            <title>
               <p>Multiplex real-time reverse transcription-PCR assay for determination of hepatitis C genotypes</p>
            </title>
            <aug>
               <au>
                  <snm>Cook</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Sullivan</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Krantz</snm>
                  <fnm>EM</fnm>
               </au>
               <au>
                  <snm>Bagabag</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Jerome</snm>
                  <fnm>KR</fnm>
               </au>
            </aug>
            <source>J Clin Microbiol</source>
            <pubdate>2006</pubdate>
            <volume>44</volume>
            <fpage>4149</fpage>
            <lpage>4156</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1698294</pubid>
                  <pubid idtype="pmpid" link="fulltext">16988019</pubid>
                  <pubid idtype="doi">10.1128/JCM.01230-06</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B36">
            <title>
               <p>Survey of major genotypes and subtypes of hepatitis C virus using RFLP of sequences amplified from the 5' non-coding region</p>
            </title>
            <aug>
               <au>
                  <snm>Davidson</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Simmonds</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Ferguson</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Jarvis</snm>
                  <fnm>LM</fnm>
               </au>
               <au>
                  <snm>Dow</snm>
                  <fnm>BC</fnm>
               </au>
               <au>
                  <snm>Follett</snm>
                  <fnm>EA</fnm>
               </au>
               <au>
                  <snm>Seed</snm>
                  <fnm>CR</fnm>
               </au>
               <au>
                  <snm>Krusius</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Lin</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Medgyesi</snm>
                  <fnm>GA</fnm>
               </au>
               <au>
                  <snm>Kiyokawa</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Olim</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Duraisamy</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Cuypers</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Saeed</snm>
                  <fnm>AA</fnm>
               </au>
               <au>
                  <snm>Teo</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Conradie</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Kew</snm>
                  <fnm>MC</fnm>
               </au>
               <au>
                  <snm>Lin</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Nuchaprayoon</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Ndimbie</snm>
                  <fnm>OK</fnm>
               </au>
               <au>
                  <snm>Yap</snm>
                  <fnm>PL</fnm>
               </au>
            </aug>
            <source>J Gen Virol</source>
            <pubdate>1995</pubdate>
            <volume>76</volume>
            <fpage>1197</fpage>
            <lpage>1204</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">7730804</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B37">
            <title>
               <p>Rapid genotyping of hepatitis C virus by primer-specific extension analysis</p>
            </title>
            <aug>
               <au>
                  <snm>Antonishyn</snm>
                  <fnm>NA</fnm>
               </au>
               <au>
                  <snm>Ast</snm>
                  <fnm>VM</fnm>
               </au>
               <au>
                  <snm>McDonald</snm>
                  <fnm>RR</fnm>
               </au>
               <au>
                  <snm>Chaudhary</snm>
                  <fnm>RK</fnm>
               </au>
               <au>
                  <snm>Lin</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Andonov</snm>
                  <fnm>AP</fnm>
               </au>
               <au>
                  <snm>Horsman</snm>
                  <fnm>GB</fnm>
               </au>
            </aug>
            <source>J Clin Microbiol</source>
            <pubdate>2005</pubdate>
            <volume>43</volume>
            <fpage>5158</fpage>
            <lpage>5163</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1248436</pubid>
                  <pubid idtype="pmpid" link="fulltext">16207978</pubid>
                  <pubid idtype="doi">10.1128/JCM.43.10.5158-5163.2005</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B38">
            <title>
               <p>Clinical significance of hepatitis C virus genotypes</p>
            </title>
            <aug>
               <au>
                  <snm>Zein</snm>
                  <fnm>NN</fnm>
               </au>
            </aug>
            <source>Clin Microbiol Rev</source>
            <pubdate>2000</pubdate>
            <volume>13</volume>
            <fpage>223</fpage>
            <lpage>235</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">100152</pubid>
                  <pubid idtype="pmpid" link="fulltext">10755999</pubid>
                  <pubid idtype="doi">10.1128/CMR.13.2.223-235.2000</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B39">
            <title>
               <p>Comparative study of different methods to genotype hepatitis C virus type 6 variants</p>
            </title>
            <aug>
               <au>
                  <snm>Chinchai</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Labout</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Noppornpanth</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Theamboonlers</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Haagmans</snm>
                  <fnm>BL</fnm>
               </au>
               <au>
                  <snm>Osterhaus</snm>
                  <fnm>AD</fnm>
               </au>
               <au>
                  <snm>Poovorawan</snm>
                  <fnm>Y</fnm>
               </au>
            </aug>
            <source>J Virol Methods</source>
            <pubdate>2003</pubdate>
            <volume>109</volume>
            <fpage>195</fpage>
            <lpage>201</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0166-0934(03)00071-5</pubid>
                  <pubid idtype="pmpid" link="fulltext">12711063</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B40">
            <title>
               <p>Consensus proposals for a unified system of nomenclature of hepatitis C virus genotypes</p>
            </title>
            <aug>
               <au>
                  <snm>Simmonds</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Bukh</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Combet</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Deleage</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Enomoto</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Feinstone</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Halfon</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Inchauspe</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Kuiken</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Maertens</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Mizokami</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Murphy</snm>
                  <fnm>DG</fnm>
               </au>
               <au>
                  <snm>Okamoto</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Pawlotsky</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Penin</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Sablon</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Shin-I</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Stuyver</snm>
                  <fnm>LJ</fnm>
               </au>
               <au>
                  <snm>Thiel</snm>
                  <fnm>HJ</fnm>
               </au>
               <au>
                  <snm>Viazov</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Weiner</snm>
                  <fnm>AJ</fnm>
               </au>
               <au>
                  <snm>Widell</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Hepatology</source>
            <pubdate>2005</pubdate>
            <volume>42</volume>
            <issue>4</issue>
            <fpage>962</fpage>
            <lpage>973</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">16149085</pubid>
                  <pubid idtype="doi">10.1002/hep.20819</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B41">
            <title>
               <p>Single-tube genotyping without oligonucleotide probes</p>
            </title>
            <aug>
               <au>
                  <snm>Germer</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Higuchi</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>1999</pubdate>
            <volume>9</volume>
            <fpage>72</fpage>
            <lpage>78</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">310703</pubid>
                  <pubid idtype="pmpid" link="fulltext">9927486</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B42">
            <title>
               <p>High resolution melting analysis for the rapid and sensitive detection of mutations in clinical samples: <it>KRAS </it>codon 12 and 13 mutations in non-small cell lung cancer</p>
            </title>
            <aug>
               <au>
                  <snm>Krypuy</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Newnham</snm>
                  <fnm>GM</fnm>
               </au>
               <au>
                  <snm>Thomas</snm>
                  <fnm>DM</fnm>
               </au>
               <au>
                  <snm>Conron</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Dobrovic</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>BMC Cancer</source>
            <pubdate>2006</pubdate>
            <volume>6</volume>
            <fpage>295</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1769510</pubid>
                  <pubid idtype="pmpid" link="fulltext">17184525</pubid>
                  <pubid idtype="doi">10.1186/1471-2407-6-295</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B43">
            <title>
               <p>Anthrax molecular epidemiology and forensics: using the appropriate marker for different evolutionary scales</p>
            </title>
            <aug>
               <au>
                  <snm>Keim</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Van Ert</snm>
                  <fnm>MN</fnm>
               </au>
               <au>
                  <snm>Pearson</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Vogler</snm>
                  <fnm>AJ</fnm>
               </au>
               <au>
                  <snm>Huynh</snm>
                  <fnm>LY</fnm>
               </au>
               <au>
                  <snm>Wagner</snm>
                  <fnm>DM</fnm>
               </au>
            </aug>
            <source>Infect Genet Evol</source>
            <pubdate>2004</pubdate>
            <fpage>205</fpage>
            <lpage>213</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.meegid.2004.02.005</pubid>
                  <pubid idtype="pmpid" link="fulltext">15450200</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B44">
            <title>
               <p>A multilocus sequence typing database system for pathogenic E coli</p>
            </title>
            <aug>
               <au>
                  <cnm>EcMLST</cnm>
               </au>
            </aug>
            <url>http://www.shigatox.net/cgi-bin/mlst7/index</url>
         </bibl>
         <bibl id="B45">
            <source>Escherichia coli MLST database</source>
            <url>http://web.mpiib-berlin.mpg.de/mlst/dbs/Ecoli/</url>
         </bibl>
         <bibl id="B46">
            <title>
               <p>Sex and virulence in <it>Escherichia coli</it>, an evolutionary perspective</p>
            </title>
            <aug>
               <au>
                  <snm>Wirth</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Falush</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Lan</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Colles</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Mensa</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Wieler</snm>
                  <fnm>LH</fnm>
               </au>
               <au>
                  <snm>Karch</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Reeves</snm>
                  <fnm>PR</fnm>
               </au>
               <au>
                  <snm>Maiden</snm>
                  <fnm>MC</fnm>
               </au>
               <au>
                  <snm>Ochman</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Achtman</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Mol Microbiol</source>
            <pubdate>2006</pubdate>
            <volume>60</volume>
            <fpage>1136</fpage>
            <lpage>1151</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1557465</pubid>
                  <pubid idtype="pmpid" link="fulltext">16689791</pubid>
                  <pubid idtype="doi">10.1111/j.1365-2958.2006.05172.x</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B47">
            <title>
               <p>eBURST: inferring patterns of evolutionary descent among clusters of related bacterial genotypes from multilocus sequence typing data</p>
            </title>
            <aug>
               <au>
                  <snm>Feil</snm>
                  <fnm>EJ</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>BC</fnm>
               </au>
               <au>
                  <snm>Aanensen</snm>
                  <fnm>DM</fnm>
               </au>
               <au>
                  <snm>Hanage</snm>
                  <fnm>WP</fnm>
               </au>
               <au>
                  <snm>Spratt</snm>
                  <fnm>BG</fnm>
               </au>
            </aug>
            <source>J Bacteriol</source>
            <pubdate>2004</pubdate>
            <volume>186</volume>
            <fpage>1518</fpage>
            <lpage>1530</lpage>
            <url>http://eBURST.mlst.net/</url>
            <note>Accessed 21 December 2006</note>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">344416</pubid>
                  <pubid idtype="pmpid" link="fulltext">14973027</pubid>
                  <pubid idtype="doi">10.1128/JB.186.5.1518-1530.2004</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B48">
            <title>
               <p>B&#956;g@s</p>
            </title>
            <url>http://bugs.sgul.ac.uk/</url>
         </bibl>
         <bibl id="B49">
            <title>
               <p>The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools</p>
            </title>
            <aug>
               <au>
                  <snm>Thompson</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Gibson</snm>
                  <fnm>TJ</fnm>
               </au>
               <au>
                  <snm>Plewniak</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Jeanmougin</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Higgins</snm>
                  <fnm>DG</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>1997</pubdate>
            <fpage>4876</fpage>
            <lpage>4882</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">147148</pubid>
                  <pubid idtype="pmpid" link="fulltext">9396791</pubid>
                  <pubid idtype="doi">10.1093/nar/25.24.4876</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B50">
            <title>
               <p>TreeView: an application to display phylogenetic trees on personal computers</p>
            </title>
            <aug>
               <au>
                  <snm>Page</snm>
                  <fnm>RD</fnm>
               </au>
            </aug>
            <source>Comput Appl Biosci</source>
            <pubdate>1996</pubdate>
            <volume>12</volume>
            <fpage>357</fpage>
            <lpage>358</lpage>
            <xrefbib>
               <pubid idtype="pmpid">8902363</pubid>
            </xrefbib>
         </bibl>
      </refgrp>
   </bm>
</art>
