<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1471-2105-7-S5-S4</ui>
   <ji>1471-2105</ji>
   <fm>
      <dochead>Proceedings</dochead>
      <bibl>
         <title>
            <p>Large-scale analysis of antigenic diversity of T-cell epitopes in dengue virus</p>
         </title>
         <aug>
            <au id="A1">
               <snm>Khan</snm>
               <mi>M</mi>
               <fnm>Asif</fnm>
               <insr iid="I1"/>
               <insr iid="I2"/>
               <email>g0501159@nus.edu.sg</email>
            </au>
            <au id="A2">
               <snm>Heiny</snm>
               <fnm>AT</fnm>
               <insr iid="I1"/>
               <insr iid="I3"/>
               <email>heiny@nus.edu.sg</email>
            </au>
            <au id="A3">
               <snm>Lee</snm>
               <mi>X</mi>
               <fnm>Kenneth</fnm>
               <insr iid="I1"/>
               <insr iid="I2"/>
               <email>leexunjian@gmail.com</email>
            </au>
            <au id="A4">
               <snm>Srinivasan</snm>
               <fnm>KN</fnm>
               <insr iid="I1"/>
               <insr iid="I4"/>
               <email>srinikn@jhmi.edu</email>
            </au>
            <au id="A5">
               <snm>Tan</snm>
               <mnm>Wee</mnm>
               <fnm>Tin</fnm>
               <insr iid="I3"/>
               <email>bchtantw@nus.edu.sg</email>
            </au>
            <au id="A6">
               <snm>August</snm>
               <fnm>J Thomas</fnm>
               <insr iid="I1"/>
               <insr iid="I4"/>
               <email>taugust@bs.jhmi.edu</email>
            </au>
            <au id="A7" ca="yes">
               <snm>Brusic</snm>
               <fnm>Vladimir</fnm>
               <insr iid="I2"/>
               <insr iid="I5"/>
               <email>v.brusic@uq.edu.au</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>The Division of Biomedical Sciences, Johns Hopkins Singapore, 31 Biopolis Way, #02-01 The Nanos, Singapore 138669, Singapore</p>
            </ins>
            <ins id="I2">
               <p>Department of Microbiology, The Yong Loo Lin School of Medicine, National University of Singapore, 5 Science Drive 2, Singapore 117597, Singapore</p>
            </ins>
            <ins id="I3">
               <p>Department of Biochemistry, The Yong Loo Lin School of Medicine, National University of Singapore, 5 Science Drive 2, Singapore 117597, Singapore</p>
            </ins>
            <ins id="I4">
               <p>Department of Pharmacology and Molecular Sciences, The Johns Hopkins University School of Medicine, 725 North Wolfe Street, Baltimore, MD 21205, USA</p>
            </ins>
            <ins id="I5">
               <p>School of Land and Food Sciences, and Institute for Molecular Biosciences, University of Queensland, Brisbane, QLD 4072, Australia</p>
            </ins>
         </insg>
         <source>BMC Bioinformatics</source>
         <supplement>
            <title>
               <p>APBioNet &#8211; Fifth International Conference on Bioinformatics (InCoB2006)</p>
            </title>
            <editor>Shoba Ranganathan, Martti Tammi, Michael Gribskov, Tin Wee Tan</editor>
            <note>Proceedings</note>
         </supplement>
         <conference>
            <title>
               <p>International Conference in Bioinformatics &#8211; InCoB2006</p>
            </title>
            <location>New Delhi, India</location>
            <date-range>18&#8211;20 December 2006</date-range>
         </conference>
         <issn>1471-2105</issn>
         <pubdate>2006</pubdate>
         <volume>7</volume>
         <issue>Suppl 5</issue>
         <fpage>S4</fpage>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">17254309</pubid>
               <pubid idtype="doi">10.1186/1471-2105-7-S5-S4</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <pub>
            <date>
               <day>18</day>
               <month>12</month>
               <year>2006</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2006</year>
         <collab>Khan et al; licensee BioMed Central Ltd</collab>
         <note>This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>Antigenic diversity in dengue virus strains has been studied, but large-scale and detailed systematic analyses have not been reported. In this study, we report a bioinformatics method for analyzing viral antigenic diversity in the context of T-cell mediated immune responses. We applied this method to study the relationship between short-peptide antigenic diversity and protein sequence diversity of dengue virus. We also studied the effects of sequence determinants on viral antigenic diversity. Short peptides, principally 9-mers were studied because they represent the predominant length of binding cores of T-cell epitopes, which are important for formulation of vaccines.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>Our analysis showed that the number of unique protein sequences required to represent complete antigenic diversity of short peptides in dengue virus is significantly smaller than that required to represent complete protein sequence diversity. Short-peptide antigenic diversity shows an asymptotic relationship to the number of unique protein sequences, indicating that for large sequence sets (~200) the addition of new protein sequences has marginal effect to increasing antigenic diversity. A near-linear relationship was observed between the extent of antigenic diversity and the length of protein sequences, suggesting that, for the practical purpose of vaccine development, antigenic diversity of short peptides from dengue virus can be represented by short regions of sequences (~&lt;100 aa) within viral antigens that are specific targets of immune responses (such as T-cell epitopes specific to particular human leukocyte antigen alleles).</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusion</p>
               </st>
               <p>This study provides evidence that there are limited numbers of antigenic combinations in protein sequence variants of a viral species and that short regions of the viral protein are sufficient to capture antigenic diversity of T-cell epitopes. The approach described herein has direct application to the analysis of other viruses, in particular those that show high diversity and/or rapid evolution, such as influenza A virus and human immunodeficiency virus (HIV).</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>Dengue virus has four serotypes (DV1, DV2, DV3 and DV4) that show substantial genetic diversity both within and between serotypes. Sequence comparison studies showed 30&#8211;40% difference in amino acid sequences between serotypes <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B2">2</abbr></abbrgrp>. The amino acid differences within each serotype are lower but the observed intra-serotype diversity is sufficiently large to warrant the definition of clusters of dengue virus variants <abbrgrp><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr></abbrgrp>. Studies of genetic diversity have focused on clade diversity and replacement <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>, mutation spectra <abbrgrp><abbr bid="B5">5</abbr></abbrgrp>, conserved regions <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> and implications for clinical manifestations <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>. Several studies have focused on the analysis of antigenic diversity (diversity of targets of immune responses in protein sequences) of dengue virus; these studies focussed on experimental mapping of antibody recognition sites <abbrgrp><abbr bid="B7">7</abbr><abbr bid="B8">8</abbr><abbr bid="B9">9</abbr><abbr bid="B10">10</abbr><abbr bid="B11">11</abbr></abbrgrp> and T-cell epitopes <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B12">12</abbr><abbr bid="B13">13</abbr><abbr bid="B14">14</abbr><abbr bid="B15">15</abbr></abbrgrp> and subsequent analysis of their diversity. Recently, Simmons <it>et al</it>. <abbrgrp><abbr bid="B15">15</abbr></abbrgrp> analyzed the T cell responses of individuals infected with DV2 by ELISpot assay and identified 34 peptides of several dengue proteins as potential novel T-cell epitopes.</p>
         <p>Generally, there is a correspondence between genetic and antigenic evolution of viruses, but genetic changes may result in disproportionately large antigenic changes <abbrgrp><abbr bid="B16">16</abbr><abbr bid="B17">17</abbr></abbrgrp>. While genetic and antigenic diversity in dengue virus strains had become evident <abbrgrp><abbr bid="B18">18</abbr></abbrgrp>, large-scale and detailed systematic analyses that explore their relationship have not been reported. Understanding this relationship is important for the study of vaccine development, especially in rapidly mutating viruses. In this paper, we will focus on protein sequence diversity, and thus consider only genetic variations that affect the protein sequences.</p>
         <p>Biological studies of antigenic diversity require great experimental effort, even for a single viral protein. Consequently, most research groups focus on studying small number of viral sequences. Rapid accumulation of sequence data from both classical and genomic/proteomic approaches makes the experimental studying of antigenic diversity difficult and time-consuming. A bioinformatics approach is necessary to support large-scale antigenic analysis of viral diversity, which can complement laboratory experiments.</p>
         <p>In this study, we have developed a bioinformatics method to analyze antigenic diversity in the context of T-cell mediated immune responses. We studied antigenic diversity of more than 9000 dengue virus protein sequences reported in the NCBI Entrez protein database <abbrgrp><abbr bid="B19">19</abbr></abbrgrp>. The study aimed to identify a minimal set of sequences that encodes the complete antigenic diversity of short peptides from all known sequences in dengue virus serotypes. Short peptides, principally 9-mers were studied because they represent the predominant length of binding cores of T-cell epitopes. We analysed the relationship between short-peptide antigenic diversity and protein sequence diversity of dengue virus; the analysis was performed at two time points to help understand the effects of the accumulation of sequence data to the relationship. We have also analyzed the effects of sequence determinants on antigenic diversity of short peptides. This study provides a framework for large-scale, systematic analysis of antigenic diversity for the protein sequences of any virus. We did not analyze B-cell epitope antigenic diversity because of their complex conformational nature. Although linear B-cell epitopes exist and our method can be used to study them, very often, they also show conformational preferences and dependence on the context of a protein antigen <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>. Further, only approximately 10% of B-cell epitopes from native proteins are linear <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>.</p>
      </sec>
      <sec>
         <st>
            <p>Results</p>
         </st>
         <sec>
            <st>
               <p>Dengue serotype protein datasets</p>
            </st>
            <p>Data of June 2004 (Table <tblr tid="T1">1</tblr>), collected from the NCBI Entrez protein database, contained a total of 3699 sequences representing the ten proteins encoded by the genomes of the four serotypes (Table <tblr tid="T2">2</tblr>). The number of these reported sequences increased nearly three-fold during the following 18 months (9512 sequences; see Table <tblr tid="T1">1</tblr>). The removal of duplicates (identical protein sequences) reduced these collected sequences to 1318 (2004) and 2419 (2005) unique sequences (Table <tblr tid="T1">1</tblr>). More than 64% of the sequences collected in 2004 were identical and redundant, and this redundancy increased by approximately 10% in 2005 (75%). The number of reported unique sequences varied greatly among the proteins, ranging from 69 NS4a to 998 E sequences in 2005 set (Table <tblr tid="T3">3</tblr>). Minor errors of annotation, mainly of the cleavage sites, were corrected prior to analysis for 17 sequences (see <supplr sid="S1">additional file 1</supplr>: Table S1.pdf).</p>
            <tbl id="T1">
               <title>
                  <p>Table 1</p>
               </title>
               <caption>
                  <p>Collected and unique protein sequences for each dengue serotype in 2004 and 2005 and the corresponding increase in data between the two time points.</p>
               </caption>
               <tblbdy cols="7">
                  <r>
                     <c ca="center">
                        <p>Dengue serotype</p>
                     </c>
                     <c cspan="2" ca="center">
                        <p>Data retrieved in 2004 (#)</p>
                     </c>
                     <c cspan="2" ca="center">
                        <p>Data retrieved in 2005 (#)</p>
                     </c>
                     <c cspan="2" ca="center">
                        <p>Increase (#)</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c cspan="6">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>Collected sequences</p>
                     </c>
                     <c ca="center">
                        <p>Unique sequences</p>
                     </c>
                     <c ca="center">
                        <p>Collected sequences</p>
                     </c>
                     <c ca="center">
                        <p>Unique sequences</p>
                     </c>
                     <c ca="center">
                        <p>Collected sequences</p>
                     </c>
                     <c ca="center">
                        <p>Unique sequences</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>DV1</p>
                     </c>
                     <c ca="center">
                        <p>744</p>
                     </c>
                     <c ca="center">
                        <p>359</p>
                     </c>
                     <c ca="center">
                        <p>2318</p>
                     </c>
                     <c ca="center">
                        <p>724</p>
                     </c>
                     <c ca="center">
                        <p>1574</p>
                     </c>
                     <c ca="center">
                        <p>365</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>DV2</p>
                     </c>
                     <c ca="center">
                        <p>1426</p>
                     </c>
                     <c ca="center">
                        <p>507</p>
                     </c>
                     <c ca="center">
                        <p>3351</p>
                     </c>
                     <c ca="center">
                        <p>697</p>
                     </c>
                     <c ca="center">
                        <p>1925</p>
                     </c>
                     <c ca="center">
                        <p>190</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>DV3</p>
                     </c>
                     <c ca="center">
                        <p>597</p>
                     </c>
                     <c ca="center">
                        <p>230</p>
                     </c>
                     <c ca="center">
                        <p>2520</p>
                     </c>
                     <c ca="center">
                        <p>678</p>
                     </c>
                     <c ca="center">
                        <p>1923</p>
                     </c>
                     <c ca="center">
                        <p>448</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>DV4</p>
                     </c>
                     <c ca="center">
                        <p>932</p>
                     </c>
                     <c ca="center">
                        <p>222</p>
                     </c>
                     <c ca="center">
                        <p>1323</p>
                     </c>
                     <c ca="center">
                        <p>320</p>
                     </c>
                     <c ca="center">
                        <p>391</p>
                     </c>
                     <c ca="center">
                        <p>98</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <it>Total</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>3699</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>1318</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>9512</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>2419</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>5813</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>1101</it>
                        </p>
                     </c>
                  </r>
               </tblbdy>
            </tbl>
            <tbl id="T2">
               <title>
                  <p>Table 2</p>
               </title>
               <caption>
                  <p>Proteins of a representative dengue virus serotype 2 polyprotein entry (P14340 of 3391 amino acids) in the NCBI Entrez protein database.</p>
               </caption>
               <tblbdy cols="2">
                  <r>
                     <c ca="center">
                        <p>Protein</p>
                     </c>
                     <c ca="center">
                        <p>Length (amino acids)</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="2">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Capsid (C)</p>
                     </c>
                     <c ca="center">
                        <p>114</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Precursor membrane (pM)</p>
                     </c>
                     <c ca="center">
                        <p>166</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Envelope (E)</p>
                     </c>
                     <c ca="center">
                        <p>495</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Nonstructural protein 1 (NS1)</p>
                     </c>
                     <c ca="center">
                        <p>352</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Nonstructural protein 2a (NS2a)</p>
                     </c>
                     <c ca="center">
                        <p>218</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Nonstructural protein 2b (NS2b)</p>
                     </c>
                     <c ca="center">
                        <p>130</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Nonstructural protein 3 (NS3)</p>
                     </c>
                     <c ca="center">
                        <p>618</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Nonstructural protein 4a (NS4a)</p>
                     </c>
                     <c ca="center">
                        <p>150</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Nonstructural protein 4b (NS4b)</p>
                     </c>
                     <c ca="center">
                        <p>248</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Nonstructural protein 5 (NS5)</p>
                     </c>
                     <c ca="center">
                        <p>900</p>
                     </c>
                  </r>
               </tblbdy>
            </tbl>
            <tbl id="T3">
               <title>
                  <p>Table 3</p>
               </title>
               <caption>
                  <p>Unique sequences for the proteins of the four serotypes in 2004 and 2005.</p>
               </caption>
               <tblbdy cols="3">
                  <r>
                     <c ca="center">
                        <p>Protein</p>
                     </c>
                     <c cspan="2" ca="center">
                        <p>No. of unique sequences (all four serotypes)</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c cspan="2">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>2004</p>
                     </c>
                     <c ca="center">
                        <p>2005</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="3">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>C</p>
                     </c>
                     <c ca="center">
                        <p>107</p>
                     </c>
                     <c ca="center">
                        <p>196</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>pM</p>
                     </c>
                     <c ca="center">
                        <p>126</p>
                     </c>
                     <c ca="center">
                        <p>220</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>E</p>
                     </c>
                     <c ca="center">
                        <p>495</p>
                     </c>
                     <c ca="center">
                        <p>998</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>NS1</p>
                     </c>
                     <c ca="center">
                        <p>150</p>
                     </c>
                     <c ca="center">
                        <p>224</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>NS2a</p>
                     </c>
                     <c ca="center">
                        <p>95</p>
                     </c>
                     <c ca="center">
                        <p>142</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>NS2b</p>
                     </c>
                     <c ca="center">
                        <p>59</p>
                     </c>
                     <c ca="center">
                        <p>78</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>NS3</p>
                     </c>
                     <c ca="center">
                        <p>80</p>
                     </c>
                     <c ca="center">
                        <p>164</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>NS4a</p>
                     </c>
                     <c ca="center">
                        <p>37</p>
                     </c>
                     <c ca="center">
                        <p>69</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>NS4b</p>
                     </c>
                     <c ca="center">
                        <p>57</p>
                     </c>
                     <c ca="center">
                        <p>88</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>NS5</p>
                     </c>
                     <c ca="center">
                        <p>112</p>
                     </c>
                     <c ca="center">
                        <p>240</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="3">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <it>Total</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>1318</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>2419</it>
                        </p>
                     </c>
                  </r>
               </tblbdy>
            </tbl>
            <suppl id="S1">
               <title>
                  <p>Additional file 1</p>
               </title>
               <text>
                  <p>Errors and discrepancies found in each dengue serotype (DV1, DV2, DV3 and DV4) data entries collected from the NCBI Entrez protein database.</p>
               </text>
               <file name="1471-2105-7-S5-S4-S1.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
         </sec>
         <sec>
            <st>
               <p>Intra- and inter-serotype amino acid sequence identities of dengue proteins</p>
            </st>
            <p>Earlier studies of dengue proteins, mainly E and NS1 <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B22">22</abbr><abbr bid="B23">23</abbr><abbr bid="B24">24</abbr><abbr bid="B25">25</abbr></abbrgrp>, have shown substantial amino acid sequence diversity both within and between the serotypes. In our study, we surveyed the extent of amino acid variation and conservation among dengue viruses by calculating pairwise percentage amino acid identity of unique sequences for each dengue protein, intra- and inter-serotype, using the large dengue data set of 2005. The intra- and inter-serotype percentage sequence identities (PSI) for all dengue proteins are shown in Table <tblr tid="T4">4</tblr>.</p>
            <tbl id="T4">
               <title>
                  <p>Table 4</p>
               </title>
               <caption>
                  <p>Minimum and maximum percentage sequence identity range for each dengue protein, intra- and inter-serotype.</p>
               </caption>
               <tblbdy cols="14">
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>DV1</p>
                     </c>
                     <c ca="left">
                        <p>DV2</p>
                     </c>
                     <c ca="left">
                        <p>DV3</p>
                     </c>
                     <c ca="left">
                        <p>DV4</p>
                     </c>
                     <c ca="center">
                        <p>Average PSI</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>DV1</p>
                     </c>
                     <c ca="left">
                        <p>DV2</p>
                     </c>
                     <c ca="left">
                        <p>DV3</p>
                     </c>
                     <c ca="left">
                        <p>DV4</p>
                     </c>
                     <c ca="left">
                        <p>Average PSI</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="14">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>C</p>
                     </c>
                     <c ca="left">
                        <p>DV1</p>
                     </c>
                     <c ca="left">
                        <p>88&#8211;99</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>65</p>
                     </c>
                     <c ca="left">
                        <p>pM</p>
                     </c>
                     <c ca="left">
                        <p>DV1</p>
                     </c>
                     <c ca="left">
                        <p>92&#8211;99</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>68</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>DV2</p>
                     </c>
                     <c ca="left">
                        <p>56&#8211;75</p>
                     </c>
                     <c ca="left">
                        <p>81&#8211;99</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>DV2</p>
                     </c>
                     <c ca="left">
                        <p>62&#8211;75</p>
                     </c>
                     <c ca="left">
                        <p>79&#8211;99</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>DV3</p>
                     </c>
                     <c ca="left">
                        <p>75&#8211;84</p>
                     </c>
                     <c ca="left">
                        <p>53&#8211;66</p>
                     </c>
                     <c ca="left">
                        <p>91&#8211;99</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>DV3</p>
                     </c>
                     <c ca="left">
                        <p>75&#8211;82</p>
                     </c>
                     <c ca="left">
                        <p>60&#8211;72</p>
                     </c>
                     <c ca="left">
                        <p>93&#8211;99</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>DV4</p>
                     </c>
                     <c ca="left">
                        <p>61&#8211;68</p>
                     </c>
                     <c ca="left">
                        <p>57&#8211;69</p>
                     </c>
                     <c ca="left">
                        <p>54&#8211;60</p>
                     </c>
                     <c ca="left">
                        <p>94&#8211;99</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>DV4</p>
                     </c>
                     <c ca="left">
                        <p>62&#8211;67</p>
                     </c>
                     <c ca="left">
                        <p>60&#8211;71</p>
                     </c>
                     <c ca="left">
                        <p>64&#8211;70</p>
                     </c>
                     <c ca="left">
                        <p>96&#8211;99</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c cspan="14">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>E</p>
                     </c>
                     <c ca="left">
                        <p>DV1</p>
                     </c>
                     <c ca="left">
                        <p>89&#8211;99</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>65</p>
                     </c>
                     <c ca="left">
                        <p>NS1</p>
                     </c>
                     <c ca="left">
                        <p>DV1</p>
                     </c>
                     <c ca="left">
                        <p>93&#8211;99</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>72</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>DV2</p>
                     </c>
                     <c ca="left">
                        <p>58&#8211;70</p>
                     </c>
                     <c ca="left">
                        <p>80&#8211;99</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>DV2</p>
                     </c>
                     <c ca="left">
                        <p>68&#8211;75</p>
                     </c>
                     <c ca="left">
                        <p>85&#8211;99</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>DV3</p>
                     </c>
                     <c ca="left">
                        <p>72&#8211;79</p>
                     </c>
                     <c ca="left">
                        <p>60&#8211;69</p>
                     </c>
                     <c ca="left">
                        <p>92&#8211;99</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>DV3</p>
                     </c>
                     <c ca="left">
                        <p>77&#8211;80</p>
                     </c>
                     <c ca="left">
                        <p>69&#8211;75</p>
                     </c>
                     <c ca="left">
                        <p>94&#8211;99</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>DV4</p>
                     </c>
                     <c ca="left">
                        <p>58&#8211;66</p>
                     </c>
                     <c ca="left">
                        <p>55&#8211;65</p>
                     </c>
                     <c ca="left">
                        <p>61&#8211;64</p>
                     </c>
                     <c ca="left">
                        <p>94&#8211;99</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>DV4</p>
                     </c>
                     <c ca="left">
                        <p>67&#8211;70</p>
                     </c>
                     <c ca="left">
                        <p>68&#8211;73</p>
                     </c>
                     <c ca="left">
                        <p>70&#8211;74</p>
                     </c>
                     <c ca="left">
                        <p>93&#8211;99</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c cspan="14">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>NS2a</p>
                     </c>
                     <c ca="left">
                        <p>DV1</p>
                     </c>
                     <c ca="left">
                        <p>90&#8211;99</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>39</p>
                     </c>
                     <c ca="left">
                        <p>NS2b</p>
                     </c>
                     <c ca="left">
                        <p>DV1</p>
                     </c>
                     <c ca="left">
                        <p>93&#8211;99</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>60</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>DV2</p>
                     </c>
                     <c ca="left">
                        <p>36&#8211;40</p>
                     </c>
                     <c ca="left">
                        <p>93&#8211;99</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>DV2</p>
                     </c>
                     <c ca="left">
                        <p>56&#8211;62</p>
                     </c>
                     <c ca="left">
                        <p>95&#8211;99</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>DV3</p>
                     </c>
                     <c ca="left">
                        <p>43&#8211;48</p>
                     </c>
                     <c ca="left">
                        <p>35&#8211;40</p>
                     </c>
                     <c ca="left">
                        <p>93&#8211;99</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>DV3</p>
                     </c>
                     <c ca="left">
                        <p>66&#8211;70</p>
                     </c>
                     <c ca="left">
                        <p>58&#8211;63</p>
                     </c>
                     <c ca="left">
                        <p>96&#8211;99</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>DV4</p>
                     </c>
                     <c ca="left">
                        <p>35&#8211;39</p>
                     </c>
                     <c ca="left">
                        <p>33&#8211;36</p>
                     </c>
                     <c ca="left">
                        <p>36&#8211;41</p>
                     </c>
                     <c ca="left">
                        <p>89&#8211;99</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>DV4</p>
                     </c>
                     <c ca="left">
                        <p>56&#8211;62</p>
                     </c>
                     <c ca="left">
                        <p>54&#8211;59</p>
                     </c>
                     <c ca="left">
                        <p>56&#8211;59</p>
                     </c>
                     <c ca="left">
                        <p>94&#8211;99</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c cspan="14">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>NS3</p>
                     </c>
                     <c ca="left">
                        <p>DV1</p>
                     </c>
                     <c ca="left">
                        <p>97&#8211;99</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>79</p>
                     </c>
                     <c ca="left">
                        <p>NS4a</p>
                     </c>
                     <c ca="left">
                        <p>DV1</p>
                     </c>
                     <c ca="left">
                        <p>92&#8211;99</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>60</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>DV2</p>
                     </c>
                     <c ca="left">
                        <p>78&#8211;80</p>
                     </c>
                     <c ca="left">
                        <p>96&#8211;99</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>DV2</p>
                     </c>
                     <c ca="left">
                        <p>56&#8211;61</p>
                     </c>
                     <c ca="left">
                        <p>96&#8211;99</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>DV3</p>
                     </c>
                     <c ca="left">
                        <p>84&#8211;86</p>
                     </c>
                     <c ca="left">
                        <p>79&#8211;81</p>
                     </c>
                     <c ca="left">
                        <p>97&#8211;99</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>DV3</p>
                     </c>
                     <c ca="left">
                        <p>63&#8211;68</p>
                     </c>
                     <c ca="left">
                        <p>56&#8211;63</p>
                     </c>
                     <c ca="left">
                        <p>92&#8211;99</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>DV4</p>
                     </c>
                     <c ca="left">
                        <p>75&#8211;77</p>
                     </c>
                     <c ca="left">
                        <p>75&#8211;77</p>
                     </c>
                     <c ca="left">
                        <p>77&#8211;79</p>
                     </c>
                     <c ca="left">
                        <p>97&#8211;99</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>DV4</p>
                     </c>
                     <c ca="left">
                        <p>56&#8211;60</p>
                     </c>
                     <c ca="left">
                        <p>59&#8211;64</p>
                     </c>
                     <c ca="left">
                        <p>56&#8211;62</p>
                     </c>
                     <c ca="left">
                        <p>94&#8211;99</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c cspan="14">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>NS4b</p>
                     </c>
                     <c ca="left">
                        <p>DV1</p>
                     </c>
                     <c ca="left">
                        <p>95&#8211;99</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>78</p>
                     </c>
                     <c ca="left">
                        <p>NS5</p>
                     </c>
                     <c ca="left">
                        <p>DV1</p>
                     </c>
                     <c ca="left">
                        <p>96&#8211;99</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>77</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>DV2</p>
                     </c>
                     <c ca="left">
                        <p>75&#8211;79</p>
                     </c>
                     <c ca="left">
                        <p>95&#8211;99</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>DV2</p>
                     </c>
                     <c ca="left">
                        <p>77&#8211;79</p>
                     </c>
                     <c ca="left">
                        <p>95&#8211;99</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>DV3</p>
                     </c>
                     <c ca="left">
                        <p>81&#8211;85</p>
                     </c>
                     <c ca="left">
                        <p>75&#8211;79</p>
                     </c>
                     <c ca="left">
                        <p>97&#8211;99</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>DV3</p>
                     </c>
                     <c ca="left">
                        <p>80&#8211;82</p>
                     </c>
                     <c ca="left">
                        <p>77&#8211;79</p>
                     </c>
                     <c ca="left">
                        <p>96&#8211;99</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>DV4</p>
                     </c>
                     <c ca="left">
                        <p>75&#8211;78</p>
                     </c>
                     <c ca="left">
                        <p>77&#8211;81</p>
                     </c>
                     <c ca="left">
                        <p>76&#8211;79</p>
                     </c>
                     <c ca="left">
                        <p>97&#8211;99</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>DV4</p>
                     </c>
                     <c ca="left">
                        <p>73&#8211;76</p>
                     </c>
                     <c ca="left">
                        <p>72&#8211;75</p>
                     </c>
                     <c ca="left">
                        <p>74&#8211;77</p>
                     </c>
                     <c ca="left">
                        <p>95&#8211;99</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p> The average percentage sequence identities (PSI) are shown for inter-serotype comparisons.</p>
               </tblfn>
            </tbl>
            <p>The intra-serotype percentage sequence identity was between 92% and 99%, except for C, pM, E and NS1 of DV2, which showed minimum sequence identities ranging from 79% to 89%. In contrast, the average inter-serotype percentage sequence identity was in the range of 60&#8211;79%, except for NS2a. The NS3, NS4b and NS5 proteins are highly conserved across the serotypes, with average sequence identities in the range of 77&#8211;79%, probably because of their involvement in forming the RNA replication complex <abbrgrp><abbr bid="B26">26</abbr></abbrgrp>. The NS2a protein is the most diverse across the serotypes (average PSI of 39%), although it is highly conserved within each serotype. The inter-serotype diversity observed for NS2a is comparable to the inter-<it>Flavivirus </it>diversity of the envelope protein, which shows approximately 40% amino acid identity <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>.</p>
         </sec>
         <sec>
            <st>
               <p>Minimal sequence sets representing dengue virus antigenic diversity</p>
            </st>
            <p>In addition to identical protein sequences, another source of sequence redundancy, relative to this study, is the presence of antigenically redundant sequences. These sequences exist because of the identity of many amino acid residues among the individually unique protein sequences (see Results section sub-heading: <it>Intra- and inter-serotype amino acid sequence identities of dengue proteins</it>), resulting in the presence of targets of T-cell mediated immune responses (T-cell epitopes) that are identical among viral variants. Antigenically redundant sequences can be removed without loss of information on antigenic diversity among the sequences sets. For example, if in a dataset of three sequences, all of the overlapping 9-mers in one sequence have a match in at least one of the other two sequences, the antigenic diversity of this sequence can be covered by the other two sequences combined, thus rendering the first sequence antigenically redundant (Figure <figr fid="F1">1</figr>).</p>
            <fig id="F1">
               <title>
                  <p>Figure 1</p>
               </title>
               <caption>
                  <p>Definition of antigenically redundant sequence</p>
               </caption>
               <text>
                  <p><b>Definition of antigenically redundant sequence</b>. A) The three sequences (NCBI GI no.: 1854039, 17129648 and 37963458) are each unique, and residues that vary among them are shown. B) Overlapping 9-mers generated from the three unique sequences represent all the inherent antigenic variations, with respect to potential 9-mer T-cell epitopes. Although the three sequences are each unique, they share identical 9-mers. 9-mers shown in uppercase are those with an identical match in two of the unique sequences analyzed, while those in bold uppercase have an identical match in all three sequences; unique 9-mers are shown in lowercase. All the 9-mers in sequence 1854039 have a match in at least one of the other two sequences; thus, the antigenic diversity of this sequence can be covered by the other two sequences combined, rendering the sequence 1854039 antigenically redundant. Hence, the minimal number of sequences required to represent antigenic diversity for this dataset is two.</p>
               </text>
               <graphic file="1471-2105-7-S5-S4-1"/>
            </fig>
            <p>The removal of antigenically redundant sequences using our bioinformatics method (see Methods section sub-heading: <it>Protein sequence and antigenic diversity analysis of dengue virus</it>) resulted in a further reduction of the number of dengue unique sequences to a total of 969 (2004 set) or 1684 (2005 set). Those two sets represent the complete antigenic diversity of short peptides for all four dengue serotypes (Table <tblr tid="T5">5</tblr>). The increase in the number of unique sequences required to represent the complete antigenic diversity of short peptides in the four dengue serotypes in 2005 (compared to 2004) is an indication that more short-peptide antigenic diversity was found in the new sequences accumulated in the database. However, the percentage of unique sequences required to represent the complete short-peptide antigenic diversity of all four dengue serotypes in 2005 decreased (from 74% in 2004 to 70% in 2005) because of an increase in antigenic redundancy. This observation indicates that the increase in the number of unique protein sequences (representing protein sequence diversity) deposited in public databases is generally accompanied by a slower increase in short-peptide antigenic diversity.</p>
            <tbl id="T5">
               <title>
                  <p>Table 5</p>
               </title>
               <caption>
                  <p>Reduction of the number of unique dengue sequences by removal of antigenically redundant sequences.</p>
               </caption>
               <tblbdy cols="7">
                  <r>
                     <c ca="center">
                        <p>Dengue serotype</p>
                     </c>
                     <c cspan="3" ca="center">
                        <p>Data retrieved in 2004</p>
                     </c>
                     <c cspan="3" ca="center">
                        <p>Data retrieved in 2005</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c cspan="6">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>Unique sequences (#)</p>
                     </c>
                     <c cspan="2" ca="center">
                        <p>Minimal antigenic set</p>
                     </c>
                     <c ca="center">
                        <p>Unique sequences (#)</p>
                     </c>
                     <c cspan="2" ca="center">
                        <p>Minimal antigenic set</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c cspan="2">
                        <hr/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c cspan="2">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>Unique sequences (#)*</p>
                     </c>
                     <c ca="center">
                        <p>Percentage of unique sequences (%)**</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>Unique sequences (#)*</p>
                     </c>
                     <c ca="center">
                        <p>Percentage of unique sequences (%)**</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>DV1</p>
                     </c>
                     <c ca="center">
                        <p>359</p>
                     </c>
                     <c ca="center">
                        <p>244</p>
                     </c>
                     <c ca="center">
                        <p>68%</p>
                     </c>
                     <c ca="center">
                        <p>724</p>
                     </c>
                     <c ca="center">
                        <p>493</p>
                     </c>
                     <c ca="center">
                        <p>68%</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>DV2</p>
                     </c>
                     <c ca="center">
                        <p>507</p>
                     </c>
                     <c ca="center">
                        <p>368</p>
                     </c>
                     <c ca="center">
                        <p>73%</p>
                     </c>
                     <c ca="center">
                        <p>697</p>
                     </c>
                     <c ca="center">
                        <p>466</p>
                     </c>
                     <c ca="center">
                        <p>67%</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>DV3</p>
                     </c>
                     <c ca="center">
                        <p>230</p>
                     </c>
                     <c ca="center">
                        <p>180</p>
                     </c>
                     <c ca="center">
                        <p>78%</p>
                     </c>
                     <c ca="center">
                        <p>678</p>
                     </c>
                     <c ca="center">
                        <p>482</p>
                     </c>
                     <c ca="center">
                        <p>71%</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>DV4</p>
                     </c>
                     <c ca="center">
                        <p>222</p>
                     </c>
                     <c ca="center">
                        <p>177</p>
                     </c>
                     <c ca="center">
                        <p>80%</p>
                     </c>
                     <c ca="center">
                        <p>320</p>
                     </c>
                     <c ca="center">
                        <p>243</p>
                     </c>
                     <c ca="center">
                        <p>76%</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <it>Total</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>1318</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>969</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>74%</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>2419</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>1684</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>70%</it>
                        </p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>*Minimal no. of unique sequences that represent complete short-peptide (9-mer) antigenic diversity of dengue unique sequences reported in NCBI Entrez protein database. **Percentage of unique sequences that represent complete short-peptide (9-mer) antigenic diversity of dengue unique sequences reported in the NCBI Entrez protein database.</p>
               </tblfn>
            </tbl>
         </sec>
         <sec>
            <st>
               <p>Characterization and application of sequence variables that affect antigenic diversity</p>
            </st>
            <p>We examined the effects of sequence determinants, such as number and length of sequences, on the short-peptide antigenic diversity of dengue virus. These analyses were carried out using test datasets of different numbers of sequences (20, 40, 60, 80, 100, 120 and 140 sequences) and different lengths (23, 46, 128, 138, 276 and 460 aa) that were randomly selected from a set of DV2 envelope protein sequences with repeated sampling for 20 times. Antigenic diversity analysis of each test dataset was performed to identify a minimal set of sequences that represents the complete short-peptide antigenic diversity for each dataset. These minimal sets were used to analyze the effects of the sequence determinants on antigenic diversity.</p>
            <sec>
               <st>
                  <p>Effects of number of sequences on short-peptide antigenic diversity</p>
               </st>
               <p>An increase in the number of unique sequences in a dataset reduces the fraction required to represent the complete short-peptide antigenic diversity (Table <tblr tid="T6">6</tblr>). This observation reflects an asymptotic relationship between the number of unique sequences and the percentage of the complete short-peptide antigenic diversity that is covered (Figure <figr fid="F2">2</figr>). Asymptotic curves were observed for all proteins of the four dengue serotypes (data not shown). The shape of the curve indicates that a single sequence will cover only a small proportion of the total short-peptide antigenic diversity and that for proteins with a large number of unique sequences, the addition of a single new variant sequence has little effect on the overall antigenic diversity.</p>
               <tbl id="T6">
                  <title>
                     <p>Table 6</p>
                  </title>
                  <caption>
                     <p>Effects of number of unique dengue virus serotype 2 (DV2) envelope sequences (N) on short-peptide (9-mer) antigenic diversity.</p>
                  </caption>
                  <tblbdy cols="8">
                     <r>
                        <c ca="left">
                           <p>Number of unique sequences (N)</p>
                        </c>
                        <c ca="center">
                           <p>20</p>
                        </c>
                        <c ca="center">
                           <p>40</p>
                        </c>
                        <c ca="center">
                           <p>60</p>
                        </c>
                        <c ca="center">
                           <p>80</p>
                        </c>
                        <c ca="center">
                           <p>100</p>
                        </c>
                        <c ca="center">
                           <p>120</p>
                        </c>
                        <c ca="center">
                           <p>140</p>
                        </c>
                     </r>
                     <r>
                        <c cspan="8">
                           <hr/>
                        </c>
                     </r>
                     <r>
                        <c ca="left">
                           <p>Length of sequences</p>
                        </c>
                        <c ca="center">
                           <p>460 aa</p>
                        </c>
                        <c ca="center">
                           <p>460 aa</p>
                        </c>
                        <c ca="center">
                           <p>460 aa</p>
                        </c>
                        <c ca="center">
                           <p>460 aa</p>
                        </c>
                        <c ca="center">
                           <p>460 aa</p>
                        </c>
                        <c ca="center">
                           <p>460 aa</p>
                        </c>
                        <c ca="center">
                           <p>460 aa</p>
                        </c>
                     </r>
                     <r>
                        <c ca="left">
                           <p>Minimal number of unique sequences that represent complete short-peptide antigenic diversity (Mean &#177; SE)</p>
                        </c>
                        <c ca="center">
                           <p>18 &#177; 0.30</p>
                        </c>
                        <c ca="center">
                           <p>32 &#177; 0.54</p>
                        </c>
                        <c ca="center">
                           <p>46 &#177; 0.70</p>
                        </c>
                        <c ca="center">
                           <p>58 &#177; 0.87</p>
                        </c>
                        <c ca="center">
                           <p>70 &#177; 0.87</p>
                        </c>
                        <c ca="center">
                           <p>80 &#177; 0.87</p>
                        </c>
                        <c ca="center">
                           <p>90 &#177; 0.71</p>
                        </c>
                     </r>
                     <r>
                        <c ca="left">
                           <p>Percentage of unique sequences that represent complete short-peptide antigenic diversity (%) (Mean &#177; SE)</p>
                        </c>
                        <c ca="center">
                           <p>90 &#177; 1.5</p>
                        </c>
                        <c ca="center">
                           <p>80 &#177; 1.35</p>
                        </c>
                        <c ca="center">
                           <p>77 &#177; 1.17</p>
                        </c>
                        <c ca="center">
                           <p>73 &#177; 1.09</p>
                        </c>
                        <c ca="center">
                           <p>70 &#177; 0.87</p>
                        </c>
                        <c ca="center">
                           <p>67 &#177; 0.73</p>
                        </c>
                        <c ca="center">
                           <p>64 &#177; 0.51</p>
                        </c>
                     </r>
                  </tblbdy>
                  <tblfn>
                     <p>The mean and standard error (SE) values are shown for random repeated sampling of 20 times.</p>
                  </tblfn>
               </tbl>
               <fig id="F2">
                  <title>
                     <p>Figure 2</p>
                  </title>
                  <caption>
                     <p>Short-peptide (9-mer) antigenic diversity as a function of number of sequences</p>
                  </caption>
                  <text>
                     <p><b>Short-peptide (9-mer) antigenic diversity as a function of number of sequences</b>. Short-peptide antigenic diversity has an asymptotic relationship to number of unique dengue virus serotype 2 (DV2) envelope sequences (N). Each curve shows the cumulative percentage coverage of short-peptide antigenic diversity. Vertical bars represent standard error for repeated random sampling of 20 times.</p>
                  </text>
                  <graphic file="1471-2105-7-S5-S4-2"/>
               </fig>
            </sec>
            <sec>
               <st>
                  <p>Effects of length of sequences on short-peptide antigenic diversity</p>
               </st>
               <p>A decrease in the length of sequences of a dataset reduces the fraction required to represent the complete short-peptide antigenic diversity of the dataset (Table <tblr tid="T7">7</tblr>). This reduction was achieved by removal of two types of redundancy: identical fragments and antigenically redundant fragments. The number of identical fragments increases significantly with a decrease in the length of the fragments because of the limited variability associated with smaller size. Hence, the effect of sequence length is significant, especially for very short fragments (23 aa), for which only ~7% of the unique fragments were required to represent complete antigenic diversity of the short fragments (a reduction of ~93%). Overall, the results indicate that short-peptide antigenic diversity has a near-linear relationship to sequence length (Figure <figr fid="F3">3</figr>).</p>
               <tbl id="T7">
                  <title>
                     <p>Table 7</p>
                  </title>
                  <caption>
                     <p>Effect of length of dengue virus serotype 2 (DV2) envelope protein sequences on short-peptide (9-mer) antigenic diversity.</p>
                  </caption>
                  <tblbdy cols="7">
                     <r>
                        <c ca="left">
                           <p>Length of fragments</p>
                        </c>
                        <c ca="center">
                           <p>100% (460 aa)</p>
                        </c>
                        <c ca="center">
                           <p>60% (276 aa)</p>
                        </c>
                        <c ca="center">
                           <p>30% (138 aa)</p>
                        </c>
                        <c ca="center">
                           <p>20% (92 aa)</p>
                        </c>
                        <c ca="center">
                           <p>10% (46 aa)</p>
                        </c>
                        <c ca="center">
                           <p>5% (23 aa)</p>
                        </c>
                     </r>
                     <r>
                        <c cspan="7">
                           <hr/>
                        </c>
                     </r>
                     <r>
                        <c ca="left">
                           <p>Number of fragments</p>
                        </c>
                        <c ca="center">
                           <p>187</p>
                        </c>
                        <c ca="center">
                           <p>187</p>
                        </c>
                        <c ca="center">
                           <p>187</p>
                        </c>
                        <c ca="center">
                           <p>187</p>
                        </c>
                        <c ca="center">
                           <p>187</p>
                        </c>
                        <c ca="center">
                           <p>187</p>
                        </c>
                     </r>
                     <r>
                        <c ca="left">
                           <p>Number of unique fragments</p>
                        </c>
                        <c ca="center">
                           <p>187</p>
                        </c>
                        <c ca="center">
                           <p>131</p>
                        </c>
                        <c ca="center">
                           <p>82</p>
                        </c>
                        <c ca="center">
                           <p>58</p>
                        </c>
                        <c ca="center">
                           <p>27</p>
                        </c>
                        <c ca="center">
                           <p>17</p>
                        </c>
                     </r>
                     <r>
                        <c ca="left">
                           <p>Minimal number of fragments that represent complete short-peptide antigenic diversity (Mean &#177; SE)</p>
                        </c>
                        <c ca="center">
                           <p>111 &#177; 0.11</p>
                        </c>
                        <c ca="center">
                           <p>74 &#177; 0.11</p>
                        </c>
                        <c ca="center">
                           <p>48 &#177; 0.17</p>
                        </c>
                        <c ca="center">
                           <p>38 &#177; 0.10</p>
                        </c>
                        <c ca="center">
                           <p>24 &#177; 0.10</p>
                        </c>
                        <c ca="center">
                           <p>14 &#177; 0.10</p>
                        </c>
                     </r>
                     <r>
                        <c ca="left">
                           <p>Percentage of fragments that represent complete short-peptide antigenic diversity (%) (Mean &#177; SE)</p>
                        </c>
                        <c ca="center">
                           <p>59 &#177; 0.06</p>
                        </c>
                        <c ca="center">
                           <p>40 &#177; 0.06</p>
                        </c>
                        <c ca="center">
                           <p>26 &#177; 0.09</p>
                        </c>
                        <c ca="center">
                           <p>20 &#177; 0.05</p>
                        </c>
                        <c ca="center">
                           <p>13 &#177; 0.05</p>
                        </c>
                        <c ca="center">
                           <p>7 &#177; 0.05</p>
                        </c>
                     </r>
                  </tblbdy>
                  <tblfn>
                     <p>The mean and standard error (SE) values are shown for random repeated sampling of 20 times.</p>
                  </tblfn>
               </tbl>
               <fig id="F3">
                  <title>
                     <p>Figure 3</p>
                  </title>
                  <caption>
                     <p>Short-peptide (9-mer) antigenic diversity as a function of length of sequences</p>
                  </caption>
                  <text>
                     <p><b>Short-peptide (9-mer) antigenic diversity as a function of length of sequences</b>. Short-peptide antigenic diversity shows a linear relationship to the sequence length of dengue virus serotype 2 (DV2) envelope protein. Vertical bars represent standard error for repeated random sampling of 20 times.</p>
                  </text>
                  <graphic file="1471-2105-7-S5-S4-3"/>
               </fig>
            </sec>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Discussion</p>
         </st>
         <p>In this study, we applied a systematic bioinformatics approach to collect, clean, organize and analyze the antigenic diversity of short peptides in reported protein sequence data of dengue virus. We have developed a computational method for the analysis of antigenic diversity in the context of T-cell mediated immune responses. The method was applied for the analysis of short-peptide antigenic diversity of dengue virus to determine a minimal sequence set that encodes the complete antigenic diversity of linear epitopes within each dengue virus serotype. We studied the relationship between short-peptide antigenic diversity and protein sequence diversity of DV and also explored the effects of sequence determinants on viral antigenic diversity. Our analysis showed that the minimal number of unique sequences required to represent complete antigenic diversity of linear epitopes in dengue virus is significantly smaller than that required to represent complete protein sequence diversity. Short-peptide antigenic diversity shows an asymptotic relationship to the number of unique sequences and linear relationship to the length of protein antigens.</p>
         <p>The minimal sequence set that encodes the complete short-peptide antigenic diversity for each dengue virus serotype was derived through removal of identical sequences and antigenically redundant sequences (Table <tblr tid="T5">5</tblr> and Figure <figr fid="F4">4</figr>). Both reductions occurred without any loss of information on antigenic diversity among the sequences. The largest reduction was accomplished through the removal of identical sequences, since only 36% (year 2004) or 25% (year 2005) of the sequences were unique. The identical sequences originated from dengue virus strains that were unique variants with respect to the whole polyprotein, but were identical to other dengue strains with respect to individual proteins, resulting in many duplicate protein sequences. The removal of antigenically redundant sequences also involved a significant proportion of the sequences, approximately one-third of all unique sequences (2004: 26%; 2005: 30%), reflecting the high antigenic redundancy among the dengue virus variants, which often differed by only a few amino acids. Despite significant reduction achieved by reducing the collected sequences to minimal sequences, a large number of protein sequences, 969 in 2004 and 1684 in 2005, were still required to represent the complete short-peptide antigenic diversity of dengue virus.</p>
         <fig id="F4">
            <title>
               <p>Figure 4</p>
            </title>
            <caption>
               <p>Flowchart summarizing the steps undertaken to identify the antigenically relevant unique sequence for dengue virus</p>
            </caption>
            <text>
               <p>Flowchart summarizing the steps undertaken to identify the antigenically relevant unique sequence for dengue virus.</p>
            </text>
            <graphic file="1471-2105-7-S5-S4-4"/>
         </fig>
         <p>It is clear that antigenic diversity in the reported dengue sequences is large. With many asymptomatic human and animal carriers of dengue viruses representing a huge reservoir for emergence of new strains <abbrgrp><abbr bid="B6">6</abbr><abbr bid="B24">24</abbr><abbr bid="B28">28</abbr></abbrgrp>, the diversity is expected to increase, although at a progressively slower pace. This is because antigenic redundancy increases when the number of sequences increases; we observed that when the dataset for a particular protein reaches approximately 200 sequences, the effect of addition of new sequences to increasing antigenic diversity is marginal.</p>
         <p>Our study of factors that affect antigenic diversity provided insight into dealing with the increasing T-cell epitope antigenic diversity in the context of vaccine development. Length of sequences had the largest effect on short-peptide antigenic diversity. The asymptotic behaviour of antigenic diversity increase was observed for the increase in the number of sequence variants. For practical purposes of vaccine formulation, antigenic diversity cannot be represented by whole protein sequences because it is not feasible to use these sequences for systematic experimental analysis: they are long and their number is increasing rapidly. The implication is that conventional vaccination strategies, which utilize whole attenuated pathogen with little knowledge of the specificity of immune responses they elicit, may not be suitable for providing protection from multiple variants of viruses. Furthermore, it may be difficult to optimize such vaccine according to the human leukocyte antigen (HLA) profile of the population receiving the vaccine <abbrgrp><abbr bid="B29">29</abbr><abbr bid="B30">30</abbr></abbrgrp>, as neither the identities of the HLA molecules that bind T-cell epitopes, nor the epitopes themselves are known.</p>
         <p>The more effective vaccine strategy that we propose is to focus on short segments of proteins (~&lt;100 aa) that are known to be specific targets of immune responses (such as T-cell epitopes specific to particular HLA alleles), particularly those that have high concentration of T-cell epitopes <abbrgrp><abbr bid="B31">31</abbr></abbrgrp>. By combining selected sets of short antigen fragments that represent T-cell epitope antigenic diversity, complete sets of viral targets can be covered in a "divide-and-conquer" approach. This may provide a promising basis for multivalent peptide-based vaccines against dengue virus. However, this strategy does not address the dengue virus-specific problem of protection versus immunopathology during secondary infections with a different serotype <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>.</p>
         <p>Several caveats need to be considered in a study such as this. First, it is well-known that not all HLA-restricted epitopes are 9-mers <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>. This may impact the interpretation of our results, which were based only on 9-mers, and hence may not give a true representation of dengue T-cell epitope antigenic diversity. We selected 9-mers because they represent the typical size of HLA class I T-cell epitopes, as well as the binding core of HLA class II T-cell epitopes <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>. We performed the same analysis with peptides of 8-mers and 10-mers. The results showed no significant difference as compared to the analysis of 9-mers (data not shown).</p>
         <p>The second caveat is the sampling bias in dengue virus sequences reported to the public databases. Only dengue sequences that have been studied are reported, and viruses collected in accessible locations, associated with notable disease outbreaks or of known immunological properties are preferentially studied. Consequently, certain dengue proteins have been studied intensively, while the others remained largely unstudied. For example, sequences of the envelope protein, known to be important for immunological activity and viral entry into host <abbrgrp><abbr bid="B26">26</abbr><abbr bid="B33">33</abbr></abbrgrp>, were the most abundant in our dataset (3183 sequences for all four serotypes), while that of NS4a, which is relatively unknown for immunological activity, was under-represented. In addition, for majority of the proteins, a large portion of the reported sequences were incomplete in length. For example, 95% of DV2 NS5 collected sequences were incomplete in length (data not shown). However, the data used in this study was the most representative available and the large sample size for majority of the proteins helps to decrease the margin of error due to sampling bias. In addition, the reported sequences represent highly pathogenic strains isolated during dengue outbreaks.</p>
         <p>There has been no significant increase in the number of unique sequences for dengue virus since the last analysis (December 2005). The September 2006 data set contained a total of 2661 (793 DV1, 784 DV2, 759 DV3 and 325 DV4) dengue unique sequences. This was an increase of 242 unique sequences from the 2005 data set. The increase, approximately 10%, was not expected to significantly affect the results observed for 2005 data set. Therefore, we did not perform the analysis of antigenic diversity on the 2006 data set because of the small increase in the number of unique sequence.</p>
      </sec>
      <sec>
         <st>
            <p>Conclusion</p>
         </st>
         <p>This study has provided evidence that there are limited numbers of antigenic combinations in variant protein sequences of a viral species and that short regions of the viral proteins are sufficient to capture antigenic diversity of T-cell epitopes. The approach described herein has direct application to the analysis of other viruses, in particular those that show high diversity and/or rapid evolution, such as influenza A virus and human immunodeficiency virus (HIV).</p>
      </sec>
      <sec>
         <st>
            <p>Methods</p>
         </st>
         <sec>
            <st>
               <p>Data collection</p>
            </st>
            <p>All dengue virus protein sequence entries present in the NCBI Entrez protein database <abbrgrp><abbr bid="B34">34</abbr></abbrgrp> were collected in June 2004 and then again in December 2005. Data retrieval was performed through the NCBI taxonomy browser <abbrgrp><abbr bid="B19">19</abbr></abbrgrp> and the respective taxonomy ID for each of the dengue serotypes (DV1-4) are 11053, 11060, 11069 and 11070. The collected entries for both time points were processed separately using identical procedures.</p>
         </sec>
         <sec>
            <st>
               <p>Data processing: cleaning and grouping</p>
            </st>
            <p>The dengue virus RNA genome is translated into a single polyprotein (~3390 aa) that is cleaved by proteases to yield 10 dengue proteins: the C protein; the M protein, which is synthesized as a larger precursor protein pM; the major E glycoprotein; and seven nonstructural (NS) proteins, NS1, NS2a, NS2b, NS3, NS4a, NS4b and NS5 (Table <tblr tid="T2">2</tblr>). Individual protein sequences were extracted from collected entries for each DV serotype and grouped according to the 10 dengue proteins for analysis. The protein sequence extraction was done by sequence alignments and identification of known cleavage sites for dengue proteins. The cleavage sites were obtained from the annotation of the GenPept <abbrgrp><abbr bid="B19">19</abbr></abbrgrp> reference polyprotein sequence for each dengue serotype (DV1: AAF59976; DV2: P14340; DV3: AAM51537; DV4: AAG45437) and the literature <abbrgrp><abbr bid="B35">35</abbr></abbrgrp>. The grouping of the extracted sequences for proteins of each serotype was facilitated by local sequence alignment using the BLAST algorithm <abbrgrp><abbr bid="B36">36</abbr></abbrgrp> (parameters: filter &#8211; no; expect &#8211; 100; descriptions &amp; alignments &#8211; 1000), followed by multiple sequence alignment using ClustalX 1.83 <abbrgrp><abbr bid="B37">37</abbr></abbrgrp> with default parameters, followed by manual inspection. Duplicate or identical sequences for proteins within each serotype were removed, and the unique sequences were retained for further analysis. Both full-length and partial unique sequences of each dengue serotype protein were used for the analysis, unless indicated otherwise. Data compiled from public databases are prone to errors and discrepancies <abbrgrp><abbr bid="B38">38</abbr></abbrgrp>, which may affect the analysis. Therefore, we inspected the collected DV entries and corrected errors and discrepancies (see <supplr sid="S1">additional file 1</supplr>: Table S1.pdf).</p>
         </sec>
         <sec>
            <st>
               <p>Protein sequence and antigenic diversity analysis of dengue virus</p>
            </st>
            <p>In the context of this study, protein sequence diversity of a dengue protein was defined as the total number of unique sequences reported in the database for the protein. Sequences having at least a single amino acid difference between them were considered as unique. We calculated the pairwise percentage amino acid identity of the full-length unique sequences of each dengue protein, intra- and inter-serotype, by use of ClustalW 1.83 <abbrgrp><abbr bid="B39">39</abbr></abbrgrp> with default parameters, followed by manual inspection. This was done to survey the extent of amino acid variation and conservation in the latest, comprehensive dengue data of 2005.</p>
            <p>Antigenic diversity of a dengue protein was defined in this study as the minimal set of unique sequences required to represent the complete set of overlapping 9-mer peptides encoded by all unique sequences reported in the database for the protein. We developed a bioinformatics method that performs exhaustive search to determine the minimal set for a given protein. The method comprises two steps: (a) generation of a set of overlapping 9-mers from the entire length of all unique sequences reported in the database for the protein, followed by (b) identification of a minimal set of unique sequences that represents all the unique 9-mers. The union of such sets for all the ten proteins of a dengue serotype represents the antigenic diversity of the proteins for the serotype as defined in this study. The computer program for the method was written in Perl and C language.</p>
            <p>In the first step of the method, we generated overlapping 9-mers from the entire length of each unique sequence because the whole length was assumed to contain potential targets of T-cell mediated immune responses (T-cell epitopes) <abbrgrp><abbr bid="B40">40</abbr></abbrgrp>. This assumption was based, firstly, on the estimate that from a complete set of overlapping peptides (9 or 10-mers) spanning a protein, on average, 0.1&#8211;5% of the peptides will bind to any particular HLA molecule <abbrgrp><abbr bid="B41">41</abbr></abbrgrp>. Secondly, given the large number of HLA molecules (more than 2532 known as of September 2006; <abbrgrp><abbr bid="B42">42</abbr></abbrgrp>), the vast majority of the complete set of overlapping peptides are highly likely to bind at least one molecule from the total HLA pool. Thus, each overlapping peptide is a potential T-cell epitope. This assumption ensures the capture of all possible candidate 9-mer T-cell epitopes that can be present across the entire length of the unique sequence. We focused our antigenic diversity study on 9-mers because they represent the predominant length of HLA class I T-cell epitopes, as well as the binding core of HLA class II T-cell epitopes <abbrgrp><abbr bid="B32">32</abbr><abbr bid="B40">40</abbr></abbrgrp>. Furthermore, our preliminary analysis using 8-mers and 10-mers did not produce notably different results compared to the analysis of 9-mers (data not shown). A small number of 9-mers derived from the unique sequences contained unknown residues (denoted by "X") and, hence, were excluded from the analysis because they were antigenically non-informative.</p>
         </sec>
         <sec>
            <st>
               <p>Determining the effects of sequence determinants on antigenic diversity</p>
            </st>
            <p>The effects on antigenic diversity of two sequence determinants, the number of viral sequences in the studied set and the length of protein antigens were studied. The study was performed on unique sequences from the DV2 envelope protein (retrieved in 2005) because it provided a sufficiently large and well-defined dataset (198 full-length sequences). Test datasets with different numbers of sequences (20, 40, 60, 80, 100, 120 and 140 sequences) and different lengths (23, 46, 128, 138, 276 and 460 aa) were randomly derived from the envelope dataset with repeated sampling (20 repeats). Any duplicate sequences were removed from the test datasets. The minimal set of sequences that represents the complete short-peptide antigenic diversity was determined for each dataset. These minimal sets were used to analyze the effects of the sequence determinants on antigenic diversity.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>List of abbreviations used</p>
         </st>
         <p>DV- Dengue Virus; DV1- Dengue Virus Serotype 1; DV2- Dengue Virus Serotype 2; DV3- Dengue Virus Serotype 3; DV4- Dengue Virus Serotype 4; aa- amino acids; NCBI- National Center for Biotechnology Information; HIV- Human Immunodeficiency Virus; HLA- Human Leukocyte Antigen.</p>
      </sec>
      <sec>
         <st>
            <p>Competing interests</p>
         </st>
         <p>The author(s) declare that they have no competing interests.</p>
      </sec>
      <sec>
         <st>
            <p>Authors' contributions</p>
         </st>
         <p>AMK performed the <it>in silico </it>experiments and drafter the manuscript. ATH, KXL, KNS, TWT and JTA participated in the design of the study. VB conceived the study, participated in its design and coordination and helped to draft the manuscript. JTA and TWT critically reviewed the manuscript. All authors read and approved the final manuscript.</p>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>The authors thank Seng Hong Seah, Zhang Guanglan, Judice Koh and Olivo Miotto for their help and valuable suggestions. We also thank Dr. Deborah McClellan for editorial review of the manuscript. This project has been funded in part with Federal funds from the National Institute of Allergy and Infectious Diseases, National Institutes of Health, Department of Health and Human Services, USA, under Grant No. 5 U19 AI56541 and Contract No. HHSN2662-00400085C.</p>
            <p>This article has been published as part of <it>BMC Bioinformatics </it>Volume 7, Supplement 5, 2006: APBioNet &#8211; Fifth International Conference on Bioinformatics (InCoB2006). The full contents of the supplement are available online at <url>http://www.biomedcentral.com/1471-2105/7?issue=S5</url>.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>Full-length cDNA sequence of dengue type 1 virus (Singapore strain S275/90)</p>
            </title>
            <aug>
               <au>
                  <snm>Fu</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Tan</snm>
                  <fnm>BH</fnm>
               </au>
               <au>
                  <snm>Yap</snm>
                  <fnm>EH</fnm>
               </au>
               <au>
                  <snm>Chan</snm>
                  <fnm>YC</fnm>
               </au>
               <au>
                  <snm>Tan</snm>
                  <fnm>YH</fnm>
               </au>
            </aug>
            <source>Virology</source>
            <pubdate>1992</pubdate>
            <volume>188</volume>
            <issue>2</issue>
            <fpage>953</fpage>
            <lpage>958</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/0042-6822(92)90560-C</pubid>
                  <pubid idtype="pmpid" link="fulltext">1585663</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>Original antigenic sin and apoptosis in the pathogenesis of dengue hemorrhagic fever</p>
            </title>
            <aug>
               <au>
                  <snm>Mongkolsapaya</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Dejnirattisai</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Xu</snm>
                  <fnm>XN</fnm>
               </au>
               <au>
                  <snm>Vasanawathana</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Tangthawornchaikul</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Chairunsri</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Sawasdivorn</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Duangchinda</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Dong</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Rowland-Jones</snm>
                  <fnm>S</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nat Med</source>
            <pubdate>2003</pubdate>
            <volume>9</volume>
            <issue>7</issue>
            <fpage>921</fpage>
            <lpage>927</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nm887</pubid>
                  <pubid idtype="pmpid" link="fulltext">12808447</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>The causes and consequences of genetic variation in dengue virus</p>
            </title>
            <aug>
               <au>
                  <snm>Holmes</snm>
                  <fnm>EC</fnm>
               </au>
               <au>
                  <snm>Burch</snm>
                  <fnm>SS</fnm>
               </au>
            </aug>
            <source>Trends Microbiol</source>
            <pubdate>2000</pubdate>
            <volume>8</volume>
            <issue>2</issue>
            <fpage>74</fpage>
            <lpage>77</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0966-842X(99)01669-8</pubid>
                  <pubid idtype="pmpid" link="fulltext">10664600</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>Clade replacements in dengue virus serotypes 1 and 3 are associated with changing serotype prevalence</p>
            </title>
            <aug>
               <au>
                  <snm>Zhang</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Mammen</snm>
                  <fnm>MP</fnm>
                  <suf>Jr</suf>
               </au>
               <au>
                  <snm>Chinnawirotpisan</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Klungthong</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Rodpradit</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Monkongdee</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Nimmannitya</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Kalayanarooj</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Holmes</snm>
                  <fnm>EC</fnm>
               </au>
            </aug>
            <source>J Virol</source>
            <pubdate>2005</pubdate>
            <volume>79</volume>
            <issue>24</issue>
            <fpage>15123</fpage>
            <lpage>15130</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1316048</pubid>
                  <pubid idtype="pmpid" link="fulltext">16306584</pubid>
                  <pubid idtype="doi">10.1128/JVI.79.24.15123-15130.2005</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>Strategically examining the full-genome of dengue virus type 3 in clinical isolates reveals its mutation spectra</p>
            </title>
            <aug>
               <au>
                  <snm>Chao</snm>
                  <fnm>DY</fnm>
               </au>
               <au>
                  <snm>King</snm>
                  <fnm>CC</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>WK</fnm>
               </au>
               <au>
                  <snm>Chen</snm>
                  <fnm>WJ</fnm>
               </au>
               <au>
                  <snm>Wu</snm>
                  <fnm>HL</fnm>
               </au>
               <au>
                  <snm>Chang</snm>
                  <fnm>GJ</fnm>
               </au>
            </aug>
            <source>Virol J</source>
            <pubdate>2005</pubdate>
            <volume>2</volume>
            <fpage>72</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1208963</pubid>
                  <pubid idtype="pmpid" link="fulltext">16120221</pubid>
                  <pubid idtype="doi">10.1186/1743-422X-2-72</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>Stereophysicochemical variability plots highlight conserved antigenic areas in Flaviviruses</p>
            </title>
            <aug>
               <au>
                  <snm>Schein</snm>
                  <fnm>CH</fnm>
               </au>
               <au>
                  <snm>Zhou</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Braun</snm>
                  <fnm>W</fnm>
               </au>
            </aug>
            <source>Virol J</source>
            <pubdate>2005</pubdate>
            <volume>2</volume>
            <fpage>40</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1112618</pubid>
                  <pubid idtype="pmpid" link="fulltext">15845145</pubid>
                  <pubid idtype="doi">10.1186/1743-422X-2-40</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>Antigenic analysis of dengue virus using monoclonal antibodies</p>
            </title>
            <aug>
               <au>
                  <snm>Young</snm>
                  <fnm>PR</fnm>
               </au>
            </aug>
            <source>Southeast Asian J Trop Med Public Health</source>
            <pubdate>1990</pubdate>
            <volume>21</volume>
            <issue>4</issue>
            <fpage>646</fpage>
            <lpage>651</lpage>
            <xrefbib>
               <pubid idtype="pmpid">2098930</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>Applications of polymerase chain reaction for identification of dengue viruses isolated from patient sera</p>
            </title>
            <aug>
               <au>
                  <snm>Maneekarn</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Morita</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Tanaka</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Igarashi</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Usawattanakul</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Sirisanthana</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Innis</snm>
                  <fnm>BL</fnm>
               </au>
               <au>
                  <snm>Sittisombut</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Nisalak</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Nimmanitya</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Microbiol Immunol</source>
            <pubdate>1993</pubdate>
            <volume>37</volume>
            <issue>1</issue>
            <fpage>41</fpage>
            <lpage>47</lpage>
            <xrefbib>
               <pubid idtype="pmpid">8474356</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>Possible occurrence of a genetic bottleneck in dengue serotype 2 viruses between the 1980 and 1987 epidemic seasons in Bangkok, Thailand</p>
            </title>
            <aug>
               <au>
                  <snm>Sittisombut</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Sistayanarain</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Cardosa</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Salminen</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Damrongdachakul</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Kalayanarooj</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Rojanasuphot</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Supawadee</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Maneekarn</snm>
                  <fnm>N</fnm>
               </au>
            </aug>
            <source>Am J Trop Med Hyg</source>
            <pubdate>1997</pubdate>
            <volume>57</volume>
            <issue>1</issue>
            <fpage>100</fpage>
            <lpage>108</lpage>
            <xrefbib>
               <pubid idtype="pmpid">9242328</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>Antigenic relatedness of selected flaviviruses: study with homologous and heterologous immune mouse ascitic fluids</p>
            </title>
            <aug>
               <au>
                  <snm>Baba</snm>
                  <fnm>SS</fnm>
               </au>
               <au>
                  <snm>Fagbami</snm>
                  <fnm>AH</fnm>
               </au>
               <au>
                  <snm>Olaleye</snm>
                  <fnm>OD</fnm>
               </au>
            </aug>
            <source>Rev Inst Med Trop Sao Paulo</source>
            <pubdate>1998</pubdate>
            <volume>40</volume>
            <issue>6</issue>
            <fpage>343</fpage>
            <lpage>349</lpage>
            <xrefbib>
               <pubid idtype="pmpid">10436653</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>Antibody responses to Asian and American genotypes of dengue 2 virus in immunized mice</p>
            </title>
            <aug>
               <au>
                  <snm>Bernardo</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Yndart</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Vazquez</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Morier</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Guzman</snm>
                  <fnm>MG</fnm>
               </au>
            </aug>
            <source>Clin Diagn Lab Immunol</source>
            <pubdate>2005</pubdate>
            <volume>12</volume>
            <issue>2</issue>
            <fpage>361</fpage>
            <lpage>362</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">549307</pubid>
                  <pubid idtype="pmpid" link="fulltext">15699435</pubid>
                  <pubid idtype="doi">10.1128/CDLI.12.2.361-362.2005</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>Identification of amino acids involved in recognition by dengue virus NS3-specific, HLA-DR15-restricted cytotoxic CD4+ T-cell clones</p>
            </title>
            <aug>
               <au>
                  <snm>Zeng</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Kurane</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Okamoto</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Ennis</snm>
                  <fnm>FA</fnm>
               </au>
               <au>
                  <snm>Brinton</snm>
                  <fnm>MA</fnm>
               </au>
            </aug>
            <source>J Virol</source>
            <pubdate>1996</pubdate>
            <volume>70</volume>
            <issue>5</issue>
            <fpage>3108</fpage>
            <lpage>3117</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">190173</pubid>
                  <pubid idtype="pmpid" link="fulltext">8627790</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B13">
            <title>
               <p>Definition of an epitope on NS3 recognized by human CD4+ cytotoxic T lymphocyte clones cross-reactive for dengue virus types 2, 3, and 4</p>
            </title>
            <aug>
               <au>
                  <snm>Kurane</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Zeng</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Brinton</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Ennis</snm>
                  <fnm>FA</fnm>
               </au>
            </aug>
            <source>Virology</source>
            <pubdate>1998</pubdate>
            <volume>240</volume>
            <issue>2</issue>
            <fpage>169</fpage>
            <lpage>174</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1006/viro.1997.8925</pubid>
                  <pubid idtype="pmpid" link="fulltext">9454689</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>Strong HLA class I &#8211; restricted T cell responses in dengue hemorrhagic fever: a double-edged sword?</p>
            </title>
            <aug>
               <au>
                  <snm>Loke</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Bethell</snm>
                  <fnm>DB</fnm>
               </au>
               <au>
                  <snm>Phuong</snm>
                  <fnm>CX</fnm>
               </au>
               <au>
                  <snm>Dung</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Schneider</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>White</snm>
                  <fnm>NJ</fnm>
               </au>
               <au>
                  <snm>Day</snm>
                  <fnm>NP</fnm>
               </au>
               <au>
                  <snm>Farrar</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Hill</snm>
                  <fnm>AV</fnm>
               </au>
            </aug>
            <source>J Infect Dis</source>
            <pubdate>2001</pubdate>
            <volume>184</volume>
            <issue>11</issue>
            <fpage>1369</fpage>
            <lpage>1373</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1086/324320</pubid>
                  <pubid idtype="pmpid" link="fulltext">11709777</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B15">
            <title>
               <p>Early T-cell responses to dengue virus epitopes in Vietnamese adults with secondary dengue virus infections</p>
            </title>
            <aug>
               <au>
                  <snm>Simmons</snm>
                  <fnm>CP</fnm>
               </au>
               <au>
                  <snm>Dong</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Chau</snm>
                  <fnm>NV</fnm>
               </au>
               <au>
                  <snm>Dung</snm>
                  <fnm>NT</fnm>
               </au>
               <au>
                  <snm>Chau</snm>
                  <fnm>TN</fnm>
               </au>
               <au>
                  <snm>Thao le</snm>
                  <fnm>TT</fnm>
               </au>
               <au>
                  <snm>Dung</snm>
                  <fnm>NT</fnm>
               </au>
               <au>
                  <snm>Hien</snm>
                  <fnm>TT</fnm>
               </au>
               <au>
                  <snm>Rowland-Jones</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Farrar</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>J Virol</source>
            <pubdate>2005</pubdate>
            <volume>79</volume>
            <issue>9</issue>
            <fpage>5665</fpage>
            <lpage>5675</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1082776</pubid>
                  <pubid idtype="pmpid" link="fulltext">15827181</pubid>
                  <pubid idtype="doi">10.1128/JVI.79.9.5665-5675.2005</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <title>
               <p>Antigenic variations in West Nile virus strains isolated in Madagascar since 1978</p>
            </title>
            <aug>
               <au>
                  <snm>Morvan</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Besselaar</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Fontenille</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Coulanges</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Res Virol</source>
            <pubdate>1990</pubdate>
            <volume>141</volume>
            <issue>6</issue>
            <fpage>667</fpage>
            <lpage>676</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/0923-2516(90)90039-L</pubid>
                  <pubid idtype="pmpid">1982372</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>Mapping the antigenic and genetic evolution of influenza virus</p>
            </title>
            <aug>
               <au>
                  <snm>Smith</snm>
                  <fnm>DJ</fnm>
               </au>
               <au>
                  <snm>Lapedes</snm>
                  <fnm>AS</fnm>
               </au>
               <au>
                  <snm>de Jong</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Bestebroer</snm>
                  <fnm>TM</fnm>
               </au>
               <au>
                  <snm>Rimmelzwaan</snm>
                  <fnm>GF</fnm>
               </au>
               <au>
                  <snm>Osterhaus</snm>
                  <fnm>AD</fnm>
               </au>
               <au>
                  <snm>Fouchier</snm>
                  <fnm>RA</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2004</pubdate>
            <volume>305</volume>
            <issue>5682</issue>
            <fpage>371</fpage>
            <lpage>376</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.1097211</pubid>
                  <pubid idtype="pmpid" link="fulltext">15218094</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p>Microevolution and virulence of dengue viruses</p>
            </title>
            <aug>
               <au>
                  <snm>Rico-Hesse</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Adv Virus Res</source>
            <pubdate>2003</pubdate>
            <volume>59</volume>
            <fpage>315</fpage>
            <lpage>341</lpage>
            <xrefbib>
               <pubid idtype="pmpid">14696333</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B19">
            <title>
               <p>Database resources of the National Center for Biotechnology Information</p>
            </title>
            <aug>
               <au>
                  <snm>Wheeler</snm>
                  <fnm>DL</fnm>
               </au>
               <au>
                  <snm>Barrett</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Benson</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>Bryant</snm>
                  <fnm>SH</fnm>
               </au>
               <au>
                  <snm>Canese</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Church</snm>
                  <fnm>DM</fnm>
               </au>
               <au>
                  <snm>DiCuccio</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Edgar</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Federhen</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Helmberg</snm>
                  <fnm>W</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2005</pubdate>
            <volume>33</volume>
            <issue>Database</issue>
            <fpage>D39</fpage>
            <lpage>45</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">540016</pubid>
                  <pubid idtype="pmpid" link="fulltext">15608222</pubid>
                  <pubid idtype="doi">10.1093/nar/gki062</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B20">
            <title>
               <p>Construction of recombinant targeting immunogens incorporating an HIV-1 neutralizing epitope into sites of differing conformational constraint</p>
            </title>
            <aug>
               <au>
                  <snm>Ho</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>MacDonald</snm>
                  <fnm>KS</fnm>
               </au>
               <au>
                  <snm>Barber</snm>
                  <fnm>BH</fnm>
               </au>
            </aug>
            <source>Vaccine</source>
            <pubdate>2002</pubdate>
            <volume>20</volume>
            <issue>7&#8211;8</issue>
            <fpage>1169</fpage>
            <lpage>1180</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0264-410X(01)00441-8</pubid>
                  <pubid idtype="pmpid" link="fulltext">11803079</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B21">
            <title>
               <p>CED: a conformational epitope database</p>
            </title>
            <aug>
               <au>
                  <snm>Huang</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Honda</snm>
                  <fnm>W</fnm>
               </au>
            </aug>
            <source>BMC Immunol</source>
            <pubdate>2006</pubdate>
            <volume>7</volume>
            <fpage>7</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1513601</pubid>
                  <pubid idtype="pmpid" link="fulltext">16603068</pubid>
                  <pubid idtype="doi">10.1186/1471-2172-7-7</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B22">
            <title>
               <p>Molecular evolution and distribution of dengue viruses type 1 and 2 in nature</p>
            </title>
            <aug>
               <au>
                  <snm>Rico-Hesse</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Virology</source>
            <pubdate>1990</pubdate>
            <volume>174</volume>
            <issue>2</issue>
            <fpage>479</fpage>
            <lpage>493</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/0042-6822(90)90102-W</pubid>
                  <pubid idtype="pmpid">2129562</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B23">
            <title>
               <p>Phylogenetic relationships and differential selection pressures among genotypes of dengue-2 virus</p>
            </title>
            <aug>
               <au>
                  <snm>Twiddy</snm>
                  <fnm>SS</fnm>
               </au>
               <au>
                  <snm>Farrar</snm>
                  <fnm>JJ</fnm>
               </au>
               <au>
                  <snm>Vinh Chau</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Wills</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Gould</snm>
                  <fnm>EA</fnm>
               </au>
               <au>
                  <snm>Gritsun</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Lloyd</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Holmes</snm>
                  <fnm>EC</fnm>
               </au>
            </aug>
            <source>Virology</source>
            <pubdate>2002</pubdate>
            <volume>298</volume>
            <issue>1</issue>
            <fpage>63</fpage>
            <lpage>72</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1006/viro.2002.1447</pubid>
                  <pubid idtype="pmpid" link="fulltext">12093174</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B24">
            <title>
               <p>The origin, emergence and evolutionary genetics of dengue virus</p>
            </title>
            <aug>
               <au>
                  <snm>Holmes</snm>
                  <fnm>EC</fnm>
               </au>
               <au>
                  <snm>Twiddy</snm>
                  <fnm>SS</fnm>
               </au>
            </aug>
            <source>Infect Genet Evol</source>
            <pubdate>2003</pubdate>
            <volume>3</volume>
            <issue>1</issue>
            <fpage>19</fpage>
            <lpage>28</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S1567-1348(03)00004-2</pubid>
                  <pubid idtype="pmpid" link="fulltext">12797969</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B25">
            <title>
               <p>Inferring the rate and time-scale of dengue virus evolution</p>
            </title>
            <aug>
               <au>
                  <snm>Twiddy</snm>
                  <fnm>SS</fnm>
               </au>
               <au>
                  <snm>Holmes</snm>
                  <fnm>EC</fnm>
               </au>
               <au>
                  <snm>Rambaut</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Mol Biol Evol</source>
            <pubdate>2003</pubdate>
            <volume>20</volume>
            <issue>1</issue>
            <fpage>122</fpage>
            <lpage>129</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/molbev/msg010</pubid>
                  <pubid idtype="pmpid" link="fulltext">12519914</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B26">
            <title>
               <p>Processing of nonstructural proteins NS4A and NS4B of dengue 2 virus in vitro and in vivo</p>
            </title>
            <aug>
               <au>
                  <snm>Preugschat</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Strauss</snm>
                  <fnm>JH</fnm>
               </au>
            </aug>
            <source>Virology</source>
            <pubdate>1991</pubdate>
            <volume>185</volume>
            <issue>2</issue>
            <fpage>689</fpage>
            <lpage>697</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/0042-6822(91)90540-R</pubid>
                  <pubid idtype="pmpid" link="fulltext">1683727</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B27">
            <title>
               <p>A structural perspective of the flavivirus life cycle</p>
            </title>
            <aug>
               <au>
                  <snm>Mukhopadhyay</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Kuhn</snm>
                  <fnm>RJ</fnm>
               </au>
               <au>
                  <snm>Rossmann</snm>
                  <fnm>MG</fnm>
               </au>
            </aug>
            <source>Nat Rev Microbiol</source>
            <pubdate>2005</pubdate>
            <volume>3</volume>
            <issue>1</issue>
            <fpage>13</fpage>
            <lpage>22</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nrmicro1067</pubid>
                  <pubid idtype="pmpid" link="fulltext">15608696</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B28">
            <title>
               <p>The future of dengue vaccines</p>
            </title>
            <aug>
               <au>
                  <snm>Halstead</snm>
                  <fnm>SB</fnm>
               </au>
               <au>
                  <snm>Deen</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Lancet</source>
            <pubdate>2002</pubdate>
            <volume>360</volume>
            <issue>9341</issue>
            <fpage>1243</fpage>
            <lpage>1245</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0140-6736(02)11276-1</pubid>
                  <pubid idtype="pmpid" link="fulltext">12401270</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B29">
            <title>
               <p>The changing field of vaccine development in the genomics era</p>
            </title>
            <aug>
               <au>
                  <snm>Brusic</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>August</snm>
                  <fnm>JT</fnm>
               </au>
            </aug>
            <source>Pharmacogenomics</source>
            <pubdate>2004</pubdate>
            <volume>5</volume>
            <issue>6</issue>
            <fpage>597</fpage>
            <lpage>600</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1517/14622416.5.6.597</pubid>
                  <pubid idtype="pmpid" link="fulltext">15335280</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B30">
            <title>
               <p>Variation in vaccine response in normal populations</p>
            </title>
            <aug>
               <au>
                  <snm>Ovsyannikova</snm>
                  <fnm>IG</fnm>
               </au>
               <au>
                  <snm>Jacobson</snm>
                  <fnm>RM</fnm>
               </au>
               <au>
                  <snm>Poland</snm>
                  <fnm>GA</fnm>
               </au>
            </aug>
            <source>Pharmacogenomics</source>
            <pubdate>2004</pubdate>
            <volume>5</volume>
            <issue>4</issue>
            <fpage>417</fpage>
            <lpage>427</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1517/14622416.5.4.417</pubid>
                  <pubid idtype="pmpid" link="fulltext">15165177</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B31">
            <title>
               <p>Prediction of class I T-cell epitopes: evidence of presence of immunological hot spots inside antigens</p>
            </title>
            <aug>
               <au>
                  <snm>Srinivasan</snm>
                  <fnm>KN</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>GL</fnm>
               </au>
               <au>
                  <snm>Khan</snm>
                  <fnm>AM</fnm>
               </au>
               <au>
                  <snm>August</snm>
                  <fnm>JT</fnm>
               </au>
               <au>
                  <snm>Brusic</snm>
                  <fnm>V</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2004</pubdate>
            <volume>20</volume>
            <issue>Suppl 1</issue>
            <fpage>I297</fpage>
            <lpage>I302</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/bth943</pubid>
                  <pubid idtype="pmpid" link="fulltext">15262812</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B32">
            <title>
               <p>Chemistry of peptides associated with MHC class I and class II molecules</p>
            </title>
            <aug>
               <au>
                  <snm>Rammensee</snm>
                  <fnm>HG</fnm>
               </au>
            </aug>
            <source>Curr Opin Immunol</source>
            <pubdate>1995</pubdate>
            <volume>7</volume>
            <issue>1</issue>
            <fpage>85</fpage>
            <lpage>96</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/0952-7915(95)80033-6</pubid>
                  <pubid idtype="pmpid" link="fulltext">7772286</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B33">
            <title>
               <p>Immune mediated and inherited defences against flaviviruses</p>
            </title>
            <aug>
               <au>
                  <snm>Brinton</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Kurane</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Mathew</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Zeng</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Shi</snm>
                  <fnm>PY</fnm>
               </au>
               <au>
                  <snm>Rothman</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Ennis</snm>
                  <fnm>FA</fnm>
               </au>
            </aug>
            <source>Clin Diagn Virol</source>
            <pubdate>1998</pubdate>
            <volume>10</volume>
            <issue>2&#8211;3</issue>
            <fpage>129</fpage>
            <lpage>139</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0928-0197(98)00039-7</pubid>
                  <pubid idtype="pmpid">9741638</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B34">
            <title>
               <p>NCBI Entrez protein database</p>
            </title>
            <url>http://www.ncbi.nlm.nih.gov/entrez</url>
         </bibl>
         <bibl id="B35">
            <title>
               <p>Complete nucleotide sequence of dengue type 3 virus genome RNA</p>
            </title>
            <aug>
               <au>
                  <snm>Osatomi</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Sumiyoshi</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>Virology</source>
            <pubdate>1990</pubdate>
            <volume>176</volume>
            <issue>2</issue>
            <fpage>643</fpage>
            <lpage>647</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/0042-6822(90)90037-R</pubid>
                  <pubid idtype="pmpid">2345967</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B36">
            <title>
               <p>BLAST: at the core of a powerful and diverse set of sequence analysis tools</p>
            </title>
            <aug>
               <au>
                  <snm>McGinnis</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Madden</snm>
                  <fnm>TL</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2004</pubdate>
            <volume>32</volume>
            <issue>Web Server</issue>
            <fpage>W20</fpage>
            <lpage>25</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">441573</pubid>
                  <pubid idtype="pmpid" link="fulltext">15215342</pubid>
                  <pubid idtype="doi">10.1093/nar/gkh435</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B37">
            <title>
               <p>The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools</p>
            </title>
            <aug>
               <au>
                  <snm>Thompson</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Gibson</snm>
                  <fnm>TJ</fnm>
               </au>
               <au>
                  <snm>Plewniak</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Jeanmougin</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Higgins</snm>
                  <fnm>DG</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>1997</pubdate>
            <volume>25</volume>
            <issue>24</issue>
            <fpage>4876</fpage>
            <lpage>4882</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">147148</pubid>
                  <pubid idtype="pmpid" link="fulltext">9396791</pubid>
                  <pubid idtype="doi">10.1093/nar/25.24.4876</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B38">
            <title>
               <p>SCORPION, a molecular database of scorpion toxins</p>
            </title>
            <aug>
               <au>
                  <snm>Srinivasan</snm>
                  <fnm>KN</fnm>
               </au>
               <au>
                  <snm>Gopalakrishnakone</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Tan</snm>
                  <fnm>PT</fnm>
               </au>
               <au>
                  <snm>Chew</snm>
                  <fnm>KC</fnm>
               </au>
               <au>
                  <snm>Cheng</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Kini</snm>
                  <fnm>RM</fnm>
               </au>
               <au>
                  <snm>Koh</snm>
                  <fnm>JL</fnm>
               </au>
               <au>
                  <snm>Seah</snm>
                  <fnm>SH</fnm>
               </au>
               <au>
                  <snm>Brusic</snm>
                  <fnm>V</fnm>
               </au>
            </aug>
            <source>Toxicon</source>
            <pubdate>2002</pubdate>
            <volume>40</volume>
            <issue>1</issue>
            <fpage>23</fpage>
            <lpage>31</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0041-0101(01)00182-9</pubid>
                  <pubid idtype="pmpid" link="fulltext">11602275</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B39">
            <title>
               <p>CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice</p>
            </title>
            <aug>
               <au>
                  <snm>Thompson</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Higgins</snm>
                  <fnm>DG</fnm>
               </au>
               <au>
                  <snm>Gibson</snm>
                  <fnm>TJ</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>1994</pubdate>
            <volume>22</volume>
            <issue>22</issue>
            <fpage>4673</fpage>
            <lpage>4680</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">308517</pubid>
                  <pubid idtype="pmpid" link="fulltext">7984417</pubid>
                  <pubid idtype="doi">10.1093/nar/22.22.4673</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B40">
            <title>
               <p>Peptide selection for human immunodeficiency virus type 1 CTL-based vaccine evaluation</p>
            </title>
            <aug>
               <au>
                  <snm>Li</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Malhotra</snm>
                  <fnm>U</fnm>
               </au>
               <au>
                  <snm>Gilbert</snm>
                  <fnm>PB</fnm>
               </au>
               <au>
                  <snm>Hawkins</snm>
                  <fnm>NR</fnm>
               </au>
               <au>
                  <snm>Duerr</snm>
                  <fnm>AC</fnm>
               </au>
               <au>
                  <snm>McElrath</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Corey</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Self</snm>
                  <fnm>SG</fnm>
               </au>
            </aug>
            <source>Vaccine</source>
            <pubdate>2006</pubdate>
            <volume>24</volume>
            <fpage>6893</fpage>
            <lpage>6904</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.vaccine.2006.06.009</pubid>
                  <pubid idtype="pmpid" link="fulltext">16890329</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B41">
            <title>
               <p>Computational binding assays of antigenic peptides</p>
            </title>
            <aug>
               <au>
                  <snm>Brusic</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Zeleznikow</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Lett Pept Sci</source>
            <pubdate>1999</pubdate>
            <volume>6</volume>
            <fpage>313</fpage>
            <lpage>324</lpage>
         </bibl>
         <bibl id="B42">
            <title>
               <p>HLA Informatics Group</p>
            </title>
            <url>http://www.anthonynolan.org.uk/HIG</url>
         </bibl>
      </refgrp>
   </bm>
</art>
