<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1471-2105-7-362</ui>
   <ji>1471-2105</ji>
   <fm>
      <dochead>Software</dochead>
      <bibl>
         <title>
            <p>A fast SCOP fold classification system using content-based E-Predict algorithm</p>
         </title>
         <aug>
            <au id="A1">
               <snm>Chi</snm>
               <fnm>Pin-Hao</fnm>
               <insr iid="I1"/>
               <email>pinhao@diglib1.cecs.missouri.edu</email>
            </au>
            <au id="A2" ca="yes">
               <snm>Shyu</snm>
               <fnm>Chi-Ren</fnm>
               <insr iid="I1"/>
               <email>shyuc@missouri.edu</email>
            </au>
            <au id="A3">
               <snm>Xu</snm>
               <fnm>Dong</fnm>
               <insr iid="I2"/>
               <email>xudong@missouri.edu</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Medical and Biological Digital Library Research Lab, Department of Computer Science, University of Missouri, Columbia, MO 65211, USA</p>
            </ins>
            <ins id="I2">
               <p>Digital Biology Laboratory, Department of Computer Science and Life Sciences Center, University of Missouri, Columbia, MO 65211, USA</p>
            </ins>
         </insg>
         <source>BMC Bioinformatics</source>
         <issn>1471-2105</issn>
         <pubdate>2006</pubdate>
         <volume>7</volume>
         <issue>1</issue>
         <fpage>362</fpage>
         <url>http://www.biomedcentral.com/1471-2105/7/362</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">16872501</pubid>
               <pubid idtype="doi">10.1186/1471-2105-7-362</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>29</day>
               <month>12</month>
               <year>2005</year>
            </date>
         </rec>
         <acc>
            <date>
               <day>26</day>
               <month>7</month>
               <year>2006</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>26</day>
               <month>7</month>
               <year>2006</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2006</year>
         <collab>Chi et al; licensee BioMed Central Ltd.</collab>
         <note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>Domain experts manually construct the Structural Classification of Protein (SCOP) database to categorize and compare protein structures. Even though using the SCOP database is believed to be more reliable than classification results from other methods, it is labor intensive. To mimic human classification processes, we develop an automatic SCOP fold classification system to assign possible known SCOP folds and recognize novel folds for newly-discovered proteins.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>With a sufficient amount of ground truth data, our system is able to assign the known folds for newly-discovered proteins in the latest SCOP <it>v</it>1.69 release with 92.17% accuracy. Our system also recognizes the novel folds with 89.27% accuracy using 10 fold cross validation. The average response time for proteins with 500 and 1409 amino acids to complete the classification process is 4.1 and 17.4 seconds, respectively. By comparison with several structural alignment algorithms, our approach outperforms previous methods on both the classification accuracy and efficiency.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusion</p>
               </st>
               <p>In this paper, we build an advanced, non-parametric classifier to accelerate the manual classification processes of SCOP. With satisfactory ground truth data from the SCOP database, our approach identifies relevant domain knowledge and yields reasonably accurate classifications. Our system is publicly accessible at <url>http://ProteinDBS.rnet.missouri.edu/E-Predict.php</url>.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>Protein structure classification is well-known to be an important research topic in computational and molecular biology. Through the use of structural classification, life science researchers and biologists are able to study evolutionary evidence from similar proteins that have been conserved in multiple species. In addition, similar 3-D conformations of enzyme active sites and binding sites may correlate with biochemical functions <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. In recent years, structural genomics projects <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr><abbr bid="B5">5</abbr></abbrgrp> have aimed to link protein sequences to possible functions via high-throughput techniques such as X-ray crystallography and nuclear magnetic resonance (NMR) that determine 3-D protein structures. With a large-scale set of newly-discovered structures, a system that classifies similar protein structures with high efficiency and accuracy becomes an indispensable requirement to the study of structure-to-function relationships.</p>
         <p>Several classification systems categorize proteins based on structural similarities. The Class, Architecture, Topology, Homologous Superfamily (CATH) database <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> is constructed by applying the Secondary Structure Alignment Program (SSAP) <abbrgrp><abbr bid="B7">7</abbr></abbrgrp>, which consists of a double dynamic programming technique to find the optimal structural alignment of two proteins. The Fold Classification based on Structure-Structure Alignment of Proteins (FSSP) database <abbrgrp><abbr bid="B8">8</abbr></abbrgrp> is built based on the Distance Alignment (DALI) <abbrgrp><abbr bid="B9">9</abbr></abbrgrp> algorithm that applies Monte Carlo heuristics to compare structural similarities from 2-D distance matrices mapped from 3-D protein structures. Generally, these systems rely on the structural alignment algorithms to measure the similarity of two proteins, which is known to be of complexity NP-Hard <abbrgrp><abbr bid="B10">10</abbr></abbrgrp>. To reduce the computational effort of scanning large-scale protein databases, those structural alignment algorithms need to apply heuristics with trade-offs which may return divergent results from the same query protein. At present, the Structural Classification of Protein (SCOP) database <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>, which is manually constructed by human experts, is believed to contain the most accurate structural classifications. In the SCOP database, proteins with similar domain structures are usually clustered into the same fold hierarchy. Even though manual classification provides reliable results, it is labor intensive. As of May 30th, 2006, 10864 newly-discovered proteins deposited in the Protein Data Bank (PDB) <abbrgrp><abbr bid="B12">12</abbr></abbrgrp> have not been classified in the latest SCOP <it>v</it>1.69 release. The number of newly-discovered proteins is increasing continuously.</p>
         <p>Recent studies <abbrgrp><abbr bid="B13">13</abbr><abbr bid="B14">14</abbr></abbrgrp> apply a consensus scheme to classify the SCOP folds for newly-discovered proteins by intersecting multiple classification results from classical structural alignment algorithms such as DALI <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>, Combinatorial Extension (CE) <abbrgrp><abbr bid="B15">15</abbr></abbrgrp> and VAST <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>. These consensus approaches yield higher classification accuracies than each individual method. However, a combination of structural alignment algorithms is computationally expensive. To accelerate the manual classification process of SCOP, there is an urgent need to develop a fast, automated SCOP fold classification system with a reasonably high accuracy. By extending our recent works with the real-time tertiary structure retrieval system, ProteinDBS <abbrgrp><abbr bid="B17">17</abbr><abbr bid="B18">18</abbr><abbr bid="B19">19</abbr></abbrgrp>, we have already studied an efficient model of association rule (AR) mining to identify relevant structural patterns in proteins for SCOP domain and fold classifications <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>. In this paper, we further develop a non-parametric classifier to conduct the SCOP fold classifications with better accuracy and efficiency. Our contribution is to introduce a real-time classification model, <it>E-Predict</it>, that applies the <it>E_Measure </it>metric <abbrgrp><abbr bid="B21">21</abbr></abbrgrp> from the Information Retrieval (<it>IR</it>) field to assign the known SCOP folds and recognize the novel folds for newly-discovered proteins. In the past, a number of systems have been developed to assign a protein structure to an existing fold or recognize it as a novel fold. For example, DALI <abbrgrp><abbr bid="B22">22</abbr></abbrgrp> uses Z-score of the best structural match to either assign a structure to a known fold (Z>2) or novel fold (Z&#8804;2). Other programs, such as CE <abbrgrp><abbr bid="B15">15</abbr></abbrgrp> and VAST <abbrgrp><abbr bid="B23">23</abbr></abbrgrp> can perform similar tasks. However, the computational effort associated with those methods prevents a user from exploring the protein structure database in real time.</p>
      </sec>
      <sec>
         <st>
            <p>Results</p>
         </st>
         <p>There are two important tasks for the SCOP fold classifications. 1) <it>Known SCOP Fold Assignments</it>: the algorithm assigns newly-discovered protein structures into the known SCOP folds. 2) <it>Novel SCOP Fold Recognitions</it>: the algorithm detects whether or not newly-discovered protein structures should be categorized into the novel folds. Given two SCOP database releases <it>v</it><sub>1 </sub>and <it>v</it><sub>2 </sub>(<it>v</it><sub>1 </sub>&#8834; <it>v</it><sub>2</sub>), <m:math name="1471-2105-7-362-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>&#916;</m:mi><m:mrow><m:msub><m:mi>v</m:mi><m:mn>1</m:mn></m:msub></m:mrow><m:mrow><m:msub><m:mi>v</m:mi><m:mn>2</m:mn></m:msub></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODay3aaSbaaWqaaiabigdaXaqabaaaleaacqWG2bGDdaWgaaadbaGaeGOmaidabeaaaaaaaa@3370@</m:annotation></m:semantics></m:math> denotes a set of newly-discovered proteins in <it>v</it><sub>2 </sub>that have not been identified in <it>v</it><sub>1</sub>. The proteins from <m:math name="1471-2105-7-362-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>&#916;</m:mi><m:mrow><m:msub><m:mi>v</m:mi><m:mn>1</m:mn></m:msub></m:mrow><m:mrow><m:msub><m:mi>v</m:mi><m:mn>2</m:mn></m:msub></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODay3aaSbaaWqaaiabigdaXaqabaaaleaacqWG2bGDdaWgaaadbaGaeGOmaidabeaaaaaaaa@3370@</m:annotation></m:semantics></m:math> will be partitioned into either the known SCOP folds of <it>v</it><sub>1 </sub>(<m:math name="1471-2105-7-362-i2" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>&#916;</m:mi><m:mrow><m:msub><m:mi>v</m:mi><m:mn>1</m:mn></m:msub></m:mrow><m:mrow><m:msub><m:mi>v</m:mi><m:mn>2</m:mn></m:msub><m:mo>,</m:mo><m:mi>k</m:mi><m:mi>n</m:mi><m:mi>o</m:mi><m:mi>w</m:mi><m:mi>n</m:mi></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODay3aaSbaaWqaaiabigdaXaqabaaaleaacqWG2bGDdaWgaaadbaGaeGOmaidabeaaliabcYcaSiabdUgaRjabd6gaUjabd+gaVjabdEha3jabd6gaUbaaaaa@3B62@</m:annotation></m:semantics></m:math>), or the novel folds that have not been determined prior to <it>v</it><sub>2 </sub>(<m:math name="1471-2105-7-362-i3" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>&#916;</m:mi><m:mrow><m:msub><m:mi>v</m:mi><m:mn>1</m:mn></m:msub></m:mrow><m:mrow><m:msub><m:mi>v</m:mi><m:mn>2</m:mn></m:msub><m:mo>,</m:mo><m:mi>n</m:mi><m:mi>o</m:mi><m:mi>v</m:mi><m:mi>e</m:mi><m:mi>l</m:mi></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODay3aaSbaaWqaaiabigdaXaqabaaaleaacqWG2bGDdaWgaaadbaGaeGOmaidabeaaliabcYcaSiabd6gaUjabd+gaVjabdAha2jabdwgaLjabdYgaSbaaaaa@3B50@</m:annotation></m:semantics></m:math>), where <m:math name="1471-2105-7-362-i4" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>&#916;</m:mi><m:mrow><m:msub><m:mi>v</m:mi><m:mn>1</m:mn></m:msub></m:mrow><m:mrow><m:msub><m:mi>v</m:mi><m:mn>2</m:mn></m:msub><m:mo>,</m:mo><m:mi>k</m:mi><m:mi>n</m:mi><m:mi>o</m:mi><m:mi>w</m:mi><m:mi>n</m:mi></m:mrow></m:msubsup><m:mstyle displaystyle="true"><m:mo>&#8746;</m:mo><m:mrow><m:msubsup><m:mi>&#916;</m:mi><m:mrow><m:msub><m:mi>v</m:mi><m:mn>1</m:mn></m:msub></m:mrow><m:mrow><m:msub><m:mi>v</m:mi><m:mn>2</m:mn></m:msub><m:mo>,</m:mo><m:mi>n</m:mi><m:mi>o</m:mi><m:mi>v</m:mi><m:mi>e</m:mi><m:mi>l</m:mi></m:mrow></m:msubsup></m:mrow></m:mstyle><m:mo>=</m:mo><m:msubsup><m:mi>&#916;</m:mi><m:mrow><m:msub><m:mi>v</m:mi><m:mn>1</m:mn></m:msub></m:mrow><m:mrow><m:msub><m:mi>v</m:mi><m:mn>2</m:mn></m:msub></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODay3aaSbaaWqaaiabigdaXaqabaaaleaacqWG2bGDdaWgaaadbaGaeGOmaidabeaaliabcYcaSiabdUgaRjabd6gaUjabd+gaVjabdEha3jabd6gaUbaakmaataaabaGaeuiLdq0aa0baaSqaaiabdAha2naaBaaameaacqaIXaqmaeqaaaWcbaGaemODay3aaSbaaWqaaiabikdaYaqabaWccqGGSaalcqWGUbGBcqWGVbWBcqWG2bGDcqWGLbqzcqWGSbaBaaaabeqab0GaeSOkIufakiabg2da9iabfs5aenaaDaaaleaacqWG2bGDdaWgaaadbaGaeGymaedabeaaaSqaaiabdAha2naaBaaameaacqaIYaGmaeqaaaaaaaa@533E@</m:annotation></m:semantics></m:math>. In our experiments, we measure the classification accuracy for proteins from <m:math name="1471-2105-7-362-i2" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>&#916;</m:mi><m:mrow><m:msub><m:mi>v</m:mi><m:mn>1</m:mn></m:msub></m:mrow><m:mrow><m:msub><m:mi>v</m:mi><m:mn>2</m:mn></m:msub><m:mo>,</m:mo><m:mi>k</m:mi><m:mi>n</m:mi><m:mi>o</m:mi><m:mi>w</m:mi><m:mi>n</m:mi></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODay3aaSbaaWqaaiabigdaXaqabaaaleaacqWG2bGDdaWgaaadbaGaeGOmaidabeaaliabcYcaSiabdUgaRjabd6gaUjabd+gaVjabdEha3jabd6gaUbaaaaa@3B62@</m:annotation></m:semantics></m:math>, and then we gauge the accuracy for classifying proteins from <m:math name="1471-2105-7-362-i3" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>&#916;</m:mi><m:mrow><m:msub><m:mi>v</m:mi><m:mn>1</m:mn></m:msub></m:mrow><m:mrow><m:msub><m:mi>v</m:mi><m:mn>2</m:mn></m:msub><m:mo>,</m:mo><m:mi>n</m:mi><m:mi>o</m:mi><m:mi>v</m:mi><m:mi>e</m:mi><m:mi>l</m:mi></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODay3aaSbaaWqaaiabigdaXaqabaaaleaacqWG2bGDdaWgaaadbaGaeGOmaidabeaaliabcYcaSiabd6gaUjabd+gaVjabdAha2jabdwgaLjabdYgaSbaaaaa@3B50@</m:annotation></m:semantics></m:math>. Finally, we report the efficiency of SCOP fold classifications.</p>
         <sec>
            <st>
               <p>Assigning newly-discovered proteins to the known folds</p>
            </st>
            <p>We conduct three experiments for classifying newly-discovered proteins into the known folds. The first experiment compares our classification model, <it>E-Predict</it>, with several methods reported in a recent work <abbrgrp><abbr bid="B13">13</abbr></abbrgrp> such as CE, DALI, VAST and CBOOST. Our test data shown in Table <tblr tid="T1">1</tblr> is the same test set used in their work, which has proteins with average sequence identities equal to 16.88% and average sequence similarities equal to 20.76% by conducting all against all pairwise alignments using <it>EMBOSS-Align </it><abbrgrp><abbr bid="B24">24</abbr></abbrgrp> algorithm. The same ground truth data with their work includes proteins from the entire SCOP <it>v</it>l.59 release. To evaluate the accuracy, we use a general metric, <it>Correct Classification Rate </it>(<it>CCR</it>), which is defined as follows:</p>
            <tbl id="T1">
               <title>
                  <p>Table 1</p>
               </title>
               <caption>
                  <p>A test set that contains 37 protein chains from <m:math name="1471-2105-7-362-i5" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>&#916;</m:mi><m:mrow><m:mi>v</m:mi><m:mn>1.59</m:mn></m:mrow><m:mrow><m:mi>v</m:mi><m:mn>1.61</m:mn><m:mo>,</m:mo><m:mi>k</m:mi><m:mi>n</m:mi><m:mi>o</m:mi><m:mi>w</m:mi><m:mi>n</m:mi></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODayNaeGymaeJaeiOla4IaeGynauJaeGyoaKdabaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeGymaeJaeiilaWIaem4AaSMaemOBa4Maem4Ba8Maem4DaCNaemOBa4gaaaaa@409A@</m:annotation></m:semantics></m:math> [13].</p>
               </caption>
               <tblbdy cols="10">
                  <r>
                     <c ca="center">
                        <p><it>pdb</it>_<it>id</it></p>
                     </c>
                     <c ca="center">
                        <p><it>fold</it>_<it>id</it></p>
                     </c>
                     <c ca="center">
                        <p><it>pdb</it>_<it>id</it></p>
                     </c>
                     <c ca="center">
                        <p><it>fold</it>_<it>id</it></p>
                     </c>
                     <c ca="center">
                        <p><it>pdb</it>_<it>id</it></p>
                     </c>
                     <c ca="center">
                        <p><it>fold</it>_<it>id</it></p>
                     </c>
                     <c ca="center">
                        <p><it>pdb</it>_<it>id</it></p>
                     </c>
                     <c ca="center">
                        <p><it>fold</it>_<it>id</it></p>
                     </c>
                     <c ca="center">
                        <p><it>pdb</it>_<it>id</it></p>
                     </c>
                     <c ca="center">
                        <p><it>fold</it>_<it>id</it></p>
                     </c>
                  </r>
                  <r>
                     <c cspan="10">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>1<it>gyz</it>_<it>A</it></p>
                     </c>
                     <c ca="center">
                        <p>63569</p>
                     </c>
                     <c ca="center">
                        <p>1<it>key</it>_<it>A</it></p>
                     </c>
                     <c ca="center">
                        <p>48370</p>
                     </c>
                     <c ca="center">
                        <p>1<it>key</it>_<it>B</it></p>
                     </c>
                     <c ca="center">
                        <p>48370</p>
                     </c>
                     <c ca="center">
                        <p>1<it>key</it>_<it>C</it></p>
                     </c>
                     <c ca="center">
                        <p>48370</p>
                     </c>
                     <c ca="center">
                        <p>1<it>key</it>_<it>D</it></p>
                     </c>
                     <c ca="center">
                        <p>48370</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>1<it>lkv</it>_<it>X</it></p>
                     </c>
                     <c ca="center">
                        <p>48370</p>
                     </c>
                     <c ca="center">
                        <p>1<it>ldk</it>_<it>A</it></p>
                     </c>
                     <c ca="center">
                        <p>48370</p>
                     </c>
                     <c ca="center">
                        <p>1<it>ifr</it>_<it>A</it></p>
                     </c>
                     <c ca="center">
                        <p>48725</p>
                     </c>
                     <c ca="center">
                        <p>1<it>ivt</it>_<it>A</it></p>
                     </c>
                     <c ca="center">
                        <p>48725</p>
                     </c>
                     <c ca="center">
                        <p>1<it>gyv</it>_<it>A</it></p>
                     </c>
                     <c ca="center">
                        <p>48725</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>1<it>gyu</it>_<it>A</it></p>
                     </c>
                     <c ca="center">
                        <p>48725</p>
                     </c>
                     <c ca="center">
                        <p>1<it>iu1</it>_<it>A</it></p>
                     </c>
                     <c ca="center">
                        <p>48725</p>
                     </c>
                     <c ca="center">
                        <p>1<it>iu1</it>_<it>B</it></p>
                     </c>
                     <c ca="center">
                        <p>48725</p>
                     </c>
                     <c ca="center">
                        <p>1<it>gyw</it>_<it>A</it></p>
                     </c>
                     <c ca="center">
                        <p>48725</p>
                     </c>
                     <c ca="center">
                        <p>1<it>gyw</it>_<it>B</it></p>
                     </c>
                     <c ca="center">
                        <p>48725</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>1<it>l6p</it>_<it>A</it></p>
                     </c>
                     <c ca="center">
                        <p>48725</p>
                     </c>
                     <c ca="center">
                        <p>1<it>lpl</it>_<it>A</it></p>
                     </c>
                     <c ca="center">
                        <p>50036</p>
                     </c>
                     <c ca="center">
                        <p>1<it>k</it>3<it>b</it>_<it>A</it></p>
                     </c>
                     <c ca="center">
                        <p>50875</p>
                     </c>
                     <c ca="center">
                        <p>1<it>gyh</it>_<it>A</it></p>
                     </c>
                     <c ca="center">
                        <p>50933</p>
                     </c>
                     <c ca="center">
                        <p>1<it>gyh</it>_<it>B</it></p>
                     </c>
                     <c ca="center">
                        <p>50933</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>1<it>gyh</it>_<it>C</it></p>
                     </c>
                     <c ca="center">
                        <p>50933</p>
                     </c>
                     <c ca="center">
                        <p>1<it>gyh</it>_<it>D</it></p>
                     </c>
                     <c ca="center">
                        <p>50933</p>
                     </c>
                     <c ca="center">
                        <p>1<it>gyh</it>_<it>E</it></p>
                     </c>
                     <c ca="center">
                        <p>50933</p>
                     </c>
                     <c ca="center">
                        <p>1<it>gyh</it>_<it>F</it></p>
                     </c>
                     <c ca="center">
                        <p>50933</p>
                     </c>
                     <c ca="center">
                        <p>1<it>gyd</it>_<it>B</it></p>
                     </c>
                     <c ca="center">
                        <p>50933</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>1<it>gye</it>_<it>B</it></p>
                     </c>
                     <c ca="center">
                        <p>50933</p>
                     </c>
                     <c ca="center">
                        <p>1<it>jof</it>_<it>A</it></p>
                     </c>
                     <c ca="center">
                        <p>50964</p>
                     </c>
                     <c ca="center">
                        <p>1<it>jof</it>_<it>B</it></p>
                     </c>
                     <c ca="center">
                        <p>50964</p>
                     </c>
                     <c ca="center">
                        <p>1<it>jof</it>_<it>C</it></p>
                     </c>
                     <c ca="center">
                        <p>50964</p>
                     </c>
                     <c ca="center">
                        <p>1<it>jof</it>_<it>D</it></p>
                     </c>
                     <c ca="center">
                        <p>50964</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>1<it>jof</it>_<it>E</it></p>
                     </c>
                     <c ca="center">
                        <p>50964</p>
                     </c>
                     <c ca="center">
                        <p>1<it>jof</it>_<it>F</it></p>
                     </c>
                     <c ca="center">
                        <p>50964</p>
                     </c>
                     <c ca="center">
                        <p>1<it>jof</it>_<it>G</it></p>
                     </c>
                     <c ca="center">
                        <p>50964</p>
                     </c>
                     <c ca="center">
                        <p>1<it>Jof</it>_<it>H</it></p>
                     </c>
                     <c ca="center">
                        <p>50964</p>
                     </c>
                     <c ca="center">
                        <p>1<it>l</it>2<it>q</it>_<it>A</it></p>
                     </c>
                     <c ca="center">
                        <p>51350</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>1<it>ln</it>4_<it>A</it></p>
                     </c>
                     <c ca="center">
                        <p>55199</p>
                     </c>
                     <c ca="center">
                        <p>1<it>kuu</it>_<it>A</it></p>
                     </c>
                     <c ca="center">
                        <p>56234</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
               </tblbdy>
            </tbl>
            <p>
               <m:math name="1471-2105-7-362-i6" xmlns:m="http://www.w3.org/1998/Math/MathML">
                  <m:semantics>
                     <m:mrow>
                        <m:mi>C</m:mi>
                        <m:mi>C</m:mi>
                        <m:mi>R</m:mi>
                        <m:mo>=</m:mo>
                        <m:mfrac>
                           <m:mrow>
                              <m:mi>T</m:mi>
                              <m:mi>h</m:mi>
                              <m:mi>e</m:mi>
                              <m:mtext>&#160;</m:mtext>
                              <m:mi>n</m:mi>
                              <m:mi>u</m:mi>
                              <m:mi>m</m:mi>
                              <m:mi>b</m:mi>
                              <m:mi>e</m:mi>
                              <m:mi>r</m:mi>
                              <m:mtext>&#160;</m:mtext>
                              <m:mi>o</m:mi>
                              <m:mi>f</m:mi>
                              <m:mtext>&#160;</m:mtext>
                              <m:mi>c</m:mi>
                              <m:mi>o</m:mi>
                              <m:mi>r</m:mi>
                              <m:mi>r</m:mi>
                              <m:mi>e</m:mi>
                              <m:mi>c</m:mi>
                              <m:mi>t</m:mi>
                              <m:mi>l</m:mi>
                              <m:mi>y</m:mi>
                              <m:mtext>&#160;</m:mtext>
                              <m:mi>c</m:mi>
                              <m:mi>l</m:mi>
                              <m:mi>a</m:mi>
                              <m:mi>s</m:mi>
                              <m:mi>s</m:mi>
                              <m:mi>i</m:mi>
                              <m:mi>f</m:mi>
                              <m:mi>i</m:mi>
                              <m:mi>e</m:mi>
                              <m:mi>d</m:mi>
                              <m:mtext>&#160;</m:mtext>
                              <m:mi>p</m:mi>
                              <m:mi>r</m:mi>
                              <m:mi>o</m:mi>
                              <m:mi>t</m:mi>
                              <m:mi>e</m:mi>
                              <m:mi>i</m:mi>
                              <m:mi>n</m:mi>
                              <m:mi>s</m:mi>
                           </m:mrow>
                           <m:mrow>
                              <m:mi>T</m:mi>
                              <m:mi>h</m:mi>
                              <m:mi>e</m:mi>
                              <m:mtext>&#160;</m:mtext>
                              <m:mi>t</m:mi>
                              <m:mi>o</m:mi>
                              <m:mi>t</m:mi>
                              <m:mi>a</m:mi>
                              <m:mi>l</m:mi>
                              <m:mtext>&#160;</m:mtext>
                              <m:mi>n</m:mi>
                              <m:mi>u</m:mi>
                              <m:mi>m</m:mi>
                              <m:mi>b</m:mi>
                              <m:mi>e</m:mi>
                              <m:mi>r</m:mi>
                              <m:mtext>&#160;</m:mtext>
                              <m:mi>o</m:mi>
                              <m:mi>f</m:mi>
                              <m:mtext>&#160;</m:mtext>
                              <m:mi>t</m:mi>
                              <m:mi>e</m:mi>
                              <m:mi>s</m:mi>
                              <m:mi>t</m:mi>
                              <m:mtext>&#160;</m:mtext>
                              <m:mi>p</m:mi>
                              <m:mi>r</m:mi>
                              <m:mi>o</m:mi>
                              <m:mi>t</m:mi>
                              <m:mi>e</m:mi>
                              <m:mi>i</m:mi>
                              <m:mi>n</m:mi>
                              <m:mi>s</m:mi>
                           </m:mrow>
                        </m:mfrac>
                        <m:mtext>&#160;&#160;&#160;&#160;&#160;</m:mtext>
                        <m:mrow>
                           <m:mo>(</m:mo>
                           <m:mn>1</m:mn>
                           <m:mo>)</m:mo>
                        </m:mrow>
                     </m:mrow>
                     <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGdbWqcqWGdbWqcqWGsbGucqGH9aqpdaWcaaqaaiabdsfaujabdIgaOjabdwgaLjabbccaGiabd6gaUjabdwha1jabd2gaTjabdkgaIjabdwgaLjabdkhaYjabbccaGiabd+gaVjabdAgaMjabbccaGiabdogaJjabd+gaVjabdkhaYjabdkhaYjabdwgaLjabdogaJjabdsha0jabdYgaSjabdMha5jabbccaGiabdogaJjabdYgaSjabdggaHjabdohaZjabdohaZjabdMgaPjabdAgaMjabdMgaPjabdwgaLjabdsgaKjabbccaGiabdchaWjabdkhaYjabd+gaVjabdsha0jabdwgaLjabdMgaPjabd6gaUjabdohaZbqaaiabdsfaujabdIgaOjabdwgaLjabbccaGiabdsha0jabd+gaVjabdsha0jabdggaHHqaciab=XgaSjabbccaGiab=5gaUjab=vha1jabd2gaTjabdkgaIjabdwgaLjabdkhaYjabbccaGiabd+gaVjabdAgaMjabbccaGiabdsha0jabdwgaLjabdohaZjabdsha0jabbccaGiabdchaWjabdkhaYjabd+gaVjabdsha0jabdwgaLjabdMgaPjabd6gaUjabdohaZbaacaWLjaGaaCzcamaabmGabaGaeGymaedacaGLOaGaayzkaaaaaa@9753@</m:annotation>
                  </m:semantics>
               </m:math>
            </p>
            <p>Figure <figr fid="F1">1</figr> shows that <it>E-Predict </it>outperforms DALI, CE, and VAST, exhibiting an accuracy of 64.86%. Can <it>et al</it>. <abbrgrp><abbr bid="B13">13</abbr></abbrgrp> have proposed a method, named CBOOST, which utilizes a decision tree to integrate DALI, CE, and VAST, achieving the same accuracy of 64.86%. It is worth mentioning that the computationally expensive structural alignment algorithms of CBOOST may not be able to efficiently classify a large number of newly-discovered proteins generated from on-going, high-throughput structure determination projects.</p>
            <fig id="F1">
               <title>
                  <p>Figure 1</p>
               </title>
               <caption>
                  <p>The <it>Correct Classification Rate </it>of assigning the known folds for test proteins in Table 1</p>
               </caption>
               <text>
                  <p>The <it>Correct Classification Rate </it>of assigning the known folds for test proteins in Table 1.</p>
               </text>
               <graphic file="1471-2105-7-362-1"/>
            </fig>
            <p>The second experiment exhaustively evaluates the accuracy of <it>E-Predict </it>on several general test sets from <m:math name="1471-2105-7-362-i7" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>&#916;</m:mi><m:mrow><m:mi>v</m:mi><m:mn>1.55</m:mn><m:mo>,</m:mo><m:mi>g</m:mi><m:mi>e</m:mi><m:mi>n</m:mi><m:mi>e</m:mi><m:mi>r</m:mi><m:mi>a</m:mi><m:mi>l</m:mi></m:mrow><m:mrow><m:mi>v</m:mi><m:mn>1.57</m:mn><m:mo>,</m:mo><m:mi>k</m:mi><m:mi>n</m:mi><m:mi>o</m:mi><m:mi>w</m:mi><m:mi>n</m:mi></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODayNaeGymaeJaeiOla4IaeGynauJaeGynauJaeiilaWIaem4zaCMaemyzauMaemOBa4MaemyzauMaemOCaiNaemyyaeMaemiBaWgabaGaemODayNaeGymaeJaeiOla4IaeGynauJaeG4naCJaeiilaWIaem4AaSMaemOBa4Maem4Ba8Maem4DaCNaemOBa4gaaaaa@4AF7@</m:annotation></m:semantics></m:math> to <m:math name="1471-2105-7-362-i8" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>&#916;</m:mi><m:mrow><m:mi>v</m:mi><m:mn>1.67</m:mn><m:mo>,</m:mo><m:mi>g</m:mi><m:mi>e</m:mi><m:mi>n</m:mi><m:mi>e</m:mi><m:mi>r</m:mi><m:mi>a</m:mi><m:mi>l</m:mi></m:mrow><m:mrow><m:mi>v</m:mi><m:mn>1.69</m:mn><m:mo>,</m:mo><m:mi>k</m:mi><m:mi>n</m:mi><m:mi>o</m:mi><m:mi>w</m:mi><m:mi>n</m:mi></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeG4naCJaeiilaWIaem4zaCMaemyzauMaemOBa4MaemyzauMaemOCaiNaemyyaeMaemiBaWgabaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeGyoaKJaeiilaWIaem4AaSMaemOBa4Maem4Ba8Maem4DaCNaemOBa4gaaaaa@4B03@</m:annotation></m:semantics></m:math>. In Table <tblr tid="T2">2</tblr> and Table <tblr tid="T3">3</tblr>, our test proteins in <m:math name="1471-2105-7-362-i2" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>&#916;</m:mi><m:mrow><m:msub><m:mi>v</m:mi><m:mn>1</m:mn></m:msub></m:mrow><m:mrow><m:msub><m:mi>v</m:mi><m:mn>2</m:mn></m:msub><m:mo>,</m:mo><m:mi>k</m:mi><m:mi>n</m:mi><m:mi>o</m:mi><m:mi>w</m:mi><m:mi>n</m:mi></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODay3aaSbaaWqaaiabigdaXaqabaaaleaacqWG2bGDdaWgaaadbaGaeGOmaidabeaaliabcYcaSiabdUgaRjabd6gaUjabd+gaVjabdEha3jabd6gaUbaaaaa@3B62@</m:annotation></m:semantics></m:math> are selected from the known SCOP folds of <it>v</it><sub>2</sub>, which also maintain at least one protein chain and 10 proteins in <it>v</it><sub>1</sub>, respectively. Figure <figr fid="F2">2(a)</figr> shows that <it>E-Predict </it>achieves 72% to 82% classification accuracies for the general test sets of seven SCOP releases. According to Figure <figr fid="F3">3</figr>, there exists a large number of SCOP folds with small sizes. When a newly-discovered protein belongs to a small-size fold, there is a limited amount of ground truth data available. In machine learning, classifiers usually require sufficient ground truth data to guarantee the accuracy. Figure <figr fid="F2">2(b)</figr> demonstrates that <it>E-Predict </it>is able to achieve much higher accuracies, 90% to 96%, for the general test sets of seven SCOP releases with more than 10 ground truth proteins. In the future, when newly-discovered protein structures are categorized into those small-size SCOP folds, the accuracy of <it>E-Predict </it>could be further improved.</p>
            <tbl id="T2">
               <title>
                  <p>Table 2</p>
               </title>
               <caption>
                  <p>The number of proteins in a test set of novel folds, general and non-redundant test sets in <m:math name="1471-2105-7-362-i2" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>&#916;</m:mi><m:mrow><m:msub><m:mi>v</m:mi><m:mn>1</m:mn></m:msub></m:mrow><m:mrow><m:msub><m:mi>v</m:mi><m:mn>2</m:mn></m:msub><m:mo>,</m:mo><m:mi>k</m:mi><m:mi>n</m:mi><m:mi>o</m:mi><m:mi>w</m:mi><m:mi>n</m:mi></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODay3aaSbaaWqaaiabigdaXaqabaaaleaacqWG2bGDdaWgaaadbaGaeGOmaidabeaaliabcYcaSiabdUgaRjabd6gaUjabd+gaVjabdEha3jabd6gaUbaaaaa@3B62@</m:annotation></m:semantics></m:math> which are selected from the known SCOP folds of <it>v</it><sub>2 </sub>with at least one protein chain in <it>v</it><sub>1</sub>.</p>
               </caption>
               <tblbdy cols="6">
                  <r>
                     <c ca="center">
                        <p>test set</p>
                     </c>
                     <c ca="center">
                        <p>size <it>(#proteins)</it></p>
                     </c>
                     <c ca="center">
                        <p>test set</p>
                     </c>
                     <c ca="center">
                        <p>size <it>(#proteins)</it></p>
                     </c>
                     <c ca="center">
                        <p>test set</p>
                     </c>
                     <c ca="center">
                        <p>size <it>(#proteins)</it></p>
                     </c>
                  </r>
                  <r>
                     <c cspan="6">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <m:math name="1471-2105-7-362-i7" xmlns:m="http://www.w3.org/1998/Math/MathML">
                              <m:semantics>
                                 <m:mrow>
                                    <m:msubsup>
                                       <m:mi>&#916;</m:mi>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.55</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:mi>g</m:mi>
                                          <m:mi>e</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>e</m:mi>
                                          <m:mi>r</m:mi>
                                          <m:mi>a</m:mi>
                                          <m:mi>l</m:mi>
                                       </m:mrow>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.57</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:mi>k</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>o</m:mi>
                                          <m:mi>w</m:mi>
                                          <m:mi>n</m:mi>
                                       </m:mrow>
                                    </m:msubsup>
                                 </m:mrow>
                                 <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODayNaeGymaeJaeiOla4IaeGynauJaeGynauJaeiilaWIaem4zaCMaemyzauMaemOBa4MaemyzauMaemOCaiNaemyyaeMaemiBaWgabaGaemODayNaeGymaeJaeiOla4IaeGynauJaeG4naCJaeiilaWIaem4AaSMaemOBa4Maem4Ba8Maem4DaCNaemOBa4gaaaaa@4AF7@</m:annotation>
                              </m:semantics>
                           </m:math>
                        </p>
                     </c>
                     <c ca="center">
                        <p>4192</p>
                     </c>
                     <c ca="center">
                        <p>
                           <m:math name="1471-2105-7-362-i9" xmlns:m="http://www.w3.org/1998/Math/MathML">
                              <m:semantics>
                                 <m:mrow>
                                    <m:msubsup>
                                       <m:mi>&#916;</m:mi>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.55</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:mi>n</m:mi>
                                          <m:mi>o</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mo>&#8722;</m:mo>
                                          <m:mi>r</m:mi>
                                          <m:mi>e</m:mi>
                                          <m:mi>d</m:mi>
                                          <m:mi>u</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>d</m:mi>
                                          <m:mi>a</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>t</m:mi>
                                       </m:mrow>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.57</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:mi>k</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>o</m:mi>
                                          <m:mi>w</m:mi>
                                          <m:mi>n</m:mi>
                                       </m:mrow>
                                    </m:msubsup>
                                 </m:mrow>
                                 <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODayNaeGymaeJaeiOla4IaeGynauJaeGynauJaeiilaWIaemOBa4Maem4Ba8MaemOBa4MaeyOeI0IaemOCaiNaemyzauMaemizaqMaemyDauNaemOBa4MaemizaqMaemyyaeMaemOBa4MaemiDaqhabaGaemODayNaeGymaeJaeiOla4IaeGynauJaeG4naCJaeiilaWIaem4AaSMaemOBa4Maem4Ba8Maem4DaCNaemOBa4gaaaaa@52F5@</m:annotation>
                              </m:semantics>
                           </m:math>
                        </p>
                     </c>
                     <c ca="center">
                        <p>442</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <m:math name="1471-2105-7-362-i10" xmlns:m="http://www.w3.org/1998/Math/MathML">
                              <m:semantics>
                                 <m:mrow>
                                    <m:msubsup>
                                       <m:mi>&#916;</m:mi>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.57</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:mi>g</m:mi>
                                          <m:mi>e</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>e</m:mi>
                                          <m:mi>r</m:mi>
                                          <m:mi>a</m:mi>
                                          <m:mi>l</m:mi>
                                       </m:mrow>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.59</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:mi>k</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>o</m:mi>
                                          <m:mi>w</m:mi>
                                          <m:mi>n</m:mi>
                                       </m:mrow>
                                    </m:msubsup>
                                 </m:mrow>
                                 <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODayNaeGymaeJaeiOla4IaeGynauJaeG4naCJaeiilaWIaem4zaCMaemyzauMaemOBa4MaemyzauMaemOCaiNaemyyaeMaemiBaWgabaGaemODayNaeGymaeJaeiOla4IaeGynauJaeGyoaKJaeiilaWIaem4AaSMaemOBa4Maem4Ba8Maem4DaCNaemOBa4gaaaaa@4AFF@</m:annotation>
                              </m:semantics>
                           </m:math>
                        </p>
                     </c>
                     <c ca="center">
                        <p>4047</p>
                     </c>
                     <c ca="center">
                        <p>
                           <m:math name="1471-2105-7-362-i11" xmlns:m="http://www.w3.org/1998/Math/MathML">
                              <m:semantics>
                                 <m:mrow>
                                    <m:msubsup>
                                       <m:mi>&#916;</m:mi>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.57</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:mi>n</m:mi>
                                          <m:mi>o</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mo>&#8722;</m:mo>
                                          <m:mi>r</m:mi>
                                          <m:mi>e</m:mi>
                                          <m:mi>d</m:mi>
                                          <m:mi>u</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>d</m:mi>
                                          <m:mi>a</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>t</m:mi>
                                       </m:mrow>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.59</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:mi>k</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>o</m:mi>
                                          <m:mi>w</m:mi>
                                          <m:mi>n</m:mi>
                                       </m:mrow>
                                    </m:msubsup>
                                 </m:mrow>
                                 <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODayNaeGymaeJaeiOla4IaeGynauJaeG4naCJaeiilaWIaemOBa4Maem4Ba8MaemOBa4MaeyOeI0IaemOCaiNaemyzauMaemizaqMaemyDauNaemOBa4MaemizaqMaemyyaeMaemOBa4MaemiDaqhabaGaemODayNaeGymaeJaeiOla4IaeGynauJaeGyoaKJaeiilaWIaem4AaSMaemOBa4Maem4Ba8Maem4DaCNaemOBa4gaaaaa@52FD@</m:annotation>
                              </m:semantics>
                           </m:math>
                        </p>
                     </c>
                     <c ca="center">
                        <p>431</p>
                     </c>
                     <c ca="center">
                        <p>
                           <m:math name="1471-2105-7-362-i12" xmlns:m="http://www.w3.org/1998/Math/MathML">
                              <m:semantics>
                                 <m:mrow>
                                    <m:msubsup>
                                       <m:mi>&#916;</m:mi>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.57</m:mn>
                                       </m:mrow>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.59</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:mi>n</m:mi>
                                          <m:mi>o</m:mi>
                                          <m:mi>v</m:mi>
                                          <m:mi>e</m:mi>
                                          <m:mi>l</m:mi>
                                       </m:mrow>
                                    </m:msubsup>
                                 </m:mrow>
                                 <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODayNaeGymaeJaeiOla4IaeGynauJaeG4naCdabaGaemODayNaeGymaeJaeiOla4IaeGynauJaeGyoaKJaeiilaWIaemOBa4Maem4Ba8MaemODayNaemyzauMaemiBaWgaaaaa@4092@</m:annotation>
                              </m:semantics>
                           </m:math>
                        </p>
                     </c>
                     <c ca="center">
                        <p>94</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <m:math name="1471-2105-7-362-i13" xmlns:m="http://www.w3.org/1998/Math/MathML">
                              <m:semantics>
                                 <m:mrow>
                                    <m:msubsup>
                                       <m:mi>&#916;</m:mi>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.59</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:mi>g</m:mi>
                                          <m:mi>e</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>e</m:mi>
                                          <m:mi>r</m:mi>
                                          <m:mi>a</m:mi>
                                          <m:mi>l</m:mi>
                                       </m:mrow>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.61</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:mi>k</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>o</m:mi>
                                          <m:mi>w</m:mi>
                                          <m:mi>n</m:mi>
                                       </m:mrow>
                                    </m:msubsup>
                                 </m:mrow>
                                 <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODayNaeGymaeJaeiOla4IaeGynauJaeGyoaKJaeiilaWIaem4zaCMaemyzauMaemOBa4MaemyzauMaemOCaiNaemyyaeMaemiBaWgabaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeGymaeJaeiilaWIaem4AaSMaemOBa4Maem4Ba8Maem4DaCNaemOBa4gaaaaa@4AF5@</m:annotation>
                              </m:semantics>
                           </m:math>
                        </p>
                     </c>
                     <c ca="center">
                        <p>4547</p>
                     </c>
                     <c ca="center">
                        <p>
                           <m:math name="1471-2105-7-362-i14" xmlns:m="http://www.w3.org/1998/Math/MathML">
                              <m:semantics>
                                 <m:mrow>
                                    <m:msubsup>
                                       <m:mi>&#916;</m:mi>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.59</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:mi>n</m:mi>
                                          <m:mi>o</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mo>&#8722;</m:mo>
                                          <m:mi>r</m:mi>
                                          <m:mi>e</m:mi>
                                          <m:mi>d</m:mi>
                                          <m:mi>u</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>d</m:mi>
                                          <m:mi>a</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>t</m:mi>
                                       </m:mrow>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.61</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:mi>k</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>o</m:mi>
                                          <m:mi>w</m:mi>
                                          <m:mi>n</m:mi>
                                       </m:mrow>
                                    </m:msubsup>
                                 </m:mrow>
                                 <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODayNaeGymaeJaeiOla4IaeGynauJaeGyoaKJaeiilaWIaemOBa4Maem4Ba8MaemOBa4MaeyOeI0IaemOCaiNaemyzauMaemizaqMaemyDauNaemOBa4MaemizaqMaemyyaeMaemOBa4MaemiDaqhabaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeGymaeJaeiilaWIaem4AaSMaemOBa4Maem4Ba8Maem4DaCNaemOBa4gaaaaa@52F3@</m:annotation>
                              </m:semantics>
                           </m:math>
                        </p>
                     </c>
                     <c ca="center">
                        <p>468</p>
                     </c>
                     <c ca="center">
                        <p>
                           <m:math name="1471-2105-7-362-i15" xmlns:m="http://www.w3.org/1998/Math/MathML">
                              <m:semantics>
                                 <m:mrow>
                                    <m:msubsup>
                                       <m:mi>&#916;</m:mi>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.59</m:mn>
                                       </m:mrow>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.61</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:mi>n</m:mi>
                                          <m:mi>o</m:mi>
                                          <m:mi>v</m:mi>
                                          <m:mi>e</m:mi>
                                          <m:mi>l</m:mi>
                                       </m:mrow>
                                    </m:msubsup>
                                 </m:mrow>
                                 <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODayNaeGymaeJaeiOla4IaeGynauJaeGyoaKdabaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeGymaeJaeiilaWIaemOBa4Maem4Ba8MaemODayNaemyzauMaemiBaWgaaaaa@4088@</m:annotation>
                              </m:semantics>
                           </m:math>
                        </p>
                     </c>
                     <c ca="center">
                        <p>10</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <m:math name="1471-2105-7-362-i16" xmlns:m="http://www.w3.org/1998/Math/MathML">
                              <m:semantics>
                                 <m:mrow>
                                    <m:msubsup>
                                       <m:mi>&#916;</m:mi>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.61</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:mi>g</m:mi>
                                          <m:mi>e</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>e</m:mi>
                                          <m:mi>r</m:mi>
                                          <m:mi>a</m:mi>
                                          <m:mi>l</m:mi>
                                       </m:mrow>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.63</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:mi>k</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>o</m:mi>
                                          <m:mi>w</m:mi>
                                          <m:mi>n</m:mi>
                                       </m:mrow>
                                    </m:msubsup>
                                 </m:mrow>
                                 <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeGymaeJaeiilaWIaem4zaCMaemyzauMaemOBa4MaemyzauMaemOCaiNaemyyaeMaemiBaWgabaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeG4mamJaeiilaWIaem4AaSMaemOBa4Maem4Ba8Maem4DaCNaemOBa4gaaaaa@4AEB@</m:annotation>
                              </m:semantics>
                           </m:math>
                        </p>
                     </c>
                     <c ca="center">
                        <p>5226</p>
                     </c>
                     <c ca="center">
                        <p>
                           <m:math name="1471-2105-7-362-i17" xmlns:m="http://www.w3.org/1998/Math/MathML">
                              <m:semantics>
                                 <m:mrow>
                                    <m:msubsup>
                                       <m:mi>&#916;</m:mi>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.61</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:mi>n</m:mi>
                                          <m:mi>o</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mo>&#8722;</m:mo>
                                          <m:mi>r</m:mi>
                                          <m:mi>e</m:mi>
                                          <m:mi>d</m:mi>
                                          <m:mi>u</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>d</m:mi>
                                          <m:mi>a</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>t</m:mi>
                                       </m:mrow>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.63</m:mn>
                                          <m:mi>k</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>o</m:mi>
                                          <m:mi>w</m:mi>
                                          <m:mi>n</m:mi>
                                       </m:mrow>
                                    </m:msubsup>
                                 </m:mrow>
                                 <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeGymaeJaeiilaWIaemOBa4Maem4Ba8MaemOBa4MaeyOeI0IaemOCaiNaemyzauMaemizaqMaemyDauNaemOBa4MaemizaqMaemyyaeMaemOBa4MaemiDaqhabaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeG4mamJaem4AaSMaemOBa4Maem4Ba8Maem4DaCNaemOBa4gaaaaa@5209@</m:annotation>
                              </m:semantics>
                           </m:math>
                        </p>
                     </c>
                     <c ca="center">
                        <p>491</p>
                     </c>
                     <c ca="center">
                        <p>
                           <m:math name="1471-2105-7-362-i18" xmlns:m="http://www.w3.org/1998/Math/MathML">
                              <m:semantics>
                                 <m:mrow>
                                    <m:msubsup>
                                       <m:mi>&#916;</m:mi>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.61</m:mn>
                                       </m:mrow>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.63</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:mi>n</m:mi>
                                          <m:mi>o</m:mi>
                                          <m:mi>v</m:mi>
                                          <m:mi>e</m:mi>
                                          <m:mi>l</m:mi>
                                       </m:mrow>
                                    </m:msubsup>
                                 </m:mrow>
                                 <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeGymaedabaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeG4mamJaeiilaWIaemOBa4Maem4Ba8MaemODayNaemyzauMaemiBaWgaaaaa@407E@</m:annotation>
                              </m:semantics>
                           </m:math>
                        </p>
                     </c>
                     <c ca="center">
                        <p>190</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <m:math name="1471-2105-7-362-i19" xmlns:m="http://www.w3.org/1998/Math/MathML">
                              <m:semantics>
                                 <m:mrow>
                                    <m:msubsup>
                                       <m:mi>&#916;</m:mi>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.63</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:mi>g</m:mi>
                                          <m:mi>e</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>e</m:mi>
                                          <m:mi>r</m:mi>
                                          <m:mi>a</m:mi>
                                          <m:mi>l</m:mi>
                                       </m:mrow>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.65</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:mi>k</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>o</m:mi>
                                          <m:mi>w</m:mi>
                                          <m:mi>n</m:mi>
                                       </m:mrow>
                                    </m:msubsup>
                                 </m:mrow>
                                 <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeG4mamJaeiilaWIaem4zaCMaemyzauMaemOBa4MaemyzauMaemOCaiNaemyyaeMaemiBaWgabaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeGynauJaeiilaWIaem4AaSMaemOBa4Maem4Ba8Maem4DaCNaemOBa4gaaaaa@4AF3@</m:annotation>
                              </m:semantics>
                           </m:math>
                        </p>
                     </c>
                     <c ca="center">
                        <p>5445</p>
                     </c>
                     <c ca="center">
                        <p>
                           <m:math name="1471-2105-7-362-i20" xmlns:m="http://www.w3.org/1998/Math/MathML">
                              <m:semantics>
                                 <m:mrow>
                                    <m:msubsup>
                                       <m:mi>&#916;</m:mi>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.63</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:mi>n</m:mi>
                                          <m:mi>o</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mo>&#8722;</m:mo>
                                          <m:mi>r</m:mi>
                                          <m:mi>e</m:mi>
                                          <m:mi>d</m:mi>
                                          <m:mi>u</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>d</m:mi>
                                          <m:mi>a</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>t</m:mi>
                                       </m:mrow>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.65</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:mi>k</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>o</m:mi>
                                          <m:mi>w</m:mi>
                                          <m:mi>n</m:mi>
                                       </m:mrow>
                                    </m:msubsup>
                                 </m:mrow>
                                 <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeG4mamJaeiilaWIaemOBa4Maem4Ba8MaemOBa4MaeyOeI0IaemOCaiNaemyzauMaemizaqMaemyDauNaemOBa4MaemizaqMaemyyaeMaemOBa4MaemiDaqhabaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeGynauJaeiilaWIaem4AaSMaemOBa4Maem4Ba8Maem4DaCNaemOBa4gaaaaa@52F1@</m:annotation>
                              </m:semantics>
                           </m:math>
                        </p>
                     </c>
                     <c ca="center">
                        <p>494</p>
                     </c>
                     <c ca="center">
                        <p>
                           <m:math name="1471-2105-7-362-i21" xmlns:m="http://www.w3.org/1998/Math/MathML">
                              <m:semantics>
                                 <m:mrow>
                                    <m:msubsup>
                                       <m:mi>&#916;</m:mi>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.63</m:mn>
                                       </m:mrow>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.65</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:mi>n</m:mi>
                                          <m:mi>o</m:mi>
                                          <m:mi>v</m:mi>
                                          <m:mi>e</m:mi>
                                          <m:mi>l</m:mi>
                                       </m:mrow>
                                    </m:msubsup>
                                 </m:mrow>
                                 <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeG4mamdabaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeGynauJaeiilaWIaemOBa4Maem4Ba8MaemODayNaemyzauMaemiBaWgaaaaa@4086@</m:annotation>
                              </m:semantics>
                           </m:math>
                        </p>
                     </c>
                     <c ca="center">
                        <p>48</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <m:math name="1471-2105-7-362-i22" xmlns:m="http://www.w3.org/1998/Math/MathML">
                              <m:semantics>
                                 <m:mrow>
                                    <m:msubsup>
                                       <m:mi>&#916;</m:mi>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.65</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:mi>g</m:mi>
                                          <m:mi>e</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>e</m:mi>
                                          <m:mi>r</m:mi>
                                          <m:mi>a</m:mi>
                                          <m:mi>l</m:mi>
                                       </m:mrow>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.67</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:mi>k</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>o</m:mi>
                                          <m:mi>w</m:mi>
                                          <m:mi>n</m:mi>
                                       </m:mrow>
                                    </m:msubsup>
                                 </m:mrow>
                                 <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeGynauJaeiilaWIaem4zaCMaemyzauMaemOBa4MaemyzauMaemOCaiNaemyyaeMaemiBaWgabaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeG4naCJaeiilaWIaem4AaSMaemOBa4Maem4Ba8Maem4DaCNaemOBa4gaaaaa@4AFB@</m:annotation>
                              </m:semantics>
                           </m:math>
                        </p>
                     </c>
                     <c ca="center">
                        <p>10521</p>
                     </c>
                     <c ca="center">
                        <p>
                           <m:math name="1471-2105-7-362-i23" xmlns:m="http://www.w3.org/1998/Math/MathML">
                              <m:semantics>
                                 <m:mrow>
                                    <m:msubsup>
                                       <m:mi>&#916;</m:mi>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.65</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:mi>n</m:mi>
                                          <m:mi>o</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mo>&#8722;</m:mo>
                                          <m:mi>r</m:mi>
                                          <m:mi>e</m:mi>
                                          <m:mi>d</m:mi>
                                          <m:mi>u</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>d</m:mi>
                                          <m:mi>a</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>t</m:mi>
                                       </m:mrow>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.67</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:mi>k</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>o</m:mi>
                                          <m:mi>w</m:mi>
                                          <m:mi>n</m:mi>
                                       </m:mrow>
                                    </m:msubsup>
                                 </m:mrow>
                                 <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeGynauJaeiilaWIaemOBa4Maem4Ba8MaemOBa4MaeyOeI0IaemOCaiNaemyzauMaemizaqMaemyDauNaemOBa4MaemizaqMaemyyaeMaemOBa4MaemiDaqhabaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeG4naCJaeiilaWIaem4AaSMaemOBa4Maem4Ba8Maem4DaCNaemOBa4gaaaaa@52F9@</m:annotation>
                              </m:semantics>
                           </m:math>
                        </p>
                     </c>
                     <c ca="center">
                        <p>736</p>
                     </c>
                     <c ca="center">
                        <p>
                           <m:math name="1471-2105-7-362-i24" xmlns:m="http://www.w3.org/1998/Math/MathML">
                              <m:semantics>
                                 <m:mrow>
                                    <m:msubsup>
                                       <m:mi>&#916;</m:mi>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.65</m:mn>
                                       </m:mrow>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.67</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:mi>n</m:mi>
                                          <m:mi>o</m:mi>
                                          <m:mi>v</m:mi>
                                          <m:mi>e</m:mi>
                                          <m:mi>l</m:mi>
                                       </m:mrow>
                                    </m:msubsup>
                                 </m:mrow>
                                 <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeGynaudabaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeG4naCJaeiilaWIaemOBa4Maem4Ba8MaemODayNaemyzauMaemiBaWgaaaaa@408E@</m:annotation>
                              </m:semantics>
                           </m:math>
                        </p>
                     </c>
                     <c ca="center">
                        <p>215</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <m:math name="1471-2105-7-362-i8" xmlns:m="http://www.w3.org/1998/Math/MathML">
                              <m:semantics>
                                 <m:mrow>
                                    <m:msubsup>
                                       <m:mi>&#916;</m:mi>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.67</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:mi>g</m:mi>
                                          <m:mi>e</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>e</m:mi>
                                          <m:mi>r</m:mi>
                                          <m:mi>a</m:mi>
                                          <m:mi>l</m:mi>
                                       </m:mrow>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.69</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:mi>k</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>o</m:mi>
                                          <m:mi>w</m:mi>
                                          <m:mi>n</m:mi>
                                       </m:mrow>
                                    </m:msubsup>
                                 </m:mrow>
                                 <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeG4naCJaeiilaWIaem4zaCMaemyzauMaemOBa4MaemyzauMaemOCaiNaemyyaeMaemiBaWgabaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeGyoaKJaeiilaWIaem4AaSMaemOBa4Maem4Ba8Maem4DaCNaemOBa4gaaaaa@4B03@</m:annotation>
                              </m:semantics>
                           </m:math>
                        </p>
                     </c>
                     <c ca="center">
                        <p>5604</p>
                     </c>
                     <c ca="center">
                        <p>
                           <m:math name="1471-2105-7-362-i25" xmlns:m="http://www.w3.org/1998/Math/MathML">
                              <m:semantics>
                                 <m:mrow>
                                    <m:msubsup>
                                       <m:mi>&#916;</m:mi>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.67</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:mi>n</m:mi>
                                          <m:mi>o</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mo>&#8722;</m:mo>
                                          <m:mi>r</m:mi>
                                          <m:mi>e</m:mi>
                                          <m:mi>d</m:mi>
                                          <m:mi>u</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>d</m:mi>
                                          <m:mi>a</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>t</m:mi>
                                       </m:mrow>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.69</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:mi>k</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>o</m:mi>
                                          <m:mi>w</m:mi>
                                          <m:mi>n</m:mi>
                                       </m:mrow>
                                    </m:msubsup>
                                 </m:mrow>
                                 <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeG4naCJaeiilaWIaemOBa4Maem4Ba8MaemOBa4MaeyOeI0IaemOCaiNaemyzauMaemizaqMaemyDauNaemOBa4MaemizaqMaemyyaeMaemOBa4MaemiDaqhabaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeGyoaKJaeiilaWIaem4AaSMaemOBa4Maem4Ba8Maem4DaCNaemOBa4gaaaaa@5301@</m:annotation>
                              </m:semantics>
                           </m:math>
                        </p>
                     </c>
                     <c ca="center">
                        <p>585</p>
                     </c>
                     <c ca="center">
                        <p>
                           <m:math name="1471-2105-7-362-i26" xmlns:m="http://www.w3.org/1998/Math/MathML">
                              <m:semantics>
                                 <m:mrow>
                                    <m:msubsup>
                                       <m:mi>&#916;</m:mi>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.67</m:mn>
                                       </m:mrow>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.69</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:mi>n</m:mi>
                                          <m:mi>o</m:mi>
                                          <m:mi>v</m:mi>
                                          <m:mi>e</m:mi>
                                          <m:mi>l</m:mi>
                                       </m:mrow>
                                    </m:msubsup>
                                 </m:mrow>
                                 <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeG4naCdabaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeGyoaKJaeiilaWIaemOBa4Maem4Ba8MaemODayNaemyzauMaemiBaWgaaaaa@4096@</m:annotation>
                              </m:semantics>
                           </m:math>
                        </p>
                     </c>
                     <c ca="center">
                        <p>86</p>
                     </c>
                  </r>
               </tblbdy>
            </tbl>
            <tbl id="T3">
               <title>
                  <p>Table 3</p>
               </title>
               <caption>
                  <p>The number of proteins in general and non-redundant test sets in <m:math name="1471-2105-7-362-i2" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>&#916;</m:mi><m:mrow><m:msub><m:mi>v</m:mi><m:mn>1</m:mn></m:msub></m:mrow><m:mrow><m:msub><m:mi>v</m:mi><m:mn>2</m:mn></m:msub><m:mo>,</m:mo><m:mi>k</m:mi><m:mi>n</m:mi><m:mi>o</m:mi><m:mi>w</m:mi><m:mi>n</m:mi></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODay3aaSbaaWqaaiabigdaXaqabaaaleaacqWG2bGDdaWgaaadbaGaeGOmaidabeaaliabcYcaSiabdUgaRjabd6gaUjabd+gaVjabdEha3jabd6gaUbaaaaa@3B62@</m:annotation></m:semantics></m:math> which are selected from the known SCOP folds of <it>v</it><sub>2 </sub>with at least 10 protein chains in <it>v</it><sub>1</sub>.</p>
               </caption>
               <tblbdy cols="4">
                  <r>
                     <c ca="center">
                        <p>test set</p>
                     </c>
                     <c ca="center">
                        <p>size <it>(#proteins)</it></p>
                     </c>
                     <c ca="center">
                        <p>test set</p>
                     </c>
                     <c ca="center">
                        <p>size <it>(#proteins)</it></p>
                     </c>
                  </r>
                  <r>
                     <c cspan="4">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <m:math name="1471-2105-7-362-i7" xmlns:m="http://www.w3.org/1998/Math/MathML">
                              <m:semantics>
                                 <m:mrow>
                                    <m:msubsup>
                                       <m:mi>&#916;</m:mi>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.55</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:mi>g</m:mi>
                                          <m:mi>e</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>e</m:mi>
                                          <m:mi>r</m:mi>
                                          <m:mi>a</m:mi>
                                          <m:mi>l</m:mi>
                                       </m:mrow>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.57</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:mi>k</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>o</m:mi>
                                          <m:mi>w</m:mi>
                                          <m:mi>n</m:mi>
                                       </m:mrow>
                                    </m:msubsup>
                                 </m:mrow>
                                 <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODayNaeGymaeJaeiOla4IaeGynauJaeGynauJaeiilaWIaem4zaCMaemyzauMaemOBa4MaemyzauMaemOCaiNaemyyaeMaemiBaWgabaGaemODayNaeGymaeJaeiOla4IaeGynauJaeG4naCJaeiilaWIaem4AaSMaemOBa4Maem4Ba8Maem4DaCNaemOBa4gaaaaa@4AF7@</m:annotation>
                              </m:semantics>
                           </m:math>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1832</p>
                     </c>
                     <c ca="center">
                        <p>
                           <m:math name="1471-2105-7-362-i9" xmlns:m="http://www.w3.org/1998/Math/MathML">
                              <m:semantics>
                                 <m:mrow>
                                    <m:msubsup>
                                       <m:mi>&#916;</m:mi>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.55</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:mi>n</m:mi>
                                          <m:mi>o</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mo>&#8722;</m:mo>
                                          <m:mi>r</m:mi>
                                          <m:mi>e</m:mi>
                                          <m:mi>d</m:mi>
                                          <m:mi>u</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>d</m:mi>
                                          <m:mi>a</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>t</m:mi>
                                       </m:mrow>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.57</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:mi>k</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>o</m:mi>
                                          <m:mi>w</m:mi>
                                          <m:mi>n</m:mi>
                                       </m:mrow>
                                    </m:msubsup>
                                 </m:mrow>
                                 <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODayNaeGymaeJaeiOla4IaeGynauJaeGynauJaeiilaWIaemOBa4Maem4Ba8MaemOBa4MaeyOeI0IaemOCaiNaemyzauMaemizaqMaemyDauNaemOBa4MaemizaqMaemyyaeMaemOBa4MaemiDaqhabaGaemODayNaeGymaeJaeiOla4IaeGynauJaeG4naCJaeiilaWIaem4AaSMaemOBa4Maem4Ba8Maem4DaCNaemOBa4gaaaaa@52F5@</m:annotation>
                              </m:semantics>
                           </m:math>
                        </p>
                     </c>
                     <c ca="center">
                        <p>158</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <m:math name="1471-2105-7-362-i10" xmlns:m="http://www.w3.org/1998/Math/MathML">
                              <m:semantics>
                                 <m:mrow>
                                    <m:msubsup>
                                       <m:mi>&#916;</m:mi>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.57</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:mi>g</m:mi>
                                          <m:mi>e</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>e</m:mi>
                                          <m:mi>r</m:mi>
                                          <m:mi>a</m:mi>
                                          <m:mi>l</m:mi>
                                       </m:mrow>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.59</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:mi>k</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>o</m:mi>
                                          <m:mi>w</m:mi>
                                          <m:mi>n</m:mi>
                                       </m:mrow>
                                    </m:msubsup>
                                 </m:mrow>
                                 <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODayNaeGymaeJaeiOla4IaeGynauJaeG4naCJaeiilaWIaem4zaCMaemyzauMaemOBa4MaemyzauMaemOCaiNaemyyaeMaemiBaWgabaGaemODayNaeGymaeJaeiOla4IaeGynauJaeGyoaKJaeiilaWIaem4AaSMaemOBa4Maem4Ba8Maem4DaCNaemOBa4gaaaaa@4AFF@</m:annotation>
                              </m:semantics>
                           </m:math>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1901</p>
                     </c>
                     <c ca="center">
                        <p>
                           <m:math name="1471-2105-7-362-i11" xmlns:m="http://www.w3.org/1998/Math/MathML">
                              <m:semantics>
                                 <m:mrow>
                                    <m:msubsup>
                                       <m:mi>&#916;</m:mi>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.57</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:mi>n</m:mi>
                                          <m:mi>o</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mo>&#8722;</m:mo>
                                          <m:mi>r</m:mi>
                                          <m:mi>e</m:mi>
                                          <m:mi>d</m:mi>
                                          <m:mi>u</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>d</m:mi>
                                          <m:mi>a</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>t</m:mi>
                                       </m:mrow>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.59</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:mi>k</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>o</m:mi>
                                          <m:mi>w</m:mi>
                                          <m:mi>n</m:mi>
                                       </m:mrow>
                                    </m:msubsup>
                                 </m:mrow>
                                 <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODayNaeGymaeJaeiOla4IaeGynauJaeG4naCJaeiilaWIaemOBa4Maem4Ba8MaemOBa4MaeyOeI0IaemOCaiNaemyzauMaemizaqMaemyDauNaemOBa4MaemizaqMaemyyaeMaemOBa4MaemiDaqhabaGaemODayNaeGymaeJaeiOla4IaeGynauJaeGyoaKJaeiilaWIaem4AaSMaemOBa4Maem4Ba8Maem4DaCNaemOBa4gaaaaa@52FD@</m:annotation>
                              </m:semantics>
                           </m:math>
                        </p>
                     </c>
                     <c ca="center">
                        <p>168</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <m:math name="1471-2105-7-362-i13" xmlns:m="http://www.w3.org/1998/Math/MathML">
                              <m:semantics>
                                 <m:mrow>
                                    <m:msubsup>
                                       <m:mi>&#916;</m:mi>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.59</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:mi>g</m:mi>
                                          <m:mi>e</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>e</m:mi>
                                          <m:mi>r</m:mi>
                                          <m:mi>a</m:mi>
                                          <m:mi>l</m:mi>
                                       </m:mrow>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.61</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:mi>k</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>o</m:mi>
                                          <m:mi>w</m:mi>
                                          <m:mi>n</m:mi>
                                       </m:mrow>
                                    </m:msubsup>
                                 </m:mrow>
                                 <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODayNaeGymaeJaeiOla4IaeGynauJaeGyoaKJaeiilaWIaem4zaCMaemyzauMaemOBa4MaemyzauMaemOCaiNaemyyaeMaemiBaWgabaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeGymaeJaeiilaWIaem4AaSMaemOBa4Maem4Ba8Maem4DaCNaemOBa4gaaaaa@4AF5@</m:annotation>
                              </m:semantics>
                           </m:math>
                        </p>
                     </c>
                     <c ca="center">
                        <p>2136</p>
                     </c>
                     <c ca="center">
                        <p>
                           <m:math name="1471-2105-7-362-i14" xmlns:m="http://www.w3.org/1998/Math/MathML">
                              <m:semantics>
                                 <m:mrow>
                                    <m:msubsup>
                                       <m:mi>&#916;</m:mi>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.59</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:mi>n</m:mi>
                                          <m:mi>o</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mo>&#8722;</m:mo>
                                          <m:mi>r</m:mi>
                                          <m:mi>e</m:mi>
                                          <m:mi>d</m:mi>
                                          <m:mi>u</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>d</m:mi>
                                          <m:mi>a</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>t</m:mi>
                                       </m:mrow>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.61</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:mi>k</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>o</m:mi>
                                          <m:mi>w</m:mi>
                                          <m:mi>n</m:mi>
                                       </m:mrow>
                                    </m:msubsup>
                                 </m:mrow>
                                 <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODayNaeGymaeJaeiOla4IaeGynauJaeGyoaKJaeiilaWIaemOBa4Maem4Ba8MaemOBa4MaeyOeI0IaemOCaiNaemyzauMaemizaqMaemyDauNaemOBa4MaemizaqMaemyyaeMaemOBa4MaemiDaqhabaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeGymaeJaeiilaWIaem4AaSMaemOBa4Maem4Ba8Maem4DaCNaemOBa4gaaaaa@52F3@</m:annotation>
                              </m:semantics>
                           </m:math>
                        </p>
                     </c>
                     <c ca="center">
                        <p>166</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <m:math name="1471-2105-7-362-i16" xmlns:m="http://www.w3.org/1998/Math/MathML">
                              <m:semantics>
                                 <m:mrow>
                                    <m:msubsup>
                                       <m:mi>&#916;</m:mi>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.61</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:mi>g</m:mi>
                                          <m:mi>e</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>e</m:mi>
                                          <m:mi>r</m:mi>
                                          <m:mi>a</m:mi>
                                          <m:mi>l</m:mi>
                                       </m:mrow>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.63</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:mi>k</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>o</m:mi>
                                          <m:mi>w</m:mi>
                                          <m:mi>n</m:mi>
                                       </m:mrow>
                                    </m:msubsup>
                                 </m:mrow>
                                 <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeGymaeJaeiilaWIaem4zaCMaemyzauMaemOBa4MaemyzauMaemOCaiNaemyyaeMaemiBaWgabaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeG4mamJaeiilaWIaem4AaSMaemOBa4Maem4Ba8Maem4DaCNaemOBa4gaaaaa@4AEB@</m:annotation>
                              </m:semantics>
                           </m:math>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1947</p>
                     </c>
                     <c ca="center">
                        <p>
                           <m:math name="1471-2105-7-362-i17" xmlns:m="http://www.w3.org/1998/Math/MathML">
                              <m:semantics>
                                 <m:mrow>
                                    <m:msubsup>
                                       <m:mi>&#916;</m:mi>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.61</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:mi>n</m:mi>
                                          <m:mi>o</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mo>&#8722;</m:mo>
                                          <m:mi>r</m:mi>
                                          <m:mi>e</m:mi>
                                          <m:mi>d</m:mi>
                                          <m:mi>u</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>d</m:mi>
                                          <m:mi>a</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>t</m:mi>
                                       </m:mrow>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.63</m:mn>
                                          <m:mi>k</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>o</m:mi>
                                          <m:mi>w</m:mi>
                                          <m:mi>n</m:mi>
                                       </m:mrow>
                                    </m:msubsup>
                                 </m:mrow>
                                 <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeGymaeJaeiilaWIaemOBa4Maem4Ba8MaemOBa4MaeyOeI0IaemOCaiNaemyzauMaemizaqMaemyDauNaemOBa4MaemizaqMaemyyaeMaemOBa4MaemiDaqhabaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeG4mamJaem4AaSMaemOBa4Maem4Ba8Maem4DaCNaemOBa4gaaaaa@5209@</m:annotation>
                              </m:semantics>
                           </m:math>
                        </p>
                     </c>
                     <c ca="center">
                        <p>189</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <m:math name="1471-2105-7-362-i19" xmlns:m="http://www.w3.org/1998/Math/MathML">
                              <m:semantics>
                                 <m:mrow>
                                    <m:msubsup>
                                       <m:mi>&#916;</m:mi>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.63</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:mi>g</m:mi>
                                          <m:mi>e</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>e</m:mi>
                                          <m:mi>r</m:mi>
                                          <m:mi>a</m:mi>
                                          <m:mi>l</m:mi>
                                       </m:mrow>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.65</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:mi>k</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>o</m:mi>
                                          <m:mi>w</m:mi>
                                          <m:mi>n</m:mi>
                                       </m:mrow>
                                    </m:msubsup>
                                 </m:mrow>
                                 <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeG4mamJaeiilaWIaem4zaCMaemyzauMaemOBa4MaemyzauMaemOCaiNaemyyaeMaemiBaWgabaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeGynauJaeiilaWIaem4AaSMaemOBa4Maem4Ba8Maem4DaCNaemOBa4gaaaaa@4AF3@</m:annotation>
                              </m:semantics>
                           </m:math>
                        </p>
                     </c>
                     <c ca="center">
                        <p>2062</p>
                     </c>
                     <c ca="center">
                        <p>
                           <m:math name="1471-2105-7-362-i20" xmlns:m="http://www.w3.org/1998/Math/MathML">
                              <m:semantics>
                                 <m:mrow>
                                    <m:msubsup>
                                       <m:mi>&#916;</m:mi>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.63</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:mi>n</m:mi>
                                          <m:mi>o</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mo>&#8722;</m:mo>
                                          <m:mi>r</m:mi>
                                          <m:mi>e</m:mi>
                                          <m:mi>d</m:mi>
                                          <m:mi>u</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>d</m:mi>
                                          <m:mi>a</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>t</m:mi>
                                       </m:mrow>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.65</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:mi>k</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>o</m:mi>
                                          <m:mi>w</m:mi>
                                          <m:mi>n</m:mi>
                                       </m:mrow>
                                    </m:msubsup>
                                 </m:mrow>
                                 <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeG4mamJaeiilaWIaemOBa4Maem4Ba8MaemOBa4MaeyOeI0IaemOCaiNaemyzauMaemizaqMaemyDauNaemOBa4MaemizaqMaemyyaeMaemOBa4MaemiDaqhabaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeGynauJaeiilaWIaem4AaSMaemOBa4Maem4Ba8Maem4DaCNaemOBa4gaaaaa@52F1@</m:annotation>
                              </m:semantics>
                           </m:math>
                        </p>
                     </c>
                     <c ca="center">
                        <p>198</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <m:math name="1471-2105-7-362-i22" xmlns:m="http://www.w3.org/1998/Math/MathML">
                              <m:semantics>
                                 <m:mrow>
                                    <m:msubsup>
                                       <m:mi>&#916;</m:mi>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.65</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:mi>g</m:mi>
                                          <m:mi>e</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>e</m:mi>
                                          <m:mi>r</m:mi>
                                          <m:mi>a</m:mi>
                                          <m:mi>l</m:mi>
                                       </m:mrow>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.67</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:mi>k</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>o</m:mi>
                                          <m:mi>w</m:mi>
                                          <m:mi>n</m:mi>
                                       </m:mrow>
                                    </m:msubsup>
                                 </m:mrow>
                                 <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeGynauJaeiilaWIaem4zaCMaemyzauMaemOBa4MaemyzauMaemOCaiNaemyyaeMaemiBaWgabaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeG4naCJaeiilaWIaem4AaSMaemOBa4Maem4Ba8Maem4DaCNaemOBa4gaaaaa@4AFB@</m:annotation>
                              </m:semantics>
                           </m:math>
                        </p>
                     </c>
                     <c ca="center">
                        <p>4735</p>
                     </c>
                     <c ca="center">
                        <p>
                           <m:math name="1471-2105-7-362-i23" xmlns:m="http://www.w3.org/1998/Math/MathML">
                              <m:semantics>
                                 <m:mrow>
                                    <m:msubsup>
                                       <m:mi>&#916;</m:mi>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.65</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:mi>n</m:mi>
                                          <m:mi>o</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mo>&#8722;</m:mo>
                                          <m:mi>r</m:mi>
                                          <m:mi>e</m:mi>
                                          <m:mi>d</m:mi>
                                          <m:mi>u</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>d</m:mi>
                                          <m:mi>a</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>t</m:mi>
                                       </m:mrow>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.67</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:mi>k</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>o</m:mi>
                                          <m:mi>w</m:mi>
                                          <m:mi>n</m:mi>
                                       </m:mrow>
                                    </m:msubsup>
                                 </m:mrow>
                                 <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeGynauJaeiilaWIaemOBa4Maem4Ba8MaemOBa4MaeyOeI0IaemOCaiNaemyzauMaemizaqMaemyDauNaemOBa4MaemizaqMaemyyaeMaemOBa4MaemiDaqhabaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeG4naCJaeiilaWIaem4AaSMaemOBa4Maem4Ba8Maem4DaCNaemOBa4gaaaaa@52F9@</m:annotation>
                              </m:semantics>
                           </m:math>
                        </p>
                     </c>
                     <c ca="center">
                        <p>302</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <m:math name="1471-2105-7-362-i8" xmlns:m="http://www.w3.org/1998/Math/MathML">
                              <m:semantics>
                                 <m:mrow>
                                    <m:msubsup>
                                       <m:mi>&#916;</m:mi>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.67</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:mi>g</m:mi>
                                          <m:mi>e</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>e</m:mi>
                                          <m:mi>r</m:mi>
                                          <m:mi>a</m:mi>
                                          <m:mi>l</m:mi>
                                       </m:mrow>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.69</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:mi>k</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>o</m:mi>
                                          <m:mi>w</m:mi>
                                          <m:mi>n</m:mi>
                                       </m:mrow>
                                    </m:msubsup>
                                 </m:mrow>
                                 <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeG4naCJaeiilaWIaem4zaCMaemyzauMaemOBa4MaemyzauMaemOCaiNaemyyaeMaemiBaWgabaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeGyoaKJaeiilaWIaem4AaSMaemOBa4Maem4Ba8Maem4DaCNaemOBa4gaaaaa@4B03@</m:annotation>
                              </m:semantics>
                           </m:math>
                        </p>
                     </c>
                     <c ca="center">
                        <p>2298</p>
                     </c>
                     <c ca="center">
                        <p>
                           <m:math name="1471-2105-7-362-i25" xmlns:m="http://www.w3.org/1998/Math/MathML">
                              <m:semantics>
                                 <m:mrow>
                                    <m:msubsup>
                                       <m:mi>&#916;</m:mi>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.67</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:mi>n</m:mi>
                                          <m:mi>o</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mo>&#8722;</m:mo>
                                          <m:mi>r</m:mi>
                                          <m:mi>e</m:mi>
                                          <m:mi>d</m:mi>
                                          <m:mi>u</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>d</m:mi>
                                          <m:mi>a</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>t</m:mi>
                                       </m:mrow>
                                       <m:mrow>
                                          <m:mi>v</m:mi>
                                          <m:mn>1.69</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:mi>k</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>o</m:mi>
                                          <m:mi>w</m:mi>
                                          <m:mi>n</m:mi>
                                       </m:mrow>
                                    </m:msubsup>
                                 </m:mrow>
                                 <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeG4naCJaeiilaWIaemOBa4Maem4Ba8MaemOBa4MaeyOeI0IaemOCaiNaemyzauMaemizaqMaemyDauNaemOBa4MaemizaqMaemyyaeMaemOBa4MaemiDaqhabaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeGyoaKJaeiilaWIaem4AaSMaemOBa4Maem4Ba8Maem4DaCNaemOBa4gaaaaa@5301@</m:annotation>
                              </m:semantics>
                           </m:math>
                        </p>
                     </c>
                     <c ca="center">
                        <p>263</p>
                     </c>
                  </r>
               </tblbdy>
            </tbl>
            <fig id="F2">
               <title>
                  <p>Figure 2</p>
               </title>
               <caption>
                  <p>The <it>Correct Classification Rate </it>of assigning the known folds for various SCOP releases using <it>E-Predict </it>on (a) general and non-redundant test set in <m:math name="1471-2105-7-362-i2" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>&#916;</m:mi><m:mrow><m:msub><m:mi>v</m:mi><m:mn>1</m:mn></m:msub></m:mrow><m:mrow><m:msub><m:mi>v</m:mi><m:mn>2</m:mn></m:msub><m:mo>,</m:mo><m:mi>k</m:mi><m:mi>n</m:mi><m:mi>o</m:mi><m:mi>w</m:mi><m:mi>n</m:mi></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODay3aaSbaaWqaaiabigdaXaqabaaaleaacqWG2bGDdaWgaaadbaGaeGOmaidabeaaliabcYcaSiabdUgaRjabd6gaUjabd+gaVjabdEha3jabd6gaUbaaaaa@3B62@</m:annotation></m:semantics></m:math> which are selected from the known SCOP folds of <it>v</it><sub>2 </sub>with at least one protein chain in <it>v</it><sub>1 </sub>(Table 2) (b) general and non-redundant test set in <m:math name="1471-2105-7-362-i2" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>&#916;</m:mi><m:mrow><m:msub><m:mi>v</m:mi><m:mn>1</m:mn></m:msub></m:mrow><m:mrow><m:msub><m:mi>v</m:mi><m:mn>2</m:mn></m:msub><m:mo>,</m:mo><m:mi>k</m:mi><m:mi>n</m:mi><m:mi>o</m:mi><m:mi>w</m:mi><m:mi>n</m:mi></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODay3aaSbaaWqaaiabigdaXaqabaaaleaacqWG2bGDdaWgaaadbaGaeGOmaidabeaaliabcYcaSiabdUgaRjabd6gaUjabd+gaVjabdEha3jabd6gaUbaaaaa@3B62@</m:annotation></m:semantics></m:math> which are selected from the known SCOP folds of <it>v</it><sub>2 </sub>with at least 10 protein chains in <it>v</it><sub>1 </sub>(Table 3)</p>
               </caption>
               <text>
                  <p>The <it>Correct Classification Rate </it>of assigning the known folds for various SCOP releases using <it>E-Predict </it>on (a) general and non-redundant test set in <m:math name="1471-2105-7-362-i2" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>&#916;</m:mi><m:mrow><m:msub><m:mi>v</m:mi><m:mn>1</m:mn></m:msub></m:mrow><m:mrow><m:msub><m:mi>v</m:mi><m:mn>2</m:mn></m:msub><m:mo>,</m:mo><m:mi>k</m:mi><m:mi>n</m:mi><m:mi>o</m:mi><m:mi>w</m:mi><m:mi>n</m:mi></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODay3aaSbaaWqaaiabigdaXaqabaaaleaacqWG2bGDdaWgaaadbaGaeGOmaidabeaaliabcYcaSiabdUgaRjabd6gaUjabd+gaVjabdEha3jabd6gaUbaaaaa@3B62@</m:annotation></m:semantics></m:math> which are selected from the known SCOP folds of <it>v</it><sub>2 </sub>with at least one protein chain in <it>v</it><sub>1 </sub>(Table 2) (b) general and non-redundant test set in <m:math name="1471-2105-7-362-i2" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>&#916;</m:mi><m:mrow><m:msub><m:mi>v</m:mi><m:mn>1</m:mn></m:msub></m:mrow><m:mrow><m:msub><m:mi>v</m:mi><m:mn>2</m:mn></m:msub><m:mo>,</m:mo><m:mi>k</m:mi><m:mi>n</m:mi><m:mi>o</m:mi><m:mi>w</m:mi><m:mi>n</m:mi></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODay3aaSbaaWqaaiabigdaXaqabaaaleaacqWG2bGDdaWgaaadbaGaeGOmaidabeaaliabcYcaSiabdUgaRjabd6gaUjabd+gaVjabdEha3jabd6gaUbaaaaa@3B62@</m:annotation></m:semantics></m:math> which are selected from the known SCOP folds of <it>v</it><sub>2 </sub>with at least 10 protein chains in <it>v</it><sub>1 </sub>(Table 3).</p>
               </text>
               <graphic file="1471-2105-7-362-2"/>
            </fig>
            <fig id="F3">
               <title>
                  <p>Figure 3</p>
               </title>
               <caption>
                  <p>The amount of proteins in the folds against the number of SCOP folds in the SCOP <it>v</it>1.69 release</p>
               </caption>
               <text>
                  <p>The amount of proteins in the folds against the number of SCOP folds in the SCOP <it>v</it>1.69 release.</p>
               </text>
               <graphic file="1471-2105-7-362-3"/>
            </fig>
            <p>The third experiment evaluates the accuracy of <it>E-Predict </it>on <it>non-redundant </it>test sets, which are obtained from randomly sampling one protein chain among each SCOP superfamily. In Table <tblr tid="T2">2</tblr> and Table <tblr tid="T3">3</tblr>, a <it>non-redundant </it>test set <m:math name="1471-2105-7-362-i27" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>&#916;</m:mi><m:mrow><m:msub><m:mi>v</m:mi><m:mn>1</m:mn></m:msub><m:mo>,</m:mo><m:mi>n</m:mi><m:mi>o</m:mi><m:mi>n</m:mi><m:mo>&#8722;</m:mo><m:mi>r</m:mi><m:mi>e</m:mi><m:mi>d</m:mi><m:mi>u</m:mi><m:mi>n</m:mi><m:mi>d</m:mi><m:mi>a</m:mi><m:mi>n</m:mi><m:mi>t</m:mi></m:mrow><m:mrow><m:msub><m:mi>v</m:mi><m:mn>2</m:mn></m:msub><m:mo>,</m:mo><m:mi>k</m:mi><m:mi>n</m:mi><m:mi>o</m:mi><m:mi>w</m:mi><m:mi>n</m:mi></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODay3aaSbaaWqaaiabigdaXaqabaWccqGGSaalcqWGUbGBcqWGVbWBcqWGUbGBcqGHsislcqWGYbGCcqWGLbqzcqWGKbazcqWG1bqDcqWGUbGBcqWGKbazcqWGHbqycqWGUbGBcqWG0baDaeaacqWG2bGDdaWgaaadbaGaeGOmaidabeaaliabcYcaSiabdUgaRjabd6gaUjabd+gaVjabdEha3jabd6gaUbaaaaa@4DBB@</m:annotation></m:semantics></m:math> is defined by randomly selecting one protein from each SCOP superfamily of the general test set <m:math name="1471-2105-7-362-i28" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>&#916;</m:mi><m:mrow><m:msub><m:mi>v</m:mi><m:mn>1</m:mn></m:msub><m:mo>,</m:mo><m:mi>g</m:mi><m:mi>e</m:mi><m:mi>n</m:mi><m:mi>e</m:mi><m:mi>r</m:mi><m:mi>a</m:mi><m:mi>l</m:mi></m:mrow><m:mrow><m:msub><m:mi>v</m:mi><m:mn>2</m:mn></m:msub><m:mo>,</m:mo><m:mi>k</m:mi><m:mi>n</m:mi><m:mi>o</m:mi><m:mi>w</m:mi><m:mi>n</m:mi></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODay3aaSbaaWqaaiabigdaXaqabaWccqGGSaalcqWGNbWzcqWGLbqzcqWGUbGBcqWGLbqzcqWGYbGCcqWGHbqycqWGSbaBaeaacqWG2bGDdaWgaaadbaGaeGOmaidabeaaliabcYcaSiabdUgaRjabd6gaUjabd+gaVjabdEha3jabd6gaUbaaaaa@45BD@</m:annotation></m:semantics></m:math>. According to SCOP <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>, proteins between two different SCOP superfamilies have low sequence similarities, which suggest that test proteins in our <it>non-redundant </it>sets should maintain low sequence similarities. Table <tblr tid="T4">4</tblr> measures the degree of sequence redundancy for 10 pairs of proteins, which are randomly sampled from the <it>non-redundant </it>set <m:math name="1471-2105-7-362-i25" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>&#916;</m:mi><m:mrow><m:mi>v</m:mi><m:mn>1.67</m:mn><m:mo>,</m:mo><m:mi>n</m:mi><m:mi>o</m:mi><m:mi>n</m:mi><m:mo>&#8722;</m:mo><m:mi>r</m:mi><m:mi>e</m:mi><m:mi>d</m:mi><m:mi>u</m:mi><m:mi>n</m:mi><m:mi>d</m:mi><m:mi>a</m:mi><m:mi>n</m:mi><m:mi>t</m:mi></m:mrow><m:mrow><m:mi>v</m:mi><m:mn>1.69</m:mn><m:mo>,</m:mo><m:mi>k</m:mi><m:mi>n</m:mi><m:mi>o</m:mi><m:mi>w</m:mi><m:mi>n</m:mi></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeG4naCJaeiilaWIaemOBa4Maem4Ba8MaemOBa4MaeyOeI0IaemOCaiNaemyzauMaemizaqMaemyDauNaemOBa4MaemizaqMaemyyaeMaemOBa4MaemiDaqhabaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeGyoaKJaeiilaWIaem4AaSMaemOBa4Maem4Ba8Maem4DaCNaemOBa4gaaaaa@5301@</m:annotation></m:semantics></m:math> with the average sequence identity and sequence similarity equal to 12.55% and 21.17%, respectively. In addition, the experiment using the <it>non-redundant </it>test sets avoids the case that some folds in the general test sets predominate the classification accuracy with relatively more test proteins. For example, there are 900 out of 1000 test proteins in a general test from the same SCOP fold <it>f</it><sub>1</sub>. The quantity of this fold may affect the accuracy significantly when a majority of these 900 proteins are correctly classified. In Figure <figr fid="F2">2(a)</figr>, <it>E-Predict </it>presents a reduction of accuracies on several sets of <it>non-redundant </it>proteins in comparison with the general test sets in Table <tblr tid="T2">2</tblr>, which includes small-size folds. This gap demonstrates that the impact of some SCOP folds with outnumbered proteins in the general test sets improves the overall accuracy. Figure <figr fid="F2">2(b)</figr> shows that <it>E-Predict </it>exhibits similar accuracies on seven sets of the <it>non-redundant </it>proteins in comparison with the general test sets in Table <tblr tid="T3">3</tblr>, which have at least 10 ground truth proteins. This suggests that with a sufficient amount of ground truth data <it>non-redundant </it>proteins can still be classified with a reasonably high accuracy.</p>
            <tbl id="T4">
               <title>
                  <p>Table 4</p>
               </title>
               <caption>
                  <p>The sequence redundancy in a set that contains 10 pairs of proteins, which are randomly sampled from <m:math name="1471-2105-7-362-i25" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>&#916;</m:mi><m:mrow><m:mi>v</m:mi><m:mn>1.67</m:mn><m:mo>,</m:mo><m:mi>n</m:mi><m:mi>o</m:mi><m:mi>n</m:mi><m:mo>&#8722;</m:mo><m:mi>r</m:mi><m:mi>e</m:mi><m:mi>d</m:mi><m:mi>u</m:mi><m:mi>n</m:mi><m:mi>d</m:mi><m:mi>a</m:mi><m:mi>n</m:mi><m:mi>t</m:mi></m:mrow><m:mrow><m:mi>v</m:mi><m:mn>1.69</m:mn><m:mo>,</m:mo><m:mi>k</m:mi><m:mi>n</m:mi><m:mi>o</m:mi><m:mi>w</m:mi><m:mi>n</m:mi></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeG4naCJaeiilaWIaemOBa4Maem4Ba8MaemOBa4MaeyOeI0IaemOCaiNaemyzauMaemizaqMaemyDauNaemOBa4MaemizaqMaemyyaeMaemOBa4MaemiDaqhabaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeGyoaKJaeiilaWIaem4AaSMaemOBa4Maem4Ba8Maem4DaCNaemOBa4gaaaaa@5301@</m:annotation></m:semantics></m:math></p>
               </caption>
               <tblbdy cols="7">
                  <r>
                     <c ca="center">
                        <p>
                           <it>pairs</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p><it>pdb</it>_<it>id</it><sub>1</sub></p>
                     </c>
                     <c ca="center">
                        <p><it>super family</it>_<it>id</it><sub>1</sub></p>
                     </c>
                     <c ca="center">
                        <p><it>pdb</it>_<it>id</it><sub>2</sub></p>
                     </c>
                     <c ca="center">
                        <p><it>super family</it>_<it>id</it><sub>2</sub></p>
                     </c>
                     <c ca="center">
                        <p>sequence identity</p>
                     </c>
                     <c ca="center">
                        <p>sequence similarity</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>01</p>
                     </c>
                     <c ca="center">
                        <p>1<it>osd</it>_<it>A</it></p>
                     </c>
                     <c ca="center">
                        <p>55008</p>
                     </c>
                     <c ca="center">
                        <p>1<it>uta</it>_<it>A</it></p>
                     </c>
                     <c ca="center">
                        <p>110997</p>
                     </c>
                     <c ca="center">
                        <p>2.10%</p>
                     </c>
                     <c ca="center">
                        <p>3.50%</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>02</p>
                     </c>
                     <c ca="center">
                        <p>1<it>ug</it>8_<it>A</it></p>
                     </c>
                     <c ca="center">
                        <p>82708</p>
                     </c>
                     <c ca="center">
                        <p>1<it>vm</it>0_<it>A</it></p>
                     </c>
                     <c ca="center">
                        <p>82704</p>
                     </c>
                     <c ca="center">
                        <p>12.80%</p>
                     </c>
                     <c ca="center">
                        <p>26.80%</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>03</p>
                     </c>
                     <c ca="center">
                        <p>1<it>v</it>5<it>n</it>_<it>A</it></p>
                     </c>
                     <c ca="center">
                        <p>57889</p>
                     </c>
                     <c ca="center">
                        <p>1<it>rq</it>8_<it>A</it></p>
                     </c>
                     <c ca="center">
                        <p>75471</p>
                     </c>
                     <c ca="center">
                        <p>13.60%</p>
                     </c>
                     <c ca="center">
                        <p>23.50%</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>04</p>
                     </c>
                     <c ca="center">
                        <p>1<it>veu</it>_<it>B</it></p>
                     </c>
                     <c ca="center">
                        <p>103196</p>
                     </c>
                     <c ca="center">
                        <p>1<it>j</it>3<it>m</it>_<it>A</it></p>
                     </c>
                     <c ca="center">
                        <p>103247</p>
                     </c>
                     <c ca="center">
                        <p>22.40%</p>
                     </c>
                     <c ca="center">
                        <p>34.20%</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>05</p>
                     </c>
                     <c ca="center">
                        <p>1<it>tu</it>1_<it>B</it></p>
                     </c>
                     <c ca="center">
                        <p>55724</p>
                     </c>
                     <c ca="center">
                        <p>1<it>smb</it>_<it>A</it></p>
                     </c>
                     <c ca="center">
                        <p>55797</p>
                     </c>
                     <c ca="center">
                        <p>6.80%</p>
                     </c>
                     <c ca="center">
                        <p>10.80%</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>06</p>
                     </c>
                     <c ca="center">
                        <p>1<it>thq</it>_<it>A</it></p>
                     </c>
                     <c ca="center">
                        <p>56925</p>
                     </c>
                     <c ca="center">
                        <p>1<it>xfs</it>_<it>B</it></p>
                     </c>
                     <c ca="center">
                        <p>55961</p>
                     </c>
                     <c ca="center">
                        <p>18.10%</p>
                     </c>
                     <c ca="center">
                        <p>28.40%</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>07</p>
                     </c>
                     <c ca="center">
                        <p>1<it>vki</it>_<it>B</it></p>
                     </c>
                     <c ca="center">
                        <p>55826</p>
                     </c>
                     <c ca="center">
                        <p>1<it>sk</it>3_<it>A</it></p>
                     </c>
                     <c ca="center">
                        <p>55846</p>
                     </c>
                     <c ca="center">
                        <p>17.70%</p>
                     </c>
                     <c ca="center">
                        <p>30.50%</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>08</p>
                     </c>
                     <c ca="center">
                        <p>1<it>tf</it>1_<it>D</it></p>
                     </c>
                     <c ca="center">
                        <p>55781</p>
                     </c>
                     <c ca="center">
                        <p>1<it>pp</it>6_<it>E</it></p>
                     </c>
                     <c ca="center">
                        <p>55676</p>
                     </c>
                     <c ca="center">
                        <p>10.30%</p>
                     </c>
                     <c ca="center">
                        <p>17.50%</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>09</p>
                     </c>
                     <c ca="center">
                        <p>1<it>ucd</it>_<it>A</it></p>
                     </c>
                     <c ca="center">
                        <p>55895</p>
                     </c>
                     <c ca="center">
                        <p>1<it>vkw</it>_<it>A</it></p>
                     </c>
                     <c ca="center">
                        <p>55469</p>
                     </c>
                     <c ca="center">
                        <p>9.00%</p>
                     </c>
                     <c ca="center">
                        <p>14.70%</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>10</p>
                     </c>
                     <c ca="center">
                        <p>1<it>tt</it>4_<it>A</it></p>
                     </c>
                     <c ca="center">
                        <p>55931</p>
                     </c>
                     <c ca="center">
                        <p>1<it>vkp</it>_<it>A</it></p>
                     </c>
                     <c ca="center">
                        <p>55909</p>
                     </c>
                     <c ca="center">
                        <p>12.70%</p>
                     </c>
                     <c ca="center">
                        <p>21.80%</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>Avg. 12.55%</p>
                     </c>
                     <c ca="center">
                        <p>Avg. 21.17%</p>
                     </c>
                  </r>
               </tblbdy>
            </tbl>
         </sec>
         <sec>
            <st>
               <p>Recognizing the novel folds for newly-discovered proteins</p>
            </st>
            <p>We measure the accuracies of classifying six sets of proteins with the novel folds from <m:math name="1471-2105-7-362-i29" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>&#916;</m:mi><m:mrow><m:mi>v</m:mi><m:mn>1.57</m:mn></m:mrow><m:mrow><m:mi>v</m:mi><m:mn>1.59</m:mn><m:mo>,</m:mo><m:mi>n</m:mi><m:mi>o</m:mi><m:mi>v</m:mi><m:mi>e</m:mi><m:mi>l</m:mi></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODayNaeGymaeJaeiOla4IaeGynauJaeG4naCdabaGaemODayNaeGymaeJaeiOla4IaeGynauJaeGyoaKJaeiilaWIaemOBa4Maem4Ba8MaemODayNaemyzauMaemiBaWgaaaaa@4092@</m:annotation></m:semantics></m:math> to <m:math name="1471-2105-7-362-i30" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>&#916;</m:mi><m:mrow><m:mi>v</m:mi><m:mn>1.67</m:mn></m:mrow><m:mrow><m:mi>v</m:mi><m:mn>1.69</m:mn><m:mo>,</m:mo><m:mi>n</m:mi><m:mi>o</m:mi><m:mi>v</m:mi><m:mi>e</m:mi><m:mi>l</m:mi></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeG4naCdabaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeGyoaKJaeiilaWIaemOBa4Maem4Ba8MaemODayNaemyzauMaemiBaWgaaaaa@4096@</m:annotation></m:semantics></m:math>, which are listed in Table <tblr tid="T2">2</tblr>. We accumulate labeled proteins from the prior SCOP releases to obtain more ground truth data. For example, when an experiment is conducted with test proteins from <m:math name="1471-2105-7-362-i30" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>&#916;</m:mi><m:mrow><m:mi>v</m:mi><m:mn>1.67</m:mn></m:mrow><m:mrow><m:mi>v</m:mi><m:mn>1.69</m:mn><m:mo>,</m:mo><m:mi>n</m:mi><m:mi>o</m:mi><m:mi>v</m:mi><m:mi>e</m:mi><m:mi>l</m:mi></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeG4naCdabaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeGyoaKJaeiilaWIaemOBa4Maem4Ba8MaemODayNaemyzauMaemiBaWgaaaaa@4096@</m:annotation></m:semantics></m:math>, our ground truth data is composed of new proteins from <m:math name="1471-2105-7-362-i31" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>&#916;</m:mi><m:mrow><m:mi>v</m:mi><m:mn>1.55</m:mn></m:mrow><m:mrow><m:mi>v</m:mi><m:mn>1.67</m:mn></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODayNaeGymaeJaeiOla4IaeGynauJaeGynaudabaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeG4naCdaaaaa@38B7@</m:annotation></m:semantics></m:math>. We compare our <it>E-Predict </it>algorithm with two prevalent classification methods, <it>Nearest Neighbor </it>search (NN) <abbrgrp><abbr bid="B25">25</abbr></abbrgrp> and <it>C4.5 Decision Tree </it>(<it>DT</it>) <abbrgrp><abbr bid="B26">26</abbr></abbrgrp>. Figure <figr fid="F4">4</figr> presents a plot of <it>CCR </it>against six test sets <m:math name="1471-2105-7-362-i29" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>&#916;</m:mi><m:mrow><m:mi>v</m:mi><m:mn>1.57</m:mn></m:mrow><m:mrow><m:mi>v</m:mi><m:mn>1.59</m:mn><m:mo>,</m:mo><m:mi>n</m:mi><m:mi>o</m:mi><m:mi>v</m:mi><m:mi>e</m:mi><m:mi>l</m:mi></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODayNaeGymaeJaeiOla4IaeGynauJaeG4naCdabaGaemODayNaeGymaeJaeiOla4IaeGynauJaeGyoaKJaeiilaWIaemOBa4Maem4Ba8MaemODayNaemyzauMaemiBaWgaaaaa@4092@</m:annotation></m:semantics></m:math> to <m:math name="1471-2105-7-362-i30" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>&#916;</m:mi><m:mrow><m:mi>v</m:mi><m:mn>1.67</m:mn></m:mrow><m:mrow><m:mi>v</m:mi><m:mn>1.69</m:mn><m:mo>,</m:mo><m:mi>n</m:mi><m:mi>o</m:mi><m:mi>v</m:mi><m:mi>e</m:mi><m:mi>l</m:mi></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeG4naCdabaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeGyoaKJaeiilaWIaemOBa4Maem4Ba8MaemODayNaemyzauMaemiBaWgaaaaa@4096@</m:annotation></m:semantics></m:math>, which are listed in Table <tblr tid="T2">2</tblr>. From computational results, <it>E-Predict </it>outperforms NN and <it>C4.5 DT</it>. There is a noticeable reduction in accuracy when classifying proteins in <m:math name="1471-2105-7-362-i32" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>&#916;</m:mi><m:mrow><m:mi>v</m:mi><m:mn>1.65</m:mn></m:mrow><m:mrow><m:mi>v</m:mi><m:mn>1.67</m:mn><m:mo>,</m:mo><m:mi>n</m:mi><m:mi>o</m:mi><m:mi>v</m:mi><m:mi>e</m:mi><m:mi>l</m:mi></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeGynaudabaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeG4naCJaeiilaWIaemOBa4Maem4Ba8MaemODayNaemyzauMaemiBaWgaaaaa@408E@</m:annotation></m:semantics></m:math>. This is probably because the test set, <m:math name="1471-2105-7-362-i32" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>&#916;</m:mi><m:mrow><m:mi>v</m:mi><m:mn>1.65</m:mn></m:mrow><m:mrow><m:mi>v</m:mi><m:mn>1.67</m:mn><m:mo>,</m:mo><m:mi>n</m:mi><m:mi>o</m:mi><m:mi>v</m:mi><m:mi>e</m:mi><m:mi>l</m:mi></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeGynaudabaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeG4naCJaeiilaWIaemOBa4Maem4Ba8MaemODayNaemyzauMaemiBaWgaaaaa@408E@</m:annotation></m:semantics></m:math>, is harder to be correctly predicted than the other sets. To address the issue that accuracies may be biased by particular new structures, we conduct 10 fold cross validation that sequentially selects 10% of ground truth data from <m:math name="1471-2105-7-362-i33" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>&#916;</m:mi><m:mrow><m:mi>v</m:mi><m:mn>1.55</m:mn></m:mrow><m:mrow><m:mi>v</m:mi><m:mn>1.69</m:mn></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODayNaeGymaeJaeiOla4IaeGynauJaeGynaudabaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeGyoaKdaaaaa@38BB@</m:annotation></m:semantics></m:math> as a test set and the rest of 90% of ground truth data as a training set for 10 times. In the 10 fold experiment, our approach achieves 89.27% accuracy of the novel fold recognitions.</p>
            <fig id="F4">
               <title>
                  <p>Figure 4</p>
               </title>
               <caption>
                  <p>The <it>Correct Classification Rates </it>of recognizing the novel SCOP folds for proteins in various SCOP releases</p>
               </caption>
               <text>
                  <p>The <it>Correct Classification Rates </it>of recognizing the novel SCOP folds for proteins in various SCOP releases.</p>
               </text>
               <graphic file="1471-2105-7-362-4"/>
            </fig>
         </sec>
         <sec>
            <st>
               <p>Efficiency</p>
            </st>
            <p>For efficiency, we measure the average response time of the entire classification process, including feature extraction, nearest neighbor search on an M-tree <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>, and the computation of the SCOP folds by the <it>E-Predict </it>algorithm. The classification process performs <it>one-against-all </it>structural comparisons by scanning the entire SCOP database. Our system runs on a Fedora-Core Linux system with Dual Xeon IV 2.4 GHz processors and 2 GB RAM. A large-scale test set is chosen from the SCOP <it>v</it>l.69 release with 51911 protein chains which have more than 20 amino acids. Figure <figr fid="F5">5</figr> shows the average response time of fold classifications for various protein chain sizes. When the protein size increases, the <it>E-Predict </it>algorithm demands more computational resources to extract features from larger distance matrices. When the protein chain size reaches a certain threshold, the Linux system may swap huge distance matrices into the virtual memory resulting in a significant <it>I/O </it>time. This effect is reflected in Figure <figr fid="F5">5</figr> with long computation times for the protein chain size larger than 1099 amino acids, where more memory is required to prevent page swapping. On average, classifying a newly-discovered protein to a SCOP fold takes 3.5 seconds. In our test set, the longest protein chain, comprised of 1409 amino acids, completes the classification process in 17.4 seconds.</p>
            <fig id="F5">
               <title>
                  <p>Figure 5</p>
               </title>
               <caption>
                  <p>The protein chain sizes against the average response time of classifying test proteins</p>
               </caption>
               <text>
                  <p>The protein chain sizes against the average response time of classifying test proteins.</p>
               </text>
               <graphic file="1471-2105-7-362-5"/>
            </fig>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Discussion</p>
         </st>
         <p>Our approach yields better accuracy and efficiency compared to the structure alignment algorithms. The accuracy is achieved by analyzing the ranked SCOP folds of a nearest neighbors search using the <it>E-Predict </it>algorithm. In addition, efficiency results from using an M-tree <abbrgrp><abbr bid="B27">27</abbr></abbrgrp> for fast nearest neighbor searches. In the following subsections, we compare our performance with the structural alignment algorithms in terms of efficiency and accuracy.</p>
         <sec>
            <st>
               <p>Performance in efficiency</p>
            </st>
            <p>Since structural alignment algorithms usually apply dynamic programming techniques to align each pair of amino acids in two proteins, they demand a huge amount of computational resources. Instead of aligning amino acids, our <it>E-Predict </it>model transforms relevant protein structure information into high-level features, and similar protein structures are then retrieved from a high-dimensional feature space by a nearest neighbors search in the M-tree. Our approach is able to return the classification result in seconds. Since performing the structural alignment algorithms with multiple pairwise alignments of a newly-discovered structure against the known protein structures from the SCOP database is known to be computationally expensive <abbrgrp><abbr bid="B10">10</abbr></abbrgrp>, the response times for the structural alignment algorithms are not plotted in Figure <figr fid="F5">5</figr>.</p>
         </sec>
         <sec>
            <st>
               <p>The accuracy of assigning newly-discovered proteins to the known folds</p>
            </st>
            <p>For the assignment of proteins to the known SCOP folds, the <it>E-Predict </it>algorithm mainly contributes to the accuracy. Traditional structural alignment methods usually apply heuristics to reduce computational efforts of aligning a large combination of amino acids in two proteins. Different heuristics could return diverse results from the same set of proteins since these algorithms might be trapped in local optimal solutions. Even though a consensus method that combines classification results of multiple structural alignment algorithms outperforms each individual structural alignment approach <abbrgrp><abbr bid="B13">13</abbr></abbrgrp>, it is computationally expensive. Instead of performing structural alignments, our model maps both known proteins from the SCOP database and newly-discovered protein structures into 33-D feature vectors. With a nearest neighbor search for a newly-discovered structure <it>t </it>in the high-dimensional feature space, there may exist multiple candidate folds, which are associated with nearest neighbor proteins in the vicinity of <it>t</it>. One way to assign a SCOP fold to <it>t </it>is to choose the fold of the nearest neighbor protein in the high-dimensional feature space. Since it is possible that hundreds of folds are partially overlapped in the high-dimensional feature space, the nearest neighbor of <it>t </it>may be an outlier that deviates from the majority of proteins in its fold. To avoid selecting an outlier, we apply the <it>E</it>_<it>Measure </it>metric that considers the ranks of at least two nearest neighbor proteins for each fold. The algorithm rewards a SCOP fold in which proteins are highly ranked and penalizes a fold with proteins in the lower ranks. Hence, when the SCOP fold includes only a single highly ranked protein with the other proteins from this fold ranked much lower, the algorithm is able to avoid assigning this fold to <it>t </it>based on the penalty of low ranking. From computational results, <it>E</it>_<it>Measure </it>has a vital impact on the classification accuracy.</p>
         </sec>
         <sec>
            <st>
               <p>Misclassifications of assigning newly-discovered proteins to the known folds</p>
            </st>
            <p>Within the framework of ProteinDBS <abbrgrp><abbr bid="B17">17</abbr><abbr bid="B18">18</abbr><abbr bid="B19">19</abbr></abbrgrp>, our model, <it>E-Predict</it>, transforms a 3-D protein structure into a 33-D feature vector that represents the geometric properties of folded proteins. Applying these features to measure the structural similarity of proteins, <it>E-Predict </it>outperforms several classification methods that apply the structural alignment algorithm using the test set in Table <tblr tid="T1">1</tblr>. <it>E-Predict </it>also yields reasonably high accuracy for several test sets in Table <tblr tid="T3">3</tblr> with sufficient ground truth data. However, misclassifications still exist. The limited amount of 33-D ground truth data available for training contributes to the classification errors. As more ground truth data becomes available in small-size SCOP folds, a higher classification accuracy is expected. The second reason for misclassifications is due to the overlapping of folds in the high-dimensional feature space. To further separate overlapping folds, our system needs more relevant features to detect the protein 3-D folding with sufficient discriminating power. Another possible reason for misclassifications is that SCOP may categorize a partial segment of a PDB protein chain (substructure) into a domain. Since our approach measures the global similarity of distance matrices for classification, users need to submit the portion of the protein chain identified in the SCOP domain to ensure a correct classification. In Figure <figr fid="F6">6</figr>, we measure the correlation between the classification accuracy and a structure variation value, <it>S</it>, for a query protein <it>t </it>and the best matched protein of <it>t </it>in our classified SCOP fold. For a pair of proteins (<it>p</it><sub>1</sub>, <it>p</it><sub>2</sub>), the structural variation <it>S </it>is defined as follows:</p>
            <fig id="F6">
               <title>
                  <p>Figure 6</p>
               </title>
               <caption>
                  <p><it>Correct Classification Rates </it>of classifying test proteins against structural variation values</p>
               </caption>
               <text>
                  <p><it>Correct Classification Rates </it>of classifying test proteins against structural variation values.</p>
               </text>
               <graphic file="1471-2105-7-362-6"/>
            </fig>
            <p>
               <m:math name="1471-2105-7-362-i34" xmlns:m="http://www.w3.org/1998/Math/MathML">
                  <m:semantics>
                     <m:mrow>
                        <m:mi>S</m:mi>
                        <m:mo stretchy="false">(</m:mo>
                        <m:msub>
                           <m:mi>p</m:mi>
                           <m:mn>1</m:mn>
                        </m:msub>
                        <m:mo>,</m:mo>
                        <m:msub>
                           <m:mi>p</m:mi>
                           <m:mn>2</m:mn>
                        </m:msub>
                        <m:mo stretchy="false">)</m:mo>
                        <m:mo>=</m:mo>
                        <m:mi>R</m:mi>
                        <m:mi>M</m:mi>
                        <m:mi>S</m:mi>
                        <m:mi>D</m:mi>
                        <m:mo>/</m:mo>
                        <m:mo stretchy="false">(</m:mo>
                        <m:mfrac>
                           <m:mrow>
                              <m:msub>
                                 <m:mi>N</m:mi>
                                 <m:mi>A</m:mi>
                              </m:msub>
                           </m:mrow>
                           <m:mrow>
                              <m:msub>
                                 <m:mi>N</m:mi>
                                 <m:mrow>
                                    <m:msub>
                                       <m:mi>p</m:mi>
                                       <m:mn>1</m:mn>
                                    </m:msub>
                                 </m:mrow>
                              </m:msub>
                              <m:mo>+</m:mo>
                              <m:msub>
                                 <m:mi>N</m:mi>
                                 <m:mrow>
                                    <m:msub>
                                       <m:mi>p</m:mi>
                                       <m:mn>2</m:mn>
                                    </m:msub>
                                 </m:mrow>
                              </m:msub>
                           </m:mrow>
                        </m:mfrac>
                        <m:mo stretchy="false">)</m:mo>
                        <m:mo>,</m:mo>
                        <m:mtext>&#160;&#160;&#160;&#160;&#160;</m:mtext>
                        <m:mrow>
                           <m:mo>(</m:mo>
                           <m:mn>2</m:mn>
                           <m:mo>)</m:mo>
                        </m:mrow>
                     </m:mrow>
                     <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGtbWucqGGOaakcqWGWbaCdaWgaaWcbaGaeGymaedabeaakiabcYcaSiabdchaWnaaBaaaleaacqaIYaGmaeqaaOGaeiykaKIaeyypa0JaemOuaiLaemyta0Kaem4uamLaemiraqKaei4la8IaeiikaGYaaSaaaeaacqWGobGtdaWgaaWcbaGaemyqaeeabeaaaOqaaiabd6eaonaaBaaaleaacqWGWbaCdaWgaaadbaGaeGymaedabeaaaSqabaGccqGHRaWkcqWGobGtdaWgaaWcbaGaemiCaa3aaSbaaWqaaiabikdaYaqabaaaleqaaaaakiabcMcaPiabcYcaSiaaxMaacaWLjaWaaeWaceaacqaIYaGmaiaawIcacaGLPaaaaaa@4D8E@</m:annotation>
                  </m:semantics>
               </m:math>
            </p>
            <p>where <it>RMSD </it>means the root mean square deviation of aligned segments, and <it>N</it><sub><it>A </it></sub>denotes the number of amino acids in the aligned segments of two proteins. <it>N</it><sub><it>p</it>1 </sub>and <it>N</it><sub><it>p</it>2 </sub>represent the number of amino acid residues in <it>p</it><sub>1 </sub>and <it>p</it><sub>2</sub>, respectively. These measurements are computed using SARF <abbrgrp><abbr bid="B28">28</abbr></abbrgrp>. The smaller <it>S </it>value can be interpreted as a better structural match for two proteins <it>p</it><sub>1 </sub>and <it>p</it><sub>2</sub>. Two proteins that have a high structural similarity can usually be superimposed with longer aligned residue segments and a small <it>RMSD </it>value, resulting in a small <it>S </it>value. For example, the SARF algorithm aligns a query protein <it>t </it>with 100 amino acids and its best matched protein <it>p</it><sub>1 </sub>with 100 amino acids and returns structurally similar segments with 90 amino acid residues and 0.3 <it>&#197; </it>of <it>RMSD</it>. Their structure variation value <it>S </it>is computed as 0.3/(<m:math name="1471-2105-7-362-i35" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:mfrac><m:mrow><m:mn>90</m:mn></m:mrow><m:mrow><m:mn>100</m:mn><m:mo>+</m:mo><m:mn>100</m:mn></m:mrow></m:mfrac></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaadaWcaaqaaiabiMda5iabicdaWaqaaiabigdaXiabicdaWiabicdaWiabgUcaRiabigdaXiabicdaWiabicdaWaaaaaa@3524@</m:annotation></m:semantics></m:math>) = 0.67. When <it>S </it>is smaller than 6, we expect the <it>E-Predict </it>algorithm to maintain above 90% classification accuracy. This statistic is obtained from the classification of 41262 testing proteins.</p>
         </sec>
         <sec>
            <st>
               <p>The accuracy of recognizing the novel folds for newly-discovered proteins</p>
            </st>
            <p>Since no protein has been labeled with the novel folds in our 33-D ground truth data, the novel fold recognition becomes a challenging problem. To address this issue, we introduce three features: <it>E</it>_<it>Measure </it>evaluation score, structural variation value, and Euclidean distance measurement. These features measure structural similarity between a newly-discovered protein and the nearest neighbor protein in a candidate known fold suggested by the <it>E-Predict </it>algorithm. Then, our method applies the <it>E-Predict </it>algorithm as a classifier to identify meaningful patterns from ground truth data, which has been obtained by the aggregation of proteins in several prior SCOP releases. Computational results show that using these three features benefits the classification accuracy.</p>
         </sec>
         <sec>
            <st>
               <p>Misclassifications of recognizing the novel folds for newly-discovered proteins</p>
            </st>
            <p>To recognize the novel folds for newly-discovered protein structures, our classification model exploits three relevant features. With the assumption that protein structures in the novel folds usually present low structural similarities to proteins in the known folds, a high <it>E</it>_<it>Measure </it>evaluation score, a high Euclidean distance, and a high structural variation value are expected for newly-discovered protein structures from the novel folds. Due to noise in ground truth data and imperfect features, a few proteins in the novel folds may have a low structural variation value, a low <it>E</it>_<it>Measure </it>score, or a low Euclidean distance measurement. Even though our approach presents an improved accuracy over NN and <it>C4.5 DT</it>, there is still a need to discover more relevant features for better recognition performance.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Conclusion</p>
         </st>
         <p>We have developed an automatic SCOP fold classification system that is able to assign the known SCOP folds and recognize the novel folds for newly-discovered proteins. For the known fold assignments, the algorithm transforms protein structures into 33-D feature vectors and constructs an M-tree to index these feature vectors for fast retrievals. The <it>E-Predict </it>algorithm is then applied to classify newly-discovered proteins in the known SCOP folds. For the novel fold recognitions, the algorithm utilizes three relevant features that are related to structural similarity of proteins. From the computational results, our method outperforms several structural alignment algorithms such as DALI, CE and VAST, achieving reasonably high classification accuracy and efficiency. This research can help accelerate the classification process of the SCOP database and benefit the biomedical research community to further study biochemical functions of proteins with similar 3-D structures.</p>
      </sec>
      <sec>
         <st>
            <p>Methods</p>
         </st>
         <p>Our classification model, <it>E-Predict</it>, contains three primary functions. First, <it>E-Predict </it>assigns newly-discovered protein structures to the known SCOP folds. Second, <it>E-Predict </it>recognizes the novel folds for newly-discovered protein structures. Third, <it>E-Predict </it>indexes the high-dimensional protein data for a fast nearest neighbors search.</p>
         <sec>
            <st>
               <p>Assigning newly-discovered proteins to the known folds</p>
            </st>
            <p>According to the SCOP hierarchical setting, proteins that share similar secondary structure arrangements are usually classified in the <it>fold </it>level <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>. The entire process of assigning newly-discovered proteins to the known folds is shown in Figure <figr fid="F7">7</figr>. The labeling procedure transforms protein structures from the SCOP database into 33-D feature vectors, which are labeled with their corresponding SCOP folds. These labeled proteins are then used as our ground truth data. The testing procedure converts newly-discovered proteins into feature vectors and submits these unlabeled vectors into a classifier to obtain possible SCOP fold assignments. In the following, we discuss several components of the entire process such as distance matrix generation, feature extraction, and classifier design.</p>
            <fig id="F7">
               <title>
                  <p>Figure 7</p>
               </title>
               <caption>
                  <p><it>E-Predict </it>model for assigning newly-discovered proteins to the known folds</p>
               </caption>
               <text>
                  <p><it>E-Predict </it>model for assigning newly-discovered proteins to the known folds.</p>
               </text>
               <graphic file="1471-2105-7-362-7"/>
            </fig>
            <sec>
               <st>
                  <p>Mapping 3-D backbone structures into 2-D distance matrices</p>
               </st>
               <p>Proteins are polypeptide chains, which are chained by 20 types of amino acids. Instead of considering the side chains of amino acids, many computational biology papers <abbrgrp><abbr bid="B6">6</abbr><abbr bid="B9">9</abbr><abbr bid="B15">15</abbr></abbrgrp> use the <it>C</it><sub><it>&#945; </it></sub>atom to describe each amino acid. In our model, the <it>C</it><sub><it>&#945; </it></sub>backbone of the <it>k</it><sup><it>th </it></sup>protein chain with <it>n </it>amino acids can be represented by a set of vectors, &#937;<sup><it>k </it></sup>= {<m:math name="1471-2105-7-362-i36" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>C</m:mi><m:mi>&#945;</m:mi><m:mrow><m:mover accent="true"><m:mi>k</m:mi><m:mo>&#8594;</m:mo></m:mover><m:mo>,</m:mo><m:mn>1</m:mn></m:mrow></m:msubsup><m:mo>,</m:mo><m:msubsup><m:mi>C</m:mi><m:mi>&#945;</m:mi><m:mrow><m:mover accent="true"><m:mi>k</m:mi><m:mo>&#8594;</m:mo></m:mover><m:mo>,</m:mo><m:mn>2</m:mn></m:mrow></m:msubsup><m:mo>,</m:mo><m:mn>...</m:mn><m:mo>,</m:mo><m:msubsup><m:mi>C</m:mi><m:mi>&#945;</m:mi><m:mrow><m:mover accent="true"><m:mi>k</m:mi><m:mo>&#8594;</m:mo></m:mover><m:mo>,</m:mo><m:mi>n</m:mi></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGdbWqdaqhaaWcbaacciGae8xSdegabaGafm4AaSMbaSaacqGGSaalcqaIXaqmaaGccqGGSaalcqWGdbWqdaqhaaWcbaGae8xSdegabaGafm4AaSMbaSaacqGGSaalcqaIYaGmaaGccqGGSaalcqGGUaGlcqGGUaGlcqGGUaGlcqGGSaalcqWGdbWqdaqhaaWcbaGae8xSdegabaGafm4AaSMbaSaacqGGSaalcqWGUbGBaaaaaa@44D4@</m:annotation></m:semantics></m:math>}, where <m:math name="1471-2105-7-362-i37" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>C</m:mi><m:mi>&#945;</m:mi><m:mrow><m:mover accent="true"><m:mi>k</m:mi><m:mo>&#8594;</m:mo></m:mover><m:mo>,</m:mo><m:mi>i</m:mi></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGdbWqdaqhaaWcbaacciGae8xSdegabaGafm4AaSMbaSaacqGGSaalcqWGPbqAaaaaaa@333A@</m:annotation></m:semantics></m:math> denotes the 3-D coordinate of the <it>i</it><sup><it>th </it></sup><it>C</it><sub><it>&#945; </it></sub>atom. Protein backbones can be transformed into 2-D distance matrices. For &#937;<sup><it>k</it></sup>, the corresponding distance matrix, <it>D</it><sup><it>k</it></sup>, is defined as <it>D</it>[<it>i</it>, <it>j</it>] = <it>dist</it>(<m:math name="1471-2105-7-362-i37" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>C</m:mi><m:mi>&#945;</m:mi><m:mrow><m:mover accent="true"><m:mi>k</m:mi><m:mo>&#8594;</m:mo></m:mover><m:mo>,</m:mo><m:mi>i</m:mi></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGdbWqdaqhaaWcbaacciGae8xSdegabaGafm4AaSMbaSaacqGGSaalcqWGPbqAaaaaaa@333A@</m:annotation></m:semantics></m:math>, <m:math name="1471-2105-7-362-i38" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>C</m:mi><m:mi>&#945;</m:mi><m:mrow><m:mover accent="true"><m:mi>k</m:mi><m:mo>&#8594;</m:mo></m:mover><m:mo>,</m:mo><m:mi>j</m:mi></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGdbWqdaqhaaWcbaacciGae8xSdegabaGafm4AaSMbaSaacqGGSaalcqWGQbGAaaaaaa@333C@</m:annotation></m:semantics></m:math>), 1 &#8804; <it>i</it>, <it>j </it>&#8804; <it>n</it>, in a Euclidean space. A distance geometry method <abbrgrp><abbr bid="B29">29</abbr></abbrgrp> shows that the 2-D distance matrix is generally sufficient to recover the original 3-D structure in polynomial time. Several examples in the literature convert protein backbone structures into distance matrices and then detect the structural similarity from them <abbrgrp><abbr bid="B9">9</abbr><abbr bid="B30">30</abbr><abbr bid="B31">31</abbr></abbrgrp>. Since 2-D distance matrices maintain sufficient 3-D structural information, similar protein backbones are expected to have similar distance matrices. Figure <figr fid="F8">8</figr> shows 3-D protein backbone structures and their corresponding 2-D distance matrices sampled from the SCOP <it>Heme-dependent peroxidases </it>and <it>Acid proteases </it>folds. Within each SCOP fold, the proteins maintain high similarities in both 3-D backbone structures and 2-D distance matrices. Variations in distance matrices are detectable by comparing structures that belong to different folds.</p>
               <fig id="F8">
                  <title>
                     <p>Figure 8</p>
                  </title>
                  <caption>
                     <p>The 3-D backbone structures and distance matrices of four protein chains, which are selected from the SCOP folds: (1)<it>Heme-dependent peroxidases: </it>1<it>kta</it>_<it>A</it>(a-b), 1<it>ekv</it>_<it>A</it>(c-d), (2)<it>Acid proteases : </it>1<it>lee</it>_<it>A</it>(e-f), 1<it>lf</it>2_<it>A</it>(g-h)</p>
                  </caption>
                  <text>
                     <p>The 3-D backbone structures and distance matrices of four protein chains, which are selected from the SCOP folds: (1)<it>Heme-dependent peroxidases: </it>1<it>kta</it>_<it>A</it>(a-b), 1<it>ekv</it>_<it>A</it>(c-d), (2)<it>Acid proteases : </it>1<it>lee</it>_<it>A</it>(e-f), 1<it>lf</it>2_<it>A</it>(g-h).</p>
                  </text>
                  <graphic file="1471-2105-7-362-8"/>
               </fig>
            </sec>
            <sec>
               <st>
                  <p>Feature extraction</p>
               </st>
               <p>In the area of content-based image retrieval (<it>CBIR</it>), several computational techniques have been developed to retrieve visually similar images from databases for a query image <abbrgrp><abbr bid="B32">32</abbr><abbr bid="B33">33</abbr><abbr bid="B34">34</abbr></abbrgrp>. When each element of the distance matrix is interpreted as a grayscale pixel, the distance matrix can be converted into a 2-D image. After preprocessing protein structures into grayscale images, we apply the <it>CBIR </it>technique to retrieve similar distance matrices in ranked order. In our previous work <abbrgrp><abbr bid="B17">17</abbr><abbr bid="B19">19</abbr></abbrgrp>, we extract 24 local and 9 global features that are relevant to the visual content of distance matrices using a suite of computer vision algorithms such as histograms and textures <abbrgrp><abbr bid="B35">35</abbr><abbr bid="B36">36</abbr><abbr bid="B37">37</abbr></abbrgrp>. To obtain local features, each distance matrix is partitioned into six band regions, which are parallel to the diagonal of the matrix. In each band, histograms are computed by four bins of distance ranges: [0&#8211;5], [6&#8211;10], [11&#8211;15], and [16-&#8734;]. Since the distance matrix is symmetric, global features such as <it>Entropy, Homogenity </it>and <it>Contrast </it>are computed for the entire upper triangle of each distance image. Interested readers are referred to our previous publications <abbrgrp><abbr bid="B17">17</abbr><abbr bid="B19">19</abbr></abbrgrp> for details of the feature extraction algorithms applied in this work. The transformation of a distance matrix into a 33-D feature vector ensures each feature vector uniquely identifies a protein chain. Table <tblr tid="T5">5</tblr> and Table <tblr tid="T6">6</tblr> list the 24 local features and 9 global features extracted for proteins from the SCOP <it>Heme-dependent peroxidases </it>and <it>Acid proteases </it>folds, respectively. For a large-scale protein database such as the SCOP database, it is important to develop a classifier that groups proteins within the same fold and separates proteins from different folds.</p>
               <tbl id="T5">
                  <title>
                     <p>Table 5</p>
                  </title>
                  <caption>
                     <p>Local features of proteins from the SCOP folds: (1)<it>Heme-dependent peroxidases </it>: 1<it>stq</it>_<it>A</it>, 1<it>sog</it>_<it>A</it>, (2) <it>Acid proteases</it>: 1<it>lee</it>_<it>A</it>, 1<it>lf</it>2_<it>A</it>. Histogram [a,b] denotes the distance histogram for the <it>a</it><sup><it>th </it></sup>band region and the <it>b</it><sup><it>th </it></sup>grayscale bin.</p>
                  </caption>
                  <tblbdy cols="5">
                     <r>
                        <c ca="center">
                           <p>Image Features</p>
                        </c>
                        <c ca="center">
                           <p>1stq_A</p>
                        </c>
                        <c ca="center">
                           <p>1sog_A</p>
                        </c>
                        <c ca="center">
                           <p>1lee_A</p>
                        </c>
                        <c ca="center">
                           <p>1lf2_A</p>
                        </c>
                     </r>
                     <r>
                        <c cspan="5">
                           <hr/>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>Histogram [1,1]</p>
                        </c>
                        <c ca="center">
                           <p>0.0000</p>
                        </c>
                        <c ca="center">
                           <p>0.0000</p>
                        </c>
                        <c ca="center">
                           <p>0.0000</p>
                        </c>
                        <c ca="center">
                           <p>0.0000</p>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>Histogram [l,2]</p>
                        </c>
                        <c ca="center">
                           <p>0.0002</p>
                        </c>
                        <c ca="center">
                           <p>0.0002</p>
                        </c>
                        <c ca="center">
                           <p>0.0000</p>
                        </c>
                        <c ca="center">
                           <p>0.0000</p>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>Histogram [l,3]</p>
                        </c>
                        <c ca="center">
                           <p>0.0018</p>
                        </c>
                        <c ca="center">
                           <p>0.0020</p>
                        </c>
                        <c ca="center">
                           <p>0.0001</p>
                        </c>
                        <c ca="center">
                           <p>0.0002</p>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>Histogram [1,4]</p>
                        </c>
                        <c ca="center">
                           <p>0.0050</p>
                        </c>
                        <c ca="center">
                           <p>0.0053</p>
                        </c>
                        <c ca="center">
                           <p>0.0009</p>
                        </c>
                        <c ca="center">
                           <p>0.0011</p>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>Histogram [2,1]</p>
                        </c>
                        <c ca="center">
                           <p>0.0000</p>
                        </c>
                        <c ca="center">
                           <p>0.0000</p>
                        </c>
                        <c ca="center">
                           <p>0.0000</p>
                        </c>
                        <c ca="center">
                           <p>0.0000</p>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>Histogram [2,2]</p>
                        </c>
                        <c ca="center">
                           <p>0.0000</p>
                        </c>
                        <c ca="center">
                           <p>0.0000</p>
                        </c>
                        <c ca="center">
                           <p>0.0000</p>
                        </c>
                        <c ca="center">
                           <p>0.0000</p>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>Histogram [2,3]</p>
                        </c>
                        <c ca="center">
                           <p>0.0023</p>
                        </c>
                        <c ca="center">
                           <p>0.0022</p>
                        </c>
                        <c ca="center">
                           <p>0.0000</p>
                        </c>
                        <c ca="center">
                           <p>0.0001</p>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>Histogram [2,4]</p>
                        </c>
                        <c ca="center">
                           <p>0.0044</p>
                        </c>
                        <c ca="center">
                           <p>0.0043</p>
                        </c>
                        <c ca="center">
                           <p>0.0012</p>
                        </c>
                        <c ca="center">
                           <p>0.0010</p>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>Histogram [3,1]</p>
                        </c>
                        <c ca="center">
                           <p>0.0000</p>
                        </c>
                        <c ca="center">
                           <p>0.0000</p>
                        </c>
                        <c ca="center">
                           <p>0.0000</p>
                        </c>
                        <c ca="center">
                           <p>0.0000</p>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>Histogram [3,2]</p>
                        </c>
                        <c ca="center">
                           <p>0.0004</p>
                        </c>
                        <c ca="center">
                           <p>0.0004</p>
                        </c>
                        <c ca="center">
                           <p>0.0003</p>
                        </c>
                        <c ca="center">
                           <p>0.0004</p>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>Histogram [3,3]</p>
                        </c>
                        <c ca="center">
                           <p>0.0020</p>
                        </c>
                        <c ca="center">
                           <p>0.0019</p>
                        </c>
                        <c ca="center">
                           <p>0.0017</p>
                        </c>
                        <c ca="center">
                           <p>0.0019</p>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>Histogram [3,4]</p>
                        </c>
                        <c ca="center">
                           <p>0.0092</p>
                        </c>
                        <c ca="center">
                           <p>0.0080</p>
                        </c>
                        <c ca="center">
                           <p>0.0048</p>
                        </c>
                        <c ca="center">
                           <p>0.0055</p>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>Histogram [4,1]</p>
                        </c>
                        <c ca="center">
                           <p>0.0000</p>
                        </c>
                        <c ca="center">
                           <p>0.0000</p>
                        </c>
                        <c ca="center">
                           <p>0.0000</p>
                        </c>
                        <c ca="center">
                           <p>0.0000</p>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>Histogram [4,2]</p>
                        </c>
                        <c ca="center">
                           <p>0.0006</p>
                        </c>
                        <c ca="center">
                           <p>0.0006</p>
                        </c>
                        <c ca="center">
                           <p>0.0015</p>
                        </c>
                        <c ca="center">
                           <p>0.0012</p>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>Histogram [4,3]</p>
                        </c>
                        <c ca="center">
                           <p>0.0040</p>
                        </c>
                        <c ca="center">
                           <p>0.0042</p>
                        </c>
                        <c ca="center">
                           <p>0.0056</p>
                        </c>
                        <c ca="center">
                           <p>0.0053</p>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>Histogram [4,4]</p>
                        </c>
                        <c ca="center">
                           <p>0.0132</p>
                        </c>
                        <c ca="center">
                           <p>0.0130</p>
                        </c>
                        <c ca="center">
                           <p>0.0172</p>
                        </c>
                        <c ca="center">
                           <p>0.0166</p>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>Histogram [5,1]</p>
                        </c>
                        <c ca="center">
                           <p>0.0000</p>
                        </c>
                        <c ca="center">
                           <p>0.0000</p>
                        </c>
                        <c ca="center">
                           <p>0.0000</p>
                        </c>
                        <c ca="center">
                           <p>0.0000</p>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>Histogram [5,2]</p>
                        </c>
                        <c ca="center">
                           <p>0.0014</p>
                        </c>
                        <c ca="center">
                           <p>0.0015</p>
                        </c>
                        <c ca="center">
                           <p>0.0036</p>
                        </c>
                        <c ca="center">
                           <p>0.0035</p>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>Histogram [5,3]</p>
                        </c>
                        <c ca="center">
                           <p>0.0124</p>
                        </c>
                        <c ca="center">
                           <p>0.0128</p>
                        </c>
                        <c ca="center">
                           <p>0.0133</p>
                        </c>
                        <c ca="center">
                           <p>0.0134</p>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>Histogram [5,4]</p>
                        </c>
                        <c ca="center">
                           <p>0.0423</p>
                        </c>
                        <c ca="center">
                           <p>0.0425</p>
                        </c>
                        <c ca="center">
                           <p>0.0298</p>
                        </c>
                        <c ca="center">
                           <p>0.0304</p>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>Histogram [6,1]</p>
                        </c>
                        <c ca="center">
                           <p>0.0203</p>
                        </c>
                        <c ca="center">
                           <p>0.0201</p>
                        </c>
                        <c ca="center">
                           <p>0.0179</p>
                        </c>
                        <c ca="center">
                           <p>0.0180</p>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>Histogram [6,2]</p>
                        </c>
                        <c ca="center">
                           <p>0.0392</p>
                        </c>
                        <c ca="center">
                           <p>0.0386</p>
                        </c>
                        <c ca="center">
                           <p>0.0291</p>
                        </c>
                        <c ca="center">
                           <p>0.0289</p>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>Histogram [6,3]</p>
                        </c>
                        <c ca="center">
                           <p>0.0503</p>
                        </c>
                        <c ca="center">
                           <p>0.0496</p>
                        </c>
                        <c ca="center">
                           <p>0.0474</p>
                        </c>
                        <c ca="center">
                           <p>0.0485</p>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>Histogram [6,4]</p>
                        </c>
                        <c ca="center">
                           <p>0.0845</p>
                        </c>
                        <c ca="center">
                           <p>0.0833</p>
                        </c>
                        <c ca="center">
                           <p>0.0796</p>
                        </c>
                        <c ca="center">
                           <p>0.0795</p>
                        </c>
                     </r>
                  </tblbdy>
               </tbl>
               <tbl id="T6">
                  <title>
                     <p>Table 6</p>
                  </title>
                  <caption>
                     <p>Global features of proteins from the SCOP folds: (1)<it>Heme-dependent peroxidases</it>: 1<it>stq</it>_<it>A</it>, 1<it>sog</it>_<it>A</it>, (2)<it>Acid proteases</it>: 1<it>lee</it>_<it>A</it>, 1<it>lf</it>2_<it>A</it>.</p>
                  </caption>
                  <tblbdy cols="5">
                     <r>
                        <c ca="left">
                           <p>Image Features</p>
                        </c>
                        <c ca="left">
                           <p>1stq_A</p>
                        </c>
                        <c ca="left">
                           <p>1sog_A</p>
                        </c>
                        <c ca="left">
                           <p>1lee_A</p>
                        </c>
                        <c ca="left">
                           <p>1lf2_A</p>
                        </c>
                     </r>
                     <r>
                        <c cspan="5">
                           <hr/>
                        </c>
                     </r>
                     <r>
                        <c ca="left">
                           <p>Dimension</p>
                        </c>
                        <c ca="left">
                           <p>291.00</p>
                        </c>
                        <c ca="left">
                           <p>294.00</p>
                        </c>
                        <c ca="left">
                           <p>331.00</p>
                        </c>
                        <c ca="left">
                           <p>329.00</p>
                        </c>
                     </r>
                     <r>
                        <c ca="left">
                           <p>Binary_Threshold</p>
                        </c>
                        <c ca="left">
                           <p>23.0000</p>
                        </c>
                        <c ca="left">
                           <p>23.0000</p>
                        </c>
                        <c ca="left">
                           <p>26.0000</p>
                        </c>
                        <c ca="left">
                           <p>26.0000</p>
                        </c>
                     </r>
                     <r>
                        <c ca="left">
                           <p>Texture_Energy</p>
                        </c>
                        <c ca="left">
                           <p>0.0155</p>
                        </c>
                        <c ca="left">
                           <p>0.0153</p>
                        </c>
                        <c ca="left">
                           <p>0.0107</p>
                        </c>
                        <c ca="left">
                           <p>0.0107</p>
                        </c>
                     </r>
                     <r>
                        <c ca="left">
                           <p>Texture_Entropy</p>
                        </c>
                        <c ca="left">
                           <p>51.7143</p>
                        </c>
                        <c ca="left">
                           <p>51.8067</p>
                        </c>
                        <c ca="left">
                           <p>54.4426</p>
                        </c>
                        <c ca="left">
                           <p>54.4139</p>
                        </c>
                     </r>
                     <r>
                        <c ca="left">
                           <p>Texture_Homogenity1</p>
                        </c>
                        <c ca="left">
                           <p>2.5344</p>
                        </c>
                        <c ca="left">
                           <p>2.5261</p>
                        </c>
                        <c ca="left">
                           <p>2.2184</p>
                        </c>
                        <c ca="left">
                           <p>2.2192</p>
                        </c>
                     </r>
                     <r>
                        <c ca="left">
                           <p>Texture_Homogenity2</p>
                        </c>
                        <c ca="left">
                           <p>1.7608</p>
                        </c>
                        <c ca="left">
                           <p>1.7529</p>
                        </c>
                        <c ca="left">
                           <p>1.4467</p>
                        </c>
                        <c ca="left">
                           <p>1.4485</p>
                        </c>
                     </r>
                     <r>
                        <c ca="left">
                           <p>Texture_Contrast</p>
                        </c>
                        <c ca="left">
                           <p>0.0027</p>
                        </c>
                        <c ca="left">
                           <p>0.0027</p>
                        </c>
                        <c ca="left">
                           <p>0.0041</p>
                        </c>
                        <c ca="left">
                           <p>0.0041</p>
                        </c>
                     </r>
                     <r>
                        <c ca="left">
                           <p>Texture_Correlation</p>
                        </c>
                        <c ca="left">
                           <p>6.8883</p>
                        </c>
                        <c ca="left">
                           <p>6.8914</p>
                        </c>
                        <c ca="left">
                           <p>6.7682</p>
                        </c>
                        <c ca="left">
                           <p>6.7659</p>
                        </c>
                     </r>
                     <r>
                        <c ca="left">
                           <p>Texture_Cluster_Tendency</p>
                        </c>
                        <c ca="left">
                           <p>0.0387</p>
                        </c>
                        <c ca="left">
                           <p>0.0392</p>
                        </c>
                        <c ca="left">
                           <p>0.0517</p>
                        </c>
                        <c ca="left">
                           <p>0.0515</p>
                        </c>
                     </r>
                  </tblbdy>
               </tbl>
            </sec>
            <sec>
               <st>
                  <p>Fold assignment</p>
               </st>
               <p>To ensure high accuracy when classifying a newly-discovered protein, we have designed a novel method that extends algorithms from Information Retrieval (<it>IR</it>) <abbrgrp><abbr bid="B38">38</abbr></abbrgrp>. For the assignment of newly-discovered proteins to the known folds, we first discuss two well-recognized methods, <it>C4.5 Decision Tree (DT) </it><abbrgrp><abbr bid="B26">26</abbr></abbrgrp> and <it>Nearest Neighbor </it>(NN) <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>, and then our new approach, <it>E-Predict</it>, which achieves a better classification accuracy than <it>C4.5 DT </it>or NN.</p>
               <p>Decision tree approaches have been developed for classification in supervised machine learning <abbrgrp><abbr bid="B26">26</abbr></abbrgrp>. Using a set of ground truth data that contains feature vectors of proteins and their associated fold labels, a classifier usually divides the high-dimensional feature space, discussed previously, into multiple subspaces, which are normally in the form of <it>hyper-cubes </it>or <it>hyper-spheres</it>. In the labeling process using <it>C4.5 DT</it>, the majority of proteins from the same fold are expected to be clustered into a small number of subspaces. Proteins from different folds are separated into different subspaces based on minimization of entropy. A newly-discovered protein can then be classified into one of the known subspaces for fold assignment by following decision features of internal tree nodes and their corresponding thresholds. However, a small number of proteins from the same SCOP fold with similar feature values may be partitioned into different leaf nodes by <it>C4.5 DT </it>due to their feature values, which are distributed around thresholds of internal nodes. With hundreds of folds in the SCOP database, the more proteins from different folds that have been grouped into a leaf node, the higher the probability of misclassification.</p>
               <p>Instead of partitioning the high-dimensional space, <it>Nearest Neighbor </it>(NN) <abbrgrp><abbr bid="B25">25</abbr></abbrgrp> assigns a SCOP fold for a newly-discovered protein by searching for its nearest neighbor with the Euclidean distance measurement. Figure <figr fid="F9">9(a)</figr> shows that NN outperforms <it>C4.5 DT </it>by 13%, on average, for fold assignments using the test sets in <m:math name="1471-2105-7-362-i2" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>&#916;</m:mi><m:mrow><m:msub><m:mi>v</m:mi><m:mn>1</m:mn></m:msub></m:mrow><m:mrow><m:msub><m:mi>v</m:mi><m:mn>2</m:mn></m:msub><m:mo>,</m:mo><m:mi>k</m:mi><m:mi>n</m:mi><m:mi>o</m:mi><m:mi>w</m:mi><m:mi>n</m:mi></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODay3aaSbaaWqaaiabigdaXaqabaaaleaacqWG2bGDdaWgaaadbaGaeGOmaidabeaaliabcYcaSiabdUgaRjabd6gaUjabd+gaVjabdEha3jabd6gaUbaaaaa@3B62@</m:annotation></m:semantics></m:math>, which are selected from the SCOP folds in the SCOP <it>v</it><sub>2 </sub>release that have at least one protein from the SCOP <it>v</it><sub>1 </sub>release. Figure <figr fid="F9">9(b)</figr> shows that NN also outperforms <it>C4.5 DT </it>by 8.45%, on average, for fold assignments using the test sets in <m:math name="1471-2105-7-362-i2" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>&#916;</m:mi><m:mrow><m:msub><m:mi>v</m:mi><m:mn>1</m:mn></m:msub></m:mrow><m:mrow><m:msub><m:mi>v</m:mi><m:mn>2</m:mn></m:msub><m:mo>,</m:mo><m:mi>k</m:mi><m:mi>n</m:mi><m:mi>o</m:mi><m:mi>w</m:mi><m:mi>n</m:mi></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODay3aaSbaaWqaaiabigdaXaqabaaaleaacqWG2bGDdaWgaaadbaGaeGOmaidabeaaliabcYcaSiabdUgaRjabd6gaUjabd+gaVjabdEha3jabd6gaUbaaaaa@3B62@</m:annotation></m:semantics></m:math>, which are selected from the SCOP folds in the SCOP <it>v</it><sub>2 </sub>release that have at least 10 proteins from the SCOP <it>v</it><sub>1 </sub>release. Even though NN yields a better classification performance than <it>C4.5 DT</it>, there still exists an important issue to consider: misclassifications from an outlier in NN search. An outlier is defined as a protein chain whose feature vector deviates greatly from the majority of proteins in the same SCOP fold. In the high-dimensional feature space with multiple overlapping SCOP folds, NN search may assign an incorrect SCOP fold to a newly-discovered protein by selecting an outlier as the nearest neighbor. For instance, we assume that the true fold of a newly-discovered protein <it>t </it>is <it>F</it><sub>2</sub>. From <it>Result F</it><sub>1</sub>, shown in the second row of Figure <figr fid="F10">10</figr>, the nearest neighbor of t is <it>p</it><sub>1</sub>, which is an outlier to the majority of proteins in fold <it>F</it><sub>1</sub>. When NN search is used for classification, the algorithm falsely classifies that <it>t </it>is in fold <it>F</it><sub>1</sub>. One possible way to address this issue is to assign the newly-discovered protein to the SCOP fold that has the majority in the top <it>k Nearest Neighbor </it>(<it>k</it>-NN). In Figure <figr fid="F9">9(b)</figr>, 3-NN yields a better accuracy than 5-NN in six test sets. Also, we find that 3-NN achieves a better accuracy than NN in <m:math name="1471-2105-7-362-i39" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>&#916;</m:mi><m:mrow><m:mi>v</m:mi><m:mn>1.65</m:mn></m:mrow><m:mrow><m:mi>v</m:mi><m:mn>1.67</m:mn><m:mo>,</m:mo><m:mi>k</m:mi><m:mi>n</m:mi><m:mi>o</m:mi><m:mi>w</m:mi><m:mi>n</m:mi></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeGynaudabaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeG4naCJaeiilaWIaem4AaSMaemOBa4Maem4Ba8Maem4DaCNaemOBa4gaaaaa@40A0@</m:annotation></m:semantics></m:math>. Unfortunately, 3-NN does not perform as well as NN on the other test sets due to the existence of two or more outliers in the 3-NN selection. In general, the <it>k</it>-NN classification method simply takes the majority of the top <it>k </it>nearest neighbors without considering the ranking information of nearest neighbor proteins.</p>
               <fig id="F9">
                  <title>
                     <p>Figure 9</p>
                  </title>
                  <caption>
                     <p>A comparison of classification performance between <it>E-Predict, NN</it>, 3-NN, 5-NN, and <it>C4.5 DT </it>classifiers using (a) testing proteins in <m:math name="1471-2105-7-362-i2" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>&#916;</m:mi><m:mrow><m:msub><m:mi>v</m:mi><m:mn>1</m:mn></m:msub></m:mrow><m:mrow><m:msub><m:mi>v</m:mi><m:mn>2</m:mn></m:msub><m:mo>,</m:mo><m:mi>k</m:mi><m:mi>n</m:mi><m:mi>o</m:mi><m:mi>w</m:mi><m:mi>n</m:mi></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODay3aaSbaaWqaaiabigdaXaqabaaaleaacqWG2bGDdaWgaaadbaGaeGOmaidabeaaliabcYcaSiabdUgaRjabd6gaUjabd+gaVjabdEha3jabd6gaUbaaaaa@3B62@</m:annotation></m:semantics></m:math> which are selected from the SCOP folds in <it>v</it><sub>2 </sub>that have at least one protein in <it>v</it><sub>1 </sub>(b) testing proteins in <m:math name="1471-2105-7-362-i2" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>&#916;</m:mi><m:mrow><m:msub><m:mi>v</m:mi><m:mn>1</m:mn></m:msub></m:mrow><m:mrow><m:msub><m:mi>v</m:mi><m:mn>2</m:mn></m:msub><m:mo>,</m:mo><m:mi>k</m:mi><m:mi>n</m:mi><m:mi>o</m:mi><m:mi>w</m:mi><m:mi>n</m:mi></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODay3aaSbaaWqaaiabigdaXaqabaaaleaacqWG2bGDdaWgaaadbaGaeGOmaidabeaaliabcYcaSiabdUgaRjabd6gaUjabd+gaVjabdEha3jabd6gaUbaaaaa@3B62@</m:annotation></m:semantics></m:math> which are selected from the SCOP folds in <it>v</it><sub>2 </sub>that have at least 10 proteins in <it>v</it><sub>1</sub></p>
                  </caption>
                  <text>
                     <p>A comparison of classification performance between <it>E-Predict, NN</it>, 3-NN, 5-NN, and <it>C4.5 DT </it>classifiers using (a) testing proteins in <m:math name="1471-2105-7-362-i2" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>&#916;</m:mi><m:mrow><m:msub><m:mi>v</m:mi><m:mn>1</m:mn></m:msub></m:mrow><m:mrow><m:msub><m:mi>v</m:mi><m:mn>2</m:mn></m:msub><m:mo>,</m:mo><m:mi>k</m:mi><m:mi>n</m:mi><m:mi>o</m:mi><m:mi>w</m:mi><m:mi>n</m:mi></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODay3aaSbaaWqaaiabigdaXaqabaaaleaacqWG2bGDdaWgaaadbaGaeGOmaidabeaaliabcYcaSiabdUgaRjabd6gaUjabd+gaVjabdEha3jabd6gaUbaaaaa@3B62@</m:annotation></m:semantics></m:math> which are selected from the SCOP folds in <it>v</it><sub>2 </sub>that have at least one protein in <it>v</it><sub>1 </sub>(b) testing proteins in <m:math name="1471-2105-7-362-i2" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>&#916;</m:mi><m:mrow><m:msub><m:mi>v</m:mi><m:mn>1</m:mn></m:msub></m:mrow><m:mrow><m:msub><m:mi>v</m:mi><m:mn>2</m:mn></m:msub><m:mo>,</m:mo><m:mi>k</m:mi><m:mi>n</m:mi><m:mi>o</m:mi><m:mi>w</m:mi><m:mi>n</m:mi></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODay3aaSbaaWqaaiabigdaXaqabaaaleaacqWG2bGDdaWgaaadbaGaeGOmaidabeaaliabcYcaSiabdUgaRjabd6gaUjabd+gaVjabdEha3jabd6gaUbaaaaa@3B62@</m:annotation></m:semantics></m:math> which are selected from the SCOP folds in <it>v</it><sub>2 </sub>that have at least 10 proteins in <it>v</it><sub>1</sub>.</p>
                  </text>
                  <graphic file="1471-2105-7-362-9"/>
               </fig>
               <fig id="F10">
                  <title>
                     <p>Figure 10</p>
                  </title>
                  <caption>
                     <p>An example of <it>E_Measure </it>calculations for two SCOP folds in a list of nearest neighbor proteins</p>
                  </caption>
                  <text>
                     <p>An example of <it>E_Measure </it>calculations for two SCOP folds in a list of nearest neighbor proteins.</p>
                  </text>
                  <graphic file="1471-2105-7-362-10"/>
               </fig>
               <p>In this work, we have developed the <it>E-Predict </it>algorithm which applies the <it>E</it>_<it>Measure </it>metric <abbrgrp><abbr bid="B21">21</abbr></abbrgrp> to calculate the ranking information of nearest neighbor proteins. <it>E_Measure </it>was originally developed to evaluate the effectiveness of retrieval systems in <it>IR</it>. The more <it>relevant </it>documents retrieved with high ranks, the higher the retrieval accuracy. In the context of <it>IR, Precision </it>and <it>Recall </it>are two commonly used metrics for evaluating the retrieval performance. Let <it>n</it><sub><it>t </it></sub>be the total number of <it>relevant </it>documents in the database for a certain query <it>t </it>and <it>s</it>(<it>R</it>,<it>i</it>) be the rank of the top <it>i</it><sup><it>th </it></sup><it>relevant </it>document in the retrieved document set <it>R </it>with 1 &#8804; <it>i </it>&#8804; <it>n</it><sub><it>t</it></sub>. <it>Precision </it>can be obtained by computing the ratio of the number of <it>relevant </it>documents retrieved to the total number of documents retrieved.</p>
               <p>
                  <m:math name="1471-2105-7-362-i40" xmlns:m="http://www.w3.org/1998/Math/MathML">
                     <m:semantics>
                        <m:mrow>
                           <m:mi>P</m:mi>
                           <m:mi>r</m:mi>
                           <m:mi>e</m:mi>
                           <m:mi>c</m:mi>
                           <m:mi>i</m:mi>
                           <m:mi>s</m:mi>
                           <m:mi>i</m:mi>
                           <m:mi>o</m:mi>
                           <m:mi>n</m:mi>
                           <m:mo stretchy="false">(</m:mo>
                           <m:mi>i</m:mi>
                           <m:mo stretchy="false">)</m:mo>
                           <m:mo>=</m:mo>
                           <m:mfrac>
                              <m:mi>i</m:mi>
                              <m:mrow>
                                 <m:mi>s</m:mi>
                                 <m:mo stretchy="false">(</m:mo>
                                 <m:mi>R</m:mi>
                                 <m:mo>,</m:mo>
                                 <m:mi>i</m:mi>
                                 <m:mo stretchy="false">)</m:mo>
                              </m:mrow>
                           </m:mfrac>
                           <m:mtext>&#160;&#160;&#160;&#160;&#160;</m:mtext>
                           <m:mrow>
                              <m:mo>(</m:mo>
                              <m:mn>3</m:mn>
                              <m:mo>)</m:mo>
                           </m:mrow>
                        </m:mrow>
                        <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaaieGacqWFqbaucqWFYbGCcqWGLbqzcqWGJbWycqWGPbqAcqWGZbWCcqWGPbqAcqWGVbWBcqWGUbGBcqGGOaakcqWGPbqAcqGGPaqkcqGH9aqpdaWcaaqaaiabdMgaPbqaaiabdohaZjabcIcaOiabdkfasjabcYcaSiabdMgaPjabcMcaPaaacaWLjaGaaCzcamaabmGabaGaeG4mamdacaGLOaGaayzkaaaaaa@48A2@</m:annotation>
                     </m:semantics>
                  </m:math>
               </p>
               <p>For example, if the second <it>relevant </it>document is ranked seventh in <it>R</it>, (i.e. <it>s</it>(<it>R</it>, 2) = 7), then <it>Precision</it>(2) = <m:math name="1471-2105-7-362-i41" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:mfrac><m:mn>2</m:mn><m:mn>7</m:mn></m:mfrac></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaadaWcaaqaaiabikdaYaqaaiabiEda3aaaaaa@2EAA@</m:annotation></m:semantics></m:math>. <it>Recall </it>is the ratio of the number of <it>relevant </it>documents <it>i </it>retrieved to the total number of <it>relevant </it>documents <it>n</it><sub><it>t </it></sub>in the database.</p>
               <p>
                  <m:math name="1471-2105-7-362-i42" xmlns:m="http://www.w3.org/1998/Math/MathML">
                     <m:semantics>
                        <m:mrow>
                           <m:mi>R</m:mi>
                           <m:mi>e</m:mi>
                           <m:mi>c</m:mi>
                           <m:mi>a</m:mi>
                           <m:mi>l</m:mi>
                           <m:mi>l</m:mi>
                           <m:mo stretchy="false">(</m:mo>
                           <m:mi>i</m:mi>
                           <m:mo stretchy="false">)</m:mo>
                           <m:mo>=</m:mo>
                           <m:mfrac>
                              <m:mi>i</m:mi>
                              <m:mrow>
                                 <m:msub>
                                    <m:mi>n</m:mi>
                                    <m:mi>t</m:mi>
                                 </m:msub>
                              </m:mrow>
                           </m:mfrac>
                           <m:mtext>&#160;&#160;&#160;&#160;&#160;</m:mtext>
                           <m:mrow>
                              <m:mo>(</m:mo>
                              <m:mn>4</m:mn>
                              <m:mo>)</m:mo>
                           </m:mrow>
                        </m:mrow>
                        <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaaieGacqWFsbGucqWFLbqzcqWGJbWycqWGHbqycqWGSbaBcqWGSbaBcqGGOaakcqWGPbqAcqGGPaqkcqGH9aqpdaWcaaqaaiabdMgaPbqaaiabd6gaUnaaBaaaleaacqWG0baDaeqaaaaakiaaxMaacaWLjaWaaeWaceaacqaI0aanaiaawIcacaGLPaaaaaa@40DA@</m:annotation>
                     </m:semantics>
                  </m:math>
               </p>
               <p>For example, if there exist 11 <it>relevant </it>documents for query <it>t</it>, the second <it>relevant </it>document in the results will have <it>Recall</it>(2) = <m:math name="1471-2105-7-362-i43" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:mfrac><m:mn>2</m:mn><m:mrow><m:mn>11</m:mn></m:mrow></m:mfrac></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaadaWcaaqaaiabikdaYaqaaiabigdaXiabigdaXaaaaaa@2F8E@</m:annotation></m:semantics></m:math>. <it>E</it>_<it>Measure </it>takes into consideration both <it>Precision </it>and <it>Recall to </it>evaluate the retrieval accuracy with a weighting factor <it>b </it>as shown in the following equation:</p>
               <p>
                  <m:math name="1471-2105-7-362-i44" xmlns:m="http://www.w3.org/1998/Math/MathML">
                     <m:semantics>
                        <m:mrow>
                           <m:mi>E</m:mi>
                           <m:mo>_</m:mo>
                           <m:mi>M</m:mi>
                           <m:mi>e</m:mi>
                           <m:mi>a</m:mi>
                           <m:mi>s</m:mi>
                           <m:mi>u</m:mi>
                           <m:mi>r</m:mi>
                           <m:mi>e</m:mi>
                           <m:mo stretchy="false">(</m:mo>
                           <m:mi>i</m:mi>
                           <m:mo>,</m:mo>
                           <m:mi>b</m:mi>
                           <m:mo stretchy="false">)</m:mo>
                           <m:mo>=</m:mo>
                           <m:mn>1</m:mn>
                           <m:mo>&#8722;</m:mo>
                           <m:mfrac>
                              <m:mrow>
                                 <m:mn>1</m:mn>
                                 <m:mo>+</m:mo>
                                 <m:msup>
                                    <m:mi>b</m:mi>
                                    <m:mn>2</m:mn>
                                 </m:msup>
                              </m:mrow>
                              <m:mrow>
                                 <m:mfrac>
                                    <m:mn>1</m:mn>
                                    <m:mrow>
                                       <m:mi>P</m:mi>
                                       <m:mi>r</m:mi>
                                       <m:mi>e</m:mi>
                                       <m:mi>c</m:mi>
                                       <m:mi>i</m:mi>
                                       <m:mi>s</m:mi>
                                       <m:mi>i</m:mi>
                                       <m:mi>o</m:mi>
                                       <m:mi>n</m:mi>
                                       <m:mo stretchy="false">(</m:mo>
                                       <m:mi>i</m:mi>
                                       <m:mo stretchy="false">)</m:mo>
                                    </m:mrow>
                                 </m:mfrac>
                                 <m:mo>+</m:mo>
                                 <m:mfrac>
                                    <m:mrow>
                                       <m:msup>
                                          <m:mi>b</m:mi>
                                          <m:mn>2</m:mn>
                                       </m:msup>
                                    </m:mrow>
                                    <m:mrow>
                                       <m:mi>R</m:mi>
                                       <m:mi>e</m:mi>
                                       <m:mi>c</m:mi>
                                       <m:mi>a</m:mi>
                                       <m:mi>l</m:mi>
                                       <m:mi>l</m:mi>
                                       <m:mo stretchy="false">(</m:mo>
                                       <m:mi>i</m:mi>
                                       <m:mo stretchy="false">)</m:mo>
                                    </m:mrow>
                                 </m:mfrac>
                              </m:mrow>
                           </m:mfrac>
                           <m:mtext>&#160;&#160;&#160;&#160;&#160;</m:mtext>
                           <m:mrow>
                              <m:mo>(</m:mo>
                              <m:mn>5</m:mn>
                              <m:mo>)</m:mo>
                           </m:mrow>
                        </m:mrow>
                        <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGfbqrcqGGFbWxcqWGnbqtcqWGLbqzcqWGHbqycqWGZbWCcqWG1bqDcqWGYbGCcqWGLbqzcqGGOaakcqWGPbqAcqGGSaalcqWGIbGycqGGPaqkcqGH9aqpcqaIXaqmcqGHsisldaWcaaqaaiabigdaXiabgUcaRiabdkgaInaaCaaaleqabaGaeGOmaidaaaGcbaWaaSaaaeaacqaIXaqmaeaaieGacqWFqbaucqWFYbGCcqWGLbqzcqWGJbWycqWGPbqAcqWGZbWCcqWGPbqAcqWGVbWBcqWGUbGBcqGGOaakcqWGPbqAcqGGPaqkaaGaey4kaSYaaSaaaeaacqWGIbGydaahaaWcbeqaaiabikdaYaaaaOqaaiab=jfasjab=vgaLjabdogaJjabdggaHjabdYgaSjabdYgaSjabcIcaOiabdMgaPjabcMcaPaaaaaGaaCzcaiaaxMaadaqadiqaaiabiwda1aGaayjkaiaawMcaaaaa@6726@</m:annotation>
                     </m:semantics>
                  </m:math>
               </p>
               <p>When a <it>relevant </it>document is highly ranked, a low <it>E</it>_<it>Measure </it>is expected. The <it>effectiveness </it>of a retrieval system <it>&#962; </it>can be evaluated by the summation of <it>E_Measures </it>for all <it>n</it><sub><it>t </it></sub><it>relevant </it>documents.</p>
               <p>
                  <m:math name="1471-2105-7-362-i45" xmlns:m="http://www.w3.org/1998/Math/MathML">
                     <m:semantics>
                        <m:mrow>
                           <m:msubsup>
                              <m:mi>E</m:mi>
                              <m:mrow>
                                 <m:mi>s</m:mi>
                                 <m:mi>u</m:mi>
                                 <m:mi>m</m:mi>
                              </m:mrow>
                              <m:mi>t</m:mi>
                           </m:msubsup>
                           <m:mo stretchy="false">(</m:mo>
                           <m:mi>&#962;</m:mi>
                           <m:mo stretchy="false">)</m:mo>
                           <m:mo>=</m:mo>
                           <m:mstyle displaystyle="true">
                              <m:munderover>
                                 <m:mo>&#8721;</m:mo>
                                 <m:mrow>
                                    <m:mi>i</m:mi>
                                    <m:mo>=</m:mo>
                                    <m:mn>1</m:mn>
                                 </m:mrow>
                                 <m:mrow>
                                    <m:msub>
                                       <m:mi>n</m:mi>
                                       <m:mi>t</m:mi>
                                    </m:msub>
                                 </m:mrow>
                              </m:munderover>
                              <m:mrow>
                                 <m:mi>E</m:mi>
                                 <m:mo>_</m:mo>
                                 <m:mi>M</m:mi>
                                 <m:mi>e</m:mi>
                                 <m:mi>a</m:mi>
                                 <m:mi>s</m:mi>
                                 <m:mi>u</m:mi>
                                 <m:mi>r</m:mi>
                                 <m:mi>e</m:mi>
                              </m:mrow>
                           </m:mstyle>
                           <m:mo stretchy="false">(</m:mo>
                           <m:mi>i</m:mi>
                           <m:mo>,</m:mo>
                           <m:mi>b</m:mi>
                           <m:mo stretchy="false">)</m:mo>
                           <m:mtext>&#160;&#160;&#160;&#160;&#160;</m:mtext>
                           <m:mrow>
                              <m:mo>(</m:mo>
                              <m:mn>6</m:mn>
                              <m:mo>)</m:mo>
                           </m:mrow>
                        </m:mrow>
                        <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGfbqrdaqhaaWcbaGaem4CamNaemyDauNaemyBa0gabaGaemiDaqhaaOGaeiikaGccciGae8NWdyLaeiykaKIaeyypa0ZaaabCaeaacqWGfbqrcqGGFbWxcqWGnbqtcqWGLbqzcqWGHbqycqWGZbWCcqWG1bqDcqWGYbGCcqWGLbqzaSqaaiabdMgaPjabg2da9iabigdaXaqaaiabd6gaUnaaBaaameaacqWG0baDaeqaaaqdcqGHris5aOGaeiikaGIaemyAaKMaeiilaWIaemOyaiMaeiykaKIaaCzcaiaaxMaadaqadiqaaiabiAda2aGaayjkaiaawMcaaaaa@556F@</m:annotation>
                     </m:semantics>
                  </m:math>
               </p>
               <p>In practice, the best <it>IR </it>system is the one with the smallest <it>E_sum</it>(<it>&#962;</it>).</p>
               <p>Instead of directly applying the above-mentioned evaluation method for our SCOP fold classification task, our <it>E-Predict </it>algorithm extends the method by visiting candidate folds in the top <it>k </it>nearest neighbor results <it>R</it>, and then ranking the folds using <it>E</it>_<it>Measure</it>. The <it>E-Predict </it>algorithm is shown in Appendix 1. From lines 2 to 16, the algorithm collects the SCOP folds of retrieved proteins in <it>R </it>into a set of candidate SCOP folds, II, with each candidate fold having at least <it>n</it><sub><it>t </it></sub>proteins appearing in <it>R</it>. The algorithm then computes an evaluation score <m:math name="1471-2105-7-362-i46" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>E</m:mi><m:mrow><m:mi>s</m:mi><m:mi>u</m:mi><m:mi>m</m:mi></m:mrow><m:mi>t</m:mi></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGfbqrdaqhaaWcbaGaem4CamNaemyDauNaemyBa0gabaGaemiDaqhaaaaa@33A2@</m:annotation></m:semantics></m:math>(<it>F</it>) for each candidate SCOP fold, <it>F </it>&#8712; &#8719;, by accumulating <it>E</it>_<it>Measures </it>of the top <it>n</it><sub><it>t </it></sub>proteins labeled with <it>F</it>, as shown from lines 17 to 26. Our approach assumes that the most relevant SCOP fold assigned to a newly-discovered protein <it>t </it>should have proteins that are highly ranked in <it>R</it>. For example, if <it>F</it><sub>1 </sub>&#8712; &#8719; is the candidate SCOP fold to be evaluated, we revisit <it>R </it>by assigning the label '<it>relevant' </it>to proteins that are from <it>F</it><sub>1 </sub>and the label '<it>irrelevant' </it>to those from folds other than <it>F</it><sub>1</sub>. Among these <it>relevant </it>proteins, we select the top <it>n</it><sub><it>t </it></sub>proteins and form <it>R</it><sub><it>F</it>1 </sub>for our classification process. The <it>Result F</it><sub>1 </sub>in Figure <figr fid="F10">10</figr> shows that the top two proteins (<it>n</it><sub><it>t </it></sub>= 2) labeled with <it>F</it><sub>1 </sub>are ranked at 1 and 10.</p>
               <p>For fold <it>F</it><sub>1</sub>, the pairs of (precision, recall) for these two proteins are (<it>Precision</it>(1) = <m:math name="1471-2105-7-362-i47" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:mfrac><m:mn>1</m:mn><m:mn>1</m:mn></m:mfrac></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaadaWcaaqaaiabigdaXaqaaiabigdaXaaaaaa@2E9C@</m:annotation></m:semantics></m:math>, <it>Recall</it>(1) = <m:math name="1471-2105-7-362-i48" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:mfrac><m:mn>1</m:mn><m:mn>2</m:mn></m:mfrac></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaadaWcaaqaaiabigdaXaqaaiabikdaYaaaaaa@2E9E@</m:annotation></m:semantics></m:math>) and (<it>Precision</it>(2) = <m:math name="1471-2105-7-362-i49" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:mfrac><m:mn>2</m:mn><m:mrow><m:mn>10</m:mn></m:mrow></m:mfrac></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaadaWcaaqaaiabikdaYaqaaiabigdaXiabicdaWaaaaaa@2F8C@</m:annotation></m:semantics></m:math>, <it>Recall</it>(2) = <m:math name="1471-2105-7-362-i50" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:mfrac><m:mn>2</m:mn><m:mn>2</m:mn></m:mfrac></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaadaWcaaqaaiabikdaYaqaaiabikdaYaaaaaa@2EA0@</m:annotation></m:semantics></m:math>). Applying Eq.(5) with <it>b </it>= 1.0, we obtain <it>E_Measure</it>(1, 1.0) = <m:math name="1471-2105-7-362-i51" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:mfrac><m:mn>1</m:mn><m:mn>3</m:mn></m:mfrac></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaadaWcaaqaaiabigdaXaqaaiabiodaZaaaaaa@2EA0@</m:annotation></m:semantics></m:math> and <it>E</it>_<it>Measure</it>(2, 1.0) = <m:math name="1471-2105-7-362-i52" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:mfrac><m:mn>2</m:mn><m:mn>3</m:mn></m:mfrac></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaadaWcaaqaaiabikdaYaqaaiabiodaZaaaaaa@2EA2@</m:annotation></m:semantics></m:math>. Substituting these two values into Eq.(6), we compute <m:math name="1471-2105-7-362-i53" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>E</m:mi><m:mrow><m:mi>s</m:mi><m:mi>u</m:mi><m:mi>m</m:mi></m:mrow><m:mi>t</m:mi></m:msubsup><m:mrow><m:mo>(</m:mo><m:mrow><m:msub><m:mi>&#962;</m:mi><m:mrow><m:msub><m:mi>F</m:mi><m:mn>1</m:mn></m:msub></m:mrow></m:msub></m:mrow><m:mo>)</m:mo></m:mrow></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGfbqrdaqhaaWcbaGaem4CamNaemyDauNaemyBa0gabaGaemiDaqhaaOWaaeWaceaaiiGacqWFcpGvdaWgaaWcbaGaemOray0aaSbaaWqaaiabigdaXaqabaaaleqaaaGccaGLOaGaayzkaaaaaa@3956@</m:annotation></m:semantics></m:math> = 1.00. Similarly, for candidate fold <it>F</it><sub>2</sub>, using <it>Result F</it><sub>2 </sub>of Figure <figr fid="F10">10</figr>, the effectiveness of <it>F</it><sub>2 </sub>is <m:math name="1471-2105-7-362-i54" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>E</m:mi><m:mrow><m:mi>s</m:mi><m:mi>u</m:mi><m:mi>m</m:mi></m:mrow><m:mi>t</m:mi></m:msubsup><m:mrow><m:mo>(</m:mo><m:mrow><m:msub><m:mi>&#962;</m:mi><m:mrow><m:msub><m:mi>F</m:mi><m:mn>2</m:mn></m:msub></m:mrow></m:msub></m:mrow><m:mo>)</m:mo></m:mrow></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGfbqrdaqhaaWcbaGaem4CamNaemyDauNaemyBa0gabaGaemiDaqhaaOWaaeWaceaaiiGacqWFcpGvdaWgaaWcbaGaemOray0aaSbaaWqaaiabikdaYaqabaaaleqaaaGccaGLOaGaayzkaaaaaa@3958@</m:annotation></m:semantics></m:math> = 0.70.</p>
               <p>According to Figure <figr fid="F3">3</figr>, there exists a significant number of small-size folds in the SCOP <it>v1</it>.69 release with 143 folds containing only one protein chain and 140 folds with two protein chains. When a newly-discovered protein belongs to a small-size fold, the algorithm might give a false positive due to insufficient ground truth data. To classify proteins in these small-size folds, we expect the NN search to retrieve a correct fold in the high-dimensional space by turning on a parameter &#955; in the <it>E-Predict </it>algorithm. Let <it>P</it><sub>0 </sub>be the nearest neighbor protein of a query <it>t </it>in <it>R </it>and <m:math name="1471-2105-7-362-i55" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>P</m:mi><m:mrow><m:mi>N</m:mi><m:mi>N</m:mi></m:mrow><m:mrow><m:mi>F</m:mi><m:mo>*</m:mo></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGqbaudaqhaaWcbaGaemOta4KaemOta4eabaGaemOrayKaeiOkaOcaaaaa@323D@</m:annotation></m:semantics></m:math> be the nearest neighbor protein in the candidate fold with the minimum <m:math name="1471-2105-7-362-i46" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>E</m:mi><m:mrow><m:mi>s</m:mi><m:mi>u</m:mi><m:mi>m</m:mi></m:mrow><m:mi>t</m:mi></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGfbqrdaqhaaWcbaGaem4CamNaemyDauNaemyBa0gabaGaemiDaqhaaaaa@33A2@</m:annotation></m:semantics></m:math> score (see line 28 of Appendix 1). The algorithm computes the structural variation values, <it>S</it>, for one pair (<it>t</it>,<it>P</it><sub>0</sub>) and the other pair (<it>t</it>, <m:math name="1471-2105-7-362-i55" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>P</m:mi><m:mrow><m:mi>N</m:mi><m:mi>N</m:mi></m:mrow><m:mrow><m:mi>F</m:mi><m:mo>*</m:mo></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGqbaudaqhaaWcbaGaemOta4KaemOta4eabaGaemOrayKaeiOkaOcaaaaa@323D@</m:annotation></m:semantics></m:math>) using the function in Eq.(2). The algorithm finally assigns the candidate fold with the minimum <it>S </it>value to the newly-discovered protein.</p>
               <p>In the <it>E-Predict </it>algorithm, there exist two parameters, <it>b </it>and <it>n</it><sub><it>t</it></sub>, that affect classification results. From our empirical observations, the best setting for the latest SCOP <it>v</it>1.69 release has <it>b </it>= 1.5 and <it>n</it><sub><it>t </it></sub>= 6 with &#955; = <it>on </it>and <it>k </it>set to 500 nearest neighbors. Figure <figr fid="F9">9</figr> shows comparisons of classification accuracies among <it>E-Predict</it>, NN, 3-NN, 5-NN, and <it>C4.5 DT </it>across seven test sets from <m:math name="1471-2105-7-362-i56" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>&#916;</m:mi><m:mrow><m:mi>v</m:mi><m:mn>1.55</m:mn></m:mrow><m:mrow><m:mi>v</m:mi><m:mn>1.57</m:mn><m:mo>,</m:mo><m:mi>k</m:mi><m:mi>n</m:mi><m:mi>o</m:mi><m:mi>w</m:mi><m:mi>n</m:mi></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODayNaeGymaeJaeiOla4IaeGynauJaeGynaudabaGaemODayNaeGymaeJaeiOla4IaeGynauJaeG4naCJaeiilaWIaem4AaSMaemOBa4Maem4Ba8Maem4DaCNaemOBa4gaaaaa@409C@</m:annotation></m:semantics></m:math> to <m:math name="1471-2105-7-362-i57" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>&#916;</m:mi><m:mrow><m:mi>v</m:mi><m:mn>1.67</m:mn></m:mrow><m:mrow><m:mi>v</m:mi><m:mn>1.69</m:mn><m:mo>,</m:mo><m:mi>k</m:mi><m:mi>n</m:mi><m:mi>o</m:mi><m:mi>w</m:mi><m:mi>n</m:mi></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeG4naCdabaGaemODayNaeGymaeJaeiOla4IaeGOnayJaeGyoaKJaeiilaWIaem4AaSMaemOBa4Maem4Ba8Maem4DaCNaemOBa4gaaaaa@40A8@</m:annotation></m:semantics></m:math>. For all test sets, <it>E-Predict </it>always outperforms <it>k</it>-NN and <it>C4.5 DT </it>with an improved classification accuracy.</p>
            </sec>
         </sec>
         <sec>
            <st>
               <p>Recognizing the novel folds for newly-discovered proteins</p>
            </st>
            <p>Classifying newly-discovered proteins into either the <it>novel folds </it>or the <it>known folds </it>has been identified as a two-class recognition problem <abbrgrp><abbr bid="B13">13</abbr></abbrgrp>. Let <it>v</it><sub>1</sub>, <it>v</it><sub>2 </sub>and <it>v</it><sub>3 </sub>denote three different SCOP releases in chronological order. To classify proteins from <m:math name="1471-2105-7-362-i58" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>&#916;</m:mi><m:mrow><m:msub><m:mi>v</m:mi><m:mn>2</m:mn></m:msub></m:mrow><m:mrow><m:msub><m:mi>v</m:mi><m:mn>3</m:mn></m:msub><m:mo>,</m:mo><m:mi>n</m:mi><m:mi>o</m:mi><m:mi>v</m:mi><m:mi>e</m:mi><m:mi>l</m:mi></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODay3aaSbaaWqaaiabikdaYaqabaaaleaacqWG2bGDdaWgaaadbaGaeG4mamdabeaaliabcYcaSiabd6gaUjabd+gaVjabdAha2jabdwgaLjabdYgaSbaaaaa@3B54@</m:annotation></m:semantics></m:math>, our algorithm relies on ground truth data from <m:math name="1471-2105-7-362-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>&#916;</m:mi><m:mrow><m:msub><m:mi>v</m:mi><m:mn>1</m:mn></m:msub></m:mrow><m:mrow><m:msub><m:mi>v</m:mi><m:mn>2</m:mn></m:msub></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODay3aaSbaaWqaaiabigdaXaqabaaaleaacqWG2bGDdaWgaaadbaGaeGOmaidabeaaaaaaaa@3370@</m:annotation></m:semantics></m:math> with three features, which are derived from the result of <it>E-Predict </it>algorithm and will be discussed in great detail in the following section. In the labeling procedure of Figure <figr fid="F11">11</figr>, the algorithm first extracts the three features from proteins in <m:math name="1471-2105-7-362-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>&#916;</m:mi><m:mrow><m:msub><m:mi>v</m:mi><m:mn>1</m:mn></m:msub></m:mrow><m:mrow><m:msub><m:mi>v</m:mi><m:mn>2</m:mn></m:msub></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODay3aaSbaaWqaaiabigdaXaqabaaaleaacqWG2bGDdaWgaaadbaGaeGOmaidabeaaaaaaaa@3370@</m:annotation></m:semantics></m:math>. These proteins are then categorized into either the <it>known folds </it>of <it>v</it><sub>1 </sub>or the <it>novel folds </it>as our ground truth data. In the testing procedure, proteins in the <it>novel folds </it>of <m:math name="1471-2105-7-362-i58" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>&#916;</m:mi><m:mrow><m:msub><m:mi>v</m:mi><m:mn>2</m:mn></m:msub></m:mrow><m:mrow><m:msub><m:mi>v</m:mi><m:mn>3</m:mn></m:msub><m:mo>,</m:mo><m:mi>n</m:mi><m:mi>o</m:mi><m:mi>v</m:mi><m:mi>e</m:mi><m:mi>l</m:mi></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHuoardaqhaaWcbaGaemODay3aaSbaaWqaaiabikdaYaqabaaaleaacqWG2bGDdaWgaaadbaGaeG4mamdabeaaliabcYcaSiabd6gaUjabd+gaVjabdAha2jabdwgaLjabdYgaSbaaaaa@3B54@</m:annotation></m:semantics></m:math> are selected as our test data and are disjoint with our ground truth data. Once the three features are extracted from the testing proteins, we apply the <it>E-Predict </it>algorithm to classify test proteins into either the <it>novel folds </it>or the <it>known folds</it>.</p>
            <fig id="F11">
               <title>
                  <p>Figure 11</p>
               </title>
               <caption>
                  <p><it>E-Predict </it>model for recognizing the novel folds for newly-discovered proteins</p>
               </caption>
               <text>
                  <p><it>E-Predict </it>model for recognizing the novel folds for newly-discovered proteins.</p>
               </text>
               <graphic file="1471-2105-7-362-11"/>
            </fig>
            <sec>
               <st>
                  <p>Feature extraction</p>
               </st>
               <p>For a newly-discovered protein <it>P</it><sub><it>N </it></sub>that does not belong to the <it>known folds</it>, we assume this protein has a low structural similarity to those proteins in the <it>known folds</it>. Under this assumption, we identify the three features that are used to achieve the novel fold recognition task. Figure <figr fid="F12">12</figr> illustrates an example showing the three features for <it>P</it><sub><it>N</it></sub>. The first feature, <m:math name="1471-2105-7-362-i59" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>E</m:mi><m:mrow><m:mi>s</m:mi><m:mi>u</m:mi><m:mi>m</m:mi></m:mrow><m:mrow><m:msub><m:mi>P</m:mi><m:mi>N</m:mi></m:msub></m:mrow></m:msubsup><m:mrow><m:mo>(</m:mo><m:mrow><m:msub><m:mi>&#962;</m:mi><m:mrow><m:mi>F</m:mi><m:mo>*</m:mo></m:mrow></m:msub></m:mrow><m:mo>)</m:mo></m:mrow></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGfbqrdaqhaaWcbaGaem4CamNaemyDauNaemyBa0gabaGaemiuaa1aaSbaaWqaaiabd6eaobqabaaaaOWaaeWaceaaiiGacqWFcpGvdaWgaaWcbaGaemOrayKaeiOkaOcabeaaaOGaayjkaiaawMcaaaaa@3A14@</m:annotation></m:semantics></m:math>, is the minimum evaluation score of <it>P</it><sub><it>N </it></sub>using the <it>E-Predict </it>algorithm with a suggested known fold <it>F</it>*. The second feature, <it>Dist</it>, represents the Euclidean distance between <it>P</it><sub><it>N </it></sub>and <m:math name="1471-2105-7-362-i55" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>P</m:mi><m:mrow><m:mi>N</m:mi><m:mi>N</m:mi></m:mrow><m:mrow><m:mi>F</m:mi><m:mo>*</m:mo></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGqbaudaqhaaWcbaGaemOta4KaemOta4eabaGaemOrayKaeiOkaOcaaaaa@323D@</m:annotation></m:semantics></m:math>, which denotes the nearest neighbor protein of <it>P</it><sub><it>N </it></sub>labeled with fold <it>F</it>*. The third feature, <it>S</it>, is the structural variation value between <it>P</it><sub><it>N </it></sub>and <m:math name="1471-2105-7-362-i55" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>P</m:mi><m:mrow><m:mi>N</m:mi><m:mi>N</m:mi></m:mrow><m:mrow><m:mi>F</m:mi><m:mo>*</m:mo></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGqbaudaqhaaWcbaGaemOta4KaemOta4eabaGaemOrayKaeiOkaOcaaaaa@323D@</m:annotation></m:semantics></m:math> using the function defined in Eq.(2). After feature extraction, these feature values are normalized between 0 and 1; each protein is then represented by a 3-D feature vector. The rationale for using these three features is in the following. Let <it>P</it><sub><it>K </it></sub>be a newly-discovered protein that has been classified in the <it>known folds</it>. If <it>P</it><sub><it>N </it></sub>is structurally dissimilar to all known protein structures from the SCOP database, then the Euclidean distance between <it>P</it><sub><it>N </it></sub>and its nearest neighbor protein in a known fold suggested by <it>E-Predict </it>is expected to be greater than the distance between <it>P</it><sub><it>K </it></sub>and its nearest neighbor protein <m:math name="1471-2105-7-362-i55" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>P</m:mi><m:mrow><m:mi>N</m:mi><m:mi>N</m:mi></m:mrow><m:mrow><m:mi>F</m:mi><m:mo>*</m:mo></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGqbaudaqhaaWcbaGaemOta4KaemOta4eabaGaemOrayKaeiOkaOcaaaaa@323D@</m:annotation></m:semantics></m:math>. Similarly, the structural variation value of <it>P</it><sub><it>N </it></sub>and its nearest neighbor protein is expected to be higher than the structural variation value of <it>P</it><sub><it>K </it></sub>and its nearest neighbor protein. Also, the minimum evaluation score of <it>P</it><sub><it>N</it></sub>, <m:math name="1471-2105-7-362-i59" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>E</m:mi><m:mrow><m:mi>s</m:mi><m:mi>u</m:mi><m:mi>m</m:mi></m:mrow><m:mrow><m:msub><m:mi>P</m:mi><m:mi>N</m:mi></m:msub></m:mrow></m:msubsup><m:mrow><m:mo>(</m:mo><m:mrow><m:msub><m:mi>&#962;</m:mi><m:mrow><m:mi>F</m:mi><m:mo>*</m:mo></m:mrow></m:msub></m:mrow><m:mo>)</m:mo></m:mrow></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGfbqrdaqhaaWcbaGaem4CamNaemyDauNaemyBa0gabaGaemiuaa1aaSbaaWqaaiabd6eaobqabaaaaOWaaeWaceaaiiGacqWFcpGvdaWgaaWcbaGaemOrayKaeiOkaOcabeaaaOGaayjkaiaawMcaaaaa@3A14@</m:annotation></m:semantics></m:math>, is expected to be higher than the score of <it>P</it><sub><it>K</it></sub>. Table <tblr tid="T7">7</tblr> lists a brief summary of expected properties of the three features for proteins in the <it>novel folds </it>and the <it>known folds</it>.</p>
               <fig id="F12">
                  <title>
                     <p>Figure 12</p>
                  </title>
                  <caption>
                     <p>An example of identifying <m:math name="1471-2105-7-362-i55" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>P</m:mi><m:mrow><m:mi>N</m:mi><m:mi>N</m:mi></m:mrow><m:mrow><m:mi>F</m:mi><m:mo>*</m:mo></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGqbaudaqhaaWcbaGaemOta4KaemOta4eabaGaemOrayKaeiOkaOcaaaaa@323D@</m:annotation></m:semantics></m:math> for a newly-discovered protein <it>P</it><sub><it>N </it></sub>in the <it>novel folds </it>by selecting the nearest neighbor protein in a fold <it>F</it>* derived from the <it>E-Predict </it>algorithm</p>
                  </caption>
                  <text>
                     <p>An example of identifying <m:math name="1471-2105-7-362-i55" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>P</m:mi><m:mrow><m:mi>N</m:mi><m:mi>N</m:mi></m:mrow><m:mrow><m:mi>F</m:mi><m:mo>*</m:mo></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGqbaudaqhaaWcbaGaemOta4KaemOta4eabaGaemOrayKaeiOkaOcaaaaa@323D@</m:annotation></m:semantics></m:math> for a newly-discovered protein <it>P</it><sub><it>N </it></sub>in the <it>novel folds </it>by selecting the nearest neighbor protein in a fold <it>F</it>* derived from the <it>E-Predict </it>algorithm.</p>
                  </text>
                  <graphic file="1471-2105-7-362-12"/>
               </fig>
               <tbl id="T7">
                  <title>
                     <p>Table 7</p>
                  </title>
                  <caption>
                     <p>A comparison of the three features for proteins in the <it>novel folds </it>and the <it>known folds</it>.</p>
                  </caption>
                  <tblbdy cols="4">
                     <r>
                        <c>
                           <p/>
                        </c>
                        <c ca="center">
                           <p>(<it>f</it><sub>1</sub>) <m:math name="1471-2105-7-362-i46" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>E</m:mi><m:mrow><m:mi>s</m:mi><m:mi>u</m:mi><m:mi>m</m:mi></m:mrow><m:mi>t</m:mi></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGfbqrdaqhaaWcbaGaem4CamNaemyDauNaemyBa0gabaGaemiDaqhaaaaa@33A2@</m:annotation></m:semantics></m:math> (<it>&#962;</it><it>F</it>*)</p>
                        </c>
                        <c ca="center">
                           <p>(<it>f</it><sub>2</sub>)<it>Dist</it></p>
                        </c>
                        <c ca="center">
                           <p>(<it>f</it><sub>3</sub>)<it>S</it></p>
                        </c>
                     </r>
                     <r>
                        <c cspan="4">
                           <hr/>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>
                              <it>novel folds</it>
                           </p>
                        </c>
                        <c ca="center">
                           <p>High</p>
                        </c>
                        <c ca="center">
                           <p>High</p>
                        </c>
                        <c ca="center">
                           <p>High</p>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>
                              <it>known folds</it>
                           </p>
                        </c>
                        <c ca="center">
                           <p>Low</p>
                        </c>
                        <c ca="center">
                           <p>Low</p>
                        </c>
                        <c ca="center">
                           <p>Low</p>
                        </c>
                     </r>
                  </tblbdy>
               </tbl>
            </sec>
            <sec>
               <st>
                  <p>Novel fold recognition</p>
               </st>
               <p>With the three features, labeling and testing procedures can be conducted to recognize the novel folds for newly-discovered proteins. According to the statistics in Table <tblr tid="T2">2</tblr> for a certain release of the SCOP database, the majority of proteins are from the <it>known folds</it>. From our empirical observations, the classifier is biased to favor the <it>known folds </it>in a 3-D feature space with two overlapping classes. To reduce noise from the <it>known folds</it>, our model randomly selects an equal number of proteins from the <it>known folds </it>and the <it>novel folds </it>in the labeling procedure. We then apply the <it>E-Predict </it>algorithm to classify test proteins into either the <it>novel folds </it>or the <it>known folds</it>.</p>
            </sec>
         </sec>
         <sec>
            <st>
               <p>Online index using the M-tree</p>
            </st>
            <p>Exhaustively searching for the nearest neighbors within a large-scale database such as the SCOP database is known to be computationally expensive. To improve the efficiency of <it>k</it>-NN search, we use an M-tree to index the high-dimensional feature vectors of proteins. The M-tree <abbrgrp><abbr bid="B27">27</abbr></abbrgrp> scales well to support dynamic operations such as insert, delete and update. Each root node in a subtree maintains two values, a radius <it>R</it><sub><it>s </it></sub>and a prototype protein, that create a <it>hyper-sphere </it>in the high-dimensional feature space, <it>A</it><sub><it>s</it></sub>, to include all proteins within the subtree. During the Depth-First-Search(DFS) traversal, the M-tree algorithm maintains a priority queue with <it>k </it>slots to record the current <it>k </it>nearest neighbors. In addition, a radius centered at the query protein, <it>R</it><sub><it>q</it></sub>, defines a search space <it>A</it><sub><it>q</it></sub>. Initially, both <it>R</it><sub><it>q </it></sub>and <it>A</it><sub><it>q </it></sub>are set to &#8734;. Once a protein has been inserted into the queue, <it>R</it><sub><it>q </it></sub>is then updated by the maximum Euclidean distance between the current proteins in the queue and the query protein, resulting in a much smaller search space <it>A</it><sub><it>q</it></sub>. A fast nearest neighbor search is achieved by: 1) Applying the triangle inequality, the M-tree algorithm avoids traversing subtrees which do not overlap with the search space <it>A</it><sub><it>q</it></sub>. 2) Concurrently, <it>A</it><sub><it>q </it></sub>shrinks at a rapid rate due to the insertion of proteins in the queue. In our implementation, the M-tree indices have been properly organized into memory on several servers for a robust, fast nearest neighbor search.</p>
         </sec>
         <sec>
            <st>
               <p>Web interface</p>
            </st>
            <p>We have implemented a web interface to suggest a set of known SCOP folds and to recognize the novel folds for newly-discovered proteins. Users are allowed to submit 3-D protein structures in the PDB format. Our system first converts protein structures into 33-D feature vectors. Then, an evaluation score for each fold is computed from the ranked results of a nearest neighbors search. In seconds, a ranked list of SCOP folds is displayed to the user. To aid in visually inspecting the classification results, a tool is provided to superimpose a known protein from a suggested fold on the query structure using the <it>KiNG (Kinemage, Next Generation) </it>graphic package <url>http://kinemage.biochem.duke.edu/software/king.php</url>. In Figure <figr fid="F13">13</figr>, a 3-D superimposition view shows that the query protein is structurally similar to a known protein 9<it>xim_A </it>in the suggested fold. Our system, <it>ProteinDBS-predict</it>, is publicly accessible at <url>http://ProteinDBS.rnet.missouri.edu/E-Predict.php</url>.</p>
            <fig id="F13">
               <title>
                  <p>Figure 13</p>
               </title>
               <caption>
                  <p>The superimposition of a newly-discovered protein and a known protein chain from the top ranked SCOP fold</p>
               </caption>
               <text>
                  <p>The superimposition of a newly-discovered protein and a known protein chain from the top ranked SCOP fold.</p>
               </text>
               <graphic file="1471-2105-7-362-13"/>
            </fig>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Authors' contributions</p>
         </st>
         <p>Both CS and PC designed the algorithm. PC implemented related programs and a web-based interface. CS supervised the whole project. DX contributed technical advice and helped test the system. PC drafted the manuscript and all authors finalized it. All authors read and approved the final manuscript.</p>
      </sec>
      <sec>
         <st>
            <p>Appendix</p>
         </st>
         <tbl id="T8">
            <title>
               <p/>
            </title>
            <caption>
               <p>Appendix 1 E-Predict Algorithm</p>
            </caption>
            <tblbdy cols="1">
               <r>
                  <c ca="left">
                     <p><b>Require</b>: <it>t</it>, <it>R</it>, <it>b</it>, <it>n</it><sub>t</sub>, <it>&#955;</it></p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>1: &#8719; = &#8709;</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>2: <b>for </b>each protein <it>p </it>&#8712; <it>R </it><b>do</b></p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>3: &#160;&#160;&#160;<b>if </b><it>p</it>&#183;<it>fold </it>&#8713; &#8719; <b>then</b></p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>4: &#160;&#160;&#160;&#160;&#160;&#160;&#8719; = &#8719; &#8746; {<it>p</it>&#183;<it>fold</it>}</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>5: &#160;&#160;&#160;&#160;&#160;&#160;<it>Count</it>[<it>p</it>&#183;<it>fold</it>] &#8592; 1</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>6: &#160;&#160;&#160;<b>else</b></p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>7: &#160;&#160;&#160;&#160;&#160;&#160;<it>Count</it>[<it>p</it>&#183;<it>fold</it>] &#8592; <it>Count</it>[<it>p</it>&#183;<it>fold</it>] + 1</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>8: &#160;&#160;&#160;<b>end if</b></p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>9: <b>end for</b></p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>10: <b>for </b><it>i </it>&#8592; 0 to |&#8719;| - 1 <b>do</b></p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>11: &#160;&#160;&#160;<b>if </b><it>Count</it>[<it>i</it>] &lt;<it>n</it><sub><it>t </it></sub><b>then</b></p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>12: &#160;&#160;&#160;&#160;&#160;&#160;&#8719; &#8592; &#8719; - {<it>i</it>}</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>13: &#160;&#160;&#160;<b>end if </b></p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>14: &#160;&#160;&#160;<m:math name="1471-2105-7-362-i46" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>E</m:mi><m:mrow><m:mi>s</m:mi><m:mi>u</m:mi><m:mi>m</m:mi></m:mrow><m:mi>t</m:mi></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGfbqrdaqhaaWcbaGaem4CamNaemyDauNaemyBa0gabaGaemiDaqhaaaaa@33A2@</m:annotation></m:semantics></m:math>[<it>i</it>] &#8592; 0</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>15: &#160;&#160;&#160;<it>Count</it>[<it>i</it>] &#8592; 0</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>16: <b>end for</b></p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>17: <b>for </b>each candidate SCOP fold <it>F</it>&#8712; &#8719; <b>do</b></p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>18: &#160;&#160;&#160;<b>for </b>each <it>p </it>&#8712; <it>R </it>starting from the top ranked protein <b>do</b></p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>19: &#160;&#160;&#160;&#160;&#160;&#160;<b>if </b><it>p</it>&#183;<it>fold </it>= <it>F </it><b>then</b></p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>20: &#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;<it>Count</it>[<it>F</it>] &#8592; <it>Count</it>[<it>F</it>] + 1</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>21: &#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;<b>if </b><it>Count</it>[<it>F</it>] &lt;<it>n</it><sub><it>t </it></sub><b>then</b></p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>22: &#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;<m:math name="1471-2105-7-362-i46" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>E</m:mi><m:mrow><m:mi>s</m:mi><m:mi>u</m:mi><m:mi>m</m:mi></m:mrow><m:mi>t</m:mi></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGfbqrdaqhaaWcbaGaem4CamNaemyDauNaemyBa0gabaGaemiDaqhaaaaa@33A2@</m:annotation></m:semantics></m:math> [<it>F</it>] &#8592; <m:math name="1471-2105-7-362-i46" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>E</m:mi><m:mrow><m:mi>s</m:mi><m:mi>u</m:mi><m:mi>m</m:mi></m:mrow><m:mi>t</m:mi></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGfbqrdaqhaaWcbaGaem4CamNaemyDauNaemyBa0gabaGaemiDaqhaaaaa@33A2@</m:annotation></m:semantics></m:math> [<it>F</it>] + <it>E</it>_<it>Measure</it>(<it>p</it>, <it>b</it>)</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>23: &#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;<b>end if </b></p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>24: &#160;&#160;&#160;&#160;&#160;&#160;<b>end if </b></p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>25: &#160;&#160;&#160;<b>end for</b></p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>26: <b>end for</b></p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>27: <it>F</it>* &#8592; arg min<sub><it>f </it></sub><m:math name="1471-2105-7-362-i46" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>E</m:mi><m:mrow><m:mi>s</m:mi><m:mi>u</m:mi><m:mi>m</m:mi></m:mrow><m:mi>t</m:mi></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGfbqrdaqhaaWcbaGaem4CamNaemyDauNaemyBa0gabaGaemiDaqhaaaaa@33A2@</m:annotation></m:semantics></m:math>[<it>f</it>]</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>28: <b>if </b>(&#955; = <it>on</it>) AND (S(<it>t</it>,<it>P</it><sub>0</sub>) &lt; S(<it>t</it>,<m:math name="1471-2105-7-362-i55" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>P</m:mi><m:mrow><m:mi>N</m:mi><m:mi>N</m:mi></m:mrow><m:mrow><m:mi>F</m:mi><m:mo>*</m:mo></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGqbaudaqhaaWcbaGaemOta4KaemOta4eabaGaemOrayKaeiOkaOcaaaaa@323D@</m:annotation></m:semantics></m:math>)) <b>then</b></p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>29: &#160;&#160;&#160;F* &#8592; <it>P</it><sub>0.<it>fold</it></sub></p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>30: <b>end if </b></p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>31: <b>return </b><it>F*</it></p>
                  </c>
               </r>
            </tblbdy>
         </tbl>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>The authors would like to thank Dr.Tolga Can of University of Middle East Technical University, Ankara, Turkey for the classification results of DALI, CE, and VAST algorithms. This research was supported by the University of Missouri-Columbia Research Council.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>Structure-based assignment of the biochemical function of a hypothetical protein: A test case of structural genomics</p>
            </title>
            <aug>
               <au>
                  <snm>Zarembinski</snm>
                  <fnm>TI</fnm>
               </au>
               <au>
                  <snm>Hung</snm>
                  <fnm>LW</fnm>
               </au>
               <au>
                  <snm>Mueller-Dieckmann</snm>
                  <fnm>HJ</fnm>
               </au>
               <au>
                  <snm>Kim</snm>
                  <fnm>KK</fnm>
               </au>
               <au>
                  <snm>Yokota</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Kim</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Kim</snm>
                  <fnm>SH</fnm>
               </au>
            </aug>
            <source>Proc Natl Sci USA</source>
            <pubdate>1998</pubdate>
            <volume>95</volume>
            <fpage>15189</fpage>
            <lpage>15193</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1073/pnas.95.26.15189</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>An overview of structural genomics</p>
            </title>
            <aug>
               <au>
                  <snm>Burley</snm>
                  <fnm>SK</fnm>
               </au>
            </aug>
            <source>Nat Struct Biol</source>
            <pubdate>2000</pubdate>
            <volume>7</volume>
            <fpage>932</fpage>
            <lpage>934</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/80697</pubid>
                  <pubid idtype="pmpid" link="fulltext">11103991</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>Global efforts in structural genomics</p>
            </title>
            <aug>
               <au>
                  <snm>Stevens</snm>
                  <fnm>RC</fnm>
               </au>
               <au>
                  <snm>Yokoyama</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Wilson</snm>
                  <fnm>IA</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2001</pubdate>
            <volume>294</volume>
            <fpage>89</fpage>
            <lpage>92</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.1066011</pubid>
                  <pubid idtype="pmpid" link="fulltext">11588249</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>TargetDB: a target registration database for structural genomics projects</p>
            </title>
            <aug>
               <au>
                  <snm>Chen</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Oughtred</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Berman</snm>
                  <fnm>HM</fnm>
               </au>
               <au>
                  <snm>Westbrook</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2004</pubdate>
            <volume>20</volume>
            <issue>16</issue>
            <fpage>2860</fpage>
            <lpage>2862</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/bth300</pubid>
                  <pubid idtype="pmpid" link="fulltext">15130928</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>PDB-UF: database of predicted enzymatic functions for unannotated protein structures from structural genomics</p>
            </title>
            <aug>
               <au>
                  <snm>von Grotthuss</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Plewczynski</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Ginalski</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Rychlewski</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Shakhnovich</snm>
                  <fnm>EI</fnm>
               </au>
            </aug>
            <source>BMC Bioinformatics</source>
            <pubdate>2006</pubdate>
            <volume>7</volume>
            <issue>53</issue>
            <note>doi:10.1186/1471-2105-7-53</note>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1409798</pubid>
                  <pubid idtype="pmpid" link="fulltext">16460560</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>The CATH database: an extended protein family resource for structural and functional genomics</p>
            </title>
            <aug>
               <au>
                  <snm>Pearl</snm>
                  <fnm>FM</fnm>
               </au>
               <au>
                  <snm>Bennett</snm>
                  <fnm>CF</fnm>
               </au>
               <au>
                  <snm>Bray</snm>
                  <fnm>JE</fnm>
               </au>
               <au>
                  <snm>Harrison</snm>
                  <fnm>AP</fnm>
               </au>
               <au>
                  <snm>Martin</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Shepherd</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Sillitoe</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Thornton</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Orengo</snm>
                  <fnm>CA</fnm>
               </au>
            </aug>
            <source>Nucl Acids Res</source>
            <pubdate>2003</pubdate>
            <volume>31</volume>
            <issue>1</issue>
            <fpage>452</fpage>
            <lpage>455</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">165509</pubid>
                  <pubid idtype="pmpid" link="fulltext">12520050</pubid>
                  <pubid idtype="doi">10.1093/nar/gkg062</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>Protein structure alignment</p>
            </title>
            <aug>
               <au>
                  <snm>Taylor</snm>
                  <fnm>WR</fnm>
               </au>
               <au>
                  <snm>Orengo</snm>
                  <fnm>CA</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1989</pubdate>
            <volume>208</volume>
            <fpage>1</fpage>
            <lpage>22</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/0022-2836(89)90084-3</pubid>
                  <pubid idtype="pmpid" link="fulltext">2769748</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>Mapping the protein universe</p>
            </title>
            <aug>
               <au>
                  <snm>Holm</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Sander</snm>
                  <fnm>C</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>1996</pubdate>
            <volume>273</volume>
            <fpage>595</fpage>
            <lpage>602</lpage>
            <xrefbib>
               <pubid idtype="pmpid">8662544</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>Protein structure comparison by alignment of distance matrices</p>
            </title>
            <aug>
               <au>
                  <snm>Holm</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Sander</snm>
                  <fnm>C</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1993</pubdate>
            <volume>233</volume>
            <fpage>123</fpage>
            <lpage>138</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1006/jmbi.1993.1489</pubid>
                  <pubid idtype="pmpid" link="fulltext">8377180</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>The structural alignment between two proteins: Is there a unique answer?</p>
            </title>
            <aug>
               <au>
                  <snm>Godzik</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Protein Science</source>
            <pubdate>1996</pubdate>
            <volume>5</volume>
            <fpage>1325</fpage>
            <lpage>1338</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">8819165</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>SCOP: a structural classification of proteins database for the investigation of sequences and structures</p>
            </title>
            <aug>
               <au>
                  <snm>Murzin</snm>
                  <fnm>AG</fnm>
               </au>
               <au>
                  <snm>Brenner</snm>
                  <fnm>SE</fnm>
               </au>
               <au>
                  <snm>Hubbard</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Chothia</snm>
                  <fnm>C</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1995</pubdate>
            <volume>247</volume>
            <fpage>536</fpage>
            <lpage>540</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1006/jmbi.1995.0159</pubid>
                  <pubid idtype="pmpid" link="fulltext">7723011</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>The RCSB Protein Data Bank: a redesigned query system and relational database based on the mmCIF schema</p>
            </title>
            <aug>
               <au>
                  <snm>Deshpande</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Addess</snm>
                  <fnm>KJ</fnm>
               </au>
               <au>
                  <snm>Bluhm</snm>
                  <fnm>WF</fnm>
               </au>
               <au>
                  <snm>Merino-Ott</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Townsend-Merino</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>Q</fnm>
               </au>
               <au>
                  <snm>Knezevich</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Xie</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Chen</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Feng</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Kramer Green</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Flippen-Anderson</snm>
                  <fnm>JL</fnm>
               </au>
               <au>
                  <snm>Westbrook</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Berman</snm>
                  <fnm>HM</fnm>
               </au>
               <au>
                  <snm>Bourne</snm>
                  <fnm>PE</fnm>
               </au>
            </aug>
            <source>Nucl Acids Res</source>
            <pubdate>2005</pubdate>
            <volume>33</volume>
            <issue>suppl 1</issue>
            <fpage>D233</fpage>
            <lpage>D237</lpage>
         </bibl>
         <bibl id="B13">
            <title>
               <p>Automated Protein Classification Using Consensus Decision</p>
            </title>
            <aug>
               <au>
                  <snm>Can</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Camoglu</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Singh</snm>
                  <fnm>AK</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>YF</fnm>
               </au>
            </aug>
            <source>Proceedings of the Third Int. IEEE Computer Society Computational Systems Bioinformatics Conference: 16&#8211;19 August 2004; Stanford</source>
            <pubdate>2004</pubdate>
            <fpage>224</fpage>
            <lpage>235</lpage>
         </bibl>
         <bibl id="B14">
            <title>
               <p>SCOPmap: Automated assignment of protein structures to evolutionary superfamilies</p>
            </title>
            <aug>
               <au>
                  <snm>Cheek</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Qi</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Krishna</snm>
                  <fnm>SS</fnm>
               </au>
               <au>
                  <snm>Kinch</snm>
                  <fnm>LN</fnm>
               </au>
               <au>
                  <snm>Grishin</snm>
                  <fnm>NV</fnm>
               </au>
            </aug>
            <source>BMC Bioinformatics</source>
            <pubdate>2004</pubdate>
            <volume>5</volume>
            <issue>1</issue>
            <fpage>197</fpage>
            <lpage>197</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">544345</pubid>
                  <pubid idtype="pmpid" link="fulltext">15598351</pubid>
                  <pubid idtype="doi">10.1186/1471-2105-5-197</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B15">
            <title>
               <p>Protein structure alignment by incremental combinatorial extension (CE) of the optimal path</p>
            </title>
            <aug>
               <au>
                  <snm>Shindyalov</snm>
                  <fnm>HN</fnm>
               </au>
               <au>
                  <snm>Bourne</snm>
                  <fnm>PE</fnm>
               </au>
            </aug>
            <source>Protein Engineering</source>
            <pubdate>1998</pubdate>
            <volume>9</volume>
            <fpage>739</fpage>
            <lpage>747</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1093/protein/11.9.739</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <title>
               <p>Threading a database of protein cores</p>
            </title>
            <aug>
               <au>
                  <snm>Madej</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Gibrat</snm>
                  <fnm>JF</fnm>
               </au>
               <au>
                  <snm>Bryant</snm>
                  <fnm>SH</fnm>
               </au>
            </aug>
            <source>Proteins</source>
            <pubdate>1995</pubdate>
            <volume>23</volume>
            <issue>3</issue>
            <fpage>356</fpage>
            <lpage>369</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/prot.340230309</pubid>
                  <pubid idtype="pmpid">8710828</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>A fast protein structure retrieval system using image-based distance matrices and multidimensional index</p>
            </title>
            <aug>
               <au>
                  <snm>Chi</snm>
                  <fnm>PH</fnm>
               </au>
               <au>
                  <snm>Scott</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Shyu</snm>
                  <fnm>CR</fnm>
               </au>
            </aug>
            <source>International Journal of Software Engineering and Knowledge Engineering, Special Issue on Software and Knowledge Engineering Support in Bioinformatics</source>
            <pubdate>2005</pubdate>
            <volume>15</volume>
            <issue>3</issue>
            <fpage>527</fpage>
            <lpage>545</lpage>
         </bibl>
         <bibl id="B18">
            <title>
               <p>Protein Matchmaking</p>
            </title>
            <aug>
               <au>
                  <snm>Leslie</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2004</pubdate>
            <volume>305</volume>
            <fpage>1381</fpage>
         </bibl>
         <bibl id="B19">
            <title>
               <p>ProteinDBS &#8211; A content-based retrieval system for protein structure databases</p>
            </title>
            <aug>
               <au>
                  <snm>Shyu</snm>
                  <fnm>CR</fnm>
               </au>
               <au>
                  <snm>Chi</snm>
                  <fnm>PH</fnm>
               </au>
               <au>
                  <snm>Scott</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Xu</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Nucl Acids Res</source>
            <pubdate>2004</pubdate>
            <volume>32</volume>
            <issue>suppl 2</issue>
            <fpage>W572</fpage>
            <lpage>W575</lpage>
         </bibl>
         <bibl id="B20">
            <title>
               <p>Predicting Ranked SCOP Domains by Mining Associations of Visual Contents in Distance Matrices</p>
            </title>
            <aug>
               <au>
                  <snm>Chi</snm>
                  <fnm>PH</fnm>
               </au>
               <au>
                  <snm>Shyu</snm>
                  <fnm>CR</fnm>
               </au>
            </aug>
            <source>Proceedings of The Fourth Asia Pacific Bioinformatics Conference</source>
            <pubdate>2006</pubdate>
            <fpage>49</fpage>
            <lpage>58</lpage>
         </bibl>
         <bibl id="B21">
            <aug>
               <au>
                  <snm>van Rijsbergen</snm>
                  <fnm>CJ</fnm>
               </au>
            </aug>
            <source>Information Retrieval, Butterworths</source>
            <edition>2</edition>
            <pubdate>1979</pubdate>
         </bibl>
         <bibl id="B22">
            <title>
               <p>The FSSP database of structurally aligned protein fold families</p>
            </title>
            <aug>
               <au>
                  <snm>Holm</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Sander</snm>
                  <fnm>C</fnm>
               </au>
            </aug>
            <source>Nucl Acids Res</source>
            <pubdate>1994</pubdate>
            <volume>22</volume>
            <fpage>3600</fpage>
            <lpage>3609</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">308329</pubid>
                  <pubid idtype="pmpid">7937067</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B23">
            <title>
               <p>Surprising similarities in structure comparison</p>
            </title>
            <aug>
               <au>
                  <snm>Gibrat</snm>
                  <fnm>JF</fnm>
               </au>
               <au>
                  <snm>Madej</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Bryant</snm>
                  <fnm>SH</fnm>
               </au>
            </aug>
            <source>Curr Opin Struct Biol</source>
            <pubdate>1996</pubdate>
            <volume>6</volume>
            <issue>3</issue>
            <fpage>377</fpage>
            <lpage>385</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0959-440X(96)80058-3</pubid>
                  <pubid idtype="pmpid" link="fulltext">8804824</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B24">
            <title>
               <p>A general method applicable to the search for similarities in the amino acid sequence of two proteins</p>
            </title>
            <aug>
               <au>
                  <snm>Needleman</snm>
                  <fnm>SB</fnm>
               </au>
               <au>
                  <snm>Wunsch</snm>
                  <fnm>CD</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1970</pubdate>
            <volume>48</volume>
            <fpage>443</fpage>
            <lpage>453</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/0022-2836(70)90057-4</pubid>
                  <pubid idtype="pmpid" link="fulltext">5420325</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B25">
            <title>
               <p>Discriminant adaptive nearest neighbor classification</p>
            </title>
            <aug>
               <au>
                  <snm>Hastie</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Tibshirani</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>IEEE Trans, on Pattern Analysis and Machine Intelligence</source>
            <pubdate>1996</pubdate>
            <volume>18</volume>
            <issue>6</issue>
            <fpage>607</fpage>
            <lpage>616</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1109/34.506411</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B26">
            <aug>
               <au>
                  <snm>Quinlan</snm>
                  <fnm>JR</fnm>
               </au>
            </aug>
            <source>C4-5: programs for machine learning, Morgan Kaufmann</source>
            <pubdate>1993</pubdate>
         </bibl>
         <bibl id="B27">
            <title>
               <p>M-tree: an efficient access method for similarity search in metric spaces</p>
            </title>
            <aug>
               <au>
                  <snm>Ciaccia</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Patella</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Zezula</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Proceedings of the International Conference on Very Large Databases</source>
            <pubdate>1997</pubdate>
            <fpage>426</fpage>
            <lpage>435</lpage>
         </bibl>
         <bibl id="B28">
            <title>
               <p>SARFing the PDB</p>
            </title>
            <aug>
               <au>
                  <snm>Alexandrov</snm>
                  <fnm>NN</fnm>
               </au>
            </aug>
            <source>Protein Engineering</source>
            <pubdate>1996</pubdate>
            <volume>9</volume>
            <fpage>727</fpage>
            <lpage>732</lpage>
            <xrefbib>
               <pubid idtype="pmpid">8888137</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B29">
            <title>
               <p>The theorey and practice of geometry</p>
            </title>
            <aug>
               <au>
                  <snm>Havel</snm>
                  <fnm>TF</fnm>
               </au>
               <au>
                  <snm>Kuntz</snm>
                  <fnm>ID</fnm>
               </au>
               <au>
                  <snm>Crippen</snm>
                  <fnm>GM</fnm>
               </au>
            </aug>
            <source>Bull Math Biol</source>
            <pubdate>1983</pubdate>
            <volume>45</volume>
            <fpage>665</fpage>
            <lpage>720</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1016/S0092-8240(83)80020-2</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B30">
            <title>
               <p>Mining Residue Contacts in Proteins Using Local Structure Predictions</p>
            </title>
            <aug>
               <au>
                  <snm>Zaki</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Jin</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Bystroff</snm>
                  <fnm>C</fnm>
               </au>
            </aug>
            <source>IEEE Trans, on Systems, Man and Cybernetics &#8211; Part B, special issue on Bio-imaging and Bio-informatics</source>
            <pubdate>2003</pubdate>
            <volume>33</volume>
            <issue>5</issue>
            <fpage>789</fpage>
            <lpage>801</lpage>
         </bibl>
         <bibl id="B31">
            <title>
               <p>Approximate protein structural alignment in polynomial time</p>
            </title>
            <aug>
               <au>
                  <snm>Kolodny</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Linial</snm>
                  <fnm>N</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci</source>
            <pubdate>2004</pubdate>
            <fpage>12201</fpage>
            <lpage>12206</lpage>
            <note>DOI:10.1073/pnas.0404383101</note>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">514457</pubid>
                  <pubid idtype="pmpid" link="fulltext">15304646</pubid>
                  <pubid idtype="doi">10.1073/pnas.0404383101</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B32">
            <title>
               <p>Pictorial dataBase systems</p>
            </title>
            <aug>
               <au>
                  <snm>Chang</snm>
                  <fnm>SK</fnm>
               </au>
               <au>
                  <snm>Kunii</snm>
                  <fnm>TL</fnm>
               </au>
            </aug>
            <source>IEEE Computer</source>
            <pubdate>1981</pubdate>
            <volume>14</volume>
            <fpage>13</fpage>
            <lpage>21</lpage>
         </bibl>
         <bibl id="B33">
            <title>
               <p>Content-based image retrieval at the end of the early years</p>
            </title>
            <aug>
               <au>
                  <snm>Smeulders</snm>
                  <fnm>AWM</fnm>
               </au>
               <au>
                  <snm>Worring</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Santini</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Gupta</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Jain</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>IEEE Trans, on Pattern and Machine Intell</source>
            <pubdate>2000</pubdate>
            <volume>2</volume>
            <fpage>1349</fpage>
            <lpage>1380</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1109/34.895972</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B34">
            <title>
               <p>Special Issue on Content-Based Image Retrieval</p>
            </title>
            <aug>
               <au>
                  <snm>Smeulders</snm>
                  <fnm>AWM</fnm>
               </au>
               <au>
                  <snm>Huang</snm>
                  <fnm>TS</fnm>
               </au>
               <au>
                  <snm>Gevers</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>International Journal of Computer Vision</source>
            <pubdate>2004</pubdate>
            <volume>56</volume>
            <fpage>5</fpage>
            <lpage>6</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1023/B:VISI.0000004865.97704.b9</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B35">
            <aug>
               <au>
                  <snm>Rosenfeld</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Kak</snm>
                  <fnm>AC</fnm>
               </au>
            </aug>
            <source>Digital picture processing</source>
            <publisher>New York: Academic Press</publisher>
            <pubdate>1982</pubdate>
         </bibl>
         <bibl id="B36">
            <title>
               <p>A threshold selection method from gray-level histogram</p>
            </title>
            <aug>
               <au>
                  <snm>Otsu</snm>
                  <fnm>N</fnm>
               </au>
            </aug>
            <source>IEEE Trans, on Systems, Man and Cybernetics</source>
            <pubdate>1979</pubdate>
            <volume>9</volume>
            <fpage>62</fpage>
            <lpage>66</lpage>
         </bibl>
         <bibl id="B37">
            <title>
               <p>Textural features for image classification</p>
            </title>
            <aug>
               <au>
                  <snm>Haralick</snm>
                  <fnm>RM</fnm>
               </au>
               <au>
                  <snm>Shanmugam</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Dinstein</snm>
                  <fnm>I</fnm>
               </au>
            </aug>
            <source>IEEE Trans, on Systems, Man and Cybernetics</source>
            <pubdate>1973</pubdate>
            <volume>3</volume>
            <fpage>610</fpage>
            <lpage>621</lpage>
         </bibl>
         <bibl id="B38">
            <aug>
               <au>
                  <snm>Baeza-Yates</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Ribeiro-Neto</snm>
                  <fnm>B</fnm>
               </au>
            </aug>
            <source>Modern Information Retrieval, Addison Wesley</source>
            <pubdate>1999</pubdate>
         </bibl>
      </refgrp>
   </bm>
</art>
