<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1471-2105-7-40</ui>
   <ji>1471-2105</ji>
   <fm>
      <dochead>Methodology article</dochead>
      <bibl>
         <title>
            <p>Protein structure similarity from principle component correlation analysis</p>
         </title>
         <aug>
            <au id="A1">
               <snm>Zhou</snm>
               <fnm>Xiaobo</fnm>
               <insr iid="I1"/>
               <insr iid="I2"/>
               <email>zhou@crystal.harvard.edu</email>
            </au>
            <au id="A2">
               <snm>Chou</snm>
               <fnm>James</fnm>
               <insr iid="I3"/>
               <email>james_chou@hms.harvard.edu</email>
            </au>
            <au id="A3" ca="yes">
               <snm>Wong</snm>
               <mi>TC</mi>
               <fnm>Stephen</fnm>
               <insr iid="I1"/>
               <insr iid="I2"/>
               <email>stephen_wong@hms.harvard.edu</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Harvard Center for Neurodegeneration and Repair &#8211; Center for Bioinformatics, Harvard Medical School, 1249 Boylston Street, Boston, MA 02215, USA</p>
            </ins>
            <ins id="I2">
               <p>Functional and Molecular Imaging Center, Radiology Department, Brigham and Women's Hospital, One Brigham Circle, 1620 Tremont Street, Boston, MA 02121, USA</p>
            </ins>
            <ins id="I3">
               <p>Department of Biological Chemistry and Molecular Pharmacology, Harvard Medial School, 240 Longwood Avenue, Boston, MA 02115, USA</p>
            </ins>
         </insg>
         <source>BMC Bioinformatics</source>
         <issn>1471-2105</issn>
         <pubdate>2006</pubdate>
         <volume>7</volume>
         <issue>1</issue>
         <fpage>40</fpage>
         <url>http://www.biomedcentral.com/1471-2105/7/40</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">16436213</pubid>
               <pubid idtype="doi">10.1186/1471-2105-7-40</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>22</day>
               <month>4</month>
               <year>2005</year>
            </date>
         </rec>
         <acc>
            <date>
               <day>25</day>
               <month>1</month>
               <year>2006</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>25</day>
               <month>1</month>
               <year>2006</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2006</year>
         <collab>Zhou et al; licensee BioMed Central Ltd.</collab>
         <note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>Owing to rapid expansion of protein structure databases in recent years, methods of structure comparison are becoming increasingly effective and important in revealing novel information on functional properties of proteins and their roles in the grand scheme of evolutionary biology. Currently, the structural similarity between two proteins is measured by the root-mean-square-deviation (RMSD) in their best-superimposed atomic coordinates. RMSD is the golden rule of measuring structural similarity when the structures are nearly identical; it, however, fails to detect the higher order topological similarities in proteins evolved into different shapes. We propose new algorithms for extracting geometrical invariants of proteins that can be effectively used to identify homologous protein structures or topologies in order to quantify both close and remote structural similarities.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>We measure structural similarity between proteins by correlating the principle components of their secondary structure interaction matrix. In our approach, the Principle Component Correlation (PCC) analysis, a symmetric interaction matrix for a protein structure is constructed with relationship parameters between secondary elements that can take the form of distance, orientation, or other relevant structural invariants. When using a distance-based construction in the presence or absence of encoded N to C terminal sense, there are strong correlations between the principle components of interaction matrices of structurally or topologically similar proteins.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusion</p>
               </st>
               <p>The PCC method is extensively tested for protein structures that belong to the same topological class but are significantly different by RMSD measure. The PCC analysis can also differentiate proteins having similar shapes but different topological arrangements. Additionally, we demonstrate that when using two independently defined interaction matrices, comparison of their maximum eigenvalues can be highly effective in clustering structurally or topologically similar proteins. We believe that the PCC analysis of interaction matrix is highly flexible in adopting various structural parameters for protein structure comparison.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <meta>
      <classifications>
         <classification type="bmc" subtype="user_supplied_xml" id="endnote"/>
      </classifications>
   </meta>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>Conformational resemblance between proteins, whether remote or close, is often used to infer functional properties of proteins and to reveal distant evolutionary relationships between two proteins exhibiting no similarity in their amino acid sequences. Traditionally, high-resolution structure determination succeeds the biological and biochemical studies of proteins to further provide mechanistic details of the function of proteins. The biological function of these proteins have usually been suggested prior to their structural studies by <it>in vitro </it>binding assays, <it>in vivo </it>gene knock-out experiments, and sequence homology with proteins of known function. However, with the completion of the sequencing of the genomes of human and other organisms, major structural biology resources have been harnessed to solve structures of large numbers of proteins encoded by the genomes in a high throughput but less specific fashion, under the name 'structural genomics' <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. Subsequently, large sets of protein structures are accumulated in the public domain databases for which we know little about their biological roles. This shortfall calls for the development of cost-effective computational methods to predict protein function based on three-dimensional structures, with the aim of providing preliminary information to guide biological experiments later.</p>
         <p>In the post-genomic era, large amounts of new protein sequences are available for statistics-based recognition of their biological properties. It has been shown in many cases that with the help of elegant computational algorithms, amino acid sequence information alone can be used to successfully predict a protein's structural class <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr></abbrgrp>, sub-cellular location <abbrgrp><abbr bid="B5">5</abbr><abbr bid="B6">6</abbr></abbrgrp>, and even enzymatic activities <abbrgrp><abbr bid="B7">7</abbr><abbr bid="B8">8</abbr><abbr bid="B9">9</abbr><abbr bid="B10">10</abbr></abbrgrp>. These approaches, however, are often limited by sequence noise arose from natural mutations throughout the evolutionary path, in which proteins are structurally and functionally conserved, but divergent in amino acid sequences. It is a recurring theme in structural biology that proteins with completely different sequences can adopt very similar global fold. Hence, incorporating structural information into functional genomics would potentially upgrade predictions to the next level of accuracy. Owing to the rapid technical advances in X-ray crystallography and liquid-state NMR spectroscopy, protein structure determination becomes more routine than before. It is reasonable to predict that full-scale structure determination can be the first step towards characterizing the biological role and mechanism of a newly sequenced protein. In the 13,000-large protein structure database (PDB), there are only approximately 4,000 different folds represented in the PDB, with a fold/structure ratio of approximately 1/5 (in the protein data bank) <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>. Therefore, given a new protein structure determined experimentally, chances are high that its topological arrangement of secondary fragments already exists in PDB either as an individual protein, or as a domain within a larger protein.</p>
         <p>Structure comparison is traditionally based on coordinate RMSD <abbrgrp><abbr bid="B12">12</abbr><abbr bid="B13">13</abbr></abbrgrp>. While the RMSD approach is effective in comparing two close topologic structures with similar chain length, it fails when proteins are of different shapes or lengths. One outstanding example is Calmodulin, a ubiquitous Ca<sup>2+ </sup>binding protein that plays a key role in numerous cellular Ca<sup>2+</sup>-dependent signaling pathways <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>. The backbone RMSD between the Ca<sup>2+</sup>-bound and apo states of individual calmodulin domain (~64 residues) is as large as 4<b>&#197;</b>, despite the fact that they are the same molecules with the same topology. When using the Ca<sup>2+</sup>-bound structure as a starting model, a homology based NMR residual dipolar coupling (RDC) refinement scheme, which relies heavily on the model having the correct topology, is able to converge the model to an accurate apo structure using RDCs measured for the apo state <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>. There are numerous proteins with similar secondary element arrangements in the 3D space yet acquire different overall shapes. Clearly for these proteins, algorithms different from the RMSD must be used to reveal their topological similarities. Another well-known software called Matching Molecular Models Obtained from Theory (MAMMOTH) is a sequence-independent protein structural alignment method <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>. It compares an experimental protein structure using an arbitrary low-resolution protein tertiary model. The distance defined in MAMMOTH is quite different from our approach. There are also many other methods of protein structure comparison, such as <abbrgrp><abbr bid="B17">17</abbr><abbr bid="B18">18</abbr><abbr bid="B19">19</abbr><abbr bid="B20">20</abbr><abbr bid="B21">21</abbr></abbrgrp>. Note that all of the aforementioned methods used sequence based comparison. In contrast, our method adopts secondary structure based comparison and focuses on extracting invariant topological features.</p>
         <p>In our study, we measure structural similarity between proteins by correlating the principle components of their secondary structure interaction matrix. In this method, referred here as the principle component correlation (PCC) analysis, the symmetric matrix for an individual protein is constructed with relationship parameters between secondary elements that can take the form of distance, orientation, or other relevant structural invariants. It is first demonstrated that the maximum eigenvalues of these interaction matrices can be effectively used to group structurally or topologically homologous proteins. Then by taking into account both maximum eigenvalues and their corresponding eigenvectors, a more refined pair-wise structure comparison is performed, which is able to differentiate structures of similar shape but different topological backbone traces. It is also shown that the results of PCC analysis are highly comparable to those given by the scaled Gauss metric (SGM) calculations <abbrgrp><abbr bid="B22">22</abbr></abbrgrp> for the data sets studied. We believe the PPC method is flexible in adopting various structural parameters for pair-wise structure comparison.</p>
      </sec>
      <sec>
         <st>
            <p>Results</p>
         </st>
         <sec>
            <st>
               <p>Materials</p>
            </st>
            <p>A total of fifty-six protein structures, grouped into 6 different sets according to CATH <abbrgrp><abbr bid="B23">23</abbr><abbr bid="B24">24</abbr></abbrgrp> are used to test our algorithms. Proteins in structure set I belong to the "mainly alpha" class, including mostly apoptosis regulators in the BCL-x<sub>L </sub>super family as well as others with remote conformational resemblance; all have the "Orthogonal Bundle" architecture. The atomic coordinates were retrieved from PDB with accession codes 1A4F, 1A6G, 1COL (A), 1DDB (A), 1F16 (A), 1G5M (A), 1GJH (A), 1MAZ, 1MDT (A), and 2BID (A), where (A) means chain A. Set II is also "mainly alpha" and has the same architecture as Set I, including structures 1CK7 (A), 1CXW (A), 1E8B (A), 1E88 (A), 1J7M (A), 1KS0 (A), 1PDC, and 2FN2. However this set consists of DNA helicase domains that have vastly different topology from Set I. Set II is used here to test the ability of our method to separate proteins that are in the same class of secondary structure but have different topologies. Set III belongs to the "mainly beta" class and has the barrel architecture, consisting of acid protease structures 1A5T, 1BVS (A), 1CUK, 1DV (A), 1F4I (A), 1G4A (E), 1G41 (A), 1HJP, 1IM2 (A), and 1JR3 (E). Set IV consists of the "alpha/beta" class proteins with the roll architecture, including structures 1FM0 (D), 1D4B (A), 1C78 (A), 1LM8 (B), 1NDD (A), 1UBQ, 1IBQ (A), and 1IP9 (A). The structures in set IV all have the Ubiquitin-like topology. Set V consists of the "mainly alpha" with the Alpha/alpha barrel architecture, including 1C82 (A), 1CB8 (A), 1EGU (A), 1F1S (A), 1F9G (A), 1HM2 (A), 1HM3 (A), 1HMU (A), 1HMW (A), 1HV6 (A), 1I8Q (A), and 1QAZ (A). The structures in Set V all have the Glycosyltransferase topology. Set VI consists of the "mainly beta" with the ribbon architecture, including 1AIW, 1E6N (A), 1E6P (A), 1E6R (A), 1E6Z (A), 1E15 (A), 1ED7 (A), and 1GOI (A). The structures in Set VI have the Seminal Fluid Protein PDC-109 (domain B).</p>
         </sec>
         <sec>
            <st>
               <p>Clustering of structurally similar proteins by SMEC method</p>
            </st>
            <p>One of the goals of this study is to compare and identify structurally or topologically similar proteins. In other words, given a new experimentally determined protein structure, the proposed method is expected to rapidly place the structure into a group of structurally or topologically similar proteins in the database, thereby aiding in correlating topological similarity with functional similarity. To illustrate the application of the SMEC approach, we compute the scaled eigenvalues of PD and PID interaction matrices (Section Methods). Figure <figr fid="F2">2a</figr> shows the plot of scaled &#955;<sub>2 </sub>versus &#955;<sub>1</sub>, calculated using the PD matrix, for all proteins in the four data sets. Figure <figr fid="F2">2b</figr> shows the plot of &#955;<sub>1 </sub>of PID matrix versus that of PD matrices. The different symbols represent different structural groups. These plots were used to resolve clusters of structurally similar structures.</p>
            <fig id="F2">
               <title>
                  <p>Figure 2</p>
               </title>
               <caption>
                  <p>(a) The plot of scaled &#955;<sub>2 </sub>(the second largest eigenvalue) versus &#955;<sub>1 </sub>(the maximum eigenvalue), calculated using the PD matrix, for all proteins in the four data sets, and (b) the plot of &#955;<sub>1 </sub>of PID matrix versus that of PD matrices</p>
               </caption>
               <text>
                  <p>(a) The plot of scaled &#955;<sub>2 </sub>(the second largest eigenvalue) versus &#955;<sub>1 </sub>(the maximum eigenvalue), calculated using the PD matrix, for all proteins in the four data sets, and (b) the plot of &#955;<sub>1 </sub>of PID matrix versus that of PD matrices. The symbol representations are: &#9675; &#8211; structure set I; &#9661; &#8211; set II; <graphic file="1471-2105-7-40-i1.gif"/> &#8211; set III; &#9633; &#8211; set IV; &#9741; &#8211; set V; and <graphic file="1471-2105-7-40-i2.gif"/> &#8211; set VI.</p>
               </text>
               <graphic file="1471-2105-7-40-2"/>
            </fig>
         </sec>
         <sec>
            <st>
               <p>Pair-Wise structural comparison by PCC method</p>
            </st>
            <p>In addition to correlating the maximum eigenvalues, the PCC method described in Section Methods, which compares both eigenvalues and eigenvectors, was tested for the four selected data sets. Using the pair-wise distance matrix defined in Section Methods, the difference metric <it>R </it>defined in Eq. 5 between all pairs of protein structures in the four data sets were calculated and shown in Tables 1-6. Additionally for the same data sets, writhing numbers computed using the SGM method were presented in the same corresponding tables. The <it>R </it>values between a few selected proteins from different groups were also shown to provide a negative control (Table <tblr tid="T2">2</tblr>).</p>
            <tbl id="T2">
               <title>
                  <p>Table 2</p>
               </title>
               <caption>
                  <p>Pair-wise <it>R </it>values calculated using the PD matrix between representative structures from different structure sets.</p>
               </caption>
               <tblbdy cols="4">
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>2BID</p>
                     </c>
                     <c ca="left">
                        <p>1C78</p>
                     </c>
                     <c ca="left">
                        <p>2FN2</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="4">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>1A5T</p>
                     </c>
                     <c ca="left">
                        <p>2.1121</p>
                     </c>
                     <c ca="left">
                        <p>6.8168</p>
                     </c>
                     <c ca="left">
                        <p>5.8935</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>1C78</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>4.6893</p>
                     </c>
                     <c ca="left">
                        <p>8.3020</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>1FN2</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>7.6954</p>
                     </c>
                  </r>
               </tblbdy>
            </tbl>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Discussion</p>
         </st>
         <p>The concept of principle component analysis (PCA) is widely used in mathematics and pattern recognition to simplify a data set. In mathematical terms, it is a transform that chooses a new coordinate system for the data set, such that the greatest variance by any projection of the data set comes to lie on the first axis (then called the first principle component), the second greatest variance on the second axis, and so on. Because of the large amount of information stored along the first axis, the maximum eigenvalue itself can be characteristic enough to represent structural features of a protein. Figure <figr fid="F2">2a</figr> plots eigenvalues &#955;<sub>1 </sub>versus &#955;<sub>2 </sub>derived from the PD matrices of all four sets of structures under study. Clearly &#955;<sub>1 </sub>values alone are distinct enough from each other for grouping most of the structures into their known conformation sets. The same plot also illustrates that the second largest eigenvalue &#955;<sub>2 </sub>is generally not powerful enough to accomplish the grouping. It is therefore expected that smaller components of interaction matrices are not effective for this purpose. Similarly, when using the first number computed with the SGM algorithm, the four structure sets can be resolved (see Fig. <figr fid="F3">3</figr>).</p>
         <fig id="F3">
            <title>
               <p>Figure 3</p>
            </title>
            <caption>
               <p>The first number of SGM of proteins in all four structural sets</p>
            </caption>
            <text>
               <p>The first number of SGM of proteins in all four structural sets. The symbol representations are the same as in Figure 2.</p>
            </text>
            <graphic file="1471-2105-7-40-3"/>
         </fig>
         <p>In addition to the PD matrix, PID matrix defined above was used to provide further separation between clusters of eigenvalues. This was demonstrated in Fig. <figr fid="F2">2b</figr>, in which the plot of &#955;<sub>1 </sub>of PID matrices versus that of PD matrices achieves a much better grouping of the four structural sets in the vertical dimension as compared to the plot in Fig. <figr fid="F2">2a</figr>. This further emphasizes the importance of the maximum eigenvalues and variations in the definition of the interaction matrix that provides independent structural information. It does not escape our notice that even better resolution can be achieved by correlating &#955;<sub>1 </sub>with three or more different types of interaction matrices in a multi-dimensional plot. The caveat, however, is that definitions of invariant relation constructing the matrices should not be redundant as there are a limited number of independent invariants in a protein structure. Nevertheless, the results here show that the PCA method using secondary interaction matrix is highly flexible in adopting various structural parameters as a means of structure comparison. We also investigate how much the first eigenvalue captures the eigenvalue spectrum in the BCL-x<sub>L </sub>family. We found that the first eigenvalue captures 45.78% of the sum of the 105 eigenvalues. That indicates that more eigenvalues could be helpful in protein structure classification in our future work.</p>
         <p>A more elaborate method built on PCA is explored in this study to utilize the directional information contained in the eigenvector corresponding to &#955;<sub>1</sub>, named here as the PCC analysis as described in Section Methods. This method is particularly suited for the pair-wise structural comparison. Using the simple PD matrix definition (Section Methods), the pair-wise difference metrics, <it>R</it>, are all small (&lt; 0.4) within each of the four known structural sets (Tables <tblr tid="T1">1</tblr> and Figure <figr fid="F5">5(a)&#8211;(f)</figr>). The SGM score in Figure <figr fid="F5">5</figr> is defined as the absolute difference between the SGM values of two proteins. The symbol 'o' denotes that the R score is smaller than SGM score, and the '*' denotes the R score is bigger than SGM score. Furthermore, as a negative control, <it>R </it>values between structures from different sets are much larger, typically greater than 2.0 (Figure <figr fid="F5">5(e)</figr>). Based on the <it>R </it>values in Table <tblr tid="T1">1</tblr> and Figure <figr fid="F5">5(a)&#8211;(f)</figr> , we found empirically that by setting the cutoff <it>R </it>value to 0.4, the PCC method can faithfully place all structures in their designated groups.</p>
         <tbl id="T1">
            <title>
               <p>Table 1</p>
            </title>
            <caption>
               <p>Pair-wise <it>R </it>values calculated using the PD matrix and the first number of SGM for proteins in structure set I.</p>
            </caption>
            <tblbdy cols="11">
               <r>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>1F16</p>
                  </c>
                  <c ca="left">
                     <p>1G5M</p>
                  </c>
                  <c ca="left">
                     <p>1GJH</p>
                  </c>
                  <c ca="left">
                     <p>1MAZ</p>
                  </c>
                  <c ca="left">
                     <p>1DDB</p>
                  </c>
                  <c ca="left">
                     <p>1MDT</p>
                  </c>
                  <c ca="left">
                     <p>1COL</p>
                  </c>
                  <c ca="left">
                     <p>1A6G</p>
                  </c>
                  <c ca="left">
                     <p>1A4F</p>
                  </c>
               </r>
               <r>
                  <c cspan="11">
                     <hr/>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>2BID</p>
                  </c>
                  <c ca="left">
                     <p>PCC</p>
                  </c>
                  <c ca="left">
                     <p>0.0249</p>
                  </c>
                  <c ca="left">
                     <p>0.0188</p>
                  </c>
                  <c ca="left">
                     <p>0.2185</p>
                  </c>
                  <c ca="left">
                     <p>0.2676</p>
                  </c>
                  <c ca="left">
                     <p>0.0000</p>
                  </c>
                  <c ca="left">
                     <p>0.0093</p>
                  </c>
                  <c ca="left">
                     <p>0.0337</p>
                  </c>
                  <c ca="left">
                     <p>0.2452</p>
                  </c>
                  <c ca="left">
                     <p>0.2835</p>
                  </c>
               </r>
               <r>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>SGM</p>
                  </c>
                  <c ca="left">
                     <p>0.0530</p>
                  </c>
                  <c ca="left">
                     <p>0.3510</p>
                  </c>
                  <c ca="left">
                     <p>0.3510</p>
                  </c>
                  <c ca="left">
                     <p>0.5940</p>
                  </c>
                  <c ca="left">
                     <p>0.0210</p>
                  </c>
                  <c ca="left">
                     <p>0.1810</p>
                  </c>
                  <c ca="left">
                     <p>0.0031</p>
                  </c>
                  <c ca="left">
                     <p>0.4890</p>
                  </c>
                  <c ca="left">
                     <p>0.5420</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>1F16</p>
                  </c>
                  <c ca="left">
                     <p>PCC</p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>0.1630</p>
                  </c>
                  <c ca="left">
                     <p>0.1248</p>
                  </c>
                  <c ca="left">
                     <p>0.1750</p>
                  </c>
                  <c ca="left">
                     <p>0.0000</p>
                  </c>
                  <c ca="left">
                     <p>0.3280</p>
                  </c>
                  <c ca="left">
                     <p>0.0005</p>
                  </c>
                  <c ca="left">
                     <p>0.2915</p>
                  </c>
                  <c ca="left">
                     <p>0.2780</p>
                  </c>
               </r>
               <r>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>SGM</p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>0.2980</p>
                  </c>
                  <c ca="left">
                     <p>0.2980</p>
                  </c>
                  <c ca="left">
                     <p>0.5410</p>
                  </c>
                  <c ca="left">
                     <p>0.0320</p>
                  </c>
                  <c ca="left">
                     <p>0.1280</p>
                  </c>
                  <c ca="left">
                     <p>0.0530</p>
                  </c>
                  <c ca="left">
                     <p>0.4360</p>
                  </c>
                  <c ca="left">
                     <p>0.4890</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>1G5M</p>
                  </c>
                  <c ca="left">
                     <p>PCC</p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>0.2077</p>
                  </c>
                  <c ca="left">
                     <p>0.1836</p>
                  </c>
                  <c ca="left">
                     <p>0.0000</p>
                  </c>
                  <c ca="left">
                     <p>0.0013</p>
                  </c>
                  <c ca="left">
                     <p>0.0145</p>
                  </c>
                  <c ca="left">
                     <p>0.2943</p>
                  </c>
                  <c ca="left">
                     <p>0.2624</p>
                  </c>
               </r>
               <r>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>SGM</p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>0.0005</p>
                  </c>
                  <c ca="left">
                     <p>0.2430</p>
                  </c>
                  <c ca="left">
                     <p>0.3300</p>
                  </c>
                  <c ca="left">
                     <p>0.1700</p>
                  </c>
                  <c ca="left">
                     <p>0.3510</p>
                  </c>
                  <c ca="left">
                     <p>0.1380</p>
                  </c>
                  <c ca="left">
                     <p>0.1910</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>1GJH</p>
                  </c>
                  <c ca="left">
                     <p>PCC</p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>0.1790</p>
                  </c>
                  <c ca="left">
                     <p>0.0000</p>
                  </c>
                  <c ca="left">
                     <p>0.0109</p>
                  </c>
                  <c ca="left">
                     <p>0.0327</p>
                  </c>
                  <c ca="left">
                     <p>0.2421</p>
                  </c>
                  <c ca="left">
                     <p>0.2899</p>
                  </c>
               </r>
               <r>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>SGM</p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>0.2430</p>
                  </c>
                  <c ca="left">
                     <p>0.3300</p>
                  </c>
                  <c ca="left">
                     <p>0.1700</p>
                  </c>
                  <c ca="left">
                     <p>0.3510</p>
                  </c>
                  <c ca="left">
                     <p>0.1380</p>
                  </c>
                  <c ca="left">
                     <p>0.1910</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>1MAZ</p>
                  </c>
                  <c ca="left">
                     <p>PCC</p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>0.0031</p>
                  </c>
                  <c ca="left">
                     <p>0.0092</p>
                  </c>
                  <c ca="left">
                     <p>0.0303</p>
                  </c>
                  <c ca="left">
                     <p>0.0107</p>
                  </c>
                  <c ca="left">
                     <p>0.2537</p>
                  </c>
               </r>
               <r>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>SGM</p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>0.5730</p>
                  </c>
                  <c ca="left">
                     <p>0.4130</p>
                  </c>
                  <c ca="left">
                     <p>0.5940</p>
                  </c>
                  <c ca="left">
                     <p>0.1050</p>
                  </c>
                  <c ca="left">
                     <p>0.0520</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>1DDB</p>
                  </c>
                  <c ca="left">
                     <p>PCC</p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>0.0054</p>
                  </c>
                  <c ca="left">
                     <p>0.0293</p>
                  </c>
                  <c ca="left">
                     <p>0.0068</p>
                  </c>
                  <c ca="left">
                     <p>0.2286</p>
                  </c>
               </r>
               <r>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>SGM</p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>0.1600</p>
                  </c>
                  <c ca="left">
                     <p>0.0210</p>
                  </c>
                  <c ca="left">
                     <p>0.4680</p>
                  </c>
                  <c ca="left">
                     <p>0.5210</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>1MDT</p>
                  </c>
                  <c ca="left">
                     <p>PCC</p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>0.0112</p>
                  </c>
                  <c ca="left">
                     <p>0.2390</p>
                  </c>
                  <c ca="left">
                     <p>0.2904</p>
                  </c>
               </r>
               <r>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>SGM</p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>0.1810</p>
                  </c>
                  <c ca="left">
                     <p>0.3080</p>
                  </c>
                  <c ca="left">
                     <p>0.3610</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>1COL</p>
                  </c>
                  <c ca="left">
                     <p>PCC</p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>0.0081</p>
                  </c>
                  <c ca="left">
                     <p>0.2496</p>
                  </c>
               </r>
               <r>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>SGM</p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>0.4890</p>
                  </c>
                  <c ca="left">
                     <p>0.5420</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>1A6G</p>
                  </c>
                  <c ca="left">
                     <p>PCC</p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>0.1950</p>
                  </c>
               </r>
               <r>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>SGM</p>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c>
                     <p/>
                  </c>
                  <c ca="left">
                     <p>0.0530</p>
                  </c>
               </r>
            </tblbdy>
         </tbl>
         <fig id="F5">
            <title>
               <p>Figure 5</p>
            </title>
            <caption>
               <p>The plot of R score versus the SGM score: (a)-(f) are plotted for datasets from I to VI, respectively</p>
            </caption>
            <text>
               <p>The plot of R score versus the SGM score: (a)-(f) are plotted for datasets from I to VI, respectively. The SGM score is defined as the absolute difference between the SGM values of two proteins. The symbol '*' denotes that the R score is smaller than SGM score, and the 'o' denotes the R score is bigger than SGM score.</p>
            </text>
            <graphic file="1471-2105-7-40-5"/>
         </fig>
         <p>To provide a more in-depth view of the PCC method, the analysis of data set I is described here in detail. This set consists of mainly &#945; helical structures having the "Orthogonal Bundle" architecture. Proteins 2BID, 1F16, 1G5M, 1GJH, 1MAZ, and 1DDB are apoptosis regulators of cell-death pathways associated with mitochondrion. Since mitochondria originated from prokaryotes, these proteins are believed to have evolved from the same ancient design. Although they differ substantially in amino acid sequence as well as in shape, the overall scaffold and topology are similar. As expected, the <it>R </it>values among them are all less than 0.4 (Table <tblr tid="T1">1</tblr>). Other proteins in this set, including bacterial toxins that are capable of forming membrane pores (1MDT and 1COL) and myoglobin (1A6G), have remote conformational resemblance with the BCL-x<sub>L </sub>proteins. The <it>R </it>values between these structures and the apoptosis regulators are also less than 0.3 and are comparable to those found within the BCL-x<sub>L </sub>family. It is interesting to note that although 1MDT and 1COL are not related to the BCL-x<sub>L </sub>proteins in terms of physiological roles, they do share a similarity with the BCL-x<sub>L </sub>members other than topology; that is, they all are able to form large pores when inserted into cellular membrane.</p>
         <p>In summing the results of Table <tblr tid="T1">1</tblr> and Figure <figr fid="F5">5(a)&#8211;(f)</figr>, the <it>R </it>values within individual sets are on average very small, with a mean of 0.1102 and standard deviation of 0.1269. This is expected because the structures have been manually examined and pre-grouped into topologically similar sets. The comparison results from PCC analyses are generally comparable to that of SGM for the data sets under study (see Table <tblr tid="T1">1</tblr> and Figure <figr fid="F5">5(a)&#8211;(f)</figr>). However, in a few isolated cases, the difference in the scaled writhing numbers within the same structure set can exceed the threshold of 0.4 that governs similarity (for example, protein pairs (1MAZ, 2BID), (1F16, 1DDB) in Table <tblr tid="T1">1</tblr>, and protein pairs (1C78, 1FM0), (1C78, 1NDD), and (1C78, 1IBQ) in Figure <figr fid="F5">5(b)</figr>. This is because the PCC analysis using the PD matrix emphasizes more on spatial separation and orientation of secondary segments. It must be mentioned that the PD matrix alone is not expected to detect pure topological similarities. The results for structure sets with predominately &#946; strands and mixed &#945;/&#946; proteins show similar <it>R </it>values (Figure <figr fid="F5">5(c)</figr> and <figr fid="F5">5(d)</figr>), indicating the generality of this method in protein structure comparison. We also tested these six data sets using MAMMOTH, it can also separate the six classes well.</p>
         <p>Another variation of the PD matrix definition is to take into account the N &#8211; C terminal sense, in attempt to further emphasize protein topological features. A good example is the comparison between structures 1COL and 1DDB in data set I. A visual examination of the two structures reveals that they share similar shape, but are considerably different in topological arrangement of helices 1 and 3. In protein 1COL, the first and third helices are anti-parallel, whereas they are parallel in 1DDB (see Figure <figr fid="F4">4</figr>). This is not identified by the PCC analysis using the PD matrix as <it>R </it>= 0.029. The great similarity in shape prevailed in the comparison. However, by applying the PDS matrix defined in Section Methods, the <it>R</it>-value considerably increases to 1.707, clearly highlighting the difference in backbone topological traces. Finally we also would like to pint out that the definition of <it>R </it>could be improved by introducing more eigenvalues.</p>
         <fig id="F4">
            <title>
               <p>Figure 4</p>
            </title>
            <caption>
               <p>Ribbon representation of protein structures of (a) 1COL and (b) 1DDB</p>
            </caption>
            <text>
               <p>Ribbon representation of protein structures of (a) 1COL and (b) 1DDB. The two proteins have similar shape, but different topological arrangements in helices 1 and 3.</p>
            </text>
            <graphic file="1471-2105-7-40-4"/>
         </fig>
      </sec>
      <sec>
         <st>
            <p>Conclusion</p>
         </st>
         <p>PCC analysis of secondary interaction matrix is a conceptually simple method that yields results highly comparable to the SGM method. Both are able to distinguish protein conformations based on the more subtle topological features. While the SGM method compares structures in a more topological sense, the outcome of PCC analysis is more dependent on the definition of the interaction matrix. With the PD matrix, the PCC analysis puts more weight on the detailed structure and shape, while it is also capable, to a certain extent, of distinguishing different topological traces. In certain cases of pair-wise comparison, such as that between 1COL and 1DDB, protein shapes can overwhelm their topological features in the analysis; yet the PCC analysis of the PDS matrix is able to completely differentiate between 1COL and 1DDB. Owing to the flexibility offered by the new method, a more effective definition of interaction matrix can be explored to provide a more efficient structure comparison. There exist many invariants in each protein. Some invariants are important for protein classification, but some are not. Hence, our future work will further explore feature selection, automated classification of PDB, modeling and statistical learning, as well as protein domain matching.</p>
      </sec>
      <sec>
         <st>
            <p>Methods</p>
         </st>
         <sec>
            <st>
               <p>Principle component analysis of secondary interaction matrix</p>
            </st>
            <p>Assuming a protein having <it>n </it>secondary fragments denoted by <b>h</b><sub>1</sub>, <b>h</b><sub>2</sub>,..., <b>h</b><sub><it>n</it></sub>, and the number of residues in each secondary structure denoted by <it>l</it><sub>1</sub>, <it>l</it><sub>2</sub>,..., <it>l</it><sub><it>n</it></sub>, respectively, the total number of residues belonging to secondary structures is given by <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-7-40-i3"><m:semantics><m:mrow><m:mi>N</m:mi><m:mo>=</m:mo><m:mstyle displaystyle="true"><m:munderover><m:mo>&#8721;</m:mo><m:mrow><m:mi>i</m:mi><m:mo>=</m:mo><m:mn>1</m:mn></m:mrow><m:mi>n</m:mi></m:munderover><m:mrow><m:msub><m:mi>l</m:mi><m:mi>i</m:mi></m:msub></m:mrow></m:mstyle></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGobGtcqGH9aqpdaaeWbqaaiabdYgaSnaaBaaaleaacqWGPbqAaeqaaaqaaiabdMgaPjabg2da9iabigdaXaqaaiabd6gaUbqdcqGHris5aaaa@38AC@</m:annotation></m:semantics></m:math>. The invariant relation between a pair of secondary elements (<b>h</b><sub><it>i</it></sub>, <b>h</b><sub><it>j</it></sub>) is described by a block matrix <b>F</b>(<b>h</b><sub><it>i</it></sub>, <b>h</b><sub><it>j</it></sub>), in which the individual matrix elements represent a particular relation between residues of the two secondary structures. Since <b>h</b><sub><it>i </it></sub>has <it>l</it><sub>i </sub>residues (denoted by <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-7-40-i4"><m:semantics><m:mrow><m:msubsup><m:mi>c</m:mi><m:mi>i</m:mi><m:mn>1</m:mn></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGJbWydaqhaaWcbaGaemyAaKgabaGaeGymaedaaaaa@3073@</m:annotation></m:semantics></m:math>, <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-7-40-i5"><m:semantics><m:mrow><m:msubsup><m:mi>c</m:mi><m:mi>i</m:mi><m:mn>2</m:mn></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGJbWydaqhaaWcbaGaemyAaKgabaGaeGOmaidaaaaa@3075@</m:annotation></m:semantics></m:math>,..., <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-7-40-i6"><m:semantics><m:mrow><m:msubsup><m:mi>c</m:mi><m:mi>i</m:mi><m:mrow><m:msub><m:mi>l</m:mi><m:mi>i</m:mi></m:msub></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGJbWydaqhaaWcbaGaemyAaKgabaGaemiBaW2aaSbaaWqaaiabdMgaPbqabaaaaaaa@326C@</m:annotation></m:semantics></m:math>), and <b>h</b><sub><it>j </it></sub>has <it>l</it><sub>j </sub>residues (denoted by <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-7-40-i7"><m:semantics><m:mrow><m:msubsup><m:mi>c</m:mi><m:mi>j</m:mi><m:mn>1</m:mn></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGJbWydaqhaaWcbaGaemOAaOgabaGaeGymaedaaaaa@3075@</m:annotation></m:semantics></m:math>, <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-7-40-i8"><m:semantics><m:mrow><m:msubsup><m:mi>c</m:mi><m:mi>j</m:mi><m:mn>2</m:mn></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGJbWydaqhaaWcbaGaemOAaOgabaGaeGOmaidaaaaa@3077@</m:annotation></m:semantics></m:math>,..., <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-7-40-i9"><m:semantics><m:mrow><m:msubsup><m:mi>c</m:mi><m:mi>j</m:mi><m:mrow><m:msub><m:mi>l</m:mi><m:mi>j</m:mi></m:msub></m:mrow></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGJbWydaqhaaWcbaGaemOAaOgabaGaemiBaW2aaSbaaWqaaiabdQgaQbqabaaaaaaa@3270@</m:annotation></m:semantics></m:math>), the elements of the <it>l</it><sub><it>i </it></sub>&#215; <it>l</it><sub><it>j </it></sub><b>F </b>block matrix, <it>g</it>(<m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-7-40-i10"><m:semantics><m:mrow><m:msubsup><m:mi>c</m:mi><m:mi>i</m:mi><m:mi>u</m:mi></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGJbWydaqhaaWcbaGaemyAaKgabaGaemyDauhaaaaa@30F6@</m:annotation></m:semantics></m:math>, <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-7-40-i11"><m:semantics><m:mrow><m:msubsup><m:mi>c</m:mi><m:mi>j</m:mi><m:mi>v</m:mi></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGJbWydaqhaaWcbaGaemOAaOgabaGaemODayhaaaaa@30FA@</m:annotation></m:semantics></m:math>), are defined as</p>
            <p>
               <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-7-40-i12">
                  <m:semantics>
                     <m:mrow>
                        <m:mi>g</m:mi>
                        <m:mo stretchy="false">(</m:mo>
                        <m:msubsup>
                           <m:mi>c</m:mi>
                           <m:mi>i</m:mi>
                           <m:mi>u</m:mi>
                        </m:msubsup>
                        <m:mo>,</m:mo>
                        <m:msubsup>
                           <m:mi>c</m:mi>
                           <m:mi>j</m:mi>
                           <m:mi>v</m:mi>
                        </m:msubsup>
                        <m:mo stretchy="false">)</m:mo>
                        <m:mo>=</m:mo>
                        <m:mrow>
                           <m:mo>{</m:mo>
                           <m:mrow>
                              <m:mtable>
                                 <m:mtr>
                                    <m:mtd>
                                       <m:mrow>
                                          <m:mi>d</m:mi>
                                          <m:mo stretchy="false">(</m:mo>
                                          <m:msubsup>
                                             <m:mi>c</m:mi>
                                             <m:mi>i</m:mi>
                                             <m:mi>u</m:mi>
                                          </m:msubsup>
                                          <m:mo>,</m:mo>
                                          <m:msubsup>
                                             <m:mi>c</m:mi>
                                             <m:mi>j</m:mi>
                                             <m:mi>v</m:mi>
                                          </m:msubsup>
                                          <m:mo stretchy="false">)</m:mo>
                                       </m:mrow>
                                    </m:mtd>
                                    <m:mtd>
                                       <m:mrow>
                                          <m:mi>i</m:mi>
                                          <m:mo>&#8800;</m:mo>
                                          <m:mi>j</m:mi>
                                       </m:mrow>
                                    </m:mtd>
                                 </m:mtr>
                                 <m:mtr>
                                    <m:mtd>
                                       <m:mn>0</m:mn>
                                    </m:mtd>
                                    <m:mtd>
                                       <m:mrow>
                                          <m:mi>i</m:mi>
                                          <m:mo>=</m:mo>
                                          <m:mi>j</m:mi>
                                       </m:mrow>
                                    </m:mtd>
                                 </m:mtr>
                              </m:mtable>
                           </m:mrow>
                        </m:mrow>
                        <m:mo>,</m:mo>
                        <m:mtext>&#160;&#160;&#160;&#160;&#160;</m:mtext>
                        <m:mrow>
                           <m:mo>(</m:mo>
                           <m:mn>1</m:mn>
                           <m:mo>)</m:mo>
                        </m:mrow>
                     </m:mrow>
                     <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGNbWzcqGGOaakcqWGJbWydaqhaaWcbaGaemyAaKgabaGaemyDauhaaOGaeiilaWIaem4yam2aa0baaSqaaiabdQgaQbqaaiabdAha2baakiabcMcaPiabg2da9maaceaabaqbaeqabiGaaaqaaiabdsgaKjabcIcaOiabdogaJnaaDaaaleaacqWGPbqAaeaacqWG1bqDaaGccqGGSaalcqWGJbWydaqhaaWcbaGaemOAaOgabaGaemODayhaaOGaeiykaKcabaGaemyAaKMaeyiyIKRaemOAaOgabaGaeGimaadabaGaemyAaKMaeyypa0JaemOAaOgaaaGaay5EaaGaeiilaWIaaCzcaiaaxMaadaqadaqaaiabigdaXaGaayjkaiaawMcaaaaa@55C8@</m:annotation>
                  </m:semantics>
               </m:math>
            </p>
            <p>where 1 &#8804; <it>u </it>&#8804; <it>l</it><sub><it>i</it></sub>, 1 &#8804; <it>v </it>&#8804; <it>l</it><sub><it>j</it></sub>, and <it>d</it>(<m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-7-40-i10"><m:semantics><m:mrow><m:msubsup><m:mi>c</m:mi><m:mi>i</m:mi><m:mi>u</m:mi></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGJbWydaqhaaWcbaGaemyAaKgabaGaemyDauhaaaaa@30F6@</m:annotation></m:semantics></m:math>, <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-7-40-i11"><m:semantics><m:mrow><m:msubsup><m:mi>c</m:mi><m:mi>j</m:mi><m:mi>v</m:mi></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGJbWydaqhaaWcbaGaemOAaOgabaGaemODayhaaaaa@30FA@</m:annotation></m:semantics></m:math>) is a real number representing an arbitrary invariant relation between residues of <b>h</b><sub><it>i </it></sub>and <b>h</b><sub><it>j</it></sub>. Note this approach allows the definition of <it>d</it>(<m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-7-40-i10"><m:semantics><m:mrow><m:msubsup><m:mi>c</m:mi><m:mi>i</m:mi><m:mi>u</m:mi></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGJbWydaqhaaWcbaGaemyAaKgabaGaemyDauhaaaaa@30F6@</m:annotation></m:semantics></m:math>, <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-7-40-i11"><m:semantics><m:mrow><m:msubsup><m:mi>c</m:mi><m:mi>j</m:mi><m:mi>v</m:mi></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGJbWydaqhaaWcbaGaemOAaOgabaGaemODayhaaaaa@30FA@</m:annotation></m:semantics></m:math>) to be rather arbitrary. The full interaction matrix of a protein structure is square and symmetric and is defined as</p>
            <p>
               <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-7-40-i13">
                  <m:semantics>
                     <m:mrow>
                        <m:mover accent="true">
                           <m:mi>I</m:mi>
                           <m:mo>^</m:mo>
                        </m:mover>
                        <m:mo>=</m:mo>
                        <m:msub>
                           <m:mrow>
                              <m:mrow>
                                 <m:mo>[</m:mo>
                                 <m:mrow>
                                    <m:mtable>
                                       <m:mtr>
                                          <m:mtd>
                                             <m:mstyle mathvariant="bold" mathsize="normal">
                                                <m:mn>0</m:mn>
                                             </m:mstyle>
                                          </m:mtd>
                                          <m:mtd>
                                             <m:mrow>
                                                <m:mstyle mathvariant="bold" mathsize="normal">
                                                   <m:mi>F</m:mi>
                                                   <m:mo stretchy="false">(</m:mo>
                                                </m:mstyle>
                                                <m:msub>
                                                   <m:mstyle mathvariant="bold" mathsize="normal">
                                                      <m:mi>h</m:mi>
                                                   </m:mstyle>
                                                   <m:mn>1</m:mn>
                                                </m:msub>
                                                <m:mstyle mathvariant="bold" mathsize="normal">
                                                   <m:mo>,</m:mo>
                                                </m:mstyle>
                                                <m:msub>
                                                   <m:mstyle mathvariant="bold" mathsize="normal">
                                                      <m:mi>h</m:mi>
                                                   </m:mstyle>
                                                   <m:mn>2</m:mn>
                                                </m:msub>
                                                <m:mstyle mathvariant="bold" mathsize="normal">
                                                   <m:mo stretchy="false">)</m:mo>
                                                </m:mstyle>
                                             </m:mrow>
                                          </m:mtd>
                                          <m:mtd>
                                             <m:mo>&#8943;</m:mo>
                                          </m:mtd>
                                          <m:mtd>
                                             <m:mrow>
                                                <m:mstyle mathvariant="bold" mathsize="normal">
                                                   <m:mi>F</m:mi>
                                                   <m:mo stretchy="false">(</m:mo>
                                                </m:mstyle>
                                                <m:msub>
                                                   <m:mstyle mathvariant="bold" mathsize="normal">
                                                      <m:mi>h</m:mi>
                                                   </m:mstyle>
                                                   <m:mn>1</m:mn>
                                                </m:msub>
                                                <m:mstyle mathvariant="bold" mathsize="normal">
                                                   <m:mo>,</m:mo>
                                                </m:mstyle>
                                                <m:msub>
                                                   <m:mstyle mathvariant="bold" mathsize="normal">
                                                      <m:mi>h</m:mi>
                                                   </m:mstyle>
                                                   <m:mi>n</m:mi>
                                                </m:msub>
                                                <m:mstyle mathvariant="bold" mathsize="normal">
                                                   <m:mo stretchy="false">)</m:mo>
                                                </m:mstyle>
                                             </m:mrow>
                                          </m:mtd>
                                       </m:mtr>
                                       <m:mtr>
                                          <m:mtd>
                                             <m:mrow>
                                                <m:mstyle mathvariant="bold" mathsize="normal">
                                                   <m:mi>F</m:mi>
                                                   <m:mo stretchy="false">(</m:mo>
                                                </m:mstyle>
                                                <m:msub>
                                                   <m:mstyle mathvariant="bold" mathsize="normal">
                                                      <m:mi>h</m:mi>
                                                   </m:mstyle>
                                                   <m:mn>2</m:mn>
                                                </m:msub>
                                                <m:mstyle mathvariant="bold" mathsize="normal">
                                                   <m:mo>,</m:mo>
                                                </m:mstyle>
                                                <m:msub>
                                                   <m:mstyle mathvariant="bold" mathsize="normal">
                                                      <m:mi>h</m:mi>
                                                   </m:mstyle>
                                                   <m:mn>1</m:mn>
                                                </m:msub>
                                                <m:mstyle mathvariant="bold" mathsize="normal">
                                                   <m:mo stretchy="false">)</m:mo>
                                                </m:mstyle>
                                             </m:mrow>
                                          </m:mtd>
                                          <m:mtd>
                                             <m:mstyle mathvariant="bold" mathsize="normal">
                                                <m:mn>0</m:mn>
                                             </m:mstyle>
                                          </m:mtd>
                                          <m:mtd>
                                             <m:mo>&#8943;</m:mo>
                                          </m:mtd>
                                          <m:mtd>
                                             <m:mrow>
                                                <m:mstyle mathvariant="bold" mathsize="normal">
                                                   <m:mi>F</m:mi>
                                                   <m:mo stretchy="false">(</m:mo>
                                                </m:mstyle>
                                                <m:msub>
                                                   <m:mstyle mathvariant="bold" mathsize="normal">
                                                      <m:mi>h</m:mi>
                                                   </m:mstyle>
                                                   <m:mn>2</m:mn>
                                                </m:msub>
                                                <m:mstyle mathvariant="bold" mathsize="normal">
                                                   <m:mo>,</m:mo>
                                                </m:mstyle>
                                                <m:msub>
                                                   <m:mstyle mathvariant="bold" mathsize="normal">
                                                      <m:mi>h</m:mi>
                                                   </m:mstyle>
                                                   <m:mi>n</m:mi>
                                                </m:msub>
                                                <m:mstyle mathvariant="bold" mathsize="normal">
                                                   <m:mo stretchy="false">)</m:mo>
                                                </m:mstyle>
                                             </m:mrow>
                                          </m:mtd>
                                       </m:mtr>
                                       <m:mtr>
                                          <m:mtd>
                                             <m:mo>&#8942;</m:mo>
                                          </m:mtd>
                                          <m:mtd>
                                             <m:mo>&#8942;</m:mo>
                                          </m:mtd>
                                          <m:mtd>
                                             <m:mo>&#8945;</m:mo>
                                          </m:mtd>
                                          <m:mtd>
                                             <m:mo>&#8942;</m:mo>
                                          </m:mtd>
                                       </m:mtr>
                                       <m:mtr>
                                          <m:mtd>
                                             <m:mrow>
                                                <m:mstyle mathvariant="bold" mathsize="normal">
                                                   <m:mi>F</m:mi>
                                                   <m:mo stretchy="false">(</m:mo>
                                                </m:mstyle>
                                                <m:msub>
                                                   <m:mstyle mathvariant="bold" mathsize="normal">
                                                      <m:mi>h</m:mi>
                                                   </m:mstyle>
                                                   <m:mi>n</m:mi>
                                                </m:msub>
                                                <m:mstyle mathvariant="bold" mathsize="normal">
                                                   <m:mo>,</m:mo>
                                                </m:mstyle>
                                                <m:msub>
                                                   <m:mstyle mathvariant="bold" mathsize="normal">
                                                      <m:mi>h</m:mi>
                                                   </m:mstyle>
                                                   <m:mn>1</m:mn>
                                                </m:msub>
                                                <m:mstyle mathvariant="bold" mathsize="normal">
                                                   <m:mo stretchy="false">)</m:mo>
                                                </m:mstyle>
                                             </m:mrow>
                                          </m:mtd>
                                          <m:mtd>
                                             <m:mrow>
                                                <m:mstyle mathvariant="bold" mathsize="normal">
                                                   <m:mi>F</m:mi>
                                                   <m:mo stretchy="false">(</m:mo>
                                                </m:mstyle>
                                                <m:msub>
                                                   <m:mstyle mathvariant="bold" mathsize="normal">
                                                      <m:mi>h</m:mi>
                                                   </m:mstyle>
                                                   <m:mi>n</m:mi>
                                                </m:msub>
                                                <m:mstyle mathvariant="bold" mathsize="normal">
                                                   <m:mo>,</m:mo>
                                                </m:mstyle>
                                                <m:msub>
                                                   <m:mstyle mathvariant="bold" mathsize="normal">
                                                      <m:mi>h</m:mi>
                                                   </m:mstyle>
                                                   <m:mn>2</m:mn>
                                                </m:msub>
                                                <m:mstyle mathvariant="bold" mathsize="normal">
                                                   <m:mo stretchy="false">)</m:mo>
                                                </m:mstyle>
                                             </m:mrow>
                                          </m:mtd>
                                          <m:mtd>
                                             <m:mo>&#8943;</m:mo>
                                          </m:mtd>
                                          <m:mtd>
                                             <m:mstyle mathvariant="bold" mathsize="normal">
                                                <m:mn>0</m:mn>
                                             </m:mstyle>
                                          </m:mtd>
                                       </m:mtr>
                                    </m:mtable>
                                 </m:mrow>
                                 <m:mo>]</m:mo>
                              </m:mrow>
                           </m:mrow>
                           <m:mrow>
                              <m:mi>N</m:mi>
                              <m:mo>&#215;</m:mo>
                              <m:mi>N</m:mi>
                           </m:mrow>
                        </m:msub>
                        <m:mtext>&#160;&#160;&#160;&#160;&#160;</m:mtext>
                        <m:mrow>
                           <m:mo>(</m:mo>
                           <m:mn>2</m:mn>
                           <m:mo>)</m:mo>
                        </m:mrow>
                     </m:mrow>
                     <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaaieqacuWFjbqsgaqcaiabg2da9maadmaabaqbaeqabqabaaaaaeaacqWHWaamaeaacqWHgbGrcqWHOaakcqWHObaAdaWgaaWcbaGaeGymaedabeaakiabhYcaSiabhIgaOnaaBaaaleaacqaIYaGmaeqaaOGaeCykaKcabaGaeS47IWeabaGaeCOrayKaeCikaGIaeCiAaG2aaSbaaSqaaiabigdaXaqabaGccqWHSaalcqWHObaAdaWgaaWcbaGaemOBa4gabeaakiabhMcaPaqaaiabhAeagjabhIcaOiabhIgaOnaaBaaaleaacqaIYaGmaeqaaOGaeCilaWIaeCiAaG2aaSbaaSqaaiabigdaXaqabaGccqWHPaqkaeaacqWHWaamaeaacqWIVlctaeaacqWHgbGrcqWHOaakcqWHObaAdaWgaaWcbaGaeGOmaidabeaakiabhYcaSiabhIgaOnaaBaaaleaacqWGUbGBaeqaaOGaeCykaKcabaGaeSO7I0eabaGaeSO7I0eabaGaeSy8I8eabaGaeSO7I0eabaGaeCOrayKaeCikaGIaeCiAaG2aaSbaaSqaaiabd6gaUbqabaGccqWHSaalcqWHObaAdaWgaaWcbaGaeGymaedabeaakiabhMcaPaqaaiabhAeagjabhIcaOiabhIgaOnaaBaaaleaacqWGUbGBaeqaaOGaeCilaWIaeCiAaG2aaSbaaSqaaiabikdaYaqabaGccqWHPaqkaeaacqWIVlctaeaacqWHWaamaaaacaGLBbGaayzxaaWaaSbaaSqaaiabd6eaojabgEna0kabd6eaobqabaGccaWLjaGaaCzcamaabmaabaGaeGOmaidacaGLOaGaayzkaaaaaa@7FF5@</m:annotation>
                  </m:semantics>
               </m:math>
            </p>
            <p>The principle components of the interaction matrix is then obtained by orthogonal decomposition as shown below:</p>
            <p>
               <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-7-40-i14">
                  <m:semantics>
                     <m:mrow>
                        <m:mover accent="true">
                           <m:mi>I</m:mi>
                           <m:mo>^</m:mo>
                        </m:mover>
                        <m:mo>=</m:mo>
                        <m:msup>
                           <m:mstyle mathvariant="bold" mathsize="normal">
                              <m:mi>E</m:mi>
                           </m:mstyle>
                           <m:mi>T</m:mi>
                        </m:msup>
                        <m:mrow>
                           <m:mo>[</m:mo>
                           <m:mrow>
                              <m:mtable>
                                 <m:mtr>
                                    <m:mtd>
                                       <m:mrow>
                                          <m:msub>
                                             <m:mi>&#955;</m:mi>
                                             <m:mn>1</m:mn>
                                          </m:msub>
                                       </m:mrow>
                                    </m:mtd>
                                    <m:mtd>
                                       <m:mrow/>
                                    </m:mtd>
                                    <m:mtd>
                                       <m:mrow/>
                                    </m:mtd>
                                    <m:mtd>
                                       <m:mrow/>
                                    </m:mtd>
                                 </m:mtr>
                                 <m:mtr>
                                    <m:mtd>
                                       <m:mrow/>
                                    </m:mtd>
                                    <m:mtd>
                                       <m:mrow>
                                          <m:msub>
                                             <m:mi>&#955;</m:mi>
                                             <m:mn>2</m:mn>
                                          </m:msub>
                                       </m:mrow>
                                    </m:mtd>
                                    <m:mtd>
                                       <m:mrow/>
                                    </m:mtd>
                                    <m:mtd>
                                       <m:mrow/>
                                    </m:mtd>
                                 </m:mtr>
                                 <m:mtr>
                                    <m:mtd>
                                       <m:mrow/>
                                    </m:mtd>
                                    <m:mtd>
                                       <m:mrow/>
                                    </m:mtd>
                                    <m:mtd>
                                       <m:mo>&#8945;</m:mo>
                                    </m:mtd>
                                    <m:mtd>
                                       <m:mrow/>
                                    </m:mtd>
                                 </m:mtr>
                                 <m:mtr>
                                    <m:mtd>
                                       <m:mrow/>
                                    </m:mtd>
                                    <m:mtd>
                                       <m:mrow/>
                                    </m:mtd>
                                    <m:mtd>
                                       <m:mrow/>
                                    </m:mtd>
                                    <m:mtd>
                                       <m:mrow>
                                          <m:msub>
                                             <m:mi>&#955;</m:mi>
                                             <m:mi>N</m:mi>
                                          </m:msub>
                                       </m:mrow>
                                    </m:mtd>
                                 </m:mtr>
                              </m:mtable>
                           </m:mrow>
                           <m:mo>]</m:mo>
                        </m:mrow>
                        <m:mstyle mathvariant="bold" mathsize="normal">
                           <m:mi>E</m:mi>
                        </m:mstyle>
                        <m:mtext>&#160;&#160;&#160;&#160;&#160;</m:mtext>
                        <m:mrow>
                           <m:mo>(</m:mo>
                           <m:mn>3</m:mn>
                           <m:mo>)</m:mo>
                        </m:mrow>
                     </m:mrow>
                     <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaaieqacuWFjbqsgaqcaiabg2da9iabhweafnaaCaaaleqabaGaemivaqfaaOWaamWaaeaafaqabeabeaaaaaqaaiabeU7aSnaaBaaaleaacqaIXaqmaeqaaaGcbaaabaaabaaabaaabaGaeq4UdW2aaSbaaSqaaiabikdaYaqabaaakeaaaeaaaeaaaeaaaeaacqWIXlYtaeaaaeaaaeaaaeaaaeaacqaH7oaBdaWgaaWcbaGaemOta4eabeaaaaaakiaawUfacaGLDbaacqWHfbqrcaWLjaGaaCzcamaabmaabaGaeG4mamdacaGLOaGaayzkaaaaaa@4304@</m:annotation>
                  </m:semantics>
               </m:math>
            </p>
            <p>where &#955;<sub>1 </sub>&#8805; &#955;<sub>2 </sub>&#8805; &#8943; &#8805; &#955;<sub><it>N </it></sub>are the sorted eigenvalues, the corresponding eigenvectors are <b>e</b><sub>1</sub>, <b>e</b><sub>2</sub>,..., <b>e</b><sub><it>N</it></sub>, and <b>E </b>= [<b>e</b><sub>1</sub>, <b>e</b><sub>2</sub>,..., <b>e</b><sub><it>N</it></sub>] is an invertible matrix. Generally, the maximum eigenvalue, &#955;<sub>1</sub>, and its corresponding eigenvector in <it>N</it>-dimensional space encode the most dominant features in the structure and therefore can be effectively used to directly compare structures, as well as to identify the less obvious topological features common to the proteins. Since the eigenvalues depend largely on the dimension of interaction matrix, they are divided by the matrix size <it>N</it>, a treatment similar to the scaling of writhing numbers in the SGM method (Rogen P. and Fain B., 2003). In a relatively crude analysis, &#955;<sub>1 </sub>can be directly compared to infer structural similarity. This method is referred here as the Scaled Maximum Eigenvalue Comparison (SMEC).</p>
            <p>In addition to the maximum eigenvalues, their corresponding eigenvectors can also be used to correlate similar structures. Particularly for pair-wise structure comparison, degree of similarity can be more accurately measured by comparing both eigenvalue and eigenvector. Since proteins are generally not of the same length, their eigenvectors cannot be directly correlated due to different dimensionality. Therefore, a "sliding window" approach is employed to correlate the smaller protein to all matching segments (length-wise) in the larger protein. Let us consider two proteins, A and B, having <it>N </it>and <it>M </it>secondary structure residues, respectively, and <it>N </it>&#8804; <it>M</it>. For the protein having shorter secondary segments, &#955;<sup>A </sup>and e<sup>A </sup>are respectively the maximum eigenvalue and its corresponding <it>N</it>-dimensional eigenvector. For the protein with more secondary structure residues, <it>M</it>-<it>N</it>+1 interaction matrices are decomposed, where (&#955;<sup>B</sup><sub>1</sub>, e<sup>B</sup><sub>1</sub>) represent the principle components of the interaction matrix constructed from secondary structure residues 1 ... <it>N</it>, (<it>&#955;</it><sup>B</sup><sub>2</sub>, e<sup>B</sup><sub>2</sub>) are from secondary structure residues 2 ... N+1, and so on. To quantify structural similarity, we define a difference metric, <it>R</it>, between <b>&#206; </b>of protein A and <b>&#206; </b>of the <it>j</it>th matching segment of protein B as</p>
            <p>
               <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-7-40-i15">
                  <m:semantics>
                     <m:mrow>
                        <m:msub>
                           <m:mi>R</m:mi>
                           <m:mi>j</m:mi>
                        </m:msub>
                        <m:mo>=</m:mo>
                        <m:mo>|</m:mo>
                        <m:mo>|</m:mo>
                        <m:msup>
                           <m:mstyle mathvariant="bold" mathsize="normal">
                              <m:mi>e</m:mi>
                           </m:mstyle>
                           <m:mi>A</m:mi>
                        </m:msup>
                        <m:mo>&#8722;</m:mo>
                        <m:msubsup>
                           <m:mstyle mathvariant="bold" mathsize="normal">
                              <m:mi>e</m:mi>
                           </m:mstyle>
                           <m:mi>j</m:mi>
                           <m:mi>B</m:mi>
                        </m:msubsup>
                        <m:mo>|</m:mo>
                        <m:mo>|</m:mo>
                        <m:mrow>
                           <m:mo>|</m:mo>
                           <m:mrow>
                              <m:msup>
                                 <m:mi>&#955;</m:mi>
                                 <m:mi>A</m:mi>
                              </m:msup>
                              <m:mo>&#8722;</m:mo>
                              <m:msubsup>
                                 <m:mi>&#955;</m:mi>
                                 <m:mi>j</m:mi>
                                 <m:mi>B</m:mi>
                              </m:msubsup>
                           </m:mrow>
                           <m:mo>|</m:mo>
                        </m:mrow>
                        <m:mtext>&#160;,&#160;</m:mtext>
                        <m:mn>1</m:mn>
                        <m:mo>&#8804;</m:mo>
                        <m:mi>j</m:mi>
                        <m:mo>&#8804;</m:mo>
                        <m:mi>M</m:mi>
                        <m:mo>&#8722;</m:mo>
                        <m:mi>N</m:mi>
                        <m:mo>+</m:mo>
                        <m:mn>1.</m:mn>
                        <m:mtext>&#160;&#160;&#160;&#160;&#160;</m:mtext>
                        <m:mrow>
                           <m:mo>(</m:mo>
                           <m:mn>4</m:mn>
                           <m:mo>)</m:mo>
                        </m:mrow>
                     </m:mrow>
                     <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGsbGudaWgaaWcbaGaemOAaOgabeaakiabg2da9iabcYha8jabcYha8jabhwgaLnaaCaaaleqabaGaemyqaeeaaOGaeyOeI0IaeCyzau2aa0baaSqaaiabdQgaQbqaaiabdkeacbaakiabcYha8jabcYha8naaemaabaGaeq4UdW2aaWbaaSqabeaacqWGbbqqaaGccqGHsislcqaH7oaBdaqhaaWcbaGaemOAaOgabaGaemOqaieaaaGccaGLhWUaayjcSdGaeeiiaaIaeeilaWIaeeiiaaIaeGymaeJaeyizImQaemOAaOMaeyizImQaemyta0KaeyOeI0IaemOta4Kaey4kaSIaeGymaeJaeiOla4IaaCzcaiaaxMaadaqadaqaaiabisda0aGaayjkaiaawMcaaaaa@5B1C@</m:annotation>
                  </m:semantics>
               </m:math>
            </p>
            <p>Obviously, smaller <it>R</it><sub><it>j </it></sub>indicates better correlation or higher degree of structural similarity. The overall difference between the two proteins is defined as</p>
            <p><it>R </it>= min(<it>R</it><sub>1</sub>, <it>R</it><sub>2</sub>,..., <it>R</it><sub><it>M</it>-<it>N</it>+1</sub>). &#160;&#160;&#160; (5)</p>
            <p>The minimum of <it>R</it><sub>1</sub>, <it>R</it><sub>2</sub>, ..., <it>R</it><sub><it>M</it>-<it>N</it>+1 </sub>is used here to measure similarity because this potentially allows mapping a smaller structure onto a homologous domain within a larger protein. This method is called the Principle Component Correlation (PCC) analysis.</p>
         </sec>
         <sec>
            <st>
               <p>Defining the matrix elements</p>
            </st>
            <p>The definition of block matrix elements, <it>d</it>(<m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-7-40-i10"><m:semantics><m:mrow><m:msubsup><m:mi>c</m:mi><m:mi>i</m:mi><m:mi>u</m:mi></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGJbWydaqhaaWcbaGaemyAaKgabaGaemyDauhaaaaa@30F6@</m:annotation></m:semantics></m:math>, <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-7-40-i11"><m:semantics><m:mrow><m:msubsup><m:mi>c</m:mi><m:mi>j</m:mi><m:mi>v</m:mi></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGJbWydaqhaaWcbaGaemOAaOgabaGaemODayhaaaaa@30FA@</m:annotation></m:semantics></m:math>), depends on the desired structural features to be extracted. In the current study, we focus structural comparison on protein backbone conformation. Clearly the simplest invariant describing the backbone conformation is the Euclidian distance between a pair of C<sup>&#945; </sup>atoms from two different secondary segments. Formally, the elements are defined as <it>d</it>(<m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-7-40-i10"><m:semantics><m:mrow><m:msubsup><m:mi>c</m:mi><m:mi>i</m:mi><m:mi>u</m:mi></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGJbWydaqhaaWcbaGaemyAaKgabaGaemyDauhaaaaa@30F6@</m:annotation></m:semantics></m:math>, <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-7-40-i11"><m:semantics><m:mrow><m:msubsup><m:mi>c</m:mi><m:mi>j</m:mi><m:mi>v</m:mi></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGJbWydaqhaaWcbaGaemOAaOgabaGaemODayhaaaaa@30FA@</m:annotation></m:semantics></m:math>) = ||<m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-7-40-i10"><m:semantics><m:mrow><m:msubsup><m:mi>c</m:mi><m:mi>i</m:mi><m:mi>u</m:mi></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGJbWydaqhaaWcbaGaemyAaKgabaGaemyDauhaaaaa@30F6@</m:annotation></m:semantics></m:math> - <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-7-40-i11"><m:semantics><m:mrow><m:msubsup><m:mi>c</m:mi><m:mi>j</m:mi><m:mi>v</m:mi></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGJbWydaqhaaWcbaGaemOAaOgabaGaemODayhaaaaa@30FA@</m:annotation></m:semantics></m:math>|| where <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-7-40-i10"><m:semantics><m:mrow><m:msubsup><m:mi>c</m:mi><m:mi>i</m:mi><m:mi>u</m:mi></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGJbWydaqhaaWcbaGaemyAaKgabaGaemyDauhaaaaa@30F6@</m:annotation></m:semantics></m:math> and <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-7-40-i11"><m:semantics><m:mrow><m:msubsup><m:mi>c</m:mi><m:mi>j</m:mi><m:mi>v</m:mi></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGJbWydaqhaaWcbaGaemOAaOgabaGaemODayhaaaaa@30FA@</m:annotation></m:semantics></m:math> are the coordinates of the two C<sup>&#945; </sup>atoms of residues <it>u </it>of <b>h</b><sub>i </sub>and <it>v </it>of <b>h</b><sub>j</sub>, respectively. For conciseness, we name the interaction matrix so defined as the Pair-wise Distance (PD) matrix. For illustration purpose, the interaction matrix for the structure of Pb1, Domain of Bem1P (PDB accession code 1IP9), is shown in Fig. <figr fid="F1">1</figr>. This structure, consisting of two &#945; helices and four &#946; strands (Fig. <figr fid="F1">1a</figr>), is used here to provide distances between all pairs of C<sub>&#945; </sub>atoms in the six secondary elements (Fig. <figr fid="F1">1b</figr>).</p>
            <fig id="F1">
               <title>
                  <p>Figure 1</p>
               </title>
               <caption>
                  <p>(a) Ribbon representation of 1IP9, showing two &#945; helixes and four &#946; strands, and (b) the corresponding symmetric interaction matrix (defined in eq. 2), where <b>h</b><sub>3 </sub>and <b>h</b><sub>5 </sub>are the two &#945; helices, and <b>h</b><sub>1</sub>, <b>h</b><sub>2</sub>, <b>h</b><sub>4 </sub>and <b>h</b><sub>6 </sub>are the four &#946; strands</p>
               </caption>
               <text>
                  <p>(a) Ribbon representation of 1IP9, showing two &#945; helixes and four &#946; strands, and (b) the corresponding symmetric interaction matrix (defined in eq. 2), where <b>h</b><sub>3 </sub>and <b>h</b><sub>5 </sub>are the two &#945; helices, and <b>h</b><sub>1</sub>, <b>h</b><sub>2</sub>, <b>h</b><sub>4 </sub>and <b>h</b><sub>6 </sub>are the four &#946; strands. The gray-level values denote the distance between any two C<sup>&#945; </sup>atoms with white corresponding to the shortest distance, i.e., 0.</p>
               </text>
               <graphic file="1471-2105-7-40-1"/>
            </fig>
            <p>Furthermore, two variations of the PD matrix definition are explored in attempt to provide a better resolution in structural comparison and classification. Since physical energy of interaction between a pair of atoms typically increase monotonically as the inverse of their separation, inverse of distance is used to mimic physical interactions between secondary elements. Here the elements of <b>F</b>(<b>h</b><sub><it>i</it></sub>, <b>h</b><sub><it>j</it></sub>) are defined as</p>
            <p>
               <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-7-40-i16">
                  <m:semantics>
                     <m:mrow>
                        <m:mi>d</m:mi>
                        <m:mo stretchy="false">(</m:mo>
                        <m:msubsup>
                           <m:mi>c</m:mi>
                           <m:mi>i</m:mi>
                           <m:mi>u</m:mi>
                        </m:msubsup>
                        <m:mo>,</m:mo>
                        <m:msubsup>
                           <m:mi>c</m:mi>
                           <m:mi>j</m:mi>
                           <m:mi>v</m:mi>
                        </m:msubsup>
                        <m:mo stretchy="false">)</m:mo>
                        <m:mo>=</m:mo>
                        <m:mrow>
                           <m:mo>{</m:mo>
                           <m:mrow>
                              <m:mtable>
                                 <m:mtr>
                                    <m:mtd>
                                       <m:mrow>
                                          <m:mfrac>
                                             <m:mn>1</m:mn>
                                             <m:mrow>
                                                <m:mrow>
                                                   <m:mo>|</m:mo>
                                                   <m:mrow>
                                                      <m:mrow>
                                                         <m:mo>|</m:mo>
                                                         <m:mrow>
                                                            <m:msubsup>
                                                               <m:mi>c</m:mi>
                                                               <m:mi>i</m:mi>
                                                               <m:mi>u</m:mi>
                                                            </m:msubsup>
                                                            <m:mo>&#8722;</m:mo>
                                                            <m:msubsup>
                                                               <m:mi>c</m:mi>
                                                               <m:mi>j</m:mi>
                                                               <m:mi>v</m:mi>
                                                            </m:msubsup>
                                                         </m:mrow>
                                                         <m:mo>|</m:mo>
                                                      </m:mrow>
                                                   </m:mrow>
                                                   <m:mo>|</m:mo>
                                                </m:mrow>
                                             </m:mrow>
                                          </m:mfrac>
                                          <m:mo>,</m:mo>
                                       </m:mrow>
                                    </m:mtd>
                                    <m:mtd>
                                       <m:mrow>
                                          <m:mrow>
                                             <m:mo>|</m:mo>
                                             <m:mrow>
                                                <m:mrow>
                                                   <m:mo>|</m:mo>
                                                   <m:mrow>
                                                      <m:msubsup>
                                                         <m:mi>c</m:mi>
                                                         <m:mi>i</m:mi>
                                                         <m:mi>u</m:mi>
                                                      </m:msubsup>
                                                      <m:mo>&#8722;</m:mo>
                                                      <m:msubsup>
                                                         <m:mi>c</m:mi>
                                                         <m:mi>j</m:mi>
                                                         <m:mi>v</m:mi>
                                                      </m:msubsup>
                                                   </m:mrow>
                                                   <m:mo>|</m:mo>
                                                </m:mrow>
                                             </m:mrow>
                                             <m:mo>|</m:mo>
                                          </m:mrow>
                                          <m:mo>&#8805;</m:mo>
                                          <m:msub>
                                             <m:mi>u</m:mi>
                                             <m:mn>0</m:mn>
                                          </m:msub>
                                       </m:mrow>
                                    </m:mtd>
                                 </m:mtr>
                                 <m:mtr>
                                    <m:mtd>
                                       <m:mrow>
                                          <m:mfrac>
                                             <m:mn>1</m:mn>
                                             <m:mrow>
                                                <m:msub>
                                                   <m:mi>u</m:mi>
                                                   <m:mn>0</m:mn>
                                                </m:msub>
                                             </m:mrow>
                                          </m:mfrac>
                                       </m:mrow>
                                    </m:mtd>
                                    <m:mtd>
                                       <m:mrow>
                                          <m:mrow>
                                             <m:mo>|</m:mo>
                                             <m:mrow>
                                                <m:mrow>
                                                   <m:mo>|</m:mo>
                                                   <m:mrow>
                                                      <m:msubsup>
                                                         <m:mi>c</m:mi>
                                                         <m:mi>i</m:mi>
                                                         <m:mi>u</m:mi>
                                                      </m:msubsup>
                                                      <m:mo>&#8722;</m:mo>
                                                      <m:msubsup>
                                                         <m:mi>c</m:mi>
                                                         <m:mi>j</m:mi>
                                                         <m:mi>v</m:mi>
                                                      </m:msubsup>
                                                   </m:mrow>
                                                   <m:mo>|</m:mo>
                                                </m:mrow>
                                             </m:mrow>
                                             <m:mo>|</m:mo>
                                          </m:mrow>
                                          <m:mo>&lt;</m:mo>
                                          <m:msub>
                                             <m:mi>u</m:mi>
                                             <m:mn>0</m:mn>
                                          </m:msub>
                                       </m:mrow>
                                    </m:mtd>
                                 </m:mtr>
                              </m:mtable>
                           </m:mrow>
                        </m:mrow>
                        <m:mtext>&#160;&#160;&#160;&#160;&#160;</m:mtext>
                        <m:mrow>
                           <m:mo>(</m:mo>
                           <m:mn>6</m:mn>
                           <m:mo>)</m:mo>
                        </m:mrow>
                     </m:mrow>
                     <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGKbazcqGGOaakcqWGJbWydaqhaaWcbaGaemyAaKgabaGaemyDauhaaOGaeiilaWIaem4yam2aa0baaSqaaiabdQgaQbqaaiabdAha2baakiabcMcaPiabg2da9maaceaabaqbaeqabiGaaaqaamaalaaabaGaeGymaedabaWaaqWaaeaadaabdaqaaiabdogaJnaaDaaaleaacqWGPbqAaeaacqWG1bqDaaGccqGHsislcqWGJbWydaqhaaWcbaGaemOAaOgabaGaemODayhaaaGccaGLhWUaayjcSdaacaGLhWUaayjcSdaaaiabcYcaSaqaamaaemaabaWaaqWaaeaacqWGJbWydaqhaaWcbaGaemyAaKgabaGaemyDauhaaOGaeyOeI0Iaem4yam2aa0baaSqaaiabdQgaQbqaaiabdAha2baaaOGaay5bSlaawIa7aaGaay5bSlaawIa7aiabgwMiZkabdwha1naaBaaaleaacqaIWaamaeqaaaGcbaWaaSaaaeaacqaIXaqmaeaacqWG1bqDdaWgaaWcbaGaeGimaadabeaaaaaakeaadaabdaqaamaaemaabaGaem4yam2aa0baaSqaaiabdMgaPbqaaiabdwha1baakiabgkHiTiabdogaJnaaDaaaleaacqWGQbGAaeaacqWG2bGDaaaakiaawEa7caGLiWoaaiaawEa7caGLiWoacqGH8aapcqWG1bqDdaWgaaWcbaGaeGimaadabeaaaaaakiaawUhaaiaaxMaacaWLjaWaaeWaaeaacqaI2aGnaiaawIcacaGLPaaaaaa@7C38@</m:annotation>
                  </m:semantics>
               </m:math>
            </p>
            <p>where <it>u</it><sub>0 </sub>represent a hard-sphere boundary below which the interaction is constant. In this study, we arbitrarily set <it>u</it><sub>0 </sub>to 3<b>&#197;</b>. This definition is referred as Pair-wise Inverse Distance (PID) matrix.</p>
            <p>Another variation of the PD matrix definition is to take into account the N &#8211; C terminal sense, in attempt to further emphasize protein topological features. For a secondary element, <b>h</b><sub><it>i</it></sub>, its direction vector <b>v</b><sub><it>i </it></sub>is defined by two points in Cartesian space: the center of mass of the five consecutive N-terminal C<sup>&#945; </sup>and the center of mass of the five consecutive C-terminal C<sup>&#945; </sup>atoms. Given a pair of secondary elements <b>h</b><sub><it>i </it></sub>and <b>h</b><sub><it>j</it></sub>, the new matrix elements are defined as</p>
            <p><it>d</it>(<m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-7-40-i10"><m:semantics><m:mrow><m:msubsup><m:mi>c</m:mi><m:mi>i</m:mi><m:mi>u</m:mi></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGJbWydaqhaaWcbaGaemyAaKgabaGaemyDauhaaaaa@30F6@</m:annotation></m:semantics></m:math>, <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-7-40-i11"><m:semantics><m:mrow><m:msubsup><m:mi>c</m:mi><m:mi>j</m:mi><m:mi>v</m:mi></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGJbWydaqhaaWcbaGaemOAaOgabaGaemODayhaaaaa@30FA@</m:annotation></m:semantics></m:math>)' = <it>d</it>(<m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-7-40-i10"><m:semantics><m:mrow><m:msubsup><m:mi>c</m:mi><m:mi>i</m:mi><m:mi>u</m:mi></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGJbWydaqhaaWcbaGaemyAaKgabaGaemyDauhaaaaa@30F6@</m:annotation></m:semantics></m:math>, <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-7-40-i11"><m:semantics><m:mrow><m:msubsup><m:mi>c</m:mi><m:mi>j</m:mi><m:mi>v</m:mi></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGJbWydaqhaaWcbaGaemOAaOgabaGaemODayhaaaaa@30FA@</m:annotation></m:semantics></m:math>)sgn(<b>v</b><sub><it>i</it></sub>&#183;<b>v</b><sub><it>j</it></sub>) &#160;&#160;&#160; (7)</p>
            <p>where sgn(<it>x</it>) is a symbol function which is 1 when <it>x </it>&#8805; 0 and -1 when <it>x </it>&lt; 0. This variation is referred as Pair-wise Distance with Sense (PDS) matrix in this study.</p>
         </sec>
         <sec>
            <st>
               <p>Linking/Writhing numbers</p>
            </st>
            <p>To evaluate the ability of PCC analysis in extracting pure topological features, the linking and writhing numbers, which are good measures of global topology, are also calculated for the four sets of structures for comparison. The linking number of two curves is defined by the C<b>&#259;</b>lug<b>&#259;</b>reanu-Fuller-White formula <abbrgrp><abbr bid="B25">25</abbr><abbr bid="B26">26</abbr><abbr bid="B27">27</abbr></abbrgrp>: <it>Lk </it>= <it>Wr </it>+ <it>Tw</it>, where the linking number <it>Lk </it>counts the sum of signed crossings between the ribbon's two boundary curves, the writhing number <it>Wr </it>counts the sum of signed self-crossings of the curve, averaged over all projection directions <abbrgrp><abbr bid="B28">28</abbr></abbrgrp>, and <it>Tw </it>is the twist number.<it>Lk </it>is an invariant to any smooth deformation that avoids self-intersections <abbrgrp><abbr bid="B29">29</abbr></abbrgrp>, and it is also independent of projection direction. <it>Wr </it>and <it>Tw </it>are invariant to some transformations, such as rigid body motions. Here we compute the writhing numbers using the Scaled Gauss Metric (SGM) approach previously described by Rogen and Fain <abbrgrp><abbr bid="B22">22</abbr></abbrgrp>.</p>
            <p>Given two curves <it>c</it><sub>1 </sub>and <it>c</it><sub>2</sub>, which are two closed non-intersecting curves in 3-dimentional space, and define <it>e</it>(<it>s</it>, <it>t</it>) = (<it>c</it><sub>2</sub>(<it>t</it>) - <it>c</it><sub>1</sub>(<it>s</it>))/||<it>c</it><sub>2</sub>(<it>t</it>) - <it>c</it><sub>1</sub>(<it>s</it>)||, where ||&#183;|| denotes the Euclidean norm. For two closed curves, the vector field <it>e</it>(<it>s</it>, <it>t</it>) is doubly periodic. Such mappings have an integer-valued degree that is invariant under topological deformations. The linking number of two curves is further defined as</p>
            <p>
               <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-7-40-i17">
                  <m:semantics>
                     <m:mrow>
                        <m:mi>L</m:mi>
                        <m:mi>k</m:mi>
                        <m:mo stretchy="false">(</m:mo>
                        <m:msub>
                           <m:mi>c</m:mi>
                           <m:mn>1</m:mn>
                        </m:msub>
                        <m:mo>,</m:mo>
                        <m:mi>c</m:mi>
                        <m:mmultiscripts>
                           <m:mo stretchy="false">)</m:mo>
                           <m:mprescripts/>
                           <m:mn>2</m:mn>
                           <m:none/>
                        </m:mmultiscripts>
                        <m:mo>=</m:mo>
                        <m:mfrac>
                           <m:mn>1</m:mn>
                           <m:mrow>
                              <m:mn>4</m:mn>
                              <m:mi>&#960;</m:mi>
                           </m:mrow>
                        </m:mfrac>
                        <m:mstyle displaystyle="true">
                           <m:mrow>
                              <m:msub>
                                 <m:mo>&#8747;</m:mo>
                                 <m:mrow>
                                    <m:msub>
                                       <m:mi>c</m:mi>
                                       <m:mn>1</m:mn>
                                    </m:msub>
                                 </m:mrow>
                              </m:msub>
                              <m:mrow>
                                 <m:mstyle displaystyle="true">
                                    <m:mrow>
                                       <m:msub>
                                          <m:mo>&#8747;</m:mo>
                                          <m:mrow>
                                             <m:msub>
                                                <m:mi>c</m:mi>
                                                <m:mn>2</m:mn>
                                             </m:msub>
                                          </m:mrow>
                                       </m:msub>
                                       <m:mrow>
                                          <m:mrow>
                                             <m:mo>[</m:mo>
                                             <m:mrow>
                                                <m:mi>e</m:mi>
                                                <m:mo>,</m:mo>
                                                <m:msub>
                                                   <m:mi>e</m:mi>
                                                   <m:mi>s</m:mi>
                                                </m:msub>
                                                <m:mo>,</m:mo>
                                                <m:msub>
                                                   <m:mi>e</m:mi>
                                                   <m:mi>t</m:mi>
                                                </m:msub>
                                             </m:mrow>
                                             <m:mo>]</m:mo>
                                          </m:mrow>
                                       </m:mrow>
                                    </m:mrow>
                                 </m:mstyle>
                              </m:mrow>
                           </m:mrow>
                        </m:mstyle>
                        <m:mi>d</m:mi>
                        <m:mi>s</m:mi>
                        <m:mi>d</m:mi>
                        <m:mi>t</m:mi>
                        <m:mo>=</m:mo>
                        <m:mfrac>
                           <m:mn>1</m:mn>
                           <m:mrow>
                              <m:mn>4</m:mn>
                              <m:mi>&#960;</m:mi>
                           </m:mrow>
                        </m:mfrac>
                        <m:mstyle displaystyle="true">
                           <m:mrow>
                              <m:msub>
                                 <m:mo>&#8747;</m:mo>
                                 <m:mrow>
                                    <m:msub>
                                       <m:mi>c</m:mi>
                                       <m:mn>1</m:mn>
                                    </m:msub>
                                 </m:mrow>
                              </m:msub>
                              <m:mrow>
                                 <m:mstyle displaystyle="true">
                                    <m:mrow>
                                       <m:msub>
                                          <m:mo>&#8747;</m:mo>
                                          <m:mrow>
                                             <m:msub>
                                                <m:mi>c</m:mi>
                                                <m:mn>2</m:mn>
                                             </m:msub>
                                          </m:mrow>
                                       </m:msub>
                                       <m:mrow>
                                          <m:mfrac>
                                             <m:mrow>
                                                <m:mrow>
                                                   <m:mo>(</m:mo>
                                                   <m:mrow>
                                                      <m:msubsup>
                                                         <m:mi>c</m:mi>
                                                         <m:mn>1</m:mn>
                                                         <m:mo>'</m:mo>
                                                      </m:msubsup>
                                                      <m:mo stretchy="false">(</m:mo>
                                                      <m:mi>s</m:mi>
                                                      <m:mo stretchy="false">)</m:mo>
                                                      <m:mo>&#215;</m:mo>
                                                      <m:msubsup>
                                                         <m:mi>c</m:mi>
                                                         <m:mn>2</m:mn>
                                                         <m:mo>'</m:mo>
                                                      </m:msubsup>
                                                      <m:mo stretchy="false">(</m:mo>
                                                      <m:mi>t</m:mi>
                                                      <m:mo stretchy="false">)</m:mo>
                                                   </m:mrow>
                                                   <m:mo>)</m:mo>
                                                </m:mrow>
                                                <m:mo>&#8901;</m:mo>
                                                <m:mrow>
                                                   <m:mo>(</m:mo>
                                                   <m:mrow>
                                                      <m:msub>
                                                         <m:mi>c</m:mi>
                                                         <m:mn>1</m:mn>
                                                      </m:msub>
                                                      <m:mo stretchy="false">(</m:mo>
                                                      <m:mi>s</m:mi>
                                                      <m:mo stretchy="false">)</m:mo>
                                                      <m:mo>&#8722;</m:mo>
                                                      <m:msub>
                                                         <m:mi>c</m:mi>
                                                         <m:mn>2</m:mn>
                                                      </m:msub>
                                                      <m:mo stretchy="false">(</m:mo>
                                                      <m:mi>t</m:mi>
                                                      <m:mo stretchy="false">)</m:mo>
                                                   </m:mrow>
                                                   <m:mo>)</m:mo>
                                                </m:mrow>
                                             </m:mrow>
                                             <m:mrow>
                                                <m:msup>
                                                   <m:mrow>
                                                      <m:mrow>
                                                         <m:mo>&#8214;</m:mo>
                                                         <m:mrow>
                                                            <m:msub>
                                                               <m:mi>c</m:mi>
                                                               <m:mn>1</m:mn>
                                                            </m:msub>
                                                            <m:mo stretchy="false">(</m:mo>
                                                            <m:mi>s</m:mi>
                                                            <m:mo stretchy="false">)</m:mo>
                                                            <m:mo>&#8722;</m:mo>
                                                            <m:msub>
                                                               <m:mi>c</m:mi>
                                                               <m:mn>2</m:mn>
                                                            </m:msub>
                                                            <m:mo stretchy="false">(</m:mo>
                                                            <m:mi>t</m:mi>
                                                            <m:mo stretchy="false">)</m:mo>
                                                         </m:mrow>
                                                         <m:mo>&#8214;</m:mo>
                                                      </m:mrow>
                                                   </m:mrow>
                                                   <m:mn>3</m:mn>
                                                </m:msup>
                                             </m:mrow>
                                          </m:mfrac>
                                       </m:mrow>
                                    </m:mrow>
                                 </m:mstyle>
                              </m:mrow>
                           </m:mrow>
                        </m:mstyle>
                        <m:mi>d</m:mi>
                        <m:mi>s</m:mi>
                        <m:mi>d</m:mi>
                        <m:mi>t</m:mi>
                        <m:mtext>&#160;&#160;&#160;&#160;&#160;</m:mtext>
                        <m:mrow>
                           <m:mo>(</m:mo>
                           <m:mn>8</m:mn>
                           <m:mo>)</m:mo>
                        </m:mrow>
                     </m:mrow>
                     <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGmbatcqWGRbWAcqGGOaakcqWGJbWydaWgaaWcbaGaeGymaedabeaakiabcYcaSiabdogaJnaaBeaaleaacqaIYaGmaeqaaOGaeiykaKIaeyypa0ZaaSaaaeaacqaIXaqmaeaacqaI0aancqaHapaCaaWaa8qeaeaadaWdraqaamaadmaabaGaemyzauMaeiilaWIaemyzau2aaSbaaSqaaiabdohaZbqabaGccqGGSaalcqWGLbqzdaWgaaWcbaGaemiDaqhabeaaaOGaay5waiaaw2faaaWcbaGaem4yam2aaSbaaWqaaiabikdaYaqabaaaleqaniabgUIiYdaaleaacqWGJbWydaWgaaadbaGaeGymaedabeaaaSqab0Gaey4kIipakiabdsgaKjabdohaZjabdsgaKjabdsha0jabg2da9maalaaabaGaeGymaedabaGaeGinaqJaeqiWdahaamaapebabaWaa8qeaeaadaWcaaqaamaabmaabaGaem4yam2aa0baaSqaaiabigdaXaqaaiabcEcaNaaakiabcIcaOiabdohaZjabcMcaPiabgEna0kabdogaJnaaDaaaleaacqaIYaGmaeaacqGGNaWjaaGccqGGOaakcqWG0baDcqGGPaqkaiaawIcacaGLPaaacqGHflY1daqadaqaaiabdogaJnaaBaaaleaacqaIXaqmaeqaaOGaeiikaGIaem4CamNaeiykaKIaeyOeI0Iaem4yam2aaSbaaSqaaiabikdaYaqabaGccqGGOaakcqWG0baDcqGGPaqkaiaawIcacaGLPaaaaeaadaqbdaqaaiabdogaJnaaBaaaleaacqaIXaqmaeqaaOGaeiikaGIaem4CamNaeiykaKIaeyOeI0Iaem4yam2aaSbaaSqaaiabikdaYaqabaGccqGGOaakcqWG0baDcqGGPaqkaiaawMa7caGLkWoadaahaaWcbeqaaiabiodaZaaaaaaabaGaem4yam2aaSbaaWqaaiabikdaYaqabaaaleqaniabgUIiYdaaleaacqWGJbWydaWgaaadbaGaeGymaedabeaaaSqab0Gaey4kIipakiabdsgaKjabdohaZjabdsgaKjabdsha0jaaxMaacaWLjaWaaeWaaeaacqaI4aaoaiaawIcacaGLPaaaaaa@9CD7@</m:annotation>
                  </m:semantics>
               </m:math>
            </p>
            <p>where <it>e</it><sub><it>s </it></sub>and <it>e</it><sub><it>t </it></sub>are the tangents of <it>e</it>(<it>s</it>, <it>t</it>) at point (<it>s</it>, <it>t</it>), as well as <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-7-40-i18"><m:semantics><m:mrow><m:msubsup><m:mi>c</m:mi><m:mn>1</m:mn><m:mo>'</m:mo></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGJbWydaqhaaWcbaGaeGymaedabaGaei4jaCcaaaaa@2FEE@</m:annotation></m:semantics></m:math>(<it>s</it>) and <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-7-40-i19"><m:semantics><m:mrow><m:msubsup><m:mi>c</m:mi><m:mn>2</m:mn><m:mo>'</m:mo></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGJbWydaqhaaWcbaGaeGOmaidabaGaei4jaCcaaaaa@2FF0@</m:annotation></m:semantics></m:math>(<it>t</it>) are the tangents along the <it>c</it><sub>1 </sub>and <it>c</it><sub>2 </sub>at <it>s </it>and <it>t</it>. Note that here <it>e</it><sub><it>s</it></sub>, <it>e</it><sub><it>t</it></sub>, <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-7-40-i18"><m:semantics><m:mrow><m:msubsup><m:mi>c</m:mi><m:mn>1</m:mn><m:mo>'</m:mo></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGJbWydaqhaaWcbaGaeGymaedabaGaei4jaC