<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1472-6807-8-45</ui>
   <ji>1472-6807</ji>
   <fm>
      <dochead>Research article</dochead>
      <bibl>
         <title>
            <p>Protein Functional Surfaces: Global Shape Matching and Local Spatial Alignments of Ligand Binding Sites</p>
         </title>
         <aug>
            <au id="A1">
               <snm>Binkowski</snm>
               <fnm>T Andrew</fnm>
               <insr iid="I1"/>
               <email>abinkowski@anl.gov</email>
            </au>
            <au id="A2" ca="yes">
               <snm>Joachimiak</snm>
               <fnm>Andrzej</fnm>
               <insr iid="I1"/>
               <email>andrzejj@anl.gov</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Midwest Center for Structural Genomics and Structural Biology Center, Biosciences Division, Argonne National Laboratory, Argonne, Illinois, 60439, USA</p>
            </ins>
         </insg>
         <source>BMC Structural Biology</source>
         <issn>1472-6807</issn>
         <pubdate>2008</pubdate>
         <volume>8</volume>
         <issue>1</issue>
         <fpage>45</fpage>
         <url>http://www.biomedcentral.com/1472-6807/8/45</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">18954462</pubid>
               <pubid idtype="doi">10.1186/1472-6807-8-45</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>25</day>
               <month>6</month>
               <year>2008</year>
            </date>
         </rec>
         <acc>
            <date>
               <day>27</day>
               <month>10</month>
               <year>2008</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>27</day>
               <month>10</month>
               <year>2008</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2008</year>
         <collab>Binkowski and Joachimiak; licensee BioMed Central Ltd.</collab>
         <note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>Protein surfaces comprise only a fraction of the total residues but are the most conserved functional features of proteins. Surfaces performing identical functions are found in proteins absent of any sequence or fold similarity. While biochemical activity can be attributed to a few key residues, the broader surrounding environment plays an equally important role.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>We describe a methodology that attempts to optimize two components, global shape and local physicochemical texture, for evaluating the similarity between a pair of surfaces. Surface shape similarity is assessed using a three-dimensional object recognition algorithm and physicochemical texture similarity is assessed through a spatial alignment of conserved residues between the surfaces. The comparisons are used in tandem to efficiently search the Global Protein Surface Survey (GPSS), a library of annotated surfaces derived from structures in the PDB, for studying evolutionary relationships and uncovering novel similarities between proteins.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusion</p>
               </st>
               <p>We provide an assessment of our method using library retrieval experiments for identifying functionally homologous surfaces binding different ligands, functionally diverse surfaces binding the same ligand, and binding surfaces of ubiquitous and conformationally flexible ligands. Results using surface similarity to predict function for proteins of unknown function are reported. Additionally, an automated analysis of the ATP binding surface landscape is presented to provide insight into the correlation between surface similarity and function for structures in the PDB and for the subset of protein kinases.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <meta>
      <classifications>
         <classification type="bmc" subtype="user_supplied_xml" id="endnote"/>
      </classifications>
   </meta>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>It has become apparent that surfaces, comprised of a fraction of the total residues, are the most conserved functional features of proteins. Proteins utilize common surface motifs to create precise chemical environments designed to perform specific functions. These motifs are not restricted to a single protein scaffold but can be found within different protein folds or at domain/domain and subunits interfaces. While biochemical activity can be attributed to a few key residues (e.g catalytic triads), the broader surrounding environment (i.e. auxiliary residues in spatial proximity) often plays an equally import role in fine-tuning molecular recognition and/or catalysis.</p>
         <p>Powerful evolutionary forces have allowed proteins to govern ligand binding through seemingly subtle local surface variability. These changes, which are not easily detectable by sequence analysis, may provide competitive advantage for optimization of co-factor specificity. In some circumstances, surface diversity adversely affects normal cell process by providing environments for undesired binding events (e.g. drug side effects) or mutations directly correlated to disease<abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. The conservation of functional surfaces presents an opportunity to compare and analyze proteins independent of sequence or fold. These comparisons can be used to classify protein functions or to infer biochemical activity for proteins with unknown function, such as those targeted by structural genomics programs.</p>
         <p>Several methods have been developed detecting localized, spatial protein similarities with applications for evolutionary analysis, function prediction and drug discovery. The use of graph theory has been widely applied to the comparison of three-dimensional patterns. Artymiuk <it>et al</it>. developed an algorithm based on subgraph isomorphism detection to search residue patterns against the PDB<abbrgrp><abbr bid="B2">2</abbr></abbrgrp>. Kinoshita <it>et al</it>. used clique detection algorithms to assign protein biochemical functions using the similarity information of molecular surface geometries and electrostatic potentials<abbrgrp><abbr bid="B3">3</abbr></abbrgrp>. Using a clique-detection algorithm, Schmitt <it>et al</it>., compared generic pseudo-centers that code for possible ligand-protein interactions in protein cavities. Query cavities are searched against Cavbase, a pre-computed database of cavities extracted from the PDB<abbrgrp><abbr bid="B4">4</abbr></abbrgrp>. The method has been applied to identify surfaces in non-homologous proteins as well as for the classification of protein families<abbrgrp><abbr bid="B5">5</abbr></abbrgrp>. Kleywegt searched for motifs of residue pseudo-centers in a library of protein structures using a depth-first search algorithm<abbrgrp><abbr bid="B6">6</abbr></abbrgrp>. Russell also developed an algorithm based on depth-first search that detects atomic geometric patterns common in between side-chains in proteins and presented new examples of convergent evolution<abbrgrp><abbr bid="B7">7</abbr></abbrgrp>. Parametric statistical evaluations of Russell's atomic superposition method were extended by Stark <it>et al</it>. <abbrgrp><abbr bid="B8">8</abbr></abbrgrp>.</p>
         <p>Another widely used approach is geometric hashing, which is an efficient method for matching features against a database. Jackson and Gold used geometric hashing to perform an all-against-all comparison of protein-ligand binding sites in the SitesBase database <abbrgrp><abbr bid="B9">9</abbr><abbr bid="B10">10</abbr><abbr bid="B11">11</abbr></abbrgrp>. Their method was also applied for functional annotation and building pharmacophore models for drug discovery<abbrgrp><abbr bid="B11">11</abbr></abbrgrp>. Fischer <it>et. al</it>. developed an algorithm based on geometric hashing that detects surface similarities of proteins using spatial patterns of atoms<abbrgrp><abbr bid="B12">12</abbr><abbr bid="B13">13</abbr></abbrgrp>. A similar method, TESS, has been applied for the derivation and matching of annotated spatial templates<abbrgrp><abbr bid="B14">14</abbr></abbrgrp>. JESS<abbrgrp><abbr bid="B15">15</abbr></abbrgrp>, a successor to TESS, searches small groups of atoms under arbitrary constraints on geometry and chemistry and utilized statistics to evaluate matches. It is used to query the Catalytic Site Atlas (CSA)<abbrgrp><abbr bid="B16">16</abbr></abbrgrp> a collection of annotated residues patterns extracted from manual literature searches. JESS is also used in the PROFUNC<abbrgrp><abbr bid="B17">17</abbr></abbrgrp> suite of annotation tools in the reverse template search, where a radius defined perimeter extends a local residue pattern search for improved search specificity.</p>
         <p>A protein evolution based method, pvSOAR, was developed that used the unique approach of aligning sub-sequences of surface residues to establish a residue correspondence between surfaces<abbrgrp><abbr bid="B18">18</abbr><abbr bid="B19">19</abbr></abbrgrp>. The residues were then superimposed on each other and statistical significance was evaluated for the resulting RMSD. This method was used to detect similar functional surfaces in non-homologous proteins. Furthermore, in a recent study of shape variation of ligand binding pockets, Kahraman <it>et. al</it>., used a shape-only comparison metric based on spherical harmonics<abbrgrp><abbr bid="B20">20</abbr></abbrgrp>. It was shown that shape descriptors could be used to classify ligand into their binding sites.</p>
         <p>In this study, we describe a new method for the sequence order independent comparison and alignment of protein functional surfaces. Our method, <it>SurfaceScreen</it>, attempts to optimize two components, global surface shape and local physicochemical texture, for evaluating the similarity between a pair of surfaces. Surface shape similarity is assessed using a three-dimensional object recognition algorithm and is used to rapidly pre-classify surfaces from a large library of surfaces. Surfaces with sufficient shape complimentarity are then aligned by combinatorially identifying the best superimposition of common residues between the two surfaces. We introduce several metrics for scoring different properties of a surface alignment and an overall scoring function used in library searches. Furthermore, we introduce the Global Protein Surface Survey (GPSS), a library of annotated protein surfaces calculated from all structures in the PDB. Querying surfaces from proteins of unknown function against the GPSS library allows <it>SurfaceScreen </it>to be utilized as predictive tool.</p>
         <p>We describe three types of analysis to assess surface shape comparisons and spatial alignments. First, we describe the retrieval of surfaces from the GPSS library for surfaces, from the same protein, that bind ligands of various size, shape and pharmacophore properties. For this we use the example of HIV-1 protease. Second, we use the example of heme (iron-protoporphyrin IX) binding sites to describe the retrieval of a functionally diverse binding surface that binds the same ligand. We provide the example of using our method as an annotation tool, identifying a new member of the heme binding monooxygenase family. Third, we describe how conformational diversity of bound ligands impacts retrieval rate for ubiquitous nucleotide binding sites. We also present the example of a nucleotide binding surface prediction and crystallographic validation for a structural genomics target with a new fold. We conclude with an analysis of the ATP binding surface landscape to provide insight on the correlation between surface similarity and function for structures in the PDB and for the subset of protein kinases.</p>
      </sec>
      <sec>
         <st>
            <p>Methods</p>
         </st>
         <sec>
            <st>
               <p>Theory and Algorithms</p>
            </st>
            <p>While key conserved residues are localized within a surface for function, additional residues contribute to the overall size and shape of the functional surface environment. In some instances, these non-key residues play an important, but non-obvious, functional role. This has been shown for aminopeptidases, where the mouth opening diameter filters peptide access into the active sites<abbrgrp><abbr bid="B21">21</abbr><abbr bid="B22">22</abbr></abbrgrp>, or for N-acetyltransferases, where the acetyl-CoA and substrate binding surfaces are sub-pockets within a larger cleft<abbrgrp><abbr bid="B23">23</abbr></abbrgrp>. In other instances, these auxiliary residues have no obvious relevance, being positioned simply as a result of folding and structural requirements, codon pressure or unrelated functional specifications. Therefore, a detachment exists between global and local surface properties of functional sites, which can distort comparison measures and mask similarities.</p>
            <p>Our approach to identifying meaningful similarities between two surfaces is accomplished by decomposing surface comparisons into two components, global shape and local physicochemical texture. Global shape comparisons can be a very powerful precursor of overall surface similarity. They are computed very efficiently and can help rapidly reject grossly dissimilar surfaces. This is followed by a non-heuristic spatial residue alignment, which assures the optimal superposition of conserved residues between two surfaces.</p>
            <sec>
               <st>
                  <p>Surface Shape Signatures for Rapid Global Surface Shape Comparison</p>
               </st>
               <p>Unlike protein folds, a common lexicon has not emerged to precisely describe the shapes formed by surfaces yet quantitative measures can be computed to empirically describe shape. Here, we introduce a metric, the <it>SurfaceShapeSignature </it>(<it>SSS</it>), which describes the global shape of a protein surface that can be used for comparison. Adapted from three-dimensional database object retrieval, the method represents the signature of an object as a probability distribution sampled from a shape function measuring global geometric properties<abbrgrp><abbr bid="B24">24</abbr></abbrgrp>. The complexity of shape matching is then reduced to the comparison of two probability distributions. This approach was used to quickly and successfully retrieve and classify complex shapes from three-dimensional databases.</p>
               <p>After a protein's binding surfaces has been identified (see Methods), the <it>SSS </it>is constructed by systematically measuring the Euclidean distance between all unique atom pairs for a given surface. This is seen for the nicotinamide-adenine-dinucleotide phosphate (NADP) binding surface from human pathogen <it>S. pyogenes </it>(PDB:<ext-link ext-link-type="pdb" ext-link-id="2ahr">2ahr</ext-link>) in Figure <figr fid="F1">1c</figr>. The inter-atomic distances are then sorted to form the shape signatures. The <it>SSS </it>distributions for a selection of heme, nicotinamide adenine dinucleotide (NAD) and adenosine 5'-triphosphate (ATP) binding surfaces are shown in Figure <figr fid="F1">1d</figr>. For reference, <it>SSS </it>distributions for a selection of DNA and metal binding surfaces are also shown. Once the shape distributions for two surfaces are computed, we apply the Kolmogorov-Smirnov (KS) test<abbrgrp><abbr bid="B25">25</abbr></abbrgrp> to compare the probability distributions. The KS test identifies the greatest distance between the observed and expected cumulative frequencies and is bound between zero (identical distributions) and 1 (different distributions).</p>
               <fig id="F1">
                  <title>
                     <p>Figure 1</p>
                  </title>
                  <caption>
                     <p>Automated identification of protein binding surfaces and construction of <it>SurfaceShapeSignatures </it>(<it>SSS</it>)</p>
                  </caption>
                  <text>
                     <p><b>Automated identification of protein binding surfaces and construction of <it>SurfaceShapeSignatures </it>(<it>SSS</it>)</b>. The nicotinamide-adenine-dinucleotide phosphate (NADP) binding surface from human pathogen <it>S. pyogenes </it>(PDB:<ext-link ext-link-type="pdb" ext-link-id="2ahr">2ahr</ext-link>, a) is defined by measuring the change in solvent accessibility between the bound and apo structure (b, pink). The <it>SSS </it>of a binding surface is constructed by measuring the inter-atomic Euclidean distances between all unique surface atom pairs (c). The signatures of select DNA, ligand and metal binding surfaces for proteins in the PDB.</p>
                  </text>
                  <graphic file="1472-6807-8-45-1"/>
               </fig>
            </sec>
            <sec>
               <st>
                  <p>Advantages and Limitations of Surface Shape Signatures</p>
               </st>
               <p>The primary advantages of this approach are computational robustness and efficiency. In the original work of Osada, the robustness of shape signatures was verified against a variety of transformations including scale, rotation and mirroring and were insensitive to model simplification. These properties are ideal for applications in structural biology as minor conformational changes or small surface perturbations should not mask overall shape similarities. For example, we should be able to detect the similarity between the apo and bound states of a binding site, yet still discern between the two states. Insensitivity to model simplification allows for meaningful comparisons between surfaces comprised of non-trivial atoms count differences. The implementation and execution of this algorithm is computationally straightforward and allows a query surface to be searched against the GPSS library in minutes.</p>
               <p>Biochemical functions rely on a combination of shape and chemical compatibility and therefore one should not expect that shape alone could describe this complexity. Figure <figr fid="F1">1</figr> shows that NAD, ATP, and heme binding surfaces share similar and, in some cases, overlapping distributions. The <it>SSS </it>comparison can convey gross shape similarity but is void of chemical typing information and cannot be used to infer specific functional information. Instead, it provides a fast and robust comparison metric utilized as the entry point into more comprehensive surface shape matching methodology.</p>
            </sec>
            <sec>
               <st>
                  <p>Surface Shape Signature Similarity Threshold</p>
               </st>
               <p>The choice of a shape similarity threshold has a significant impact on the efficiency and specificity of surface library searching. Operating under the pretense that ligands with similar shape and molecular weight are more likely to bind to similar pockets, we correlated these properties to <it>SSS </it>KS distances. Our training data is taken from querying cAMP-dependent kinase (PDB:<ext-link ext-link-type="pdb" ext-link-id="1atp">1atp</ext-link>) ATP binding surface against the GPSS library. With a correlation coefficient of 0.458, we see that some degree of surface similarity can be inferred simply by molecular weight (Figure <figr fid="F2">2a</figr>). We highlight a region on the plot (yellow) that corresponds to +/- 100 D from the molecular weight of ATP and identify the similarity distance of 0.22, along the x-axis, in which only 11% of surfaces exist as outliers. Next, ATP was compared to ligands corresponding to the surfaces in our library using the molecular shape matching application ROCS (OpenEye Scientific Software, Inc). ROCS identifies the best superposition of two molecules by optimizing their overlapping volume and reports a normalized Tanimoto score. Tanimoto values greater than 0.7 are generally regarded as having similar shape and are highlighted on our plot (cyan). The Tanimoto scores are correlated to the SSS distance in Figure <figr fid="F2">2b</figr>. We identify a distance of 0.24, in which only 10% are outliers. Since our assumptions about molecular weight and ligand shape similarity are simplistic and surface comparison hope to identify more evolutionary distant relationships, we set our default threshold distance at 0.3. In the benchmarking retrieval experiments presented in Results section, our default threshold excludes less than 1% of true-positive surfaces, all of which can be justified by unique structural incident (e.g. multiple binding pockets, mutation experiments, low resolution structure, crystallographic error).</p>
               <fig id="F2">
                  <title>
                     <p>Figure 2</p>
                  </title>
                  <caption>
                     <p>Identification of a threshold for <it>SurfaceShapeSignatures</it></p>
                  </caption>
                  <text>
                     <p><b>Identification of a threshold for <it>SurfaceShapeSignatures</it></b>. <it>SSS </it>distances obtained by querying the ATP binding site of cAMP-dependent kinase (PDB:<ext-link ext-link-type="pdb" ext-link-id="1atp">1atp</ext-link>) against the GPSS ligand surface library are plotted against the molecular weight of the ligand corresponding to the library surface (a). Ligands with MW &#177; 100 D of ATP are highlighted in yellow. The molecular shape similarity Taniomoto score between ATP and the ligand corresponding to the library surface is plotted in (b). Tanimoto scores greater than 0.7 (blue) are generally regarded as similar. The correlation coefficients for molecular weight and shape similarity are 0.46 and 0.45, respectively, and the corresponding regression lines are shown in red. Our selected threshold distance of 0.3 (green) for use in our <it>SurfaceScreen </it>methodology eliminates less than 1% of true-positive surfaces in our benchmarking exercises.</p>
                  </text>
                  <graphic file="1472-6807-8-45-2"/>
               </fig>
            </sec>
            <sec>
               <st>
                  <p>Local Spatial Surface Residue Alignments for Physicochemical Texture Similarity</p>
               </st>
               <p>The spatial arrangement of localized surface patterns and orientation of side chains are used to assess our evaluations of biochemical function complimentarity. An exhaustive enumeration and search algorithm, <it>SurfaceAlign</it>, is applied to detect groups of spatially conserved amino acid residue sets. In this application, the term "conserved" refers to the identical residues common to two protein surfaces. Our three-dimensional representation of a surface residue is represented by a single point located at the center of mass of all atoms identified as contributing to a particular surface.</p>
               <p>The alignment of two surfaces is performed by combinatorially identifying the best superposition of the maximum subset of conserved residues between surfaces. For all conserved residues of each type, we enumerate all combinations and permutations to create unique coordinate sets of common residues between the two surfaces. A visual depiction of an alignment between the heme binding pockets of myoglobin from <it>P. catodon </it>(PDB:<ext-link ext-link-type="pdb" ext-link-id="1mbn">1mbn</ext-link>) and structural genomics target hemoglobin alpha-1 (PDB:<ext-link ext-link-type="pdb" ext-link-id="1xq5">1xq5</ext-link>) is shown in Figure <figr fid="F3">3</figr>. The geometric dissimilarity of each coordinate combination is evaluated following the methods of Umeyama<abbrgrp><abbr bid="B26">26</abbr></abbrgrp>, which is based on singular value decomposition of the correlation matrix of the coordinates to identify the least square rotational matrix, translation vector and the root mean square distance (RMSD). We utilize two variants of the RMSD: the coordinate root mean square distance (cRMSD), for atomic coordinates as represented in our residue model, and the orientation root mean square distance (oRMSD)<abbrgrp><abbr bid="B18">18</abbr></abbrgrp>. The oRMSD is a derivative dissimilarity measure that reduces the effect of outliers on an RMSD value and simulates the conformational flexibility of amino acid side chains.</p>
               <fig id="F3">
                  <title>
                     <p>Figure 3</p>
                  </title>
                  <caption>
                     <p>The <it>SurfaceAlign </it>algorithm identifies the optimal alignment of spatially conserved residues</p>
                  </caption>
                  <text>
                     <p><b>The <it>SurfaceAlign </it>algorithm identifies the optimal alignment of spatially conserved residues</b>. <it>6,220,800 </it>alignment combinations and permutations are required for the alignment of 25 conserved residues of the heme binding pockets of myoglobin from <it>P. catodon </it>(a) and structural genomics target hemoglobin alpha-1 from <it>P. flavescens </it>(c). 100 alignment solutions are shown in stick representations (b). An alignment series shows the superposition of the solutions calculated towards converging to the optimal alignment (d). The myobglobin query surface is shaded in grayscale to represent the cRMSD values (black represents a large cRMSD and white represent small cRMSD) and the hemoglobin surface is colored by the shapely color scheme<abbrgrp><abbr bid="B77">77</abbr></abbrgrp>.</p>
                  </text>
                  <graphic file="1472-6807-8-45-3"/>
               </fig>
            </sec>
            <sec>
               <st>
                  <p>Statistical Significance of Aligned Surface Distances</p>
               </st>
               <p>RMSD calculations are sensitive to the number of data points compared, making it necessary to assess the statistical significance of raw distance measures. This is accomplished by converting calculated cRMSD and oRMSD values to a probability value (P-value) measuring the likelihood of obtaining a specific RMSD value for a solution with a given number of residues. This allows for the meaningful comparisons between alignments with differing number of common residues. Following the method described by Binkowski <it>et al</it><abbrgrp><abbr bid="B18">18</abbr></abbrgrp>, random surface alignments were performed for alignment solutions of <it>N</it><sub><it>res </it></sub>residues, where 3 &#8804; <it>N</it><sub><it>res </it></sub>&#8804; 100. Calculations numbering 10<sup>10 </sup>were computed to construct lookup tables associating cRMSD and oRMSD scores to P-values. To minimize the inherent bias in the PDB toward particular protein families, special consideration was employed to utilize a non-redundant surface library consisting of proteins sharing less than 90% whole-protein sequence similarity.</p>
            </sec>
            <sec>
               <st>
                  <p>Surface Volumes Overlap of Aligned Surfaces</p>
               </st>
               <p>While the alignment compares key residues important for shared biochemical function, the volume overlap of an alignment provides a comparison of all atoms belonging to a surface. The overlap volume is defined as the volume difference between the superimposed surfaces and is calculated from the formula:</p>
               <p>
                  <display-formula><it>V</it><sub><it>AB </it></sub>= <it>V</it><sub><it>A </it></sub>+ <it>V</it><sub><it>B </it></sub>- <it>V</it><sub><it>A</it>&#8746;<it>B</it></sub></display-formula>
               </p>
               <p>Where <it>V</it><sub><it>A</it></sub>, <it>V</it><sub><it>B</it></sub>, and <it>V</it><sub><it>A</it>&#8746;<it>B </it></sub>are the volumes of surface A, B and the superimposed construct AB, respectively. Volumes are calculated using the weighted Delauney triangulation and alpha shape methods <abbrgrp><abbr bid="B27">27</abbr><abbr bid="B28">28</abbr><abbr bid="B29">29</abbr></abbrgrp>.</p>
               <p>The overlap volume is then used to calculate a Tanimoto coefficient, which is a normalized similarity measure<abbrgrp><abbr bid="B30">30</abbr><abbr bid="B31">31</abbr></abbrgrp>. By using the self overlap volumes (<it>V</it><sub><it>AA</it></sub>, <it>V</it><sub><it>BB</it></sub>), we define the surface volume overlap Tanimoto (SVOT):</p>
               <p>
                  <display-formula>
                     <m:math name="1472-6807-8-45-i1" xmlns:m="http://www.w3.org/1998/Math/MathML">
                        <m:semantics>
                           <m:mrow>
                              <m:mi>S</m:mi>
                              <m:mi>V</m:mi>
                              <m:mi>O</m:mi>
                              <m:msub>
                                 <m:mi>T</m:mi>
                                 <m:mrow>
                                    <m:mi>A</m:mi>
                                    <m:mi>B</m:mi>
                                 </m:mrow>
                              </m:msub>
                              <m:mo>=</m:mo>
                              <m:mfrac>
                                 <m:mrow>
                                    <m:msub>
                                       <m:mi>V</m:mi>
                                       <m:mrow>
                                          <m:mi>A</m:mi>
                                          <m:mi>B</m:mi>
                                       </m:mrow>
                                    </m:msub>
                                 </m:mrow>
                                 <m:mrow>
                                    <m:msub>
                                       <m:mi>V</m:mi>
                                       <m:mrow>
                                          <m:mi>A</m:mi>
                                          <m:mi>A</m:mi>
                                       </m:mrow>
                                    </m:msub>
                                    <m:mo>+</m:mo>
                                    <m:msub>
                                       <m:mi>V</m:mi>
                                       <m:mrow>
                                          <m:mi>B</m:mi>
                                          <m:mi>B</m:mi>
                                       </m:mrow>
                                    </m:msub>
                                    <m:mo>&#8722;</m:mo>
                                    <m:msub>
                                       <m:mi>V</m:mi>
                                       <m:mrow>
                                          <m:mi>A</m:mi>
                                          <m:mi>B</m:mi>
                                       </m:mrow>
                                    </m:msub>
                                 </m:mrow>
                              </m:mfrac>
                           </m:mrow>
                           <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaem4uamLaemOvayLaem4ta8Kaemivaq1aaSbaaSqaaiabdgeabjabdkeacbqabaGccqGH9aqpjuaGdaWcaaqaaiabdAfawnaaBaaabaGaemyqaeKaemOqaieabeaaaeaacqWGwbGvdaWgaaqaaiabdgeabjabdgeabbqabaGaey4kaSIaemOvay1aaSbaaeaacqWGcbGqcqWGcbGqaeqaaiabgkHiTiabdAfawnaaBaaabaGaemyqaeKaemOqaieabeaaaaaaaa@4458@</m:annotation>
                        </m:semantics>
                     </m:math>
                  </display-formula>
               </p>
               <p>A SVOT score is bounded between 0 (representing non-overlapping surfaces) and 1.0 (representing identical surfaces). The SVOT values for the spatial alignment series show in Figure <figr fid="F3">3d</figr> are 0.50, 0.52, 0.73, 0.62, 0.91, and 0.89 when viewed from left (black) to right (white). The best spatial alignment, as judged by RMSD values, does not guarantee the best SVOT score (i.e. SVOT scores are not correlated to RMSD).</p>
               <p>The interpretation of the SVOT score is not straightforward for surfaces that have a large volume disparity. Figure <figr fid="F4">4</figr> shows a large surface pocket on F420-0:gamma-glutamyl ligase homolog from <it>A. fuldgidus </it>(PDB:<ext-link ext-link-type="pdb" ext-link-id="2g9i">2g9i</ext-link>, a) where a sub-surface (b, forest green) is highly conserved to the GDP binding surface in GDP-binding protein from <it>B. taurus </it>(PDB:<ext-link ext-link-type="pdb" ext-link-id="1tad">1tad</ext-link>, c). The volume overlap of the superimposed surfaces has a calculated SVOT score of 0.37, suggesting low similarity (Figure <figr fid="F4">4e</figr>, purple). In this case, the SVOT score fails to account for strong sub-surface similarity; hidden by the overall volume disparity.</p>
               <fig id="F4">
                  <title>
                     <p>Figure 4</p>
                  </title>
                  <caption>
                     <p>Calculating volume overlap between aligned surfaces</p>
                  </caption>
                  <text>
                     <p><b>Calculating volume overlap between aligned surfaces</b>. A surface on F420-0:gamma-glutamyl ligase homolog from <it>A. fuldgidus </it>(PDB:<ext-link ext-link-type="pdb" ext-link-id="2g9i">2g9i</ext-link>) (a) has a well conserved sub-surface (b, forest green) to the GDP binding surface in GDP-binding protein from <it>B. taurus </it>(c). A superposition of the surfaces from the alignment (d). When the volume overlap of the alignment is measured (e, purple), the large volume disparity between the surfaces masks the similarity with global surface volume overlap (gSVOT) score of 0.37. Using only the conserved residues of the alignment (f) to measure the local global volume overlap (lSVOT) reveals the similarity with lSVOT score of 0.71 (g, purple).</p>
                  </text>
                  <graphic file="1472-6807-8-45-4"/>
               </fig>
               <p>To improve the alignment scoring, we compute two versions of the SVOT, the local and global surface overlap volumes. In the global SVOT (gSVOT), we apply the rotation matrix from the conserved residues to the entire surface (Figure <figr fid="F4">4e</figr>). The local SVOT (lSVOT) is limited to the subset of conserved residues from the alignment solution (Figure <figr fid="F4">4f</figr>). The local and global SVOT are calculated as follows:</p>
               <p>
                  <display-formula>
                     <m:math name="1472-6807-8-45-i2" xmlns:m="http://www.w3.org/1998/Math/MathML">
                        <m:semantics>
                           <m:mrow>
                              <m:mi>g</m:mi>
                              <m:mi>S</m:mi>
                              <m:mi>V</m:mi>
                              <m:mi>O</m:mi>
                              <m:msub>
                                 <m:mi>T</m:mi>
                                 <m:mrow>
                                    <m:mi>A</m:mi>
                                    <m:mi>B</m:mi>
                                 </m:mrow>
                              </m:msub>
                              <m:mo>=</m:mo>
                              <m:mfrac>
                                 <m:mrow>
                                    <m:msub>
                                       <m:mi>V</m:mi>
                                       <m:mrow>
                                          <m:mi>A</m:mi>
                                          <m:mi>B</m:mi>
                                       </m:mrow>
                                    </m:msub>
                                 </m:mrow>
                                 <m:mrow>
                                    <m:msub>
                                       <m:mi>V</m:mi>
                                       <m:mrow>
                                          <m:mi>A</m:mi>
                                          <m:mi>A</m:mi>
                                       </m:mrow>
                                    </m:msub>
                                    <m:mo>+</m:mo>
                                    <m:msub>
                                       <m:mi>V</m:mi>
                                       <m:mrow>
                                          <m:mi>B</m:mi>
                                          <m:mi>B</m:mi>
                                       </m:mrow>
                                    </m:msub>
                                    <m:mo>&#8722;</m:mo>
                                    <m:msub>
                                       <m:mi>V</m:mi>
                                       <m:mrow>
                                          <m:mi>A</m:mi>
                                          <m:mi>B</m:mi>
                                       </m:mrow>
                                    </m:msub>
                                 </m:mrow>
                              </m:mfrac>
                              <m:mo>,</m:mo>
                              <m:mi>l</m:mi>
                              <m:mi>S</m:mi>
                              <m:mi>V</m:mi>
                              <m:mi>O</m:mi>
                              <m:msub>
                                 <m:mi>T</m:mi>
                                 <m:mrow>
                                    <m:mi>a</m:mi>
                                    <m:mi>b</m:mi>
                                 </m:mrow>
                              </m:msub>
                              <m:mo>=</m:mo>
                              <m:mfrac>
                                 <m:mrow>
                                    <m:msub>
                                       <m:mi>V</m:mi>
                                       <m:mrow>
                                          <m:mi>a</m:mi>
                                          <m:mi>b</m:mi>
                                       </m:mrow>
                                    </m:msub>
                                 </m:mrow>
                                 <m:mrow>
                                    <m:msub>
                                       <m:mi>V</m:mi>
                                       <m:mrow>
                                          <m:mi>a</m:mi>
                                          <m:mi>a</m:mi>
                                       </m:mrow>
                                    </m:msub>
                                    <m:mo>+</m:mo>
                                    <m:msub>
                                       <m:mi>V</m:mi>
                                       <m:mrow>
                                          <m:mi>b</m:mi>
                                          <m:mi>b</m:mi>
                                       </m:mrow>
                                    </m:msub>
                                    <m:mo>&#8722;</m:mo>
                                    <m:msub>
                                       <m:mi>V</m:mi>
                                       <m:mrow>
                                          <m:mi>a</m:mi>
                                          <m:mi>b</m:mi>
                                       </m:mrow>
                                    </m:msub>
                                 </m:mrow>
                              </m:mfrac>
                           </m:mrow>
                           <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaem4zaCMaem4uamLaemOvayLaem4ta8Kaemivaq1aaSbaaSqaaiabdgeabjabdkeacbqabaGccqGH9aqpjuaGdaWcaaqaaiabdAfawnaaBaaabaGaemyqaeKaemOqaieabeaaaeaacqWGwbGvdaWgaaqaaiabdgeabjabdgeabbqabaGaey4kaSIaemOvay1aaSbaaeaacqWGcbGqcqWGcbGqaeqaaiabgkHiTiabdAfawnaaBaaabaGaemyqaeKaemOqaieabeaaaaGccqGGSaalcqWGSbaBcqWGtbWucqWGwbGvcqWGpbWtcqWGubavdaWgaaWcbaGaemyyaeMaemOyaigabeaakiabg2da9KqbaoaalaaabaGaemOvay1aaSbaaeaacqWGHbqycqWGIbGyaeqaaaqaaiabdAfawnaaBaaabaGaemyyaeMaemyyaegabeaacqGHRaWkcqWGwbGvdaWgaaqaaiabdkgaIjabdkgaIbqabaGaeyOeI0IaemOvay1aaSbaaeaacqWGHbqycqWGIbGyaeqaaaaaaaa@62AF@</m:annotation>
                        </m:semantics>
                     </m:math>
                  </display-formula>
               </p>
               <p>Where <it>V</it><sub><it>a </it></sub>and <it>V</it><sub><it>b </it></sub>are the volumes of the surfaces formed only by the conserved residues used in the alignment solution. The lSVOT for the alignment in Figure <figr fid="F4">4g</figr> is 0.71, conveying stronger similarity.</p>
               <p>Finally, we introduce the ratio SVOT (rSVOT), to establish a correspondence between the global and local surface volumes:</p>
               <p>
                  <display-formula>
                     <m:math name="1472-6807-8-45-i3" xmlns:m="http://www.w3.org/1998/Math/MathML">
                        <m:semantics>
                           <m:mrow>
                              <m:mi>r</m:mi>
                              <m:mi>S</m:mi>
                              <m:mi>V</m:mi>
                              <m:mi>O</m:mi>
                              <m:mi>T</m:mi>
                              <m:mo>=</m:mo>
                              <m:mfrac>
                                 <m:mrow>
                                    <m:mi>g</m:mi>
                                    <m:mi>S</m:mi>
                                    <m:mi>V</m:mi>
                                    <m:mi>O</m:mi>
                                    <m:msub>
                                       <m:mi>T</m:mi>
                                       <m:mrow>
                                          <m:mi>a</m:mi>
                                          <m:mi>b</m:mi>
                                       </m:mrow>
                                    </m:msub>
                                 </m:mrow>
                                 <m:mrow>
                                    <m:mi>l</m:mi>
                                    <m:mi>S</m:mi>
                                    <m:mi>V</m:mi>
                                    <m:mi>O</m:mi>
                                    <m:msub>
                                       <m:mi>T</m:mi>
                                       <m:mrow>
                                          <m:mi>A</m:mi>
                                          <m:mi>B</m:mi>
                                       </m:mrow>
                                    </m:msub>
                                 </m:mrow>
                              </m:mfrac>
                           </m:mrow>
                           <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemOCaiNaem4uamLaemOvayLaem4ta8KaemivaqLaeyypa0tcfa4aaSaaaeaacqWGNbWzcqWGtbWucqWGwbGvcqWGpbWtcqWGubavdaWgaaqaaiabdggaHjabdkgaIbqabaaabaGaemiBaWMaem4uamLaemOvayLaem4ta8Kaemivaq1aaSbaaeaacqWGbbqqcqWGcbGqaeqaaaaaaaa@4512@</m:annotation>
                        </m:semantics>
                     </m:math>
                  </display-formula>
               </p>
               <p>The rSVOT is bounded between 0 and 1 and represents the fraction of a surface utilized in the alignment solution. In Figure <figr fid="F4">4</figr>, the rSVOT value of 0.52 confirms that it is a sub-pocket match. The rSVOT score can be used to automatically detect surfaces that do not have the desired properties for a particular search (e.g. excluding sub-surface matches during a library search).</p>
            </sec>
         </sec>
         <sec>
            <st>
               <p>Automated Identification of Ligand Binding Surfaces</p>
            </st>
            <p>Functional surfaces of proteins can be derived from structural information extracted from three-dimensional coordinates. This task can be automated for PDB files deposited with heteroatom records (HETATM) describing atoms belonging to small molecule cofactors rather than to part of a biopolymer chain. To extract the functional surfaces that surround structural features of a protein, an exclusion contact surface is generated by measuring a difference in solvent accessibility between a structure with and without a molecule in proximity. This is illustrated for the NADP binding surface from human pathogen <it>S. pyogenes </it>(PDB:<ext-link ext-link-type="pdb" ext-link-id="2ahr">2ahr</ext-link>) in Figure <figr fid="F1">1a</figr>. Atoms with a change in solvent accessibility between the bound and apo structure are identified as the contact surface (Figure <figr fid="F1">1b</figr>). We utilize the Delauney triangulation and alpha shape method for measuring solvent accessibility <abbrgrp><abbr bid="B27">27</abbr><abbr bid="B28">28</abbr><abbr bid="B29">29</abbr></abbrgrp>.</p>
            <sec>
               <st>
                  <p>A Library of Protein Functional Surfaces</p>
               </st>
               <p>An exclusion contact surface has been calculated for every heteroatom molecule associated with a protein's PDB file and organized into the Global Protein Surface Survey (GPSS). The GPSS contains three-dimensional libraries of functionally annotated surfaces from ligand, DNA, metal and peptide binding surfaces and is updated weekly to correspond to PDB deposits. Libraries are publicly accessible through a web browser<abbrgrp><abbr bid="B32">32</abbr></abbrgrp> or via a PyMol<abbrgrp><abbr bid="B33">33</abbr></abbrgrp> plugin. In this study, we utilize only the ligand binding surfaces of the GPSS: 113,921 members representing 5,575 unique ligands (PDB version: November 2006). For this subset of the GPSS, the average number of residues forming a surface is 12 and the average molecular weight of the bound ligands is 305. To reduce redundancy and improve search efficiency, we further limit our search library to a single ligand of each type from each protein deposit. The first ligand of each type, as described in the PDB file, was selected.</p>
            </sec>
         </sec>
         <sec>
            <st>
               <p>Similarity Searching Surface Libraries</p>
            </st>
            <p>We incorporate our two comparison algorithms, <it>SurfaceShapeSignatures </it>and <it>SurfaceAlign </it>into a comprehensive searching methodology, <it>SurfaceScreen</it>, to query a protein surface against the GPSS library. The GPSS library contains pre-computed binding surfaces for ligand, DNA, metals, and peptides from all structures in the PDB. <it>SurfaceScreen i</it>s outlined in Figure <figr fid="F5">5</figr>. Given a protein structure, we first identify all solvent accessible surfaces on the structure utilizing the CASTp<abbrgrp><abbr bid="B34">34</abbr></abbrgrp> database and select a query surface of interest. The query surface is compared to each member of our surface library using <it>SSS</it>. Each surface, whose shape is not within a threshold, is eliminated from the library. In this manner, the <it>SSS </it>is used as a fast pre-classifier before the computationally intensive alignment algorithm and scoring functions are applied. Finally, the spatial alignment is performed and the alignments are scored and ranked.</p>
            <fig id="F5">
               <title>
                  <p>Figure 5</p>
               </title>
               <caption>
                  <p>The <it>SurfaceScreen </it>methodology uses the <it>SSS </it>algorithm to rapidly pre-classify surfaces based on shape complimentarity</p>
               </caption>
               <text>
                  <p><b>The <it>SurfaceScreen </it>methodology uses the <it>SSS </it>algorithm to rapidly pre-classify surfaces based on shape complimentarity</b>. Similarly shaped surfaces are then spatially aligned using the <it>SurfaceAlign </it>algorithm and scored. While the GPSS library also contains surfaces from DNA, metal and peptide binding surfaces, in this study, only ligand binding surfaces were considered.</p>
               </text>
               <graphic file="1472-6807-8-45-5"/>
            </fig>
            <p>Independently, each scoring function conveys unique properties about a surface alignment, but an overall score is necessary to consistently evaluate and rank surfaces in a database search. To this end, we define the <it>SurfaceScreen </it>score:</p>
            <p>
               <display-formula>
                  <m:math name="1472-6807-8-45-i4" xmlns:m="http://www.w3.org/1998/Math/MathML">
                     <m:semantics>
                        <m:mrow>
                           <m:mi>S</m:mi>
                           <m:mi>u</m:mi>
                           <m:mi>r</m:mi>
                           <m:mi>f</m:mi>
                           <m:mi>a</m:mi>
                           <m:mi>c</m:mi>
                           <m:mi>e</m:mi>
                           <m:mi>S</m:mi>
                           <m:mi>c</m:mi>
                           <m:mi>r</m:mi>
                           <m:mi>e</m:mi>
                           <m:mi>e</m:mi>
                           <m:mi>n</m:mi>
                           <m:mtext>&#160;Score</m:mtext>
                           <m:mo>=</m:mo>
                           <m:mo stretchy="false">(</m:mo>
                           <m:mn>1</m:mn>
                           <m:mo>&#8722;</m:mo>
                           <m:mi>S</m:mi>
                           <m:mi>S</m:mi>
                           <m:mi>S</m:mi>
                           <m:mo stretchy="false">)</m:mo>
                           <m:mo>+</m:mo>
                           <m:mfrac>
                              <m:mrow>
                                 <m:mi>log</m:mi>
                                 <m:mo>&#8289;</m:mo>
                                 <m:mo stretchy="false">(</m:mo>
                                 <m:mi>c</m:mi>
                                 <m:mi>R</m:mi>
                                 <m:mi>M</m:mi>
                                 <m:mi>S</m:mi>
                                 <m:mi>D</m:mi>
                                 <m:mtext>&#160;</m:mtext>
                                 <m:mi>P</m:mi>
                                 <m:mo>&#8722;</m:mo>
                                 <m:mi>v</m:mi>
                                 <m:mi>a</m:mi>
                                 <m:mi>l</m:mi>
                                 <m:mi>u</m:mi>
                                 <m:mi>e</m:mi>
                                 <m:mo stretchy="false">)</m:mo>
                              </m:mrow>
                              <m:mrow>
                                 <m:mo>&#8722;</m:mo>
                                 <m:mn>9</m:mn>
                              </m:mrow>
                           </m:mfrac>
                           <m:mo>+</m:mo>
                           <m:mi>g</m:mi>
                           <m:mi>S</m:mi>
                           <m:mi>V</m:mi>
                           <m:mi>O</m:mi>
                           <m:mi>T</m:mi>
                        </m:mrow>
                        <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaem4uamLaemyDauNaemOCaiNaemOzayMaemyyaeMaem4yamMaemyzauMaem4uamLaem4yamMaemOCaiNaemyzauMaemyzauMaemOBa4MaeeiiaaIaee4uamLaee4yamMaee4Ba8MaeeOCaiNaeeyzauMaeyypa0JaeiikaGIaeGymaeJaeyOeI0Iaem4uamLaem4uamLaem4uamLaeiykaKIaey4kaSscfa4aaSaaaeaacyGGSbaBcqGGVbWBcqGGNbWzcqGGOaakcqWGJbWycqWGsbGucqWGnbqtcqWGtbWucqWGebarcqqGGaaicqWGqbaucqGHsislcqWG2bGDcqWGHbqycqWGSbaBcqWG1bqDcqWGLbqzcqGGPaqkaeaacqGHsislcqaI5aqoaaGccqGHRaWkcqWGNbWzcqWGtbWucqWGwbGvcqWGpbWtcqWGubavaaa@6CCD@</m:annotation>
                     </m:semantics>
                  </m:math>
               </display-formula>
            </p>
            <p>The <it>SurfaceScreen </it>score is bounded between 0 and 3, and represents contributions from global shape similarity, local spatial residue alignment, and global alignment volume overlap. The denominator in the cRMSD P-value component reflects the maximum probability value estimated by our statistical significance evaluations, 10<sup>-9</sup>. The oRMSD, lSVOT, and rSVOT scores are omitted because they are too highly correlated to the cRMSD and gSVOT scores, respectively. They are, however, still useful for post-processing results.</p>
         </sec>
         <sec>
            <st>
               <p>Data Analysis and Classification</p>
            </st>
            <sec>
               <st>
                  <p>Receiver Operator Characteristic Curves</p>
               </st>
               <p>Surface retrieval benchmarking experiments are summarized in a Receiver Operator Characteristic (ROC) curve, where the sensitivity is plotted against its specificity at various significance levels of summed probabilities. In the ROC curve, the x-axis represents the false positive rate, or 1-specificity, which is calculated by as 1-TN/(TN+FP), where TN is the number of true negatives and FP is the number of false positives. The <it>y-</it>axis represents the true positive rate, or sensitivity, and is calculated as TP/(TP+FN), where FN is the number of false negatives. An overall performance measure of a classification test can be calculated by the area under the ROC curve (AUC)<abbrgrp><abbr bid="B35">35</abbr></abbrgrp>. Bound between 0 and 1, an AUC of 1 is indicative of a perfectly accurate classification test, in which all true positives are distinguished from false positive. An AUC of 0.5 corresponds to a random classification test (e.g. a coin flip). The AUC is a combined measure of sensitivity and specificity.</p>
               <p>For our <it>SurfaceScreen </it>methodology, the ROC curves measure our ability to accurately identify similar surfaces from a large database. A true-positive data point in our database retrieval experiments is defined when the ligand from the query surface matches the ligand from the corresponding library surface (e.g. retrieving heme binding sites when a heme binding site is used as the query). This definition should be considered conservative, as protein surfaces have the ability to bind multiple ligands and, in some cases, a false-positive prediction may indeed be a biologically relevant hit.</p>
            </sec>
            <sec>
               <st>
                  <p>ATP Conformation Classification</p>
               </st>
               <p>The coordinates for all ATP molecules in the PDB were identified and extracted. Multiple occurrences of the ligand in structure deposit were included and treated as unique molecules. A pairwise, least-squares superposition of all remaining molecules was performed and RMSD values recorded in a distance matrix. Complete linkage clustering was applied to the data matrix. For clarity, we chose to discover the minimum number of conformation families that would accurately represent all ATP molecules. A range of cut values was tested and we chose to set the number of clusters to four based on manual visualization and analysis. The four conformations represent the bent, extended, and two intermediary forms of ATP.</p>
            </sec>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Results and discussion</p>
         </st>
         <sec>
            <st>
               <p>Surface-Based Retrieval of Binding Sites for the Same Protein: HIV-1 Protease</p>
            </st>
            <p>The utility of surface shape comparisons was assessed by retrieving functionally homologous human immunodeficiency virus (HIV-1) protease-ligand complexes from the GPSS. HIV-1 protease is an essential aspartyl protease that cleaves nascent polypeptides enabling maturation of viral proteins. Inactivation of the protease blocks production of infectious viral particles<abbrgrp><abbr bid="B36">36</abbr></abbrgrp>. Therefore, HIV-1 has been an active target and one of the early success stories of rational drug design<abbrgrp><abbr bid="B37">37</abbr></abbrgrp>. We identified 151 HIV-1 protein-inhibitor complexes deposited in the PDB with the following criteria: proteins are in the dimer conformation, inhibitors are not compound fragments (molecular weight >100), and inhibitors are unique in our dataset. The proteins in our dataset share at least 48% sequence similarity and secondary structure similarity Z-scores greater 9.0, as measured using the secondary structure matching (SSM) algorithm<abbrgrp><abbr bid="B38">38</abbr></abbrgrp>.</p>
            <p>The binding surface of human HIV-1 (PDB:<ext-link ext-link-type="pdb" ext-link-id="1eby">1eby</ext-link>, E.C. 3.4.23.16, CATH<abbrgrp><abbr bid="B39">39</abbr></abbrgrp> 20.40.70.10, Figure <figr fid="F6">6ab</figr>) with bound inhibitor BEB (MW 652.7, Figure <figr fid="F6">6c</figr>) was selected as a query. First, the query was searched against the GPSS library using the SSS comparisons. The sorted KS distance scores between the query surface and all members from the library are plotted in Figure <figr fid="F6">6d</figr>. Points highlighted in red indicate known HIV-1 inhibitor binding surfaces. The results behave expectedly as 124 of 151 have KS distance scores less than 0.1. Plotting the search results in a receiver operator characteristic (ROC) curve (see Methods) we measure the retrieval rate using <it>SSS </it>at 84.7% from the area under the curve (AUC) (Figure <figr fid="F6">6e</figr>). The poorest ranking HIV-1 protease surfaces are associated with aggressive mutation studies in the binding pocket or correlated to decisively small (&lt;200) or large (>900) molecular weight inhibitors.</p>
            <fig id="F6">
               <title>
                  <p>Figure 6</p>
               </title>
               <caption>
                  <p>Retrieval of HIV-1 proteases from the GPSS library using surface similarity</p>
               </caption>
               <text>
                  <p><b>Retrieval of HIV-1 proteases from the GPSS library using surface similarity</b>. The binding surface of human HIV-1 protease (ab) complexed with inhibitor BEB (c) was queried against the GPSS library. The sorted KS distances are shown in (d) with other HIV-1 proteases highlighted in red. ROC curves for retrieval using <it>SurfaceShapeSignature, SurfaceAlign </it>and <it>SurfaceScreen </it>scoring are shown in (e). The highest ranking non-protease surface was from the DcmaT (h) binding surface aclacinomycin methylesterase (RdmC) from <it>S. purpurascens </it>(fg). A superposition of the surfaces based on the <it>SurfaceAlign </it>alignment (ij) and with their respective ligands (k).</p>
               </text>
               <graphic file="1472-6807-8-45-6"/>
            </fig>
            <p>Next, we performed the same search using only the spatial alignment scores to evaluate similarity. We observe that all three alignment-based scoring measures provide better specificity than SSS distance score. The AUC for cRMSD P-value, oRMSD P-value, and SVOT are: 97.5%, 96.8% and 93.0%, respectively. The ROC plots are shown for each measure in Figure <figr fid="F6">6e</figr>. The improved specificity of the spatial alignment scores comes at a significant runtime disadvantage. The SSS shape retrieval method took 24 minutes to compare the query to the GPSSS library, while the spatial alignment took 1,657 minutes. When using the SSS scores to pre-filter the search library, as described in the <it>SurfaceScreen </it>methodology, we can achieve an AUC of 95.3% for the combined <it>SurfaceScreen </it>(Figure <figr fid="F6">6e</figr>) score with an overall runtime of 148 minutes. The shape signature filter reduced the library 86%, to just over 4,000 surfaces, yet did not eliminate any true positives from the library.</p>
            <p>Using the <it>SurfaceScreen </it>score, the most similar (rank 132) non HIV-1 surface was from plasmepsin II from <it>P. falciparum </it>(PDB:<ext-link ext-link-type="pdb" ext-link-id="1lf3">1lf3</ext-link>, E.C. 3.4.23.39, CATH 2.40.70.10), another aspartic protease and a major virulence factor<abbrgrp><abbr bid="B40">40</abbr></abbrgrp>.<it>P. falciparm </it>is a species of <it>Plasmodiums </it>that causes one of the major malaria infections in humans. Plasmepsin II plays an essential role in <it>P. falciparm </it>in the degradation of hemoglobin as a source of amino acids for growth and maturation. The binding surface of inhibitor EH5 exhibited strong similarity to the HIV-1 inhibitor binding site, with a <it>SurfaceScreen </it>score of 1.89. Plasmepsin II shares only 12% sequence identity to HIV-1 protease and an SSM alignment produces a non-significant Z-score of 3.3. The surfaces are both formed at the intersection of loops and &#946;-sheets, although the plasmepsin II binding surface is formed from a single chain, unlike HIV-1 protease, which is occurs at a dimer interface. There are 15 residues that are conserved between the two surfaces. While it is not surprising that proteases share a similar binding site, the low level of sequence and secondary structure similarity highlights that localized functional conservation can be found in surfaces. Our observation is in agreement with recent reports where HIV-1 inhibitors have been shown to be effective antimalarial agents<abbrgrp><abbr bid="B41">41</abbr></abbrgrp>.</p>
            <p>The highest-ranking non-protease surface was from aclacinomycin methylesterase (RdmC) from <it>S. purpurascens </it>(PDB:<ext-link ext-link-type="pdb" ext-link-id="1q0r">1q0r</ext-link>, Figure <figr fid="F6">6fg</figr>). RdmC modifies the aklavinone skeleton in the biosynthesis of anthracyclines in <it>Streptomyces </it>species<abbrgrp><abbr bid="B42">42</abbr></abbrgrp>. Anthracyclines are a class of aromatic polyketide antibiotics used as chemotherapuetic agents to treat a wide range of cancers, including leukemia, lymphoma, and breast, uterine, ovarian and lung cancers. Despite sharing only 7% sequence identity and being built present in different scaffolds (CATH 3.40.50.1820), the RdmC binding surface of DcmaT (Figure <figr fid="F6">6h</figr>), was found to be similar with a <it>SurfaceScreen </it>score of 1.81 (ranked at position 134). The superposition of the surfaces is shown in Figure <figr fid="F6">6ij</figr> and with their corresponding inhibitors in Figure <figr fid="F6">6k</figr>. The surprising similarity of these surfaces has significant medicinal impact as it supports the recent reports of the inhibitory effects of anthracycline agents on protease activity<abbrgrp><abbr bid="B43">43</abbr></abbrgrp>. This result shows how identification of similar binding surfaces (and their corresponding ligands) can provide guidance for structure based drug discovery, not only in scaffold design but also to screen against potential undesirable binding site similarities that could result in undesired side effects.</p>
         </sec>
         <sec>
            <st>
               <p>Retrieval and Prediction of Heme Binding Surfaces</p>
            </st>
            <p>Heme is a versatile prosthetic group that plays an important role across many biological systems. Hemoproteins have diverse functions including oxygen binding and transport, electron transfer and redox, and catalysis. Their functional diversity is accomplished through an equally diverse range of protein topologies<abbrgrp><abbr bid="B9">9</abbr><abbr bid="B44">44</abbr></abbrgrp>. A comprehensive analysis of 68 b-type heme binding interactions by Schneider <it>et. al</it>. identified over 20 different folds that bind heme in both solvent accessible cavities and buried voids<abbrgrp><abbr bid="B45">45</abbr></abbrgrp>. Even functional homologues show diversity in binding orientation as observed in HasA and HemS<abbrgrp><abbr bid="B45">45</abbr></abbrgrp>. Surfaces from myoglobin (CATH code = 1.10.490.10, PDB:<ext-link ext-link-type="pdb" ext-link-id="1mbn">1mbn</ext-link>), nitrophorin (CATH code = 1.40.128.20, PDB:<ext-link ext-link-type="pdb" ext-link-id="1np4">1np4</ext-link>), and inducible nitric oxide synthase (iNOS) (CATH code = 3.90.1230.10, PDB:<ext-link ext-link-type="pdb" ext-link-id="4nos">4nos</ext-link>)<abbrgrp><abbr bid="B46">46</abbr></abbrgrp>, representing extrema of heme binding, are shown in Figure <figr fid="F7">7</figr>.</p>
            <fig id="F7">
               <title>
                  <p>Figure 7</p>
               </title>
               <caption>
                  <p>Retrieval of functionally diverse heme binding proteins</p>
               </caption>
               <text>
                  <p><b>Retrieval of functionally diverse heme binding proteins</b>. Heme binding proteins myoglobin (a, CATH code = 1.10.490.10, PDB:<ext-link ext-link-type="pdb" ext-link-id="1mbn">1mbn</ext-link>), nitrophorin (c, CATH code = 1.40.128.20, PDB:<ext-link ext-link-type="pdb" ext-link-id="1np4">1np4</ext-link>), and inducible nitric oxide synthase (iNOS) (e, CATH code = 3.90.1230.10, PDB:<ext-link ext-link-type="pdb" ext-link-id="4nos">4nos</ext-link>)<abbrgrp><abbr bid="B46">46</abbr></abbrgrp>. The structures are positioned such that the proprionate groups are all oriented in the same direction. The corresponding heme binding surfaces are shown adjacent, after being rotated 90 degrees along the Y-axis. Shape signatures for each surface are shown in (g). The ROC curves for retrieval of heme binding surfaces querying myoglobin from <it>P. catodon </it>(PDB:<ext-link ext-link-type="pdb" ext-link-id="1mbn">1mbn</ext-link>) against the GPSS library (h). The Ampcpr binding surface (i) from ADPRase is the best non-heme binding surface returned from the search. A superposition of the ligands suggests ligand-shape complimentarity driving the binding surface similarity (j).</p>
               </text>
               <graphic file="1472-6807-8-45-7"/>
            </fig>
            <p>The variability of heme binding proteins presents considerable challenges for automated identification and retrieval of hemoproteins from sequence and structure databases. Using myoglobin from <it>P. catodon </it>(PDB:<ext-link ext-link-type="pdb" ext-link-id="1mbn">1mbn</ext-link>, Figure <figr fid="F7">7ab</figr>) as a query protein, we compared it to a non-redundant (&lt;95% sequence identity) PDB set using BLAST<abbrgrp><abbr bid="B47">47</abbr></abbrgrp>. Using the 690 heme binding proteins in the PDB as true-positive hits, a retrieval rate of 68.7% is calculated from the rank order of sorted (by E-value) search results. Comparing the structure of myoglobin against the same set of proteins using SSM search results provides a retrieval rate of 64.4%. The retrieval rate was calculated from the rank order of sorted Z-scores. The ROC plots are shown for both methods in Figure <figr fid="F7">7h</figr>.</p>
            <p>The myoglobin query was then searched against the GPSS library. The retrieval rate, using the <it>SurfaceScreen </it>score, is 94.8%. All surface scoring measures had superior performance over sequence and structure methods (Figure <figr fid="F7">7h</figr>). The most selective was the SSS KS distance with retrieval at 95.8%. Despite the variability of topologies forming binding surfaces, the binding surface shape appears to be the most conserved feature of hemoproteins. This can be seen in the shape signature plots for myoglobin, nitrophorin and iNOS (Figure <figr fid="F7">7h</figr>). The iNOS heme binding pocket is the lowest ranking true-positive surface against our query, as observed by the stark difference in shape signatures. It appears that despite evolutionary pressure imposed for functional specification, surface must maintain geometry necessary to accommodate the canonical heme shape. Surface shape is better conserved for heme binding than the amino acid environment. These observations agree with that of Schneider in which heme binding interactions were found to be generally diverse, with the exception of only three amino acids at "hot spots".</p>
            <p>While true heme binding surfaces dominate the top scoring surfaces we find that other binding surfaces have surprising similarity to our query surface. The highest ranking, at position 42, was the Ampcpr binding surface from ADP-ribose pyrophosphatase (ADPRase) from <it>E. coli </it>(PDB:<ext-link ext-link-type="pdb" ext-link-id="1khz">1khz</ext-link>). A Nudix hydrolase enzyme, ADPRase catalyzes the Mg<sup>2+</sup>-dependent hydrolysis of ADP-ribose to AMP and ribose 5-phosphate<abbrgrp><abbr bid="B48">48</abbr></abbrgrp>. The surface, formed at the intersection of &#946;-sheets and loops, is shown in Figure <figr fid="F7">7ij</figr>. The SSS plot confirms that strong visual shape similarity between the two surfaces (Figure <figr fid="F7">7g</figr>, red). A shape-based superposition (ROCS, OpenEye Scientific, Inc.) of the ligands shows that (Figure <figr fid="F7">7j</figr>) the similarity of these functionally unrelated proteins may lie strictly in their ability to accommodate similarly sized molecules.</p>
            <sec>
               <st>
                  <p>Detection of a Convergent Heme Binding Surface</p>
               </st>
               <p>Convergent evolution presents a far more difficult challenge for annotation of proteins of unknown function. Structural genomics target, IsdG from <it>S. aureus</it><abbrgrp><abbr bid="B49">49</abbr></abbrgrp> (PDB:<ext-link ext-link-type="pdb" ext-link-id="1xbw">1xbw</ext-link>), shows no significant sequence similarity and does not contain the conserved N-terminal histidine or the GXXXG motif characteristic present in the heme-monooxygenases family, yet this enzyme displays classical heme-monooxygenase activity<abbrgrp><abbr bid="B49">49</abbr></abbrgrp>. Also, while all known members of heme-oxygenase superfamily are of all &#945;-helical fold, IsdG adopts &#945;+&#946; sandwich with an anti-parallel &#946;-sheet and ferredoxin-like fold and a &#946;-barrel at the dimer interface<abbrgrp><abbr bid="B49">49</abbr></abbrgrp>.</p>
               <p>The structure of IsdG has a prominent pocket formed between the &#945;-helices and beta sheets (Figure <figr fid="F8">8a</figr>). This is the largest surface pocket identified by the CASTp webserver<abbrgrp><abbr bid="B50">50</abbr></abbrgrp>. Querying this surface against the GPSS library reveals a striking similarity to the heme binding pocket in heme oxygenase (HmuO) from <it>C. diphtheriae </it>(PDB:<ext-link ext-link-type="pdb" ext-link-id="1iw0">1iw0</ext-link>, Figure <figr fid="F8">8b</figr>). The SSS distributions have distance of 0.06 (Figure <figr fid="F8">8d</figr>). There are 19 conserved residues between the surfaces that come from diverse regions of the primary sequence (Figure <figr fid="F8">8c</figr>). The surfaces align with cRMSD P-value of 9.84 &#215; 10<sup>-3 </sup>(Figure <figr fid="F8">8ef</figr>) and oRMSD P-value of 5.32 &#215; 10<sup>-4 </sup>(Figure <figr fid="F8">8gh</figr>). Superposition of the surfaces results in gSVOT of 0.78, lSVOT 0.84, and rSVOT of 0.93 (Figure <figr fid="F8">8i</figr>). The gSVOT overlap is highlighted in Figure <figr fid="F8">8j</figr>. The <it>SurfaceScreen </it>score for the comparison is 1.98, which ranked fourth overall against the search library.</p>
               <fig id="F8">
                  <title>
                     <p>Figure 8</p>
                  </title>
                  <caption>
                     <p>Identification of a convergent heme binding surfaces from surface similarity</p>
                  </caption>
                  <text>
                     <p><b>Identification of a convergent heme binding surfaces from surface similarity</b>. Despite lacking sequence or structural homology to the heme-monooxygenase family, IsdG from <it>S. aureus </it>(a, yellow) contains a conserved surface allowing it to perform heme-monooxygenase activity. When compared to the heme binding surface from heme oxygenase (HmuO) from <it>C. diphtheria </it>(b, green), 19 residues are conserved (c) with similar global shape characteristics (d). The superposition of the conserved residues is shown for the best scoring cRMSD (e) and oRMSD (g) alignments. The alignments are colored by residue type (IsgG large radius, HmuO small radius) in (fh). The superposition of the surfaces resulting in the maximum volume overlap (i, red) is shown with bound heme from HmuO (j).</p>
                  </text>
                  <graphic file="1472-6807-8-45-8"/>
               </fig>
               <p>Several other structural homologues to IsdG have subsequently been solved in the structural genomics effort. A clustering of the putative binding sites for four additional enzymes is shown in Figure <figr fid="F9">9</figr>. Surface analysis reveals that the heme binding pocket is well conserved in protein TT1390 from <it>T. thermopilus </it>(PDB:<ext-link ext-link-type="pdb" ext-link-id="1iuj">1iuj</ext-link>) and protein BC2969 from <it>B. cereus </it>(PDB:<ext-link ext-link-type="pdb" ext-link-id="1tz0">1tz0</ext-link>) suggesting that both <it>T. thermopilus </it>and <it>B. cereus </it>can acquire iron through heme degradation. Despite overall structural similarity, the binding surface is not well conserved in ActVA-Orf66 from <it>S. coelicolor </it>(PDB:<ext-link ext-link-type="pdb" ext-link-id="1lq9">1lq9</ext-link>), a protein involved in antibiotic synthesis that is known to bind 6-deoxydihydrokalafungin (6-DHHK)<abbrgrp><abbr bid="B51">51</abbr></abbrgrp>. Expectedly, this surface forms the most distant branch of the clustering. Although all proteins appear to function as monooxygenases they operate on very different substrates suggesting that convergent evolution may be an important driving force to evolve new functions from existing protein scaffolds. In this manner, surface analysis could be used to define a chemical structure space by interpolating between known substrates clustered on each node.</p>
               <fig id="F9">
                  <title>
                     <p>Figure 9</p>
                  </title>
                  <caption>
                     <p>Binding surface-based classification of structural homologs</p>
                  </caption>
                  <text>
                     <p><b>Binding surface-based classification of structural homologs</b>. Putative binding surfaces for structural genomics targets with structural homology to IsdG (PDB:<ext-link ext-link-type="pdb" ext-link-id="1xbw">1xbw</ext-link>) and IsdI (PDB:<ext-link ext-link-type="pdb" ext-link-id="1sqe">1sqe</ext-link>) from <it>S. aureus </it>are clustered by <it>SurfaceScreen </it>scores. The heme binding pocket is well conserved in protein TT1390 from <it>T. thermopilus </it>(PDB:<ext-link ext-link-type="pdb" ext-link-id="1iuj">1iuj</ext-link>) and protein BC2969 from <it>B. cereus </it>(PDB:<ext-link ext-link-type="pdb" ext-link-id="1tz0">1tz0</ext-link>). ActVA-Orf66 from <it>S. coelicolor </it>(PDB:<ext-link ext-link-type="pdb" ext-link-id="1lq9">1lq9</ext-link>), is known to bind 6-deoxydihydrokalafungin (6-DHHK)<abbrgrp><abbr bid="B51">51</abbr></abbrgrp>. Cofactors are shown immediately below each protein.</p>
                  </text>
                  <graphic file="1472-6807-8-45-9"/>
               </fig>
            </sec>
         </sec>
         <sec>
            <st>
               <p>Binding Site Retrieval of Functionally Diverse and Conformationally Variable Nucleotides</p>
            </st>
            <sec>
               <st>
                  <p>Specific Nucleotide Binding Site Retrieval</p>
               </st>
               <p>ATP is a multifunctional nucleotide associated that has been classified to catalyze 58 different reactions by the Enzyme Commission (EC). In over 300 structural complexes, ATP binding is associated with domains from 45 homologous superfamilies, some sharing less than 8% sequence identity<abbrgrp><abbr bid="B52">52</abbr></abbrgrp>. The nucleotide is quite flexible and adopts a wide range of conformations, some in less than energetically favorable states<abbrgrp><abbr bid="B52">52</abbr></abbrgrp>. To determine the extent that conformational variability exerts on similarity searching, we conducted retrieval experiments with query surfaces binding ATP in diverse conformations: cAMP- dependent kinase (PDB:<ext-link ext-link-type="pdb" ext-link-id="1atp">1atp</ext-link>) protein kinase CK2 from <it>Z. mays </it>(PDB:<ext-link ext-link-type="pdb" ext-link-id="1a6o">1a6o</ext-link>)<abbrgrp><abbr bid="B53">53</abbr></abbrgrp>, ATP:corrinoid adenosyltransferase from <it>S. typhimurium </it>(PDB:<ext-link ext-link-type="pdb" ext-link-id="1g5t">1g5t</ext-link>)<abbrgrp><abbr bid="B54">54</abbr></abbrgrp>, PurT-encoded glycinamide ribonucleotide transformylase from <it>E. coli </it>(PDB:<ext-link ext-link-type="pdb" ext-link-id="1kj8">1kj8</ext-link>)<abbrgrp><abbr bid="B55">55</abbr></abbrgrp>. The conformations were selected by clustering all ATP molecules by their three-dimensional shape similarity (see Methods) and are shown in Figure <figr fid="F10">10</figr>.</p>
               <fig id="F10">
                  <title>
                     <p>Figure 10</p>
                  </title>
                  <caption>
                     <p>Retrieval of ATP binding proteins from functionally and conformationally diverse classes</p>
                  </caption>
                  <text>
                     <p><b>Retrieval of ATP binding proteins from functionally and conformationally diverse classes</b>. Binding surfaces representing different ATP conformational classes: cAMP- dependent kinase (PDB:<ext-link ext-link-type="pdb" ext-link-id="1atp">1atp</ext-link>, a), protein kinase CK2 from Z. Mays (PDB:<ext-link ext-link-type="pdb" ext-link-id="1a6o">1a6o</ext-link>, b), ATP:corrinoid adenosyltransferase from S. typhimurium (PDB:<ext-link ext-link-type="pdb" ext-link-id="1g5t">1g5t</ext-link>, c), PurT-encoded glycinamide ribonucleotide transformylase from E. coli (PDB:<ext-link ext-link-type="pdb" ext-link-id="1kj8">1kj8</ext-link>, d). A superposition of the molecules from each class (f). The retrieval rate for each binding surface against the GPSS library is shown as an ROC plot in (e). The retrieval rates are calculated using the <it>SurfaceScreen </it>score.</p>
                  </text>
                  <graphic file="1472-6807-8-45-10"/>
               </fig>
               <p>The AUCs calculated using the <it>SurfaceScreen </it>score were 79.1%, 80.1%, 83.0%, and 85.4% for ATP:corrinoid adenosyltransferase, PurT-encoded glycinamide ribonucleotide transformylase, protein kinase CK2 and cAMP-dependent kinase, respectively (Figure <figr fid="F10">10e</figr>). The extended ATP form, which is the most prominent form in the PDB, had the best retrieval rate, while the bent form ATP had the poorest. Overall, the rates underperform compared to the HIV-1 inhibitor and heme binding surface retrievals. Despite the influence of ligand conformation of surface conformations, our method appears rather tolerant to flexible ligands (and their corresponding binding surfaces) albeit at the expense of the specificity seen in more rigid molecules.</p>
               <p>It should be noted that the retrieval rates for ATP are especially conservative, as a disproportional number of ATP binding surfaces complexed with other molecules are in the PDB. For example, there are 11 structures of Protein Kinase A from <it>Bos taurus </it>(PDB:<ext-link ext-link-type="pdb" ext-link-id="1xha">1xha</ext-link>,<ext-link ext-link-type="pdb" ext-link-id="1xh8">1xh8</ext-link>,<ext-link ext-link-type="pdb" ext-link-id="1xh7">1xh7</ext-link>,<ext-link ext-link-type="pdb" ext-link-id="1xh6">1xh6</ext-link>,<ext-link ext-link-type="pdb" ext-link-id="1xh5">1xh5</ext-link>,<ext-link ext-link-type="pdb" ext-link-id="1xh4">1xh4</ext-link>,<ext-link ext-link-type="pdb" ext-link-id="1veb">1veb</ext-link>,<ext-link ext-link-type="pdb" ext-link-id="1svg">1svg</ext-link>,<ext-link ext-link-type="pdb" ext-link-id="1sve">1sve</ext-link>,<ext-link ext-link-type="pdb" ext-link-id="1svh">1svh</ext-link>) which have ATP competitive inhibitors bound. By correcting for protein kinase inhibitor complexes, the AUCs improve approximately 4% across all query surfaces. Unfortunately, there is no automated method to associate these types of complexes with their native cofactors except through literature analysis, which can be uninformative, as every structure deposit does not have a corresponding publication. The authors are developing an automated database that catalogs such natural cofactor/inhibitor relationships between structures in the PDB.</p>
            </sec>
            <sec>
               <st>
                  <p>Non-specific Nucleotide Retrieval</p>
               </st>
               <p>In some proteins, ligand binding is not an exclusive event, as some proteins are able to utilize different cofactors to catalyze the same reaction. Casein kinase 2 (CK2) is a highly conserved eukaryotic serine/threonine kinase that plays a key role in various cellular processes and possesses dual-cosubstrate specificity for guanosine-5'-triphosphate (GTP) or ATP<abbrgrp><abbr bid="B53">53</abbr></abbrgrp>. This feature, whose biological significance is not well understood, is exceptional among eukaryotic protein kinases. Querying the CK2 ATP binding surface (Figure <figr fid="F10">10b</figr>), we can retrieve GTP binding surfaces from the GPSS library with AUC of 83.7%, slightly better than ATP. Given that the two molecules differ only in their nucleoside, this result is not surprising, as we have shown ligand shape complimentarity is a strong precursor of overall surface similarity. In a comparison of the surface retrieval rates for all nucleotides binding surfaces in the PDB against the CK2 ATP binding surface, we observe trends which mirror ligand shape similarity: purine derivatives retrieval is better than the pyrimidines and tri-phosphate molecules retrieval is better than di-phosphates which are retrieved better than mono-phosphates. These results suggest that our method may be useful to identify a diverse set of molecular shapes that could potentially bind to a given surface.</p>
            </sec>
            <sec>
               <st>
                  <p>Prediction and Validation of a GDP Binding Site</p>
               </st>
               <p>The F<sub>420 </sub>coenzyme plays important roles in archaea and eubacteria in a variety of biochemical reactions (e.g. methanogenesis, the formation of secondary metabolites, the degradation of nitroaromatic compounds, DNA repair)<abbrgrp><abbr bid="B56">56</abbr></abbrgrp>. CofE, a F<sub>420</sub>-0:gamma-glutamyl ligase, is responsible for the last two enzymatic steps in coenzyme F<sub>420</sub>-2 biosynthesis. Belonging to a structurally uncharacterized family of enzymes, CofE from <it>A. fulgidus </it>was solved as a structural genomics target by the Midwest Center for Structural Genomics and found to be of novel fold (PDB:<ext-link ext-link-type="pdb" ext-link-id="2g9i">2g9i</ext-link>) (Figure <figr fid="F4">4a</figr>). Solvent accessible cavities were calculated using the CASTp webserver <abbrgrp><abbr bid="B27">27</abbr><abbr bid="B28">28</abbr><abbr bid="B29">29</abbr><abbr bid="B34">34</abbr></abbrgrp>, and the largest pocket, presumably the F420, GTP and L-glutamate binding pocket, was selected to query against the GPSS ligand surface library. The top-ranking surface was from GDP-binding protein from <it>B. taurus </it>(PDB:<ext-link ext-link-type="pdb" ext-link-id="1tad">1tad</ext-link>, red, Figure <figr fid="F4">4c</figr>). A GDP molecule is posed into the surface based on the superposition from the alignment (Figure <figr fid="F11">11a</figr>, red GDP molecule). Based on this prediction, the protein was co-crystallized with GDP and a model of the complex was determined (Figure <figr fid="F11">11a</figr>, green GDP molecule)<abbrgrp><abbr bid="B56">56</abbr></abbrgrp>. The GDP position had RMSD of 1.0&#506; from the predicted pose (Figure <figr fid="F11">11b</figr>). The addition of the ligand also improved the resolution of the structure from 2.50&#506; to 1.35&#506; and allowed two loop regions to be modeled where no electron density was previously seen (Figure <figr fid="F11">11a</figr>, magenta).</p>
               <fig id="F11">
                  <title>
                     <p>Figure 11</p>
                  </title>
                  <caption>
                     <p>Crystallographic validation of GDP binding prediction in structural genomics target</p>
                  </caption>
                  <text>
                     <p><b>Crystallographic validation of GDP binding prediction in structural genomics target</b>. The strong similarity of the putative binding surface of F420-0:gamma-glutamyl ligase homolog from <it>A. fuldgidus</it><abbrgrp><abbr bid="B56">56</abbr></abbrgrp> (a) to the GDP binding surface in GDP-binding protein from <it>B. taurus </it>(Figure 3c) allows a GDP molecule (red, colored by element) to be posed into the surface based on the surface superposition. The structure was determined with bound GDP (green, colored by element) with RMSD of 1.0&#506; from predicted position (b). The addition of the ligand to the crystallization conditions improved the quality of the structure from 2.5&#506; (a, gray) to 1.35&#506; (a, green) and allows loop regions (magenta) to be modeled.</p>
                  </text>
                  <graphic file="1472-6807-8-45-11"/>
               </fig>
            </sec>
         </sec>
         <sec>
            <st>
               <p>ATP Binding Surface Landscape</p>
            </st>
            <p>The contributions of molecular flexibility only partially account for reduced retrieval rates observed for ATP binding surfaces. It is surprising that binding surfaces for this essential nucleotide do not exhibit a greater level of conservation. To explore the global relationship between surfaces, enzymatic functions and ligand conformation, we carried out an all-against-all comparison for a limited homology dataset (&lt;50% whole sequence identity) of ATP binding surfaces. This cutoff was selected to encourage surface diversity between functionally homologous proteins yet eliminate redundant analysis. A distance matrix for the 116 surface set was constructed using the <it>SurfaceScreen </it>score and complete linkage clustering was applied. A dendrogram of the clustering is shown in Figure <figr fid="F12">12</figr>.</p>
            <fig id="F12">
               <title>
                  <p>Figure 12</p>
               </title>
               <caption>
                  <p>Clustering of 116 non-redundant ATP binding sites based on their surface similarity</p>
               </caption>
               <text>
                  <p><b>Clustering of 116 non-redundant ATP binding sites based on their surface similarity</b>. The dendrogram represents the results of complete-linkage clustering, applied to SurfaceScreen score between all surfaces in our dataset (a). Each node is color-coded representing its biological functions as assessed through EC numbers or literature references. A second grayscale-coded shape can be found on all node edges that corresponds to the ATP conformations in Figure 9. A representative binding surface from each cluster is shown in (b).</p>
               </text>
               <graphic file="1472-6807-8-45-12"/>
            </fig>
            <p>The cluster results show that there is minimal functional exclusivity between binding surfaces and ATP conformation. The same enzymatic functions can be accomplished using a variety of binding surfaces and, within each surface, multiple ligand conformations can be bound. In the most well represented functional families, hyrdolases, ligases, and transferases, we observe different degrees of binding mode conservation. A breakdown of surface clustering and ligand conformations is shown as a balloon plot in Figure <figr fid="F13">13</figr>. Hydrolases have two conformation preferences and favor, deep, encapsulating binding surfaces. Bent form ATP is disfavored in hydrolases. Ligases are the most conserved, heavily favoring the bent form of ATP that requires a wide-mouth surface shape. Transferases are the most adept of the ATP binding proteins, sampling the most surface/conformation combinations. They do not discriminate between ATP conformations but have a preference for encapsulating binding sites. Several combinations occur with higher frequencies, including an exclusive combination (4-&#9670;), which is the most observed in this family.</p>
            <fig id="F13">
               <title>
                  <p>Figure 13</p>
               </title>
               <caption>
                  <p>Mapping ATP binding surface cluster membership and ATP conformation class</p>
               </caption>
               <text>
                  <p><b>Mapping ATP binding surface cluster membership and ATP conformation class</b>. Observed frequencies for hydrolases (a), ligases (b), and transferases (c) are shown. Surface cluster numbers correspond to Figure 11(b). ATP conformation class labels correspond to Figure 9. The sums for each row and column are shown on the edges of each plot.</p>
               </text>
               <graphic file="1472-6807-8-45-13"/>
            </fig>
            <p>Analysis of a broad collection of ATP binding proteins suggest that some functional families may have conserved binding surfaces while others are more divergent. Binding surfaces themselves also deviate on their level of ligand conformation tolerance. It is likely that altering protein surfaces may be the most cost effective evolutionary mechanism for exploiting functional niches, even within functional families.</p>
            <sec>
               <st>
                  <p>Automated Protein Kinase Classification by ATP Binding Site Comparisons</p>
               </st>
               <p>Within the transferase family, the most well conserved ATP binding surfaces belong to protein kinases. Protein kinases play vital roles in regulating cellular pathways by phosphorylating other proteins. Malfunctioning kinases have been linked to a variety of diseases such as immunodeficiency, endocrine disorders and cancer, making them the target of drug discovery efforts. At their highest level, kinases are divided by the amino acid residues they target (serine/threonine or tyrosine) and further classified by more specific biochemical activity. Functional classification of kinase families has been undertaken by many methodologies utilizing primary sequence, structure, binding sites, pharmacophore profiles and expert manual analysis <abbrgrp><abbr bid="B57">57</abbr><abbr bid="B58">58</abbr><abbr bid="B59">59</abbr><abbr bid="B60">60</abbr></abbrgrp></p>
               <p>We apply our comparison methodology to protein kinases to assess our ability to automatically classify them into their functional subfamilies using only ATP binding surfaces. A dataset of 297 protein kinases with bound ligands, including both natural and synthetic molecules, were annotated using a combination of the PDB web query system, EC numbers, KinBase<abbrgrp><abbr bid="B61">61</abbr></abbrgrp> and the Protein Kinase Resource<abbrgrp><abbr bid="B62">62</abbr></abbrgrp>. For consistency, family nomenclature is applied from KinBase.</p>
               <p>An all-against-all comparison was performed with results used to populate a distance matrix of <it>SurfaceScreen </it>scores. A dendrogram showing the complete-linkage clustering is shown in Figure <figr fid="F14">14</figr> where each surface node is color coded by kinase subfamily. Overall, the method shares strong agreement with the annotated classification, as seen by the color banding. CDK2 kinases are the most ordered; with all members perfectly clustered together and distinct sub-grouping separating nucleotide ligands from small compound inhibitors. The CK2 and CAMP families also show divergence between natural ligands and inhibitors. The CAMP groupings are further clustered by the molecular weight of their bound ligands. The mitogen activated protein kinases (MAP) are successfully classified into their sub-families, but on distant nodes in the graph. In all families, we observe differentiation based on the activation state of the kinase.</p>
               <fig id="F14">
                  <title>
                     <p>Figure 14</p>
                  </title>
                  <caption>
                     <p>An all-against-all comparison of ATP binding surfaces in the PDB</p>
                  </caption>
                  <text>
                     <p><b>An all-against-all comparison of ATP binding surfaces in the PDB</b>. The dendrogram represents the results of complete-linkage clustering, applied to <it>SurfaceScreen </it>score between all surfaces in our dataset (a). The nodes of the dendrogram are color coded for kinase families according to KinBase nomenclature. A branch of the cluster (gray box) is called-out to highlight the unexpected similarity discovered between the STI-571 binding site in c-Abl kinase and serine/threonine kinase p38 MAP (b).</p>
                  </text>
                  <graphic file="1472-6807-8-45-14"/>
               </fig>
               <p>Surface based classification conveys many similarities to other methods, but has advantages of additional functional insight that could not be automated in other methods. The ability to distinguish between different activation states and to organized surfaces based on ligand types and molecular weight differences could prove useful in developing binding profiles for enhanced specificity in kinase drug discovery.</p>
            </sec>
            <sec>
               <st>
                  <p>Binding Surface Similarity of c-Abl Kinase Inhibitors</p>
               </st>
               <p>The surreptitious fusion of the cellular form of Abelson leukemia virus tyrosine kinase (c-Abl) with the breakpoint cluster region (BCR) gene disrupts the internal control mechanism causing increased tyrosine kinase activity<abbrgrp><abbr bid="B63">63</abbr></abbrgrp>. The fusion protein BCR-Abl results in the disease chronic myelogenous leukemia (CML). Five structures of c-Abl proteins can be found in the PDB with two classes of small molecule inhibitors: pyrido [2,3.d]pyrimidine-type (PDB:<ext-link ext-link-type="pdb" ext-link-id="1m52">1m52</ext-link>, <ext-link ext-link-type="pdb" ext-link-id="1opk">1opk</ext-link>) and 2-phenylaminopyrimidine-type (PDB:<ext-link ext-link-type="pdb" ext-link-id="1iep">1iep</ext-link>, <ext-link ext-link-type="pdb" ext-link-id="1fpu">1fpu</ext-link>, <ext-link ext-link-type="pdb" ext-link-id="1opj">1opj</ext-link>). Both inhibitor classes bind in the ATP binding, but 2-phenylaminopyrimidine-type bind exclusively in the inactive conformation of the activation loop. The smaller pyrido [2,3.d]pyrimidine-type class are indifferent to activation state, making them more potent but less specific inhibitors<abbrgrp><abbr bid="B64">64</abbr></abbrgrp>. Our clustering accounts for this behavior and groups them in distinct nodes.</p>
               <p>The 2-phenylaminopyrimidine-type inhibitor STI-571 (Figure <figr fid="F15">15a</figr>) is an effective inhibitor of c-Abl activity for treatment against CML <abbrgrp><abbr bid="B64">64</abbr><abbr bid="B65">65</abbr><abbr bid="B66">66</abbr></abbrgrp>. It has been shown to be specific for tyrosine kinases and also inhibits stem-cell factor receptor kinase c-Kit (PDB:<ext-link ext-link-type="pdb" ext-link-id="1t46">1t46</ext-link>). Results from querying c-Abl (PDB:<ext-link ext-link-type="pdb" ext-link-id="1opj">1opj</ext-link>, Figure <figr fid="F15">15a</figr>) against the GPSS library show cKit is the best scoring non-ABL kinase. This cross reactivity is detected in our cluster (Figure <figr fid="F14">14b</figr>).</p>
               <fig id="F15">
                  <title>
                     <p>Figure 15</p>
                  </title>
                  <caption>
                     <p>Unique conformation of p38 MAP kinase creates similar binding surface to c-Abl kinase</p>
                  </caption>
                  <text>
                     <p><b>Unique conformation of p38 MAP kinase creates similar binding surface to c-Abl kinase</b>. The binding surface of inhibitor STI-571 in c-Abl kinase (a, PDB:<ext-link ext-link-type="pdb" ext-link-id="1opj">1opj</ext-link>) shows strong similarity to the binding surface of inhibitor B96 in p38 MAP kinase (b, PDB:<ext-link ext-link-type="pdb" ext-link-id="1kv2">1kv2</ext-link>). p38 MAP kinase has DFG motif configuration (stick representation) similar to that seen in c-Abl. <it>SurfaceAlign </it>superposition of the surfaces (c). STI-571 is posed into the p38 MAP binding surface based on the surface alignments (d).</p>
                  </text>
                  <graphic file="1472-6807-8-45-15"/>
               </fig>
               <p>A surprising member of this cluster node is serine/threonine kinase p38 mitogen-activated protein (MAP) kianse (PDB:<ext-link ext-link-type="pdb" ext-link-id="1kv2">1kv2</ext-link>, Figure <figr fid="F14">14b</figr>). p38 MAP kinases play critical roles in regulation of proinflammatory cytokines such as tumor necrosis factor and interleukin-1 and are a target for many inflammatory diseases<abbrgrp><abbr bid="B67">67</abbr></abbrgrp>. STI-571 is not currently known to inhibit, by design or mechanism, any serine/threonine kinase<abbrgrp><abbr bid="B68">68</abbr></abbrgrp>. The structure of 1kv2, complexed with inhibitor B96, occupies a unique conformation for p38 MAPs, where the highly conserved DFG motif is turned out<abbrgrp><abbr bid="B67">67</abbr></abbrgrp> (Figure <figr fid="F15">15b</figr>). This is the first observation of this state in a serine/threonine kinase, where it is a hallmark for tyrosine kinases (Figure <figr fid="F15">15a</figr>).</p>
               <p>A superposition, based on the alignment of the binding surfaces of c-Abl and p38 MAP, shows that the inhibitors bind in similar orientation (Figure <figr fid="F15">15c</figr>). STI-571 can be posed into the p38 MAP surface, based on the alignment of the surfaces, with no steric clashing and preserves the orientation of several polar atoms (Figure <figr fid="F15">15d</figr>). While the conformation of this p38 MAP is unique and presumed to occur infrequently, its existence presents opportunity to explore the use of STI-571 and analogs for additional therapeutic uses. Automated surface classification can also provides important cross-reactivity analysis; where unexpected binding sites similarities could result in undesired side effects.</p>
            </sec>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Conclusion</p>
         </st>
         <p>Proteins maintain the surprising ability to preserve local, sequentially unordered, surface residue patterns capable of performing explicit biochemical functions in proteins showing negligible evolutionary relationships. Even in homologous proteins, subtle amino acid mutations, which can be underappreciated by sequence analysis, can alter the properties of a surface and protein function. In this study, we describe a novel method for the comparison and analysis of protein functional surfaces. We observe that conservation of surface shape and physicochemical texture provides sufficient discriminative features for accurate retrieval of functionally homologous binding sites. The method serves as a predictive tool allowing for the identification of cross-binding ligands or binding sites on proteins of unknown function and for the comparative analysis of proteins, such as the classification of functionally diverse families. We introduced the Global Protein Surface Survey, a searchable library of functionally annotated protein sites. The <it>SurfaceScreen </it>methodology was benchmarked against binding pockets from HIV-1 protease, heme, and ATP and used to further analyze the relationship between surface similarity, biochemical function and ligand conformation.</p>
         <sec>
            <st>
               <p>Limitations and Outlook of SurfaceScreen</p>
            </st>
            <p>Our results have shown that two components, shape and physicochemical texture, define well the functional competence of surfaces. Surface geometry allows accessibility and proximity for interaction and accurate residue positioning make available specific functional groups for biochemical function. A notable limitation to our current method is our spatial residue model, which does not afford for amino acid substitutions during spatial alignments. Previous studies have shown that the substitution rates for localized surfaces differ from those of the whole sequences<abbrgrp><abbr bid="B69">69</abbr><abbr bid="B70">70</abbr></abbrgrp> and that these differences can provide better discrimination in surface sequence comparisons. One option would be to exclude residues less likely to be involved in function. For example, in the study of general enzyme function, Ysteng <it>et al</it><abbrgrp><abbr bid="B69">69</abbr></abbrgrp> studied 3,275 functional surfaces to discover His, Asp, Glu, Ser, and Cys residues account for more than 80% of active site residues in functional pockets. This is similar to previously published reports <abbrgrp><abbr bid="B71">71</abbr><abbr bid="B72">72</abbr><abbr bid="B73">73</abbr></abbrgrp>. Defining a minimum residue set describing different functions is plausible, but would reintroduce difficulties arising from global versus local surfaces characteristics and properties. This would also naively assume that some of the residues excluded in a scheme are not contributing in some way, either mechanistically or structurally, to the function. Many proteins also show high promiscuity and can bind several different ligands into this same functional site<abbrgrp><abbr bid="B74">74</abbr></abbrgrp>, solvent mediated interactions add complexity to the surface comparisons<abbrgrp><abbr bid="B75">75</abbr></abbrgrp> and electrostatic potential also plays important role in conformational changes and attracting or rejecting ligands<abbrgrp><abbr bid="B76">76</abbr></abbrgrp>. It is clear that in the future analysis of protein surfaces properties will need to include contribution from electrostatic potential, side chain dynamics, and chemical propensities to better describe functional sites.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Authors' contributions</p>
         </st>
         <p>AJ and TAB conceived of the study and performed analysis on provided examples. TAB designed the methodologies and implemented the software. All authors read and approved of the final manuscript.</p>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>We wish to thank Dr. Jie Liang for discussion of ideas and access to CASTp database, all members of the Midwest Center for Structural Genomics for making their data available and discussions. Molecular graphics were created using PyMOL<abbrgrp><abbr bid="B33">33</abbr></abbrgrp>. This work was supported by National Institutes of Health Grants GM62414 and GM074942 and by the U.S. Department of Energy, Office of Biological and Environmental Research, under contract DE-AC02-06CH11357.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>Modeling and analyzing three-dimensional structures of human disease proteins</p>
            </title>
            <aug>
               <au>
                  <snm>Ye</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Godzik</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Pac Symp Biocomput</source>
            <pubdate>2006</pubdate>
            <fpage>439</fpage>
            <lpage>450</lpage>
            <xrefbib>
               <pubid idtype="pmpid">17094259</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>A graph-theoretic approach to the identification of three-dimensional patterns of amino acid side-chains in protein structures</p>
            </title>
            <aug>
               <au>
                  <snm>Artymiuk</snm>
                  <fnm>PJ</fnm>
               </au>
               <au>
                  <snm>Poirrette</snm>
                  <fnm>AR</fnm>
               </au>
               <au>
                  <snm>Grindley</snm>
                  <fnm>HM</fnm>
               </au>
               <au>
                  <snm>Rice</snm>
                  <fnm>DW</fnm>
               </au>
               <au>
                  <snm>Willett</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1994</pubdate>
            <volume>243</volume>
            <issue>2</issue>
            <fpage>327</fpage>
            <lpage>344</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">7932758</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>Identification of protein biochemical functions by similarity search using the molecular surface database eF-site</p>
            </title>
            <aug>
               <au>
                  <snm>Kinoshita</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Nakamura</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>Protein Sci</source>
            <pubdate>2003</pubdate>
            <volume>12</volume>
            <issue>8</issue>
            <fpage>1589</fpage>
            <lpage>1595</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">2323945</pubid>
                  <pubid idtype="pmpid" link="fulltext">12876308</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>A new method to detect related function among proteins independent of sequence and fold homology</p>
            </title>
            <aug>
               <au>
                  <snm>Schmitt</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Kuhn</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Klebe</snm>
                  <fnm>G</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>2002</pubdate>
            <volume>323</volume>
            <issue>2</issue>
            <fpage>387</fpage>
            <lpage>406</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">12381328</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>From the similarity analysis of protein cavities to the functional classification of protein families using cavbase</p>
            </title>
            <aug>
               <au>
                  <snm>Kuhn</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Weskamp</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Schmitt</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Hullermeier</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Klebe</snm>
                  <fnm>G</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>2006</pubdate>
            <volume>359</volume>
            <issue>4</issue>
            <fpage>1023</fpage>
            <lpage>1044</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">16697007</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>Recognition of spatial motifs in protein structures</p>
            </title>
            <aug>
               <au>
                  <snm>Kleywegt</snm>
                  <fnm>GJ</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1999</pubdate>
            <volume>285</volume>
            <issue>4</issue>
            <fpage>1887</fpage>
            <lpage>1897</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">9917419</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>Detection of protein three-dimensional side-chain patterns: new examples of convergent evolution</p>
            </title>
            <aug>
               <au>
                  <snm>Russell</snm>
                  <fnm>RB</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1998</pubdate>
            <volume>279</volume>
            <issue>5</issue>
            <fpage>1211</fpage>
            <lpage>1227</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">9642096</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>A model for statistical significance of local similarities in structure</p>
            </title>
            <aug>
               <au>
                  <snm>Stark</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Sunyaev</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Russell</snm>
                  <fnm>RB</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>2003</pubdate>
            <volume>326</volume>
            <issue>5</issue>
            <fpage>1307</fpage>
            <lpage>1316</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">12595245</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>Fold independent structural comparisons of protein-ligand binding sites for exploring functional relationships</p>
            </title>
            <aug>
               <au>
                  <snm>Gold</snm>
                  <fnm>ND</fnm>
               </au>
               <au>
                  <snm>Jackson</snm>
                  <fnm>RM</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>2006</pubdate>
            <volume>355</volume>
            <issue>5</issue>
            <fpage>1112</fpage>
            <lpage>1124</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">16359705</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>A searchable database for comparing protein-ligand binding sites for the analysis of structure-function relationships</p>
            </title>
            <aug>
               <au>
                  <snm>Gold</snm>
                  <fnm>ND</fnm>
               </au>
               <au>
                  <snm>Jackson</snm>
                  <fnm>RM</fnm>
               </au>
            </aug>
            <source>J Chem Inf Model</source>
            <pubdate>2006</pubdate>
            <volume>46</volume>
            <issue>2</issue>
            <fpage>736</fpage>
            <lpage>742</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">16563004</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>SitesBase: a database for structure-based protein-ligand binding site comparisons</p>
            </title>
            <aug>
               <au>
                  <snm>Gold</snm>
                  <fnm>ND</fnm>
               </au>
               <au>
                  <snm>Jackson</snm>
                  <fnm>RM</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2006</pubdate>
            <issue>34 Database</issue>
            <fpage>D231</fpage>
            <lpage>234</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1347425</pubid>
                  <pubid idtype="pmpid" link="fulltext">16381853</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>Surface motifs by a computer vision technique: searches, detection, and implications for protein-ligand recognition</p>
            </title>
            <aug>
               <au>
                  <snm>Fischer</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Norel</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Wolfson</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Nussinov</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Proteins</source>
            <pubdate>1993</pubdate>
            <volume>16</volume>
            <issue>3</issue>
            <fpage>278</fpage>
            <lpage>292</lpage>
            <xrefbib>
               <pubid idtype="pmpid">8394000</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B13">
            <title>
               <p>Molecular surface recognition by a computer vision-based technique</p>
            </title>
            <aug>
               <au>
                  <snm>Norel</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Fischer</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Wolfson</snm>
                  <fnm>HJ</fnm>
               </au>
               <au>
                  <snm>Nussinov</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Protein Eng</source>
            <pubdate>1994</pubdate>
            <volume>7</volume>
            <issue>1</issue>
            <fpage>39</fpage>
            <lpage>46</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">8140093</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>TESS: a geometric hashing algorithm for deriving 3D coordinate templates for searching structural databases. Application to enzyme active sites</p>
            </title>
            <aug>
               <au>
                  <snm>Wallace</snm>
                  <fnm>AC</fnm>
               </au>
               <au>
                  <snm>Borkakoti</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Thornton</snm>
                  <fnm>JM</fnm>
               </au>
            </aug>
            <source>Protein Sci</source>
            <pubdate>1997</pubdate>
            <volume>6</volume>
            <issue>11</issue>
            <fpage>2308</fpage>
            <lpage>2323</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">2143595</pubid>
                  <pubid idtype="pmpid" link="fulltext">9385633</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B15">
            <title>
               <p>An algorithm for constraint-based structural template matching: application to 3D templates with statistical analysis</p>
            </title>
            <aug>
               <au>
                  <snm>Barker</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Thornton</snm>
                  <fnm>JM</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2003</pubdate>
            <volume>19</volume>
            <issue>13</issue>
            <fpage>1644</fpage>
            <lpage>1649</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">12967960</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <title>
               <p>The Catalytic Site Atlas: a resource of catalytic sites and residues identified in enzymes using structural data</p>
            </title>
            <aug>
               <au>
                  <snm>Porter</snm>
                  <fnm>CT</fnm>
               </au>
               <au>
                  <snm>Bartlett</snm>
                  <fnm>GJ</fnm>
               </au>
               <au>
                  <snm>Thornton</snm>
                  <fnm>JM</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2004</pubdate>
            <issue>32 Database</issue>
            <fpage>D129</fpage>
            <lpage>133</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">308762</pubid>
                  <pubid idtype="pmpid" link="fulltext">14681376</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>ProFunc: a server for predicting protein function from 3D structure</p>
            </title>
            <aug>
               <au>
                  <snm>Laskowski</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Watson</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Thornton</snm>
                  <fnm>JM</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2005</pubdate>
            <issue>33 Web Server</issue>
            <fpage>W89</fpage>
            <lpage>93</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1160175</pubid>
                  <pubid idtype="pmpid" link="fulltext">15980588</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p>Inferring functional relationships of proteins from local sequence and spatial surface patterns</p>
            </title>
            <aug>
               <au>
                  <snm>Binkowski</snm>
                  <fnm>TA</fnm>
               </au>
               <au>
                  <snm>Adamian</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Liang</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>2003</pubdate>
            <volume>332</volume>
            <issue>2</issue>
            <fpage>505</fpage>
            <lpage>526</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">12948498</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B19">
            <title>
               <p>pvSOAR: detecting similar surface patterns of pocket and void surfaces of amino acid residues on proteins</p>
            </title>
            <aug>
               <au>
                  <snm>Binkowski</snm>
                  <fnm>TA</fnm>
               </au>
               <au>
                  <snm>Freeman</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Liang</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2004</pubdate>
            <issue>32 Web Server</issue>
            <fpage>W555</fpage>
            <lpage>558</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">441528</pubid>
                  <pubid idtype="pmpid" link="fulltext">15215448</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B20">
            <title>
               <p>Shape variation in protein binding pockets and their ligands</p>
            </title>
            <aug>
               <au>
                  <snm>Kahraman</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Morris</snm>
                  <fnm>RJ</fnm>
               </au>
               <au>
                  <snm>Laskowski</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Thornton</snm>
                  <fnm>JM</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>2007</pubdate>
            <volume>368</volume>
            <issue>1</issue>
            <fpage>283</fpage>
            <lpage>301</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">17337005</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B21">
            <title>
               <p>Aminopeptidases: structure and function</p>
            </title>
            <aug>
               <au>
                  <snm>Taylor</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Faseb J</source>
            <pubdate>1993</pubdate>
            <volume>7</volume>
            <issue>2</issue>
            <fpage>290</fpage>
            <lpage>298</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">8440407</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B22">
            <title>
               <p>Crystal structure of a dodecameric tetrahedral-shaped aminopeptidase</p>
            </title>
            <aug>
               <au>
                  <snm>Russo</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Baumann</snm>
                  <fnm>U</fnm>
               </au>
            </aug>
            <source>J Biol Chem</source>
            <pubdate>2004</pubdate>
            <volume>279</volume>
            <issue>49</issue>
            <fpage>51275</fpage>
            <lpage>51281</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">15375159</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B23">
            <title>
               <p>GCN5-related N-acetyltransferases: a structural overview</p>
            </title>
            <aug>
               <au>
                  <snm>Dyda</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Klein</snm>
                  <fnm>DC</fnm>
               </au>
               <au>
                  <snm>Hickman</snm>
                  <fnm>AB</fnm>
               </au>
            </aug>
            <source>Annu Rev Biophys Biomol Struct</source>
            <pubdate>2000</pubdate>
            <volume>29</volume>
            <fpage>81</fpage>
            <lpage>103</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">10940244</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B24">
            <title>
               <p>Shape Distributions</p>
            </title>
            <aug>
               <au>
                  <snm>Osada</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Funkhouser</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Chazelle</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Dobkin</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>ACM Transactions on Graphics</source>
            <pubdate>2002</pubdate>
            <volume>21</volume>
            <issue>4</issue>
            <fpage>807</fpage>
            <lpage>832</lpage>
         </bibl>
         <bibl id="B25">
            <title>
               <p>Handbook of Methods of Applied Statistics</p>
            </title>
            <aug>
               <au>
                  <snm>Chakravarti</snm>
                  <fnm>IM</fnm>
               </au>
               <au>
                  <snm>Laha</snm>
                  <fnm>RG</fnm>
               </au>
               <au>
                  <snm>Roy</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <publisher>John Wiley and Sons</publisher>
            <pubdate>1967</pubdate>
            <volume>I</volume>
         </bibl>
         <bibl id="B26">
            <title>
               <p>Least-squares estimation of transformation paremeters between two point patterns</p>
            </title>
            <aug>
               <au>
                  <snm>Umeyama</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>IEEE Trans Pattern Anal Mach Intell</source>
            <pubdate>1991</pubdate>
            <volume>13</volume>
            <fpage>376</fpage>
            <lpage>380</lpage>
         </bibl>
         <bibl id="B27">
            <title>
               <p>Analytical shape computation of macromolecules: I. Molecular area and volume through alpha shape</p>
            </title>
            <aug>
               <au>
                  <snm>Liang</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Edelsbrunner</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Fu</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Sudhakar</snm>
                  <fnm>PV</fnm>
               </au>
               <au>
                  <snm>Subramaniam</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Proteins</source>
            <pubdate>1998</pubdate>
            <volume>33</volume>
            <issue>1</issue>
            <fpage>1</fpage>
            <lpage>17</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">9741840</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B28">
            <title>
               <p>Analytical shape computation of macromolecules: II. Inaccessible cavities in proteins</p>
            </title>
            <aug>
               <au>
                  <snm>Liang</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Edelsbrunner</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Fu</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Sudhakar</snm>
                  <fnm>PV</fnm>
               </au>
               <au>
                  <snm>Subramaniam</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Proteins</source>
            <pubdate>1998</pubdate>
            <volume>33</volume>
            <issue>1</issue>
            <fpage>18</fpage>
            <lpage>29</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">9741841</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B29">
            <title>
               <p>Anatomy of protein pockets and cavities: measurement of binding site geometry and implications for ligand design</p>
            </title>
            <aug>
               <au>
                  <snm>Liang</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Edelsbrunner</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Woodward</snm>
                  <fnm>C</fnm>
               </au>
            </aug>
            <source>Protein Sci</source>
            <pubdate>1998</pubdate>
            <volume>7</volume>
            <issue>9</issue>
            <fpage>1884</fpage>
            <lpage>1897</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">2144175</pubid>
                  <pubid idtype="pmpid" link="fulltext">9761470</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B30">
            <title>
               <p>Chemical Similarity Searching</p>
            </title>
            <aug>
               <au>
                  <snm>Willett</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Barnard</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Downs</snm>
                  <fnm>G</fnm>
               </au>
            </aug>
            <source>J Chem Inf Comput Sci</source>
            <pubdate>1998</pubdate>
            <volume>38</volume>
            <fpage>983</fpage>
            <lpage>996</lpage>
         </bibl>
         <bibl id="B31">
            <title>
               <p>A shape-based 3-D scaffold hopping method and its application to a bacterial protein-protein interaction</p>
            </title>
            <aug>
               <au>
                  <snm>Rush</snm>
                  <fnm>TS</fnm>
                  <suf>3rd</suf>
               </au>
               <au>
                  <snm>Grant</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Mosyak</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Nicholls</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>J Med Chem</source>
            <pubdate>2005</pubdate>
            <volume>48</volume>
            <issue>5</issue>
            <fpage>1489</fpage>
            <lpage>1495</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">15743191</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B32">
            <title>
               <p>Global Protein Surface Survey</p>
            </title>
            <url>http://gpss.mcsg.anl.gov</url>
         </bibl>
         <bibl id="B33">
            <title>
               <p>The PyMOL Molecular Graphics System</p>
            </title>
            <aug>
               <au>
                  <snm>DeLano</snm>
                  <fnm>WL</fnm>
               </au>
            </aug>
            <publisher>Palo Alto, CA, USA: DeLano Scientific</publisher>
            <pubdate>2002</pubdate>
         </bibl>
         <bibl id="B34">
            <title>
               <p>CASTp: Computed Atlas of Surface Topography of proteins</p>
            </title>
            <aug>
               <au>
                  <snm>Binkowski</snm>
                  <fnm>TA</fnm>
               </au>
               <au>
                  <snm>Naghibzadeh</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Liang</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2003</pubdate>
            <volume>31</volume>
            <issue>13</issue>
            <fpage>3352</fpage>
            <lpage>3355</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">168919</pubid>
                  <pubid idtype="pmpid" link="fulltext">12824325</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B35">
            <title>
               <p>ROCR: visualizing classifier performance in R</p>
            </title>
            <aug>
               <au>
                  <snm>Sing</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Sander</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Beerenwinkel</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Lengauer</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2005</pubdate>
            <volume>21</volume>
            <issue>20</issue>
            <fpage>3940</fpage>
            <lpage>3941</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">16096348</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B36">
            <title>
               <p>Structure-based inhibitors of HIV-1 protease</p>
            </title>
            <aug>
               <au>
                  <snm>Wlodawer</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Erickson</snm>
                  <fnm>JW</fnm>
               </au>
            </aug>
            <source>Annu Rev Biochem</source>
            <pubdate>1993</pubdate>
            <volume>62</volume>
            <fpage>543</fpage>
            <lpage>585</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">8352596</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B37">
            <title>
               <p>Inhibitors of HIV-1 protease: a major success of structure-assisted drug design</p>
            </title>
            <aug>
               <au>
                  <snm>Wlodawer</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Vondrasek</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Annu Rev Biophys Biomol Struct</source>
            <pubdate>1998</pubdate>
            <volume>27</volume>
            <fpage>249</fpage>
            <lpage>284</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">9646869</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B38">
            <title>
               <p>Secondary-structure matching (SSM), a new tool for fast protein structure alignment in three dimensions</p>
            </title>
            <aug>
               <au>
                  <snm>Krissinel</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Henrick</snm>
                  <fnm>K</fnm>
               </au>
            </aug>
            <source>Acta Crystallogr D Biol Crystallogr</source>
            <pubdate>2004</pubdate>
            <volume>60</volume>
            <issue>Pt 12 Pt 1</issue>
            <fpage>2256</fpage>
            <lpage>2268</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">15572779</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B39">
            <title>
               <p>The CATH domain structure database</p>
            </title>
            <aug>
               <au>
                  <snm>Orengo</snm>
                  <fnm>CA</fnm>
               </au>
               <au>
                  <snm>Pearl</snm>
                  <fnm>FM</fnm>
               </au>
               <au>
                  <snm>Thornton</snm>
                  <fnm>JM</fnm>
               </au>
            </aug>
            <source>Methods Biochem Anal</source>
            <pubdate>2003</pubdate>
            <volume>44</volume>
            <fpage>249</fpage>
            <lpage>271</lpage>
            <xrefbib>
               <pubid idtype="pmpid">12647390</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B40">
            <title>
               <p>Novel uncomplexed and complexed structures of plasmepsin II, an aspartic protease from Plasmodium falciparum</p>
            </title>
            <aug>
               <au>
                  <snm>Asojo</snm>
                  <fnm>OA</fnm>
               </au>
               <au>
                  <snm>Gulnik</snm>
                  <fnm>SV</fnm>
               </au>
               <au>
                  <snm>Afonina</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Yu</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Ellman</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Haque</snm>
                  <fnm>TS</fnm>
               </au>
               <au>
                  <snm>Silva</snm>
                  <fnm>AM</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>2003</pubdate>
            <volume>327</volume>
            <issue>1</issue>
            <fpage>173</fpage>
            <lpage>181</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">12614616</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B41">
            <title>
               <p>Antimalarial activity of human immunodeficiency virus type 1 protease inhibitors</p>
            </title>
            <aug>
               <au>
                  <snm>Parikh</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Gut</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Istvan</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Goldberg</snm>
                  <fnm>DE</fnm>
               </au>
               <au>
                  <snm>Havlir</snm>
                  <fnm>DV</fnm>
               </au>
               <au>
                  <snm>Rosenthal</snm>
                  <fnm>PJ</fnm>
               </au>
            </aug>
            <source>Antimicrob Agents Chemother</source>
            <pubdate>2005</pubdate>
            <volume>49</volume>
            <issue>7</issue>
            <fpage>2983</fpage>
            <lpage>2985</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1168637</pubid>
                  <pubid idtype="pmpid" link="fulltext">15980379</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B42">
            <title>
               <p>Crystal structure of aclacinomycin methylesterase with bound product analogues: implications for anthracycline recognition and mechanism</p>
            </title>
            <aug>
               <au>
                  <snm>Jansson</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Niemi</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Mantsala</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Schneider</snm>
                  <fnm>G</fnm>
               </au>
            </aug>
            <source>J Biol Chem</source>
            <pubdate>2003</pubdate>
            <volume>278</volume>
            <issue>40</issue>
            <fpage>39006</fpage>
            <lpage>39013</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">12878604</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B43">
            <title>
               <p>Anthracyclines, proteasome activity and multi-drug-resistance</p>
            </title>
            <aug>
               <au>
                  <snm>Fekete</snm>
                  <fnm>MR</fnm>
               </au>
               <au>
                  <snm>McBride</snm>
                  <fnm>WH</fnm>
               </au>
               <au>
                  <snm>Pajonk</snm>
                  <fnm>F</fnm>
               </au>
            </aug>
            <source>BMC Cancer</source>
            <pubdate>2005</pubdate>
            <volume>5</volume>
            <fpage>114</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1242219</pubid>
                  <pubid idtype="pmpid" link="fulltext">16159384</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B44">
            <title>
               <p>On the evolution of protein folds: are similar motifs in different protein folds the result of convergence, insertion, or relics of an ancient peptide world?</p>
            </title>
            <aug>
               <au>
                  <snm>Lupas</snm>
                  <fnm>AN</fnm>
               </au>
               <au>
                  <snm>Ponting</snm>
                  <fnm>CP</fnm>
               </au>
               <au>
                  <snm>Russell</snm>
                  <fnm>RB</fnm>
               </au>
            </aug>
            <source>J Struct Biol</source>
            <pubdate>2001</pubdate>
            <volume>134</volume>
            <issue>2&#8211;3</issue>
            <fpage>191</fpage>
            <lpage>203</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">11551179</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B45">
            <title>
               <p>Diversity and conservation of interactions for binding heme in b-type heme proteins</p>
            </title>
            <aug>
               <au>
                  <snm>Schneider</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Marles-Wright</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Sharp</snm>
                  <fnm>KH</fnm>
               </au>
               <au>
                  <snm>Paoli</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Nat Prod Rep</source>
            <pubdate>2007</pubdate>
            <volume>24</volume>
            <issue>3</issue>
            <fpage>621</fpage>
            <lpage>630</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">17534534</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B46">
            <title>
               <p>Structural characterization of nitric oxide synthase isoforms reveals striking active-site conservation</p>
            </title>
            <aug>
               <au>
                  <snm>Fischmann</snm>
                  <fnm>TO</fnm>
               </au>
               <au>
                  <snm>Hruza</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Niu</snm>
                  <fnm>XD</fnm>
               </au>
               <au>
                  <snm>Fossetta</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Lunn</snm>
                  <fnm>CA</fnm>
               </au>
               <au>
                  <snm>Dolphin</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Prongay</snm>
                  <fnm>AJ</fnm>
               </au>
               <au>
                  <snm>Reichert</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Lundell</snm>
                  <fnm>DJ</fnm>
               </au>
               <au>
                  <snm>Narula</snm>
                  <fnm>SK</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nat Struct Biol</source>
            <pubdate>1999</pubdate>
            <volume>6</volume>
            <issue>3</issue>
            <fpage>233</fpage>
            <lpage>242</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">10074942</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B47">
            <title>
               <p>Gapped BLAST and PSI-BLAST: a new generation of protein database search programs</p>
            </title>
            <aug>
               <au>
                  <snm>Altschul</snm>
                  <fnm>SF</fnm>
               </au>
               <au>
                  <snm>Madden</snm>
                  <fnm>TL</fnm>
               </au>
               <au>
                  <snm>Schaffer</snm>
                  <fnm>AA</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Miller</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Lipman</snm>
                  <fnm>DJ</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>1997</pubdate>
            <volume>25</volume>
            <issue>17</issue>
            <fpage>3389</fpage>
            <lpage>3402</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">146917</pubid>
                  <pubid idtype="pmpid" link="fulltext">9254694</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B48">
            <title>
               <p>Mechanism of the Escherichia coli ADP-ribose pyrophosphatase, a Nudix hydrolase</p>
            </title>
            <aug>
               <au>
                  <snm>Gabelli</snm>
                  <fnm>SB</fnm>
               </au>
               <au>
                  <snm>Bianchet</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Ohnishi</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Ichikawa</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Bessman</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Amzel</snm>
                  <fnm>LM</fnm>
               </au>
            </aug>
            <source>Biochemistry</source>
            <pubdate>2002</pubdate>
            <volume>41</volume>
            <issue>30</issue>
            <fpage>9279</fpage>
            <lpage>9285</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">12135348</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B49">
            <title>
               <p>Staphylococcus aureus IsdG and IsdI, heme-degrading enzymes with structural similarity to monooxygenases</p>
            </title>
            <aug>
               <au>
                  <snm>Wu</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Skaar</snm>
                  <fnm>EP</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Joachimiak</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Gornicki</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Schneewind</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Joachimiak</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>J Biol Chem</source>
            <pubdate>2005</pubdate>
            <volume>280</volume>
            <issue>4</issue>
            <fpage>2840</fpage>
            <lpage>2846</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">15520015</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B50">
            <title>
               <p>CASTp: computed atlas of surface topography of proteins with structural and topographical mapping of functionally annotated residues</p>
            </title>
            <aug>
               <au>
                  <snm>Dundas</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Ouyang</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Tseng</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Binkowski</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Turpaz</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Liang</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2006</pubdate>
            <issue>34 Web Server</issue>
            <fpage>W116</fpage>
            <lpage>118</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1538779</pubid>
                  <pubid idtype="pmpid" link="fulltext">16844972</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B51">
            <title>
               <p>The structure of ActVA-Orf6, a novel type of monooxygenase involved in actinorhodin biosynthesis</p>
            </title>
            <aug>
               <au>
                  <snm>Sciara</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Kendrew</snm>
                  <fnm>SG</fnm>
               </au>
               <au>
                  <snm>Miele</snm>
                  <fnm>AE</fnm>
               </au>
               <au>
                  <snm>Marsh</snm>
                  <fnm>NG</fnm>
               </au>
               <au>
                  <snm>Federici</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Malatesta</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Schimperna</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Savino</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Vallone</snm>
                  <fnm>B</fnm>
               </au>
            </aug>
            <source>Embo J</source>
            <pubdate>2003</pubdate>
            <volume>22</volume>
            <issue>2</issue>
            <fpage>205</fpage>
            <lpage>215</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">140106</pubid>
                  <pubid idtype="pmpid" link="fulltext">12514126</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B52">
            <title>
               <p>Conformational diversity of ligands bound to proteins</p>
            </title>
            <aug>
               <au>
                  <snm>Stockwell</snm>
                  <fnm>GR</fnm>
               </au>
               <au>
                  <snm>Thornton</snm>
                  <fnm>JM</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>2006</pubdate>
            <volume>356</volume>
            <issue>4</issue>
            <fpage>928</fpage>
            <lpage>944</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">16405908</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B53">
            <title>
               <p>Inclining the purine base binding plane in protein kinase CK2 by exchanging the flanking side-chains generates a preference for ATP as a cosubstrate</p>
            </title>
            <aug>
               <au>
                  <snm>Yde</snm>
                  <fnm>CW</fnm>
               </au>
               <au>
                  <snm>Ermakova</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Issinger</snm>
                  <fnm>OG</fnm>
               </au>
               <au>
                  <snm>Niefind</snm>
                  <fnm>K</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>2005</pubdate>
            <volume>347</volume>
            <issue>2</issue>
            <fpage>399</fpage>
            <lpage>414</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">15740749</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B54">
            <title>
               <p>Three-dimensional structure of ATP:corrinoid adenosyltransferase from Salmonella typhimurium in its free state, complexed with MgATP, or complexed with hydroxycobalamin and MgATP</p>
            </title>
            <aug>
               <au>
                  <snm>Bauer</snm>
                  <fnm>CB</fnm>
               </au>
               <au>
                  <snm>Fonseca</snm>
                  <fnm>MV</fnm>
               </au>
               <au>
                  <snm>Holden</snm>
                  <fnm>HM</fnm>
               </au>
               <au>
                  <snm>Thoden</snm>
                  <fnm>JB</fnm>
               </au>
               <au>
                  <snm>Thompson</snm>
                  <fnm>TB</fnm>
               </au>
               <au>
                  <snm>Escalante-Semerena</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Rayment</snm>
                  <fnm>I</fnm>
               </au>
            </aug>
            <source>Biochemistry</source>
            <pubdate>2001</pubdate>
            <volume>40</volume>
            <issue>2</issue>
            <fpage>361</fpage>
            <lpage>374</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">11148030</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B55">
            <title>
               <p>PurT-encoded glycinamide ribonucleotide transformylase. Accommodation of adenosine nucleotide analogs within the active site</p>
            </title>
            <aug>
               <au>
                  <snm>Thoden</snm>
                  <fnm>JB</fnm>
               </au>
               <au>
                  <snm>Firestine</snm>
                  <fnm>SM</fnm>
               </au>
               <au>
                  <snm>Benkovic</snm>
                  <fnm>SJ</fnm>
               </au>
               <au>
                  <snm>Holden</snm>
                  <fnm>HM</fnm>
               </au>
            </aug>
            <source>J Biol Chem</source>
            <pubdate>2002</pubdate>
            <volume>277</volume>
            <issue>26</issue>
            <fpage>23898</fpage>
            <lpage>23908</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">11953435</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B56">
            <title>
               <p>Structure of an amide bond forming F(420):gamma-glutamyl ligase from Archaeoglobus fulgidus &#8211; a member of a new family of non-ribosomal peptide synthases</p>
            </title>
            <aug>
               <au>
                  <snm>Nocek</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Evdokimova</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Proudfoot</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Kudritska</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Grochowski</snm>
                  <fnm>LL</fnm>
               </au>
               <au>
                  <snm>White</snm>
                  <fnm>RH</fnm>
               </au>
               <au>
                  <snm>Savchenko</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Yakunin</snm>
                  <fnm>AF</fnm>
               </au>
               <au>
                  <snm>Edwards</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Joachimiak</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>2007</pubdate>
            <volume>372</volume>
            <issue>2</issue>
            <fpage>456</fpage>
            <lpage>469</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">17669425</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B57">
            <title>
               <p>Structural evolution of the protein kinase-like superfamily</p>
            </title>
            <aug>
               <au>
                  <snm>Scheeff</snm>
                  <fnm>ED</fnm>
               </au>
               <au>
                  <snm>Bourne</snm>
                  <fnm>PE</fnm>
               </au>
            </aug>
            <source>PLoS Comput Biol</source>
            <pubdate>2005</pubdate>
            <volume>1</volume>
            <issue>5</issue>
            <fpage>e49</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1261164</pubid>
                  <pubid idtype="pmpid" link="fulltext">16244704</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B58">
            <title>
               <p>Functional Classification of Protein Kinase Binding Sites Using Cavbase</p>
            </title>
            <aug>
               <au>
                  <snm>Kuhn</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Weskamp</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Hullermeier</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Klebe</snm>
                  <fnm>G</fnm>
               </au>
            </aug>
            <source>ChemMedChem</source>
            <pubdate>2007</pubdate>
            <volume>2</volume>
            <issue>10</issue>
            <fpage>1432</fpage>
            <lpage>1447</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">17694525</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B59">
            <title>
               <p>Comparative analysis of the surface interaction properties of the binding sites of CDK2, CDK4, and ERK2</p>
            </title>
            <aug>
               <au>
                  <snm>Kelly</snm>
                  <fnm>MD</fnm>
               </au>
               <au>
                  <snm>Mancera</snm>
                  <fnm>RL</fnm>
               </au>
            </aug>
            <source>ChemMedChem</source>
            <pubdate>2006</pubdate>
            <volume>1</volume>
            <issue>3</issue>
            <fpage>366</fpage>
            <lpage>375</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">16892371</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B60">
            <title>
               <p>A Pharmacophore Map of Small Molecule Protein Kinase Inhibitors</p>
            </title>
            <aug>
               <au>
                  <snm>McGregor</snm>
                  <fnm>MJ</fnm>
               </au>
            </aug>
            <source>J Chem Inf Model</source>
            <pubdate>2007</pubdate>
            <volume>47</volume>
            <issue>6</issue>
            <fpage>2374</fpage>
            <lpage>2382</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">17941626</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B61">
            <title>
               <p>KinBase</p>
            </title>
            <url>http://www.kinase.com</url>
         </bibl>
         <bibl id="B62">
            <title>
               <p>The protein kinase resource</p>
            </title>
            <aug>
               <au>
                  <snm>Smith</snm>
                  <fnm>CM</fnm>
               </au>
               <au>
                  <snm>Shindyalov</snm>
                  <fnm>IN</fnm>
               </au>
               <au>
                  <snm>Veretnik</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Gribskov</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Taylor</snm>
                  <fnm>SS</fnm>
               </au>
               <au>
                  <snm>Ten Eyck</snm>
                  <fnm>LF</fnm>
               </au>
               <au>
                  <snm>Bourne</snm>
                  <fnm>PE</fnm>
               </au>
            </aug>
            <source>Trends Biochem Sci</source>
            <pubdate>1997</pubdate>
            <volume>22</volume>
            <issue>11</issue>
            <fpage>444</fpage>
            <lpage>446</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">9397688</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B63">
            <title>
               <p>STI-571: an anticancer protein-tyrosine kinase inhibitor</p>
            </title>
            <aug>
               <au>
                  <snm>Roskoski</snm>
                  <fnm>RJ</fnm>
               </au>
            </aug>
            <source>Biochem Biophys Res Commun</source>
            <pubdate>2003</pubdate>
            <volume>309</volume>
            <issue>4</issue>
            <fpage>709</fpage>
            <lpage>717</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">13679030</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B64">
            <title>
               <p>Crystal structures of the kinase domain of c-Abl in complex with the small molecule inhibitors PD173955 and imatinib (STI-571)</p>
            </title>
            <aug>
               <au>
                  <snm>Nagar</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Bornmann</snm>
                  <fnm>WG</fnm>
               </au>
               <au>
                  <snm>Pellicena</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Schindler</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Veach</snm>
                  <fnm>DR</fnm>
               </au>
               <au>
                  <snm>Miller</snm>
                  <fnm>WT</fnm>
               </au>
               <au>
                  <snm>Clarkson</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Kuriyan</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Cancer Res</source>
            <pubdate>2002</pubdate>
            <volume>62</volume>
            <issue>15</issue>
            <fpage>4236</fpage>
            <lpage>4243</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">12154025</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B65">
            <title>
               <p>Structural insights into the conformational selectivity of STI-571 and related kinase inhibitors</p>
            </title>
            <aug>
               <au>
                  <snm>Mol</snm>
                  <fnm>CD</fnm>
               </au>
               <au>
                  <snm>Fabbro</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Hosfield</snm>
                  <fnm>DJ</fnm>
               </au>
            </aug>
            <source>Curr Opin Drug Discov Devel</source>
            <pubdate>2004</pubdate>
            <volume>7</volume>
            <issue>5</issue>
            <fpage>639</fpage>
            <lpage>648</lpage>
            <xrefbib>
               <pubid idtype="pmpid">15503866</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B66">
            <title>
               <p>Structural basis for the autoinhibition and STI-571 inhibition of c-Kit tyrosine kinase</p>
            </title>
            <aug>
               <au>
                  <snm>Mol</snm>
                  <fnm>CD</fnm>
               </au>
               <au>
                  <snm>Dougan</snm>
                  <fnm>DR</fnm>
               </au>
               <au>
                  <snm>Schneider</snm>
                  <fnm>TR</fnm>
               </au>
               <au>
                  <snm>Skene</snm>
                  <fnm>RJ</fnm>
               </au>
               <au>
                  <snm>Kraus</snm>
                  <fnm>ML</fnm>
               </au>
               <au>
                  <snm>Scheibe</snm>
                  <fnm>DN</fnm>
               </au>
               <au>
                  <snm>Snell</snm>
                  <fnm>GP</fnm>
               </au>
               <au>
                  <snm>Zou</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Sang</snm>
                  <fnm>BC</fnm>
               </au>
               <au>
                  <snm>Wilson</snm>
                  <fnm>KP</fnm>
               </au>
            </aug>
            <source>J Biol Chem</source>
            <pubdate>2004</pubdate>
            <volume>279</volume>
            <issue>30</issue>
            <fpage>31655</fpage>
            <lpage>31663</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">15123710</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B67">
            <title>
               <p>Inhibition of p38 MAP kinase by utilizing a novel allosteric binding site</p>
            </title>
            <aug>
               <au>
                  <snm>Pargellis</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Tong</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Churchill</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Cirillo</snm>
                  <fnm>PF</fnm>
               </au>
               <au>
                  <snm>Gilmore</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Graham</snm>
                  <fnm>AG</fnm>
               </au>
               <au>
                  <snm>Grob</snm>
                  <fnm>PM</fnm>
               </au>
               <au>
                  <snm>Hickey</snm>
                  <fnm>ER</fnm>
               </au>
               <au>
                  <snm>Moss</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Pav</snm>
                  <fnm>S</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nat Struct Biol</source>
            <pubdate>2002</pubdate>
            <volume>9</volume>
            <issue>4</issue>
            <fpage>268</fpage>
            <lpage>272</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">11896401</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B68">
            <title>
               <p>Structural mechanism for STI-571 inhibition of abelson tyrosine kinase</p>
            </title>
            <aug>
               <au>
                  <snm>Schindler</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Bornmann</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Pellicena</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Miller</snm>
                  <fnm>WT</fnm>
               </au>
               <au>
                  <snm>Clarkson</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Kuriyan</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2000</pubdate>
            <volume>289</volume>
            <issue>5486</issue>
            <fpage>1938</fpage>
            <lpage>1942</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">10988075</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B69">
            <title>
               <p>Predicting Enzyme Functional Surfaces and Locating Key Residues Automatically from Structures</p>
            </title>
            <aug>
               <au>
                  <snm>Tseng</snm>
                  <fnm>YY</fnm>
               </au>
               <au>
                  <snm>Liang</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Ann Biomed Eng</source>
            <pubdate>2007</pubdate>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">17294116</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B70">
            <title>
               <p>Estimation of amino acid residue substitution rates at local spatial regions and application in protein function inference: a Bayesian Monte Carlo approach</p>
            </title>
            <aug>
               <au>
                  <snm>Tseng</snm>
                  <fnm>YY</fnm>
               </au>
               <au>
                  <snm>Liang</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Mol Biol Evol</source>
            <pubdate>2006</pubdate>
            <volume>23</volume>
            <issue>2</issue>
            <fpage>421</fpage>
            <lpage>436</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">16251508</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B71">
            <title>
               <p>Protein function prediction using local 3D templates</p>
            </title>
            <aug>
               <au>
                  <snm>Laskowski</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Watson</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Thornton</snm>
                  <fnm>JM</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>2005</pubdate>
            <volume>351</volume>
            <issue>3</issue>
            <fpage>614</fpage>
            <lpage>626</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">16019027</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B72">
            <title>
               <p>Analysis of catalytic residues in enzyme active sites</p>
            </title>
            <aug>
               <au>
                  <snm>Bartlett</snm>
                  <fnm>GJ</fnm>
               </au>
               <au>
                  <snm>Porter</snm>
                  <fnm>CT</fnm>
               </au>
               <au>
                  <snm>Borkakoti</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Thornton</snm>
                  <fnm>JM</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>2002</pubdate>
            <volume>324</volume>
            <issue>1</issue>
            <fpage>105</fpage>
            <lpage>121</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">12421562</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B73">
            <title>
               <p>Are acidic and basic groups in buried proteins predicted to be ionized?</p>
            </title>
            <aug>
               <au>
                  <snm>Kim</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Mao</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Gunner</snm>
                  <fnm>MR</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>2005</pubdate>
            <volume>348</volume>
            <issue>5</issue>
            <fpage>1283</fpage>
            <lpage>1298</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">15854661</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B74">
            <title>
               <p>Structural and kinetic determinants of aldehyde reduction by aldose reductase</p>
            </title>
            <aug>
               <au>
                  <snm>Srivastava</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Watowich</snm>
                  <fnm>SJ</fnm>
               </au>
               <au>
                  <snm>Petrash</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Srivastava</snm>
                  <fnm>SK</fnm>
               </au>
               <au>
                  <snm>Bhatnagar</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Biochemistry</source>
            <pubdate>1999</pubdate>
            <volume>38</volume>
            <issue>1</issue>
            <fpage>42</fpage>
            <lpage>54</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">9890881</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B75">
            <title>
               <p>Mutagenesis supports water mediated recognition in the trp repressor-operator system</p>
            </title>
            <aug>
               <au>
                  <snm>Joachimiak</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Haran</snm>
                  <fnm>TE</fnm>
               </au>
               <au>
                  <snm>Sigler</snm>
                  <fnm>PB</fnm>
               </au>
            </aug>
            <source>Embo J</source>
            <pubdate>1994</pubdate>
            <volume>13</volume>
            <issue>2</issue>
            <fpage>367</fpage>
            <lpage>372</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">394817</pubid>
                  <pubid idtype="pmpid">8313881</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B76">
            <title>
               <p>Electrostatic control of the membrane targeting of C2 domains</p>
            </title>
            <aug>
               <au>
                  <snm>Murray</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Honig</snm>
                  <fnm>B</fnm>
               </au>
            </aug>
            <source>Mol Cell</source>
            <pubdate>2002</pubdate>
            <volume>9</volume>
            <issue>1</issue>
            <fpage>145</fpage>
            <lpage>154</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">11804593</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B77">
            <title>
               <p>RASMOL: biomolecular graphics for all</p>
            </title>
            <aug>
               <au>
                  <snm>Sayle</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Milner-White</snm>
                  <fnm>EJ</fnm>
               </au>
            </aug>
            <source>Trends Biochem Sci</source>
            <pubdate>1995</pubdate>
            <volume>20</volume>
            <issue>9</issue>
            <fpage>374</fpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">7482707</pubid>
            </xrefbib>
         </bibl>
      </refgrp>
   </bm>
</art>
