Abstract
Background
The prediction of conformational Bcell epitopes is one of the most important goals in immunoinformatics. The solution to this problem, even if approximate, would help in designing experiments to precisely map the residues of interaction between an antigen and an antibody. Consequently, this area of research has received considerable attention from immunologists, structural biologists and computational biologists. Phagedisplayed random peptide libraries are powerful tools used to obtain mimotopes that are selected by binding to a given monoclonal antibody (mAb) in a similar way to the native epitope. These mimotopes can be considered as functional epitope mimics. Mimotope analysis based methods can predict not only linear but also conformational epitopes and this has been the focus of much research in recent years. Though some algorithms based on mimotope analysis have been proposed, the precise localization of the interaction site mimicked by the mimotopes is still a challenging task.
Results
In this study, we propose a method for Bcell epitope prediction based on mimotope analysis called Pep3DSearch. Given the 3D structure of an antigen and a set of mimotopes (or a motif sequence derived from the set of mimotopes), Pep3DSearch can be used in two modes: mimotope or motif. To evaluate the performance of Pep3DSearch to predict epitopes from a set of mimotopes, 10 epitopes defined by crystallography were compared with the predicted results from a Pep3DSearch: the average Matthews correlation oefficient (MCC), sensitivity and precision were 0.1758, 0.3642 and 0.6948. Compared with other available prediction algorithms, Pep3DSearch showed comparable MCC, specificity and precision, and could provide novel, rational results. To verify the capability of Pep3DSearch to align a motif sequence to a 3D structure for predicting epitopes, 6 test cases were used. The predictive performance of Pep3DSearch was demonstrated to be superior to that of other similar programs. Furthermore, a set of test cases with different lengths of sequences was constructed to examine Pep3DSearch's capability in searching sequences on a 3D structure. The experimental results demonstrated the excellent search capability of Pep3DSearch, especially when the length of the query sequence becomes longer; the iteration numbers of Pep3DSearch to precisely localize the target paths did not obviously increase. This means that Pep3DSearch has the potential to quickly localize the epitope regions mimicked by longer mimotopes.
Conclusion
Our Pep3DSearch provides a powerful approach for localizing the surface region mimicked by the mimotopes. As a publicly available tool, Pep3DSearch can be utilized and conveniently evaluated, and it can also be used to complement other existing tools. The data sets and open source code used to obtain the results in this paper are available online and as supplementary material. More detailed materials may be accessed at http://kyc.nenu.edu.cn/Pep3DSearch/ webcite.
Background
A Bcell epitope is defined as that part of antigen recognized by either a particular antibody molecule or a particular Bcell receptor of the immune system. It may be linear (continuous), i.e. a short contiguous stretch of amino acids, or conformational (discontinuous), consisting of sequence segments that are distantly scattered along the protein sequence and are brought together in spatial proximity when the protein is folded [1]. It has been estimated that more than ninety percent of Bcell epitopes are conformational [2,3]. The main purpose of Bcell epitope prediction is to provide the facilities for efficiently rational vaccine design [4]. Furthermore, synthetic peptides mimicking epitopes, as well as antipeptide antibodies, have many applications in the diagnosis of human diseases [5,6]. Therefore Bcell epitope prediction is very important in medicine research.
Though Bcell epitopes can be directly identified using many biochemical or physical experiments, such as Xray crystallography of antibodyantigen (AbAg) complexes, these experiments are usually costly, timeconsuming and are not always successful [7]. Computational methods to predict Bcell epitope are much more efficient and costeffective. However they are mainly focused on the prediction of linear epitopes [814], because only few antigens are completely annotated with respect to their conformational epitopes, which makes it difficult to develop a conformational epitope prediction method. To the best of our knowledge, DiscoTope [15] and CEP [16] are the only two methods for conformational epitope prediction that are based on antigen structure information. Recently, researchers tested and evaluated existing epitope prediction methods on benchmark datasets, and concluded that the accuracies of these methods are not high enough to significantly reduce the experimental workload [1719]. Combining experiments with computational methods can tremendously improve the accuracy of the epitope prediction at a modest cost in biological experiments. Therefore, it has attracted the attention of many researchers, especially in integrating computational methods with random peptide libraries. Several researchers have reported encouraging preliminary results using phagedisplay peptide libraries [2029]. Mimotopes can be selected from phagedisplayed random peptide libraries by affinity selection with monoclonal antibodies (mAb), socalled biopanning. The mAb affinityselected mimotopes can be selected by their capacity of binding to the Ab directly against a given Antigen (Ag). Obviously, the mimotopes and Ag are both recognized by the same Ab paratope and thus mimotopes are expected to mimic natural epitopes. The purpose of the computational approach is to analyze the set of mimotopes and then to localize the mimicked region that is regarded as the epitope candidate. Thereafter, biological experiments, such as sitedirected mutagenesis and deletion analysis, may be implemented for further validation.
Generally, a computational method has three steps to approach this goal: (i) the representation of the surface residues of the antigen; (ii) the search (or alignment) of the mimotopes (or motifs derived from the mimotopes) on the antigen surface; (iii) the output of the epitope candidates based on screening and clustering. Pizzi et al [20] were the first to combine computational methods with experimental results to assign epitopes. Recently, they published an improved method named MEPS [27]. In MEPS, the surface of antigen is represented by a collection of peptides below a certain length. The motifs that derived from the mimotopes are searched against this surface and alignment tools like BLAST can be directly used in the method. However, finding all given length simple paths (i.e. a sequence of neighboring residues) on a surface graph representing the exposed residues of the antigen is a NPhard (Nondeterministic Polynomialtime hard) problem [29]. Subsequently, several computational algorithms were proposed, in which some new strategies were adopted [2126,28,29]. For example, SiteLight [23] divides the antigen surface into overlapping patches and then aligns each mimotope with each patch based on the maximal bipartite matching algorithm. Mapitope [22,28] converts a set of mimotopes into overlapping residue pairs, then calculates them to rank the pairs' occurrences to obtain a set of major statistically significant pairs (SSP), and finally uses them to search the 3D structure of the antigen and links the SSP into clusters on the antigen surface. Lately, PepSurf [29], an epitope prediction program based on a colorcoding algorithm [30], proposed to search all possible simple paths in the surface graph of an antigen and adopted a clustering strategy for epitope prediction. However, the running time of PepSurf depends exponentially on the length of a mimotope. Therefore, on their online server, each mimotope used must be less than or equal to 14 amino acids in length. Although epitopes and mimotopes are functionally equivalent, they seldom share a similar sequence. The mimicry is supposed to rely on similarities in physicochemical properties and similar spatial organization. Moreover, the binding site of an antibody is a surface, not just a continuous sequence, so the epitope prediction problem is outside the scope of classical string alignment algorithms. Searching all the surface residues on an antigen of interest for the mimotopes is problematical. Therefore, although numerous phage display library based algorithms have been proposed to characterize Bcell epitopes, the precise localization of the interaction site mimicked by the mimotopes on the antigen surface is still an open challenge [25,29].
In this research, we presented a method, Pep3DSearch, based on mimotope analysis for Bcell epitope prediction. In Pep3DSearch, a promising ACO (Ant Colony Optimization) algorithm was proposed to search matching paths on an antigen surface with respect to the query mimotopes or a motif. The ACO algorithm adopted a novel heuristic strategy that makes it powerful in dealing with longer mimotopes or motifs. Moreover, the Pvalue calculation algorithm and the DFS (DepthFirst Search) algorithm, a graph search algorithm, were used to screen and cluster the result paths at the output stage. A group of test cases, which were all taken from published data, were applied to Pep3DSearch for validation of its performance. The experimental results showed that the predictive performance of Pep3DSearch was comparable to other epitope prediction algorithms, and some novel, rational results were provided.
Implementation
Algorithm flow
The Pep3DSearch algorithm flow is shown in Figure 1. Its input included a 3D structure of an antigen (a protein data bank (PDB) [31] file) and a set of mimotopes or a motif. Pep3DSearch identified all exposed residues of the given antigen and created a surface graph of it. The algorithm can be employed in two modes. The first mode is the mimotope mode, which searched for matching paths on the antigen surface with each query mimotope by the ACO algorithm. All paths were scored to the corresponding mimotope according to an aminoacid substitution matrix. Putative candidate epitopes were then picked out by the Pvalue calculation algorithm and the DFS algorithm. The second mode is the motif mode, which directly mapped the motif onto the antigen surface using the ACO algorithm and took the topscoring paths as epitope candidates.
Figure 1. An algorithmic flowchart of Pep3DSearch. Given the 3D structure of an antigen, Pep3DSearch identifies all the surface residues and creates a surface graph. After that, it can be used in two modes: mimotope or motif. In mimotope mode, every mimotope received as an input is aligned to the antigen surface and the epitope candidates are obtained through screening and clustering of the matched paths. In motif mode, a motif received as an input is mapped on to the antigen surface. Subsequently, the top scoring paths are output directly as the epitope candidates.
Graphical representation of the antigen surface
A Bcell epitope typically is a solvent accessible surface consisting of some 15–20 exposed residues derived from 2 to 3 discontinuous segments of the antigen [32]. Whether or not a residue is exposed can be determined by its solvent accessible surface area (SASA). In this study, the exposed residues in the study antigen were determined by three steps: (i) the total SASA of a residue composed of N atoms was calculated by: SASA = ∑_{N}A_{i}, where A_{i }is the SASA of the ith atom and determined by the Surface Racer program 4.0 [33] with a probe sphere of radius 1.4 Å, corresponding to a water molecule; (ii) the relative solvent accessibility (RSA) of a residue was calculated as the SASA of the residue compared to the maximum exposed surface of the same residue type in an extended ALAXALA tripeptide, where the maximum exposed surface of the residue X in the ALAXALA tripeptide is that calculated by Ahmad al. [34]; (iii) A residue was determined as being exposed if the value of its RSA is greater than a predefined threshold (default = 5%). A surface graph representing the exposed residues, G = (V,E), was defined, where V is the vertex set consisting of all exposed residues, and E is the edge set, where any two vertices are connected by an edge if the Euclidian distance between the two vertices is not greater than a predefined threshold. In Pep3DSearch, three methods were provided to calculate neighbor residue pairs on the antigen surface. Firstly the distance between the two residues was taken as the distance between the C_{α }atoms of the two amino acids. Using C_{α }atoms may better reflect the backbone positions. Secondly, the distance between the C_{β }atoms was used, which may better reflect the side chain position (the C_{α }atom was still used when it is a glycine because it does not have a C_{β }atom). Thirdly, the minimum distances between all the heavy atoms of the two residues were used. In Pep3DSearch, we used CA, CB and AHA to represent the three methods respectively and took CA as the default parameter with a distance threshold 7 Å.
The ACO algorithm
ACO is a multiagent heuristic algorithm used for combinatorial optimization. It was inspired by the capability of real ants to find the shortest path between their nest and a food source. The original ACO algorithm was introduced by Dorigo et al [35] for solving the traveling salesman problem (TSP). Since then, many researchers have extended the original algorithm, and have successfully applied their new algorithms to large scale TSP and other problems like the vehicle routing, scheduling, routing in Internetlike networks, and so on [36]. The successful application of ACO algorithms in the TSP inspired us to develop a new heuristic algorithm for solving the mimotope prediction problem. Our aim was to find a simple path on a surface graph that yielded the alignment to a mimotope or a motif with a maximal score. Similarly to the TSP, our problem was an ordering problem, i.e. the algorithm's aim was to put the different vertices in a certain order. However, several different aspects had to be considered: (i) our problem is a partial vertex permutation of a graph, in which the number of vertices in the permutation equals the residue number in the mimotope (or the motif); (ii) the edge of any two neighbor vertices must be the same length, and scoring a resulting path is only dependent on a vertex permutation, totally irrelevant to the path length; (iii) in a resulting path, some insertions/deletions may be permitted. Therefore, some new strategies were needed for solving our problem. The details of these strategies are described below.
Definition of the pheromone trail and the heuristic information
The pheromone trail and the heuristic information are two important parameters in the ACO algorithm. Theoretically, the pheromone trail can give the artificial ants a global guide in their decisionmaking, whereas the heuristic information can guide these ants to explore better paths locally. The quality of an ACO application depends greatly on the definition of the meaning of the pheromone trail and the heuristic information [35]. According to the features of our problem, pheromone and the heuristic information for each edge on surface graph were defined as follows:
Let τ^{(k)}(i, j) be the pheromone from vertex i to vertex j at the kth searching step in a solution, which encodes the favorability of visiting a certain vertex j after vertex i, where 1 ≤ k ≤ L, and L is the number of vertices in a resulting path (i.e. the number of residues in the mimotope or motif). In our approach, τ^{(k)}(i, j) was assigned an initial value at the start point and was updated after each iteration.
Let η^{(k)}(i, j) be the heuristic information from vertex i to vertex j at the kth searching step in a solution, which encodes the preference of visiting a certain vertex j after vertex i, where 1 ≤ k ≤ L, and L is the number of vertices in a resulting path. The value of η^{(k)}(i, j) was assigned according to the input mimotope (or motif) and the aminoacid substitution matrix used (see Scoring amino acid similarities). For example, let the mimotope be "ANYNATRGTVSA", and a row of the aminoacid substitution matrix used is supposed to be: "A←A(2.14), K(0.44), I(0.39), G(0.25), V(0.07), D(0.15), S(0.22), N(0.36), Q(0.36), T(0.4), F(0.61), C(0.61), E(0.7), L(0.73), M(0.91), Y(0.91), H(1.15), P(1.15), R(1.67), W(2.61)" which represents the scoring values of each aminoacid substitution for Alanine (A). It can be seen that the first, the fifth and twelfth amino acid in the mimotope are all alanine (A). In order to make the ants tend to find maximal alignment score in each step, for k = 1, 5 and 12, we will set η^{(k)}(i, j) = 2.14 if the vertex j is a Alanine (A) and i is any neighbor vertex of j, and in the same way, η^{(k)}(i, j) = 0.44, if the vertex j is a Lysine (K) and i is any neighbor vertex of j,..., finally, η^{(k)}(i, j) = 2.61, if the vertex j is a Tryptophan (W) and i is any neighbor vertex of j. In this way, for all 1 ≤ k ≤ 12 and each edge on the surface graph, η^{(k)}(i, j) can be defined and it naturally represents the preference of an ant in vertex i for vertex j in each searching step.
In the case of a motif, let Q = (q_{1}, q_{2},...,q_{L}) be the motif, then q_{k }(1 ≤ k ≤ L) may be a set of amino acids (e.g. [STDE], see Epitope prediction based on motif mapping), a gap () or a character "X" which means it can be any amino acid. When q_{k }is a set of amino acids (the set is named S), η^{(k)}(i, j) will be set to be the maximal value in all the scoring values of vertex i substitution for vertex j, where the vertex j belongs to the set S and i is any neighbor vertex of j; When q_{k }is a gap or a character "X", η(^{k)}(i, j) will be set to be the average value of the substitution matrix, if j and i are a pair of neighbors.
Scoring amino acid similarities
Algorithms for alignment of protein sequences typically measure similarity by using a substitution matrix with scores for all possible exchanges of one amino acid with another. The choice of the substitution matrix will directly influence the performance of the algorithms. However, the optimal substitution matrices used by the existing epitope prediction algorithms are generally not compatible with each other. Following comparison experiments, we chose the substitution matrix M_Blosum62 by Mayrose et al [29] as the default selection for the similar match mode. Moreover, we defined the substitution matrix STRICT as the default selection for the exact match mode, in which the scoring value of substitution between the same two amino acids is 1, whereas the scoring value of substitution between any two different aminoacids is 0. A simple path on the surface graph is a path in which all vertices are distinct. When an ant has no novisited edge to connect to other vertices, it is allowed to jump to a noedgeconnected vertex if the distance between the two vertices is less than the double predefined distance threshold. In this situation, a gap can be left on its path. For each unmatched residue, a penalty was added.
According to the above analysis, two methods for scoring the similarity of amino acids are proposed. For mimotope analysis, the similarity score h(q_{i}, p_{i}) of amino acids q_{i }and p_{i }is calculated by Equation (1):
Where minimum refers to the minimum value in the substitution matrix used; the values of penalty are set from 0 to 0.5 (default = 0.5); s(q_{i}, p_{i}) is the observed substitution score in the substitution matrix used.
In the case of motif analysis, let Q = (q_{1}, q_{2},..., q_{L}) be the motif and P = (p_{1}, p_{2},..., p_{L}) be the resulting path on the surface graph, then we calculate the similarity score h(q_{i}, p_{i}) (1 ≤ i ≤ L) by Equation (2):
Where average refers to the average value in the substitution matrix used; minimum denotes the minimum value in the substitution matrix used; the values of penalty is set from 0 to 0.5 (default = 0.5); s(q_{i}, p_{i}) is the observed substitution score in the substitution matrix used.
Building a solution
The pheromone trail and the heuristic information defined above will now be used by the ants to find the best solutions. Suppose the number of residues in the mimotope is L. Every ant starts with a virtual original point named "O", which is permitted to connect to any vertex on the graph. Then an ant will randomly choose a vertex as its first vertex, and builds a solution going from a vertex to another connected vertex. The process will not stop until the ant has visited L vertices on the graph. At the kth searching step (1 ≤ k ≤ L), the probability that an ant A in a vertex i will choose a vertex j as its next vertex is given by equation (3):
Where τ^{(k)}(i, j) and η^{(k)}(i, j) are the pheromone and the heuristic information between i and j at kth searching step, respectively. So the preference of an ant A in vertex i for vertex j is partly defined by the pheromone between i and j, and partly by the heuristic favorability of j after i. Parameters α and β define the relative importance of the pheromone information and the heuristic information (default α = β = 2). J_{A}(i) is the set of vertices that connect to i and have not yet been visited by the ant A in vertex i.
The fitness function
In order to guide the algorithm towards good solutions, a fitness function was defined
to assess the quality of the solutions. Let Q = (q_{1}, q_{2},..., q_{L}) be a mimotope (or a motif) of length L and P = (p_{1}, p_{2},...,p_{L}) be a simple path on the surface graph obtained by an ant. Then, the alignment score
between Q and P is defined as:
Updating the pheromone trail
After all the ants have completed one iteration, the pheromones were updated. Firstly, we defined the elite ant as follows: an ant was appointed as the elite ant only if the fitness value of the path obtained by the ant was greater than a threshold. Only the elite ants were permitted to leave the pheromones on its own path. The pheromones were updated according to equations (5) and (6).
Equation (5) consists of two parts and k represents the kth searching step. The left part makes the pheromone on all edges decay. The speed of this decay is defined by the evaporation parameter ρ (0 <ρ < 1) (default ρ = 0.05). The right part increases the pheromones on all the edges visited by the elite ants. The amount of pheromone that the elite ant deposits on an edge is defined by the fitness value of the path created by the ant, as in equation (6). In this way, the increase of pheromone for an edge depends on the number of the elite ants that use this edge, and on the quality of the solutions found by those ants.
In order to enhance exploration of ants and overcome the premature convergence of the ACO algorithm, an adaptive strategy was employed to determine the threshold (which was used to select the elite ants): (i) initially, the threshold was set to 1; (ii) within 300 iterations, if the total number of the elite ants determined in each iteration was less than 5, then the new threshold was set to equal the original threshold minus 0.1; within 20 iterations, if the total number of the elite ants determined in each iteration was greater than 10, then the new threshold was set to equal the original threshold plus 0.1. In addition, according to Stützle and Hoos [37], we defined an upper and lower limit (τ_{max }and τ_{min}) for the pheromone values. Stützle and Hoos defined τ_{max }and τ_{min }algebraically based on the probability of constructing the best solution found when all the pheromone values have converged to either τ_{max }or τ_{min}. In our approach, the aim of the ACO algorithm was mainly to provide a set of good quality solutions, rather than a best solution. Therefore we defined τ_{max }as being equal to the maximum value minus the minimum value in the aminoacid substitution matrix used, and τ_{min }as zero.
Output of epitope candidates
While running the ACO algorithm, all paths obtained by the elite ants were stored in a local database. How were putative epitope candidates produced from this set of paths? According to the different kinds of input sequences, i.e. a set of mimotopes or a motif, two different strategies were adopted. For the set of mimotopes, a clustering strategy was employed (described as next section); for the motif, the n highest scoring paths were chosen directly as the epitope candidates.
Pvalue calculation for a path
Typically, a set of input mimotopes contains a number of aminoacid sequences with different lengths. In order to rationally assess the paths obtained with different mimotopes, we calculated the probability of randomly obtaining a path with a specific score, i.e. Pvalue of the path. According to the work by Mayrose et al [29], the distribution of the scores of random paths can be approximated using an extreme value distribution, whose parameters are fitted from the empirical distribution using the method of moments. To obtain rational empirical distribution of alignment scores, we generated a set of m (default m = 10^{6}) random simple paths on the surface graph for every mimotope, and each random simple path was then aligned to the mimotope.
Creating a weighted graph of the result paths
We then selected those paths whose Pvalues were less than or equal to 10^{3 }as the result paths and created a weighted graph of the result paths G = (V, E), where V is the vertex set consisting of all the result paths, and E is the edge set, where any two vertices are connected by an edge if they share at least one residue. In addition, the weight of each vertex in G was defined as the Pvalue of the path.
Clustering the result paths based on DFS algorithm
The weighted graph defined above was generally unconnected. Each connection component in the graph, which may consist of several connected paths, can be regarded as a potential epitope candidate. Here, the DFS algorithm [30] was employed to compute all the connection components of the weighted graph. According to Mayrose et al [29], the surface accessible areas of 95% of all available epitopes in the PDB are not greater than 2000 Å^{2}. Moreover, a native epitope is generally less than 40 residues. Therefore, if the surface accessible area of a connection component was greater than 2000 Å^{2 }or the number of residues in the connection component was greater than 40, this connection component was reduced in size. By iteratively removing a path, the size was cut until the remaining part met the conditions. In each such iteration, the algorithm chose a path for removal such that the remaining connection components kept the maximum score. The score of a connection component was defined as the sum of log (Pvalue) of the paths within it. As a consequence, n maximum score connection components were output as the n epitope candidates (default n = 3).
Results
Epitope prediction based on mimotope analysis
In order to assess the predictive performance of Pep3DSearch, we applied it to ten test cases (see Table 1), which were all taken from other similar published data. These test cases fulfilled the following requirements: (i) a set of mimotopes were derived by screening an antibody in a biopanning experiment; (ii) a 3D structure of the antibodyantigen complex was available; (iii) the native epitope of each test case had been crystallographically defined. Due to the similar policy of fully scanning the mimotopes (or neighbor amino acid pair (AAP) derived from the mimotopes in Mapitope [22,28]) versus the 3D structure of the antigen, we mainly compared the results from Pep3DSearch with those from PepSurf [29] and Mapitope.
Table 1. The test cases used for Assessment of Pep3DSearch's performance in mimotope anlysis.
Epitope prediction using antibodyantigen test cases
The first test group (antibodyantigen test cases in Table 1) contained eight test cases from Mapitope, PepSurf and Mimox [26]. The first test case (labeled 1jrh in Table 1) contains 59 mimotopes of 5 residues in length. Lang et al [38] further analyzed the detailed interactions between the mAb A6 and the interferon gamma receptor (IFNgR) by selecting 59 fragments of the IFNgR mutants with high affinity for the mAb A6 by phage display. These fragments can thus be regarded as mimotopes of the IFNgR and the crystal structure of the mAb A6IFNgR complex has been resolved (PDB id: 1jrh). In the second test case (labeled 1bj1 in Table 1), mimotopes were obtained by a similar experiment to the first case, but here the Fab fragment of a humanized neutralizing antibody (also known as rhuMAb VEGF) was mutated and selected for binding to the vascular endothelial growth factor (VEGF) by phage display [39]. The structure of the rhuMAbVEGF complex has been deposited in the PDB (PDB id: 1bj1). In test cases three to eight, the six sets of mimotopes were obtained by screening phage display libraries with the 17b [22], 13b5 [22], Herceptin [40], Bo2C11 [41], Cetuximab Fab [42] and 82D6A3 IgG [43] antibodies respectively (see Table 1), and their corresponding AbAg complex structures have been resolved (PDB id: 1g9m, 1e6j, 1n8z, 1iqd, 1yy9 and 2adf). In addition, the native epitope for each test case (1–8) is present in the CED database [44]. We analyzed the mimotopes in the test cases with our Pep3DSearch, PepSurf and Mapitope, respectively. The results predicted by the three algorithms and evaluation in terms of the Matthews correlation coefficient (MCC) [45], sensitivity and precision are shown in Table 2. The results in Table 2 show that our Pep3DSearch successfully predicted all the mimotopes in all eight test cases. Especially, for the test cases 1bj1, 1n8z and 1yy9, the MCC, sensitivity and precision values of Pep3DSearch were considerably superior to those of PepSurf and Mapitope. For the test case 1iqd, PepSurf yielded the best performance (MCC: 0.1272; sensitivity: 0.2581; precision: 0.5); though Mapitope achieved the highest precision (0.9375), it gave the lowest MCC (0.3502) and sensitivity (0.1415); Pep3DSearch yielded inferior prediction (MCC: 0.0356; sensitivity: 0.1277; precision: 0.375) with default parameters, whereas it obtained better prediction by using distance parameter CB with threshold 6.5 (MCC: 0.1604; sensitivity: 0.2326; precision: 0.625, see Table 3). Furthermore, for the test cases 1jrh, 1g9m, 1e6j and 2adf, Pep3DSearch and PepSurf gave better predictions, while Mapitope failed in the test cases 1e6j and 2adf.
Table 2. Evaluation and comparison of the performances of Pep3DSearch.
Table 3. Comparison of the predictive performance of Pep3DSearch with different distance parameters (CB).
Using Pep3DSearch for the prediction of proteinprotein interacting sites
In order to compare Pep3DSearch with previously published algorithms, we applied it to detect the interface residues of the interacting proteins for the two test cases, 1avz and 1hx1 (proteinprotein test cases in Table 1), which were taken from PepSurf. Rickles et al [46] used the FynSH3 domain to select a semicombinatorial random peptide library and obtained 18 affinityselected peptides. The cocrystal of FynSH3 domain with its interacting protein Nef and FynSH2 domain is now available (PDB id: 1avz). The second test case was taken from the work by Takenaka et al. [47]. They screened a random phage library against the 70 kDa heat shock cognate (Hsc70) protein and obtained a set of peptides that bind Hsc70. The structure of Hsc70 with its interacting protein Bag chaperone regulator has been deposited in the PDB (PDB id: 1hx1). For each of the above test cases, the prediction was compared to the 'true' proteinprotein interacting site that was inferred using the 'Contact Map Analysis' server [48].
From Table 2, it can be seen that both Pep3DSearch and PepSurf obtained better results than Mapitope. Especially, for the test case 1hx1, the results showed a complementarity between Pep3DSearch and PepSurf: the 24 contacting residues of protein Hsc70 and Bag chaperone regulator inferred by Contact Map Analysis server were R205 KA (208–209) IE (211–212) MK (215–216) LE (218–219) IDTLIL (221–226) R234 RK (237–238) VK (241–242) Q245 L248 D252 E255; the 39 contacting residues predicted by Pep3DSearch were GNS (150–152) E155 V157 K161 H164 K167 K171 AD (173–174) L200 K202 D204 R205 R206 KA (208–209) I211 M215 L218 FKD (230–232) R234 LK (235–236) RK (237–238) G239 VK (241–242) K243 Q245 AF (246–247) L248 AE (249–250); the 25 contacting residues suggested by PepSurf were K161 KHL (163–165) KS (167–168) E182 GI (185–186) D204 R205 R206 KA (208–209) I211 MK (215–216) I217 LE (218–219) E220 DT (222–223) L248 E255. From the above results, it is evident that in the predicted results of Pep3DSearch, six epitope residues R234, R237,K238, V241, K242 andQ245 were missed by PepSurf, while in the predicted results of PepSurf, five epitope residues K216,E219, D222,T223 and E255 were missed by Pep3DSearch.
The overall performance of each method was measured by average MCC, sensitivity and precision values. Compared with PepSurf and Mapitope, Pep3DSearch achieved the best average MCC, precision values and secondbest average sensitivity value (average MCC, sensitivity and precision values of predicted results by Pep3DSearch were 0.1758, 0.3642, 0.6948; PepSurf were 0.1589, 0.3944 and 0.5409; Mapitope were 0.1053, 0.3404 and 0.4081, see Figure 2). In addition, Pep3DSearch provides three parameters to calculate neighbor residue pairs on antigen surface, which are CB, CA and AHA. The experimental results that examined Pep3DSearch's performance with different parameters are listed in Table 3 to 5. The overall performance analyses in terms of average MCC, sensitivity and precision values are shown in Figure 3. Generally, Pep3DSearch obtained better results by using the parameter CA (distance threshold = 7) than by the other parameters. Subsequently the parameter CA with distance threshold 7 was set as the default.
Figure 2. Overall performance evaluation of Pep3DSearch using average MCC, sensitivity and precision values. From Figure 2, it can be seen that Pep3DSearch obtained the best average MCC, precision values and secondbest average sensitivity value; PepSurf obtained the best average sensitivity value and secondbest average MCC and precision values; Mapitope gave inferior results in comparison with the above two methods.
Figure 3. Overall performance analysis of Pep3DSearch with different distance parameters CB, CA and AHA. From Figure 3, it can be seen that with parameter CA (DT (distance threshold) = 7), Pep3DSearch obtained the best average MCC value (0.1758), precision value (0.6948), and the better average sensitivity (0.3642). In Pep3DSearch the parameter CA with distance threshold 7 is set as the default.
Table 4. Comparison of the predictive performance of Pep3DSearch with different distance parameters (CA).
Table 5. Comparison of the predictive performance of Pep3DSearch with different distance parameters (AHA).
Epitope prediction based on motif mapping
Pep3DSearch also provides the selection of predicting epitope based on motif mapping. The motif sequence can be derived from the set of mimotopes by using multiple sequence alignment tools such as ClustalW [49] or directly using the Mimox web service, and it is thus supposed to contain important residues for interaction of the Ab and the Ag. After mapping the motif sequence on to the antigen surface, Pep3DSearch obtained a set of matched paths and those topscoring paths were selected as the epitope candidates. In order to assess the performance of Pep3DSearch, six test cases were applied and the results are listed in Table 6 and Supplementary Table S1 to S5 [see Additional file 1]. Here, we describe one experiment of the test case 1e6j (Table 6) in detail. The test case 1e6j is taken from Mapitope and Mimox. EnshellSeijffers et al [22] used the mAb 13B5 (recognizing HIV1 capsid protein p24) to select a phage displayed random peptide library and obtained a set of 16 mimotopes. The structure of p24 with 13B5 has been resolved [PDB: 1e6j], and the 13B5 epitope, which is composed of ALGPAATEE (204–210, 212, 213) TA (216–217), has been recorded in the CED database as CE0170. Using Mapitope, EnshellSeijffers et al suggested that 13B5 epitope residues might consist of E187 D197 A204 GPAA (206–209) EE (212–213) A217, in which the epitope residues are marked in bold. It should be noted that when all parameters were set to default, Mapitope predicted candidate residues A194 N195 P196 D197 C198 A217 (i.e. among the six predicted residues, only one was epitope residue). Furthermore, Huang J et al [26] derived a motif sequence, [DE] V [FM] GPL [STDE] TXX [DE], from the 16 mimotopes using Mimox. Mimox has no ability to directly analyze the motif sequence of this type, therefore they derived three fragments, GPL, ET and EE, from the motif by manual parsing. Using the three fragments as the motif sequences respectively, they predicted the 13B5 epitope using MIMOX. For the fragment GPL, the top two candidates given by MIMOX were G206 P207 L205 and G106 P49 L52; for the fragments ET, the top three candidates were E212 T216, E213 T216 and E212 T210; for the fragments EE, the top three candidates were E28 E29, E29 E28 and E212 E213. Using Pep3DSearch we directly mapped the motif sequence, [DE]V [FM]GPL [STDE]TXX [DE], on to the antigen surface of p24 to predict the 13B5 epitope. Under the similar match mode (i.e. using substitution matrix M_Blosum62, see Scoring amino acid similarities) and parameter AHA (distance threshold = 4), the top ten predicted candidates by Pep3DSearch are listed in Table 6. From Table 6, we can see that the ten candidates all successfully localized in the epitope region. Especially, the eighthranked candidate gave the best results: D197 I201 L205 G206 P207 A209 E213 T210 M214 A217 T216 E212. Taking the top ten candidates together, we obtained a total of 25 residues suggested by Pep3DSearch, which overlap 10 of the 11 epitope residues in the 13B5. The other five experiments for assessing the performance of Pep3DSearch are similar to the procedure mentioned above, and their results are listed in Supplementary Tables S1 to S5 [see Additional file 1]. These experiments show that Pep3DSearch is effective and efficient in predicting epitopes in motif mode.
Additional file 1. Supplementary experimentresults. The file contains supplementary tables S1 to S6.
Format: PDF Size: 23KB Download file
This file can be viewed with: Adobe Acrobat Reader
Table 6. Epitope prediction of the test case 1e6j (chain: P) based on motif mapping : motif sequence taken from Mimox is [DE]V [FM]GPL [STDE]TXX [DE]; native epitope recorded in CED (id: CE0170) is ALGPAATEE (204–210, 212, 213) TA (216–217); parameters of Pep3DSearch are similarity mode and AHA (distance threshold = 4).
The searching capability of Pep3DSearch
In general, the searching algorithm has a great impact on the effectiveness and efficiency of an epitope prediction program. Therefore it is the most important part of the whole design process. In Pep3DSearch, the ACO algorithm, a kind of heuristic algorithm, is employed for searching mimotopes or motifs on an antigen surface. In order to evaluate the capability of the ACO algorithm for searching the target paths with various lengths on the antigen surface, we took gp120 (the envelope protein of HIV; chain G; PDB id: 1g9m; the residue number of the antigen is 304, see Table 2) as the target antigen and randomly selected the paths with lengths from 9 to 25 (odd numbers) residues on the antigen surface as the search goals. As shown in Figure 4, a path on the gp120 surface with 25 residues is localized firstly, E351 S347 K343 Q344 K348 I272 N234 G237 N94 K97 D99 M100 K487 V489 L226 V488 A224 A219 Y217 C218 Q246 V84 L86 N88 T240, in which the Euclidian distance of any two neighbor residues is less than or equal to 7.5 Å. From this path, 9 subpaths with lengths from 9 to 25 (odd numbers) residues were randomly selected as the test cases (see Table 7 and Supplementary Table S6 in Additional file 1). Here, we describe one experiment in detail to explain the search process of the target path with 21 residues on the gp120 surface. The target path is E351 S347 K343 Q344 K348 I272 N234 G237 N94 K97 D99 M100 K487 V489 L226 V488 A224 A219 Y217 C218 Q246 (see Table 7). We used the target path itself and mutations of it as input sequence for Pep3DSearch to localize the target path on the gp120 surface. Some residues on the original sequence were randomly changed (the mutation rates vary from 10% to 30%). From Table 7, it can be seen that Pep3DSearch quickly localized the target path with 5000 iteration numbers. When the input sequence was the target path itself (ESKQKINGNKDMKVLVAAYCQ), the path localized by Pep3DSearch with the iteration number of 5000 was E351 S347 K343 Q344 K348 I272 N234 G237 N94 K97 D99 V488 K487 V489 L226 V245 A224 A219 Y217 C247 Q246, which overlaps 19 of the 21 residues in the target path; when the iteration number was set to 25000, Pep3DSearch precisely localized the target path. When the iteration number was 30000, the path localized by Pep3DSearch was E351 S347 K343 Q344 K348 I272 N234 G237 N94 K97 D99 M100 K487 V489 L226 V488 A224 A219 Y217 C247 Q246. Though the twentieth residue (C247) on the localized path is not identical with the corresponding one (C218) on the target path in that position, they are all Cysteine. When a mutated sequence is used as input sequence, Pep3DSearch still localized the region of the target path. For example, using ESKDRINGNCDMKVHVAAYAQ (the mutation rate is 25%) as input, Pep3DSearch gave the topranked output: E267 T232 K231 N229 K485 F233 N234 G237 N94 ___ D99 M100 K487 V488 ___ I491 G222 A219 F223 A224 Q246 with 10000 iteration numbers. As shown in Table 7, although Pep3DSearch got the worst result in the test case, it overlaps 10 of 21 residues in the target path.
Figure 4. A path on the gp120 (the envelope protein of HIV) surface. The path on the gp120 surface, which is used to evaluate the searching capability of Pep3DSearch, is composed of 25 residues, E351 S347 K343 Q344 K348 I272 N234 G237 N94 K97 D99 M100 K487 V489 L226 V488 A224 A219 Y217 C218 Q246 V84 L86 N88 T240, in which the Euclidian distance of the any two neighbor residues is less than or equal to 7.5 Å.
Table 7. Evaluation of the Pep3DSearch's searching capability.
The experiments of other eight test cases for assessing Pep3DSearch's searching capability are all based on similar procedures to the one described above. Those experimental results are listed in Supplementary Table S6 [see Additional file 1]. The experiments demonstrate the excellent search capability of Pep3DSearch, especially when the length of the query sequence becomes longer; the iteration numbers of Pep3DSearch for localizing the target paths on the protein surface did not change significantly. Thus, Pep3DSearch can be used for quickly localizing the epitope regions mimicked by longer mimotopes (more than 20residues), and the proposed ACO algorithm has further potential in other applications involving sequencestructure alignment.
Discussion
In this study we developed a method, Pep3DSearch, for epitope prediction based on mimotope and motif analysis. An ACO algorithm was proposed for aligning a 1D mimotope sequence (or a motif sequence) to the 3D structure of an antigen, and Pvalue calculation based screening strategy and DFS algorithm based clustering strategy were employed in localizing epitope candidate regions. Compared with competing methods, our Pep3DSearch adopts a simple and natural strategy to deal with matches, gaps and deletions in aligning a sequence to an antigen surface, which makes it more efficient and effective, not only for sequence search, but also for motif discovery.
We conducted different sets of experiments to assess our method's performance. The results show that our method is comparable to other similar methods. In some test cases, our method is superior to the others or can provide complementary information to them. On the other hand, in order to examine the searching capability of our method, a set of test cases with differentlength sequences was constructed. The experiment showed that our method has excellent capability in searching sequences on a structure, especially when the length of the query sequence becomes longer (up to 25 residues); the iteration numbers of Pep3DSearch for precisely localizing sequence did not change significantly. Thus the method has further potential for localizing the epitope regions mimicked by longer mimotopes. For example, using an mRNA display technique, one can obtain affinityselected peptides of more than 20 residues against an antibody [50]. Moreover, the method also has potential for other applications, such as querying pathways in proteinprotein interaction networks [51]. The Pep3DSearch algorithm depends on several parameters that may influence its prediction accuracy, such as iteration number, gap penalty and distance threshold defining two neighbor residues. However, because of the limited availability benchmark datasets, we only examined a limited set of values for each parameter and were constrained in properly learning these parameters. In our experiments, varying these parameters within a reasonable range did not significantly influence the prediction results (see Table 3 to 5).
The Pep3DSearch algorithm is basically divided into three steps: generating random paths on the surface graph of an antigen for Pvalue calculation (which is not needed for motif analysis), searching the optimal paths for each mimotope (or a motif), and clustering these paths into several epitope candidates. The running time of the algorithm mainly depends on the number of graph edges, the number of mimotopes, the length of each mimotope (or the motif), and the number of generated random paths for Pvalue calculation. For a mimotope with 14 or 15 amino acids, generating 10^{6 }random paths to obtain the empirical distribution of alignment scores for Pvalue calculation may take about 10 minutes (using a PC with a Intel Core 2 processor at 1.86 GHz); searching the optimal paths may take few minutes (the iteration number is 20000 in default); clustering paths can complete in a few seconds. So the main computational burden of the algorithm comes from the Pvalue calculation.
Theoretically, the estimation of the statistical parameters for an alignment score distribution function requires a large number of random paths on the surface graph of the antigen for aligning to the mimotopes. Actually, the number of the paths generated at random is determined according to a given time limit, so that the algorithm can make a tradeoff between computational time consumed and the accuracy of the final results. We set the number to 10^{6 }in default. In general, when a set of mimotopes is to be analyzed, the running time of the algorithm will linearly increase with the number of mimotopes. However, because a collection of paths generated at random for Pvalue calculation can be used by all those mimotopes in the same length in the set of the mimotopes, the actual running time of the algorithm is much shorter in practice.
We plan to improve our method by further research in at least four areas: 1) by improving the method to identify surfaceexposed residues in an antigen; 2) by attempting more effective strategies for searching a path and dealing with matches, gaps and deletions in aligning a sequence to antigen surface in the ACO algorithm; 3) by choosing a better aminoacid substitution matrix in scoring procedure for a specialized application; and 4) by studying more efficient methods for Pvalue calculation.
Conclusion
This research makes two valuable contributions to the field of epitope prediction. Firstly, a promising ACO algorithm was proposed to align a sequence or a motif to an antigen surface. Secondly, an application program, Pep3DSearch, was developed for epitope prediction based on mimotope or motif analysis. As a standalone program in this area, Pep3DSearch is publicly accessible [see Additional file 2]. The program was tested and evaluated by several datasets [see Additional file 1, 3, 4 and 5]. The results indicate that Pep3DSearch is comparable to other similar tools.
Additional file 2. Source code, test datasets, Pep3DSearch toolkit and operation manual. The file is a ZIP archive containing the Visual Basic source code for Pep3DSearch, licensed under the GNU General Public License. It also contains the test datasets, the Pep3DSearch toolkit and the operation manual (in PDF format) of Pep3DSearch. Updated versions will be available at http://kyc.nenu.edu.cn/Pep3DSearch/ webcite.
Format: ZIP Size: 5MB Download file
Additional file 3. An example of predicting epitopes based on mimotope analysis. The file is a ZIP archive containing all materials to predict the epitopes in the test case 1n8z using Pep3DSearch based on mimotope analysis.
Format: ZIP Size: 8.6MB Download file
Additional file 4. An example of predicting epitopes based on motif analysis. The file is a ZIP archive containing all materials to predict the epitopes in the test case 1e6j using Pep3DSearch based on motif analysis.
Format: ZIP Size: 3.2MB Download file
Additional file 5. An example of evaluating the searching capability of Pep3DSearch. The file is a ZIP archive containing all materials to evaluate the Pep3DSearch's searching capability by localizing the target path of 21 residues in length on the surface of the protein 1g9m (chain G) with original and mutated sequences of the target path as inputs.
Format: ZIP Size: 16.6MB Download file
Availability and requirements
Project name: Pep3DSearch
Project's homepage: http://kyc.nenu.edu.cn/Pep3DSearch/ webcite
Operating system: Windows XP Professional with Service Pack 2(or later) with Microsoft .NET Framework 1.1(or later) installed
Programming language: Visual Basic.Net
License: GNU GPL
Any restrictions to use by nonacademics: license needed for commercial use
Authors' contributions
YXH designed the algorithm, performed the experiments and the analysis, and drafted the manuscript. YLB conceived of this study and discussed and suggested for algorithm improvement. SYG and YW collected the test data and carried out part of the experimental work and participated in writing the manuscript. CGZ designed research and contributed ideas. YXL supervised and directed the development process of the whole project and revised the manuscript critically. All authors have read and approved the final manuscript.
Acknowledgements
This work was supported by the National Natural Science Foundation of China (Grant 30672068), the Distinguished Young Scholars Fund of Jilin Province (20050114), the Key Grant of Jilin Province Science & Technology Committee (2006092301), the Key Grant of Changchun City Science & Technology Committee (06GG147), the Program for New Century Excellent Talents in University (grant NCET060320), the China Postdoctoral Science Foundation (20080431048), the Cultivation Fund of the Scientific and Technical Innovation Project of Northeast Normal University (grant NENUSTB07008).
References

van Regenmortel MH: Antigenicity and immunogenicity of synthetic peptides.
Biologicals 2001, 29:209213. PubMed Abstract  Publisher Full Text

Barlow DJ, Edwards MS, Thornton JM: Continuous and discontinuous protein antigenic determinants.
Nature 1986, 322:747748. PubMed Abstract  Publisher Full Text

van Regenmortel MH: Mapping epitope structure and activity: From onedimensional prediction to fourdimensional description of antigenic specificity.
Methods 1996, 9:465472. PubMed Abstract  Publisher Full Text

De Groot AS: Immunomederived vaccines.
Expert Opin Biol Ther 2004, 4:767772. PubMed Abstract  Publisher Full Text

Gomara MJ, Haro I: Synthetic peptides for the immunodiagnosis of human diseases.
Curr Med Chem 2007, 14(5):531546. PubMed Abstract  Publisher Full Text

Meloen RH, Puijk WC, Langeveld JP, Langedijk JP, Timmerman P: Design of synthetic peptides for diagnostics.
Curr Protein Pept Sci 2003, 4(4):253260. PubMed Abstract  Publisher Full Text

Gershoni JM, RoitburdBerman A, SimanTov DD, Tarnovitski FN, Weiss Y: Epitope Mapping: The First Step in Developing EpitopeBased Vaccines.
Drug Development Biodrugs 2007, 21(3):145156. PubMed Abstract  Publisher Full Text

Alix AJ: Predictive estimation of protein linear epitopes by using the program PEOPLE.
Vaccine 1999, 18(324):311314. PubMed Abstract  Publisher Full Text

Odorico M, Pellequer J: BEPITOPE: predicting the location of continuous epitopes and patterns in proteins.
J Mol Recognit 2003, 16:2022. PubMed Abstract  Publisher Full Text

Saha S, Raghava GP: BcePred: Prediction of continuous Bcell epitopes in antigenic sequences using physicochemical properties. In ICARIS, LNCS. Volume 3239. Edited by Nicosia G, Cutello V, Bentley PJ, Timis J. Springer; 2004::197204.

Larsen JE, Lund O, Nielsen M: Improved method for predicting linear Bcell epitopes.
Immunome Res 2006, 2:2. PubMed Abstract  BioMed Central Full Text  PubMed Central Full Text

Saha S, Raghava GP: Prediction of continuous Bcell epitopes in an antigen using recurrent neural network.
Proteins 2006, 65(1):4048. PubMed Abstract  Publisher Full Text

Sollner J, Mayer B: Machine learning approaches for prediction of linear Bcell epitopes on proteins.
J Mol Recognit 2006, 19(3):200208. PubMed Abstract  Publisher Full Text

Sollner J: Selection and combination of machine learning classifiers for prediction of linear Bcell epitopes on proteins.
J Mol Recognit 2006, 19(3):209214. PubMed Abstract  Publisher Full Text

Anderson PH, Nielsen M, Lund O: Prediction of residues in discontinuous Bcell epitopes using protein 3D structure.
Protein Science 2006, 15:25582567. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

KulkarniKale U, Bhosle S, Kolaskar AS: CEP: a conformational epitope prediction server.
Nucleic Acids Res 2005, 33:W168W171. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Blythe MJ, Flower DR: Benchmarking B cell epitope prediction: underperformance of existing methods.
Protein Sci 2005, 14(1):246248. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Greenbaum JA, Andersen PH, Blythe M, Bui HH, Cachau RE, Crowe J, Davies M, Kolaskar AS, Lund O, Morrison S, et al.: Towards a consensus on datasets and evaluation metrics for developing Bcell epitope prediction tools.
J Mol Recognit 2007, 20(2):7582. PubMed Abstract  Publisher Full Text

Ponomarenko JV, Bourne PE: Antibodyprotein interactions: benchmark datasets and prediction tools evaluation.
BMC Structural Biology 2007, 7(2):64. PubMed Abstract  BioMed Central Full Text  PubMed Central Full Text

Pizzi E, Cortese R, Tramontano A: Mapping epitopes on protein surfaces.
Biopolymers 1995, 36:675680. PubMed Abstract  Publisher Full Text

Mumey BM, Bailey BW, Kirkpatrick B, Jesaitis AJ, Angel T, Dratz EA: A New Method for Mapping Discontinuous Antibody Epitopes to Reveal Structural Features of Proteins.
J Comput Biol 2003, 10:555567. PubMed Abstract  Publisher Full Text

EnshellSeijffers D, Denisov D, Groisman B, Smelyanski L, Meyuhas R, Gross G, Denisova G, Gershoni JM: The mapping and reconstitution of a conformational discontinuous Bcell epitope of HIV1.
J Mol Biol 2003, 334:87101. PubMed Abstract  Publisher Full Text

Halperin I, Wolfson H, Nussinov R: SiteLight: bindingsite prediction using phage display libraries.
Protein Sci 2003, 12:13441359. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Schreiber A, Humbert M, Benz A, Dietrich U: 3DEpitopeExplorer (3DEX): Localization of conformational epitopes within threedimensional structures of proteins.
J of Comput Chem 2005, 26(9):879887. Publisher Full Text

Moreau V, Granier C, Villard S, Laune D, Molina F: Discontinuous epitope prediction based on mimotope analysis.
Bioinformatics 2006, 22(9):10881095. PubMed Abstract  Publisher Full Text

Huang J, Gutteridge A, Honda W, Kanehisa M: MIMOX: a web tool for phage display based epitope mapping.
BMC Bioinformatics 2006, 7:451. PubMed Abstract  BioMed Central Full Text  PubMed Central Full Text

Castrignano T, De Meo PD, Carrabino D, Orsini M, Floris M, Tramontano A: The MEPS server for identifying protein conformational epitopes.
BMC Bioinformatics 2007, 8(Suppl 1):S6. PubMed Abstract  BioMed Central Full Text  PubMed Central Full Text

Bublil EM, Freund NT, Mayrose I, Penn O, RoitburdBerman A, Rubinstein ND, Pupko T, Gershoni JM: Stepwise prediction of conformational discontinuous Bcell epitopes using the Mapitope algorithm.
Proteins 2007, 68(1):294304. PubMed Abstract  Publisher Full Text

Mayrose I, Shlomi T, Rubinstein ND, Gershoni JM, Ruppin E, Sharan R, Pupko T: Epitope mapping using combinatorial phagedisplay libraries: a graphbased algorithm.
Nucleic Acids Res 2007, 35(1):6978. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Sedgewick Robert: Algorithms in C++ Part 5: Graph Algorithms. 3rd edition. AddisonWesley; 2001.

Sussman JL, Lin D, Jiang J, Manning NO, Prilusky J, Ritter O, Abola EE: Protein Data Bank (PDB): database of threedimensional structural information of biological macromolecules.
Acta Crystallogr D Biol Crystallogr 1998, 54:10781084. PubMed Abstract  Publisher Full Text

van Regenmortel MH, Pellequer JL: Predicting antigenic determinants in proteins: looking for unidimensional solutions to a threedimensional problem?
Pept Res 1994, 7(4):224228. PubMed Abstract

Tsodikov OV, Record MT, Sergeev YV: Novel computer program for fast exact calculation of accessible and molecular surface areas and average surface curvature.
J Comput Chem 2002, 23:600609. PubMed Abstract  Publisher Full Text

Ahmad S, Gromiha M, Fawareh H, Sarai A: ASAView : Database and tool for solvent accessibility representation in proteins.
BMC Bioinformatics 2004, 5:51. PubMed Abstract  BioMed Central Full Text  PubMed Central Full Text

Dorigo M, Maniezzo V, Colorni A: Ant System: Optimization by a Colony of Coorperating Agents.
IEEE Trans Syst Man Cybern B Cybern 1996, 26(1):841. Publisher Full Text

Dorigo M, Stützle T: The ant colony optimization metaheuristic: Algorithms, applications and advances. [http://iridia.ulb.ac.be/~meta/newsite/downloads/TR.11MetaHandBook.pdf] webcite

Stützle T, Hoos H: MAXMIN ant system.
Future Generation Computer Systems 2000, 16(8):889914. Publisher Full Text

Lang S, Xu J, Stuart F, Thomas RM, Vrijbloed JW, Robinson JA: Analysis of antibody A6 binding to the extracellular interferon gamma receptor alphachain by alaninescanning mutagenesis and random mutagenesis with phage display.
Biochemistry 2000, 39:1567415685. PubMed Abstract  Publisher Full Text

Chen Y, Wiesmann C, Fuh G, Li B, Christinger HW, McKay P, de Vos AM, Lowman HB: Selection and analysis of an optimized antiVEGF antibody: crystal structure of an affinitymatured Fab in complex with antigen.
J Mol Biol 1999, 293:865881. PubMed Abstract  Publisher Full Text

Riemer AB, Klinger M, Wagner S, Bernhaus A, Mazzucchelli L, Pehamberger H, Scheiner O, Zielinski CC, JensenJarolim E: Generation of peptide mimics of the epitope recognized by trastuzumab on the oncogenic protein Her2/neu.
J Immunol 2004, 173:394401. PubMed Abstract  Publisher Full Text

Villard S, LacroixDesmazes S, KieberEmmons T, Piquer D, Grailly S, Benhida A, Kaveri SV, SaintRemy JM, Granier C: Peptide decoys selected by phage display block in vitro and in vivo activity of a human antiFVIII inhibitor.
Blood 2003, 102:949952. PubMed Abstract  Publisher Full Text

Riemer AB, Kurz H, Klinger M, Scheiner O, Zielinski CC, JensenJarolim E: Vaccination with cetuximab mimotopes and biological properties of induced antiepidermal growth factor receptor antibodies.
J Natl Cancer Inst 2005, 97:16631670. PubMed Abstract  Publisher Full Text

Vanhoorelbeke K, Depraetere H, Romijn RA, Huizinga EG, De Maeyer M, Deckmyn H: A consensus tetrapeptide selected by phage display adopts the conformation of a dominant discontinuous epitope of a monoclonal antiVWF ntibody that inhibits the von Willebrand factorcollagen interaction.
J Biol Chem 2003, 278:3781537821. PubMed Abstract  Publisher Full Text

Huang J, Honda W: CED: a conformational epitope database.
BMC Immunol 2006, 7(1):7. PubMed Abstract  BioMed Central Full Text  PubMed Central Full Text

Matthews BW: Comparison of the predicted and observed secondary structure of T4 phage lysozyme.
Biochim Biophys Acta 1975, 405(2):442451. PubMed Abstract

Rickles RJ, Botfield MC, Weng Z, Taylor JA, Green OM, Brugge JS, Zoller MJ: Identification of Src, Fyn, Lyn, PI3K and Abl SH3 domain ligands using phage display libraries.
EMBO J 1994, 13:55985604. PubMed Abstract  PubMed Central Full Text

Takenaka IM, Leung SM, McAndrew SJ, Brown JP, Hightower LE: Hsc70binding peptides selected from a phage display peptide library that resemble organellar targeting sequences.
J Biol Chem 1995, 270:1983919844. PubMed Abstract  Publisher Full Text

Sobolev V, Eyal E, Gerzon S, Potapov V, Babor M, Prilusky J, Edelman M: SPACE: a suite of tools for protein structure prediction and analysis based on complementarity and environment.
Nucl Acids Res 2005, 33:W39W43. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Thompson JD, Higgins DG, Gibson TJ: CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, positionspecific gap penalties and weight matrix choice.
Nucleic Acids Res 1994, 22(22):46734680. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Ja WW, Olsen BN, Roberts RW: Epitope mapping using mRNA display and a unidirectional nested deletion library.
Protein Eng Des Sel 2005, 18:309319. PubMed Abstract  Publisher Full Text

Shlomi T, Segal D, Ruppin E, Sharan R: QPath: a method for querying pathways in a proteinprotein interaction network.
BMC Bioinformatics 2006, 7:199. PubMed Abstract  BioMed Central Full Text  PubMed Central Full Text