BMC Bioinformatics

official impact factor 3.03

Open Access Research article

Derivation of an amino acid similarity matrix for peptide:MHC binding and its application as a Bayesian prior

Yohan Kim1, John Sidney1, Clemencia Pinilla2, Alessandro Sette1 and Bjoern Peters1*

Author Affiliations

1 Division of Vaccine Discovery, La Jolla Institute for Allergy and Immunology, La Jolla, California, USA

2 Immunology, Torrey Pines Institute for Molecular Studies, San Diego, California, USA

For all author emails, please log on.

BMC Bioinformatics 2009, 10:394 doi:10.1186/1471-2105-10-394

Published: 30 November 2009

Abstract

Background

Experts in peptide:MHC binding studies are often able to estimate the impact of a single residue substitution based on a heuristic understanding of amino acid similarity in an experimental context. Our aim is to quantify this measure of similarity to improve peptide:MHC binding prediction methods. This should help compensate for holes and bias in the sequence space coverage of existing peptide binding datasets.

Results

Here, a novel amino acid similarity matrix (PMBEC) is directly derived from the binding affinity data of combinatorial peptide mixtures. Like BLOSUM62, this matrix captures well-known physicochemical properties of amino acid residues. However, PMBEC differs markedly from existing matrices in cases where residue substitution involves a reversal of electrostatic charge. To demonstrate its usefulness, we have developed a new peptide:MHC class I binding prediction method, using the matrix as a Bayesian prior. We show that the new method can compensate for missing information on specific residues in the training data. We also carried out a large-scale benchmark, and its results indicate that prediction performance of the new method is comparable to that of the best neural network based approaches for peptide:MHC class I binding.

Conclusion

A novel amino acid similarity matrix has been derived for peptide:MHC binding interactions. One prominent feature of the matrix is that it disfavors substitution of residues with opposite charges. Given that the matrix was derived from experimentally determined peptide:MHC binding affinity measurements, this feature is likely shared by all peptide:protein interactions. In addition, we have demonstrated the usefulness of the matrix as a Bayesian prior in an improved scoring-matrix based peptide:MHC class I prediction method. A software implementation of the method is available at: http://www.mhc-pathway.net/smmpmbec webcite.