An assessment on epitope prediction methods for protozoa genomes
- Equal contributors
1 Programa de Pós-graduação em Ciências Farmacêuticas (CiPharma), Laboratório de Pesquisas Clínicas, Escola de Farmácia, Universidade Federal de Ouro Preto, Campus Morro do Cruzeiro, Ouro Preto, MG, 35400-000, Brazil
2 Laboratório de Imunologia Celular e Molecular, Instituto René Rachou, Av. Augusto de Lima, 1715, Barro Preto, Belo Horizonte, MG, 30190-002, Brazil
3 Laboratório de Imunopatologia, Núcleo de Pesquisas em Ciências Biológicas, Universidade Federal de Ouro Preto, Campus Morro do Cruzeiro, ICEB II, Ouro Preto, MG, 35400-000, Brazil
4 Laboratório de Parasitologia Celular e Molecular, Instituto René Rachou, Av. Augusto de Lima, 1715, Barro Preto, Belo Horizonte, MG 30190-002, Brazil
5 Departamento de Bioquímica e Imunologia, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Av. Antônio Carlos, Pampulha, 6627, 31270-901, Belo Horizonte, MG, Brazil
6 Pontifícia Universidade Católica, R. Rio Comprido, 4580, Monte Castelo, Contagem, MG, 32285-040, Brazil
7 Centro Universitário UNA, Instituto de Ciências Biológicas e da Saúde (ICBS), R. Guajajaras, 175, Centro, Belo Horizonte, MG 30180-100, Brazil
BMC Bioinformatics 2012, 13:309 doi:10.1186/1471-2105-13-309Published: 21 November 2012
Epitope prediction using computational methods represents one of the most promising approaches to vaccine development. Reduction of time, cost, and the availability of completely sequenced genomes are key points and highly motivating regarding the use of reverse vaccinology. Parasites of genus Leishmania are widely spread and they are the etiologic agents of leishmaniasis. Currently, there is no efficient vaccine against this pathogen and the drug treatment is highly toxic. The lack of sufficiently large datasets of experimentally validated parasites epitopes represents a serious limitation, especially for trypanomatids genomes. In this work we highlight the predictive performances of several algorithms that were evaluated through the development of a MySQL database built with the purpose of: a) evaluating individual algorithms prediction performances and their combination for CD8+ T cell epitopes, B-cell epitopes and subcellular localization by means of AUC (Area Under Curve) performance and a threshold dependent method that employs a confusion matrix; b) integrating data from experimentally validated and in silico predicted epitopes; and c) integrating the subcellular localization predictions and experimental data. NetCTL, NetMHC, BepiPred, BCPred12, and AAP12 algorithms were used for in silico epitope prediction and WoLF PSORT, Sigcleave and TargetP for in silico subcellular localization prediction against trypanosomatid genomes.
A database-driven epitope prediction method was developed with built-in functions that were capable of: a) removing experimental data redundancy; b) parsing algorithms predictions and storage experimental validated and predict data; and c) evaluating algorithm performances. Results show that a better performance is achieved when the combined prediction is considered. This is particularly true for B cell epitope predictors, where the combined prediction of AAP12 and BCPred12 reached an AUC value of 0.77. For T CD8+ epitope predictors, the combined prediction of NetCTL and NetMHC reached an AUC value of 0.64. Finally, regarding the subcellular localization prediction, the best performance is achieved when the combined prediction of Sigcleave, TargetP and WoLF PSORT is used.
Our study indicates that the combination of B cells epitope predictors is the best tool for predicting epitopes on protozoan parasites proteins. Regarding subcellular localization, the best result was obtained when the three algorithms predictions were combined. The developed pipeline is available upon request to authors.