Log on / register
Feedback | Support | My details
Open AccessHighly AccessResearch article

Identification of Proteins Secreted by Malaria Parasite into Erythrocyte using SVM and PSSM profiles

Ruchi Verma1 email, Ajit Tiwari2 email, Sukhwinder Kaur2 email, Grish C Varshney2 email and Gajendra PS Raghava1 email

1Bioinformatics Centre, Institute of Microbial Technology, Sector 39-A, Chandigarh, India

2Cell biology and Immunology, Institute of Microbial Technology, Sector 39-A, Chandigarh, India

author email corresponding author email

BMC Bioinformatics 2008, 9:201doi:10.1186/1471-2105-9-201

Published: 16 April 2008

Abstract

Background

Malaria parasite secretes various proteins in infected RBC for its growth and survival. Thus identification of these secretory proteins is important for developing vaccine/drug against malaria. The existing motif-based methods have got limited success due to lack of universal motif in all secretory proteins of malaria parasite.

Results

In this study a systematic attempt has been made to develop a general method for predicting secretory proteins of malaria parasite. All models were trained and tested on a non-redundant dataset of 252 secretory and 252 non-secretory proteins. We developed SVM models and achieved maximum MCC 0.72 with 85.65% accuracy and MCC 0.74 with 86.45% accuracy using amino acid and dipeptide composition respectively. SVM models were developed using split-amino acid and split-dipeptide composition and achieved maximum MCC 0.74 with 86.40% accuracy and MCC 0.77 with accuracy 88.22% respectively. In this study, for the first time PSSM profiles obtained from PSI-BLAST, have been used for predicting secretory proteins. We achieved maximum MCC 0.86 with 92.66% accuracy using PSSM based SVM model. All models developed in this study were evaluated using 5-fold cross-validation technique.

Conclusion

This study demonstrates that secretory proteins have different residue composition than non-secretory proteins. Thus, it is possible to predict secretory proteins from its residue composition-using machine learning technique. The multiple sequence alignment provides more information than sequence itself. Thus performance of method based on PSSM profile is more accurate than method based on sequence composition. A web server PSEApred has been developed for predicting secretory proteins of malaria parasites,the URL can be found in the Availability and requirements section.


© 1999-2008 BioMed Central Ltd unless otherwise stated