Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Research article

NClassG+: A classifier for non-classically secreted Gram-positive bacterial proteins

Daniel Restrepo-Montoya1235, Camilo Pino1, Luis F Nino12, Manuel E Patarroyo45 and Manuel A Patarroyo35*

Author Affiliations

1 Intelligent Systems Research Laboratory - LISI, Universidad Nacional de Colombia, Carrera 45 No. 26-85, Bogotá DC, Colombia

2 Research Group on Combinatorial Algorithms - ALGOS-UN, Universidad Nacional de Colombia, Bogotá DC, Colombia

3 School of Medicine and Health Sciences, Universidad del Rosario, Carrera 24 No. 63C-69, Bogotá DC, Colombia

4 School of Medicine, Universidad Nacional de Colombia, Bogotá DC, Colombia

5 Fundación Instituto de Inmunología de Colombia - FIDIC, Carrera 50 No. 26-20 Bogotá DC, Colombia

For all author emails, please log on.

BMC Bioinformatics 2011, 12:21  doi:10.1186/1471-2105-12-21

Published: 14 January 2011

Abstract

Background

Most predictive methods currently available for the identification of protein secretion mechanisms have focused on classically secreted proteins. In fact, only two methods have been reported for predicting non-classically secreted proteins of Gram-positive bacteria. This study describes the implementation of a sequence-based classifier, denoted as NClassG+, for identifying non-classically secreted Gram-positive bacterial proteins.

Results

Several feature-based classifiers were trained using different sequence transformation vectors (frequencies, dipeptides, physicochemical factors and PSSM) and Support Vector Machines (SVMs) with Linear, Polynomial and Gaussian kernel functions. Nested k-fold cross-validation (CV) was applied to select the best models, using the inner CV loop to tune the model parameters and the outer CV group to compute the error. The parameters and Kernel functions and the combinations between all possible feature vectors were optimized using grid search.

Conclusions

The final model was tested against an independent set not previously seen by the model, obtaining better predictive performance compared to SecretomeP V2.0 and SecretPV2.0 for the identification of non-classically secreted proteins. NClassG+ is freely available on the web at http://www.biolisi.unal.edu.co/web-servers/nclassgpositive/ webcite