BMC Bioinformatics

official impact factor 3.03

Open Access Research article

A genetic approach for building different alphabets for peptide and protein classification

Loris Nanni* and Alessandra Lumini

Author Affiliations

DEIS, Università di Bologna, Via Venezia 52, 47023 Cesena (FC), Italy

For all author emails, please log on.

BMC Bioinformatics 2008, 9:45 doi:10.1186/1471-2105-9-45

Published: 24 January 2008

Abstract

Background

In this paper, it is proposed an optimization approach for producing reduced alphabets for peptide classification, using a Genetic Algorithm. The classification task is performed by a multi-classifier system where each classifier (Linear or Radial Basis function Support Vector Machines) is trained using features extracted by different reduced alphabets. Each alphabet is constructed by a Genetic Algorithm whose objective function is the maximization of the area under the ROC-curve obtained in several classification problems.

Results

The new approach has been tested in three peptide classification problems: HIV-protease, recognition of T-cell epitopes and prediction of peptides that bind human leukocyte antigens. The tests demonstrate that the idea of training a pool classifiers by reduced alphabets, created using a Genetic Algorithm, allows an improvement over other state-of-the-art feature extraction methods.

Conclusion

The validity of the novel strategy for creating reduced alphabets is demonstrated by the performance improvement obtained by the proposed approach with respect to other reduced alphabets-based methods in the tested problems.