This article is part of the supplement: Proceedings of the 2012 International Conference on Intelligent Computing (ICIC 2012)
Distinct position-specific sequence features of hexa-peptides that form amyloid-fibrils: application to discriminate between amyloid fibril and amorphous β-aggregate forming peptide sequences
1 Department of Crystallography and Biophysics, University of Madras, Chennai 600025, India
2 Biotherapeutics Pharmaceutical Sciences, Pfizer Inc., MC6S, 575 Maryville Centre Drive, St. Louis, MO 63141, USA
3 Department of Biotechnology, Indian Institute of Technology Madras, Chennai 600036, India
BMC Bioinformatics 2013, 14(Suppl 8):S6 doi:10.1186/1471-2105-14-S8-S6Published: 9 May 2013
Comparison of short peptides which form amyloid-fibrils with their homologues that may form amorphous β-aggregates but not fibrils, can aid development of novel amyloid-containing nanomaterials with well defined morphologies and characteristics. The knowledge gained from the comparative analysis could also be applied towards identifying potential aggregation prone regions in proteins, which are important for biotechnology applications or have been implicated in neurodegenerative diseases. In this work we have systematically analyzed a set of 139 amyloid-fibril hexa-peptides along with a highly homologous set of 168 hexa-peptides that do not form amyloid fibrils for their position-wise as well as overall amino acid compositions and averages of 49 selected amino acid properties.
Amyloid-fibril forming peptides show distinct preferences and avoidances for amino acid residues to occur at each of the six positions. As expected, the amyloid fibril peptides are also more hydrophobic than non-amyloid peptides. We have used the results of this analysis to develop statistical potential energy values for the 20 amino acid residues to occur at each of the six different positions in the hexa-peptides. The distribution of the potential energy values in 139 amyloid and 168 non-amyloid fibrils are distinct and the amyloid-fibril peptides tend to be more stable (lower total potential energy values) than non-amyloid peptides. The average frequency of occurrence of these peptides with lower than specific cutoff energies at different positions is 72% and 50%, respectively. The potential energy values were used to devise a statistical discriminator to distinguish between amyloid-fibril and non-amyloid peptides. Our method could identify the amyloid-fibril forming hexa-peptides to an accuracy of 89%. On the other hand, the accuracy of identifying non-amyloid peptides was only 54%. Further attempts were made to improve the prediction accuracy via machine learning. This resulted in an overall accuracy of 82.7% with the sensitivity and specificity of 81.3% and 83.9%, respectively, in 10-fold cross-validation method.
Amyloid-fibril forming hexa-peptides show position specific sequence features that are different from those which may form amorphous β-aggregates. These positional preferences are found to be important features for discriminating amyloid-fibril forming peptides from their homologues that don't form amyloid-fibrils.