Open Access Highly Accessed Methodology article

POLYAR, a new computer program for prediction of poly(A) sites in human sequences

Malik Nadeem Akhtar1, Syed Abbas Bukhari1, Zeeshan Fazal1, Raheel Qamar12 and Ilham A Shahmuradov3*

Author Affiliations

1 Department of Biosciences, COMSATS Institute of Information Technology, Islamabad, Pakistan

2 Shifa College of Medicine, Islamabad, Pakistan

3 Department of Fundamental problems of biological productivity, Institute of Botany, Baku, Azerbaijan

For all author emails, please log on.

BMC Genomics 2010, 11:646  doi:10.1186/1471-2164-11-646

Published: 19 November 2010

Abstract

Background

mRNA polyadenylation is an essential step of pre-mRNA processing in eukaryotes. Accurate prediction of the pre-mRNA 3'-end cleavage/polyadenylation sites is important for defining the gene boundaries and understanding gene expression mechanisms.

Results

28761 human mapped poly(A) sites have been classified into three classes containing different known forms of polyadenylation signal (PAS) or none of them (PAS-strong, PAS-weak and PAS-less, respectively) and a new computer program POLYAR for the prediction of poly(A) sites of each class was developed. In comparison with polya_svm (till date the most accurate computer program for prediction of poly(A) sites) while searching for PAS-strong poly(A) sites in human sequences, POLYAR had a significantly higher prediction sensitivity (80.8% versus 65.7%) and specificity (66.4% versus 51.7%) However, when a similar sort of search was conducted for PAS-weak and PAS-less poly(A) sites, both programs had a very low prediction accuracy, which indicates that our knowledge about factors involved in the determination of the poly(A) sites is not sufficient to identify such polyadenylation regions.

Conclusions

We present a new classification of polyadenylation sites into three classes and a novel computer program POLYAR for prediction of poly(A) sites/regions of each of the class. In tests, POLYAR shows high accuracy of prediction of the PAS-strong poly(A) sites, though this program's efficiency in searching for PAS-weak and PAS-less poly(A) sites is not very high but is comparable to other available programs. These findings suggest that additional characteristics of such poly(A) sites remain to be elucidated. POLYAR program with a stand-alone version for downloading is available at http://cub.comsats.edu.pk/polyapredict.htm webcite.