A novel method for high accuracy sumoylation site prediction from protein sequences
1 The Key Laboratory of Bioinformatics, Ministry of Education, China, Department of Biological Sciences and Biotechnology, Tsinghua University, Beijing, 100084, China
2 The National Laboratory of Medical Molecular Biology, Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences, and Peking Union Medical College, Chinese National Human Genome Center, Beijing 100005, China
3 National Laboratory of Biomacromolecules, Institute of Biophysics, Academia Sinica, Beijing, China
BMC Bioinformatics 2008, 9:8 doi:10.1186/1471-2105-9-8Published: 8 January 2008
Protein sumoylation is an essential dynamic, reversible post translational modification that plays a role in dozens of cellular activities, especially the regulation of gene expression and the maintenance of genomic stability. Currently, the complexities of sumoylation mechanism can not be perfectly solved by experimental approaches. In this regard, computational approaches might represent a promising method to direct experimental identification of sumoylation sites and shed light on the understanding of the reaction mechanism.
Here we presented a statistical method for sumoylation site prediction. A 5-fold cross validation test over the experimentally identified sumoylation sites yielded excellent prediction performance with correlation coefficient, specificity, sensitivity and accuracy equal to 0.6364, 97.67%, 73.96% and 96.71% respectively. Additionally, the predictor performance is maintained when high level homologs are removed.
By using a statistical method, we have developed a new SUMO site prediction method – SUMOpre, which has shown its great accuracy with correlation coefficient, specificity, sensitivity and accuracy.