Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Highly Accessed Research article

A novel method for high accuracy sumoylation site prediction from protein sequences

Jialin Xu12, Yun He3, Boqin Qiang2, Jiangang Yuan2, Xiaozhong Peng2* and Xian-Ming Pan13*

Author Affiliations

1 The Key Laboratory of Bioinformatics, Ministry of Education, China, Department of Biological Sciences and Biotechnology, Tsinghua University, Beijing, 100084, China

2 The National Laboratory of Medical Molecular Biology, Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences, and Peking Union Medical College, Chinese National Human Genome Center, Beijing 100005, China

3 National Laboratory of Biomacromolecules, Institute of Biophysics, Academia Sinica, Beijing, China

For all author emails, please log on.

BMC Bioinformatics 2008, 9:8  doi:10.1186/1471-2105-9-8

Published: 8 January 2008



Protein sumoylation is an essential dynamic, reversible post translational modification that plays a role in dozens of cellular activities, especially the regulation of gene expression and the maintenance of genomic stability. Currently, the complexities of sumoylation mechanism can not be perfectly solved by experimental approaches. In this regard, computational approaches might represent a promising method to direct experimental identification of sumoylation sites and shed light on the understanding of the reaction mechanism.


Here we presented a statistical method for sumoylation site prediction. A 5-fold cross validation test over the experimentally identified sumoylation sites yielded excellent prediction performance with correlation coefficient, specificity, sensitivity and accuracy equal to 0.6364, 97.67%, 73.96% and 96.71% respectively. Additionally, the predictor performance is maintained when high level homologs are removed.


By using a statistical method, we have developed a new SUMO site prediction method – SUMOpre, which has shown its great accuracy with correlation coefficient, specificity, sensitivity and accuracy.