Email updates

Keep up to date with the latest news and content from BMC Genomics and BioMed Central.

This article is part of the supplement: Eleventh International Conference on Bioinformatics (InCoB2012): Computational Biology

Open Access Open Badges Proceedings

The parasite specific substitution matrices improve the annotation of apicomplexan proteins

Jamshaid Ali, Shashi Rekha Thummala and Akash Ranjan*

Author affiliations

Laboratory of Computational and Functional Genomics, Centre for DNA Fingerprinting and Diagnostics (CDFD), A Sun Centre of Excellence in Medical Bioinformatics, Tuljaiguda, Nampally, Hyderabad 500001, India

For all author emails, please log on.

Citation and License

BMC Genomics 2012, 13(Suppl 7):S19  doi:10.1186/1471-2164-13-S7-S19

Published: 13 December 2012



A number of apicomplexan genomes have been sequenced successfully in recent years and this would help in understanding the biology of apicomplexan parasites. The members of the phylum Apicomplexa are important protozoan parasites (Plasmodium, Toxoplasma and Cryptosporidium etc) that cause some of the deadly diseases in humans and animals. In our earlier studies, we have shown that the standard BLOSUM matrices are not suitable for compositionally biased apicomplexan proteins. So we developed a novel series (SMAT and PfFSmat60) of substitution matrices which performed better in comparison to standard BLOSUM matrices and developed ApicoAlign, a sequence search and alignment tool for apicomplexan proteins. In this study, we demonstrate the higher specificity of these matrices and make an attempt to improve the annotation of apicomplexan kinases and proteases.


The ROC curves proved that SMAT80 performs best for apicomplexan proteins followed by compositionally adjusted BLOSUM62 (PSI-BLAST searches), BLOSUM90 and BLOSUM62 matrices in terms of detecting true positives. The poor E-values and/or bit scores given by SMAT80 matrix for the experimentally identified coccidia-specific oocyst wall proteins against hematozoan (non-coccidian) parasites further supported the higher specificity of the same. SMAT80 uniquely detected (missed by BLOSUM) orthologs for 1374 apicomplexan hypothetical proteins against SwissProt database and predicted 70 kinases and 17 proteases. Further analysis confirmed the conservation of functional residues of kinase domain in one of the SMAT80 detected kinases. Similarly, one of the SMAT80 detected proteases was predicted to be a rhomboid protease.


The parasite specific substitution matrices have higher specificity for apicomplexan proteins and are helpful in detecting the orthologs missed by BLOSUM matrices and thereby improve the annotation of apicomplexan proteins which are hypothetical or with unknown function.