Email updates

Keep up to date with the latest news and content from BMC Genomics and BioMed Central.

This article is part of the supplement: Eleventh International Conference on Bioinformatics (InCoB2012): Computational Biology

Open Access Proceedings

The parasite specific substitution matrices improve the annotation of apicomplexan proteins

Jamshaid Ali, Shashi Rekha Thummala and Akash Ranjan*

Author Affiliations

Laboratory of Computational and Functional Genomics, Centre for DNA Fingerprinting and Diagnostics (CDFD), A Sun Centre of Excellence in Medical Bioinformatics, Tuljaiguda, Nampally, Hyderabad 500001, India

For all author emails, please log on.

BMC Genomics 2012, 13(Suppl 7):S19  doi:10.1186/1471-2164-13-S7-S19

Published: 13 December 2012

Additional files

Additional file 1:

SMAT80 gives poor E-values for coccidian specific proteins in non-coccidian parasites. BLAST searches for coccidian-specific oocyst wall proteins of Cryptosporidium parvum were carried out against the hematozoans (non-coccidian) and coccidian apicomplexan parasites using BLOSUM62 and SMAT80 matrices. SMAT80 correctly gave poor E-values and/or bit scores for BLAST hits of these coccidian-specific proteins in hematozoans.

Format: XLS Size: 28KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 2:

Genome-wise BLAST searches for apicomplexan proteins against 1215 bacterial species. The genome-wise BLAST searches were carried out for all the proteins of 15 apicomplexan species studied here against 1215 bacterial species using SMAT80, BLOSUM90 and BLOSUM62 matrices.

Format: XLS Size: 455KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 3:

Number of hits found at different E-value thresholds for apicomplexan proteins in genome-wise BLAST searches against one another. The genome-wise BLAST searches were carried out for all the proteins of 15 apicomplexan species against one another using SMAT80, BLOSUM90 and BLOSUM62 matrices.

Format: XLS Size: 93KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 4:

Comparison of performance of SMAT80 with that of BLOSUM90. We carried out BLAST searches for all the proteins of 15 apicomplexan parasites using SMAT80 and BLOSUM90 matrices against SwissProt database. An identical hit (best non-self) was assigned to one of the eight categories (1) better or similar E-values, better or similar scores and better or similar % identity with SMAT80 compared to BLOSUM90, (2) better or similar E-values, better or similar scores and poor % identity, (3) better or similar E-values, poor scores and better or similar % identity, (4) better or similar E-values, poor scores and poor % identity, (5) poor E-values, better or similar scores and better or similar % identity, (6) poor E-values, better or similar scores and poor % identity, (7) poor E-values, poor scores and better or similar % identity and (8) poor E-values, poor scores and poor % identity. As evident in the figure, most apicomplexan proteins fall in 1 & 7 categories that means SMAT80 performs better.

Format: TIFF Size: 443KB Download file

Open Data

Additional file 5:

Apicomplexan proteins for which hits were detected against SwissProt database by SMAT80 but not by BLOSUM62 or BLOSUM90 matrices. This is the list of 1374 apicomplexan hypothetical proteins which did not give any BLAST hit against SwissProt database using BLOSUM series of matrices but SMAT80 was able to detect hits against SwissProt for these proteins.

Format: XLS Size: 336KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 6:

List of 70 probable apicomplexan protein kinases detected by SMAT80 but not by BLOSUM series of matrices. This is the list of 70 apicomplexan hypothetical proteins whose SwissProt hits have probable or known kinase annotation. These hits were detected against SwissProt database by SMAT80 but not by BLOSUM series of matrices.

Format: XLS Size: 38KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 7:

Results of batch Conserved Domain search for 70 predicted (by SMAT80) apicomplexan protein kinases. The protein sequences in FASTA format of these 70 apicomplexan hypothetical proteins were used for Conserved Domain search at NCBI site. Only 8 proteins gave hits and no kinase domain was detected.

Format: XLS Size: 31KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 8:

Pair-wise alignments of probable apicomplexan protein kinases with a known P. falciparum protein kinase. The pairwise alignments were carried out using BLOSUM62 and PfFSmat60 matrices at ApicoAlign (http://www.cdfd.org.in/apicoalign webcite) server. 30 SMAT80-predicted kinases (out of 70 of Supplementary Table 5) were used as query proteins and PF11_0220 as subject protein. P. falciparum protein kinase PF11_0220 is an experimentally known kinase (protein kinase activity GO:0004672, evidence code IDA, source: PlasmoDB version 9.0).

Format: DOC Size: 79KB Download file

This file can be viewed with: Microsoft Word Viewer

Open Data

Additional file 9:

List of hypothetical apicomplexan proteins whose SwissProt hits are probable or known kinases. The BLAST hits obtained using SMAT80, BLOSUM90 & BLOSUM62 matrices against SwissProt database were pooled together into one set and the apicomplexan hypothetical proteins whose subject annotations include 'kinase' were filtered out of this set. We expect this list to be useful for the researchers working on apicomplexan kinomes.

Format: XLS Size: 324KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 10:

The hydropathy values of 70 apicomplexan hypothetical proteins or SMAT80 predicted kinases. The GRAVY (grand average of hydropathy) values were calculated for 70 SMAT80 predicted apicomplexan kinases. Positive GRAVY indicates hydrophobicity and negative GRAY hydrophilicity.

Format: XLS Size: 25KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 11:

List of 17 apicomplexan hypothetical proteins (or proteases as predicted by SMAT80) whose hits were detected by SMAT80 but not by BLOSUM series of matrices. This is the list of 17 apicomplexan hypothetical proteins whose SwissProt hits have probable or known protease annotation. These hits were missed by BLOSUM series of matrices but detected by SMAT80 matrix.

Format: XLS Size: 25KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 12:

Results of batch Conserved Domain search for 17 predicted (by SMAT80) apicomplexan proteases. The Conserved Domain search in batch mode at NCBI site for these 17 apicomplexan proteins gave hits only for 5 proteins and rhomboid superfamily of proteases was detected.

Format: XLS Size: 29KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 13:

List of hypothetical apicomplexan proteins whose SwissProt hits are probable or known proteases. The BLAST hits obtained using SMAT80, BLOSUM90 & BLOSUM62 matrices against SwissProt database were pooled together into one set and the apicomplexan hypothetical proteins whose subject annotations include 'protease' were filtered out of this set. We expect this list to be useful for the researchers working on role of proteases in apicomplexan biology.

Format: XLS Size: 72KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 14:

The hydropathy values of 17 apicomplexan hypothetical proteins or SMAT80 predicted proteases. The GRAVY (grand average of hydropathy) values were calculated for 17 SMAT80 predicted apicomplexan proteases. Positive GRAVY indicates hydrophobicity and negative GRAY hydrophilicity.

Format: XLS Size: 30KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 15:

Amino acid composition of SMAT80 predicted apicomplexan kinases and proteases compared to yeast kinases and proteases. The SMAT80 predicted apicomplexan kinases and proteases significantly differ from yeast kinases and proteases respectively in terms of non-polar and negatively charged amino acids content. We think this was one of the reasons that standard BLOSUM matrices could not detect orthologs for these proteins against SwissProt database.

Format: XLS Size: 30KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data