Table 1

Protease genes identified in the Eimeria tenella genome database
Protease/Gene Identifier/Contig Clan Family BLAST Apicomplexa (Database: nucleotide) BLAST NCBI (Database: PDB, Swissprot, NR) Family Domains (Pfam, MEROPS, InterProScan) Evidence Rating
Aspartic Proteases
Eimepsin 1 ETH_00001725 on Supercontig_54 AA A1 Theileria annulata strain Ankara genomic DNA chromosome 3 (E=2e-17) PDB: Porcine Pepsin (E=4e-11) partial ER1
Eimepsin 2 ETH_00007420 on Supercontig_38 AA A1 Toxoplasma gondii ME49 gcontig_1112359860822 (E=7e-36) NR: aspartic protease 7 [Toxoplasma gondii] (E=3e-37) none ER3/4
Eimepsin 3 ETH_00008525 on Supercontig_8 AA A1 Plasmodium berghei whole genome shotgun assembly, contig PB_RP2841 (E=1e-93) PDB: Human pepsin (E=7e-56) complete ER2
Cysteine Proteases
Cathepsin B ETH_00003570 on Supercontig_23 CA C1 Toxoplasma gondii GAB2-2007-GAL-DOM2 contig00350 (E= 1e-12) PDB: Human Recombinant Procathepsin B (E=2e-58) Complete ER2
Cathepsin L ETH_00033530 on NODE_2923_length_1315_cov_12.253232 CA C1 Toxoplasma gondii ME49 gcontig_1112359872114 (E= 9e-43) PDB:Toxoplasma gondii Cathepsin L (Tgcpl) (E=1e-64) Complete ER2
Cathepsin C1 ETH_00019750 on Supercontig_2 CA C1 Toxoplasma gondii ME49 gcontig_1112359873648 (E= 5e-46) PDB: Porcine Cathepsin H (E=3e-11) Partial ER2/3
Cathepsin C2 ETH_0005000 on NODE_22022_length_2554_cov_8.124119 CA C1 Toxoplasma gondii GT1 gcontig_1107000835548 (E=2e-11) PDB: Human Dipeptidyl Peptidase I (Cathepsin C) (E=0.016) Partial ER3/4
Cathepsin C3 ETH_00001590, ETH_00001595 and ETH_00001600 on Supercontig_115 CA C1 Cryptosporidium hominis strain TU502 chromosome 4 CHRO014106 (E=1e-34) PDB: Cathepsin C Rattus norvegicus (E= 3e-13) partial ER2
Calpain ETH_00004075 on Supercontig_49 CA C2 Toxoplasma gondii ME49 gcontig_1112359873650 (E=5e-96) PDB: Human Calpain 8 (E= 1e-16) Complete ER2
Ubiquitinyl hydrolase 1 ETH_00012075 on Supercontig_122 CA C19 Cryptosporidium muris RN66 gcontig_1106632353963 (E=2e-87) Swiss-Prot: ubiquitin specific peptidase 39 [Mus musculus] (E=1e-116) Complete ER2
Ubiquitinyl hydrolase 2 ETH_00034675 on Supercontig_3 CA C19 Cryptosporidium muris RN66 gcontig_1106632353937 (E=3e-35) Swiss-Prot: ubiquitin specific peptidase 5 (isopeptidase T)[Mus musculus] (E=7e-81) Partial ER2
Ubiquitinyl hydrolase 3 ETH_00001555 on Supercontig_115 CA C19 Neospora caninum Liverpool ubiquitin carboxyl-terminal hydrolase, related (NCLIV_041690) mRNA, partial cds (E=63e-94) PDB: Ubp-Family Deubiquitinating Enzyme [human] (E= 5e-40) Complete ER2
Ubiquitinyl hydrolase 4 ETH_00007310 on Supercontig_39 CA C19 Cryptosporidium muris RN66 gcontig_1106632353835 (E=1e-60) PDB: Usp14, A Proteasome-Associated Deubiquitinating Enzyme (E= 5e-40) Complete (disrupted) ER2
Ubiquitinyl hydrolase 5 ETH_00003260 on Supercontig_106 CA C19 Plasmodium falciparum VS/1 cont1.2577 (E=2e-49) PDB: Human Ubiquitin Carboxyl-Terminal Hydrolase 8 (E=2e-37) Partial ER2
Ubiquitinyl hydrolase 6 ETH_00020635 on Supercontig_5 CA C19 Toxoplasma gondii GT1 gcontig_1107000919460 (E=1e-12) Swiss-Prot: Ubiquitin carboxyl-terminal hydrolase 26 Arabidopsis thaliana (E= 2e-12) Partial ER2/3
Ubiquitinyl hydrolase 7 ETH_00008925 on Supercontig_8 CA C19 Plasmodium vivax SaI-1 ctg_6569 (E=1e-30) PDB: Ubiquitin-Usp2 Complex [human] (E= 2e-15) Partial ER2
Ubiquitinyl hydrolase 8 ETH_00003260 on Supercontig_106 CA C19 Neospora caninum Liverpool Ubiquitin carboxyl-terminal hydrolase, related (NCLIV_024510) mRNA, complete cds (E=8e-17) PDB: Covalent Ubiquitin-Usp2 Complex [human] (E= 4e-10) Partial ER2/3
OTU protease no gene name on NODE_10106_length_3351_cov_7.612056 CA C88 Toxoplasma gondii ME49 gcontig_1112359861240 (E=3e-9) PDB: OTU [Saccharomyces cerevisiae] (E= 2e-11) None ER3/4
Pyroglutamyl peptidase ETH_00030160 on Supercontig_14 CF C15 Toxoplasma gondii ME49 gcontig_1112359873116 (E=0.1) Swiss-Prot: Pyrrolidone Carboxyl Peptidase (pyroglutamyl peptidase) Chromobacterium violaceum (E= 8e-5) Partial ER3
Metallo Proteases
Aminopeptidase N 1 ETH_00013105 on Supercontig_9 MA M1 Neospora caninum Liverpool complete genome, chromosome X (E=1e-24) PDB: Aminopeptidase N From Human Pathogen Neisseria meningitidis (E=1e-144) Partial ER2
Aminopeptidase N 2 ETH_00015595 on Supercontig_153 MA M1 Babesia bovis strain T2Bo chromosome 4 gcontig_1104837696308 (E=7e-57) PDB: M1 Alanylaminopeptidase From Malaria (E= 4e-65) Partial ER2
ATP-dependant Zn protease 1 no gene name on NODE_975_length_1397_cov_15.574087 MA M41 Plasmodium falciparum FCC-2/Hainan cont1.4384 (E=9e-31) PDB: Ftsh Protease Domain [Aquifex aeolicus] (E=1e-21) Complete ER2
ATP-dependant Zn protease 2 ETH_00018435 on Supercontig_60 MA M41 Plasmodium falciparum VS/1 cont1.4464 (E=2e-68) PDB: Ftsh [Escherichia coli] (E=1e-65) Complete ER2
ATP-dependant Zn protease 3 ETH_00010985 on Supercontig_4 MA M41 Toxoplasma gondii ME49 gcontig_1112359860098 (E=2e-54) PDB: Human Paraplegin (FtsH endopeptidase family) (E=7e-40) Partial ER3
CaaX prenyl protease ETH_00017305 on Supercontig_76 MA M48 Babesia bovis strain T2Bo chromosome 4 gcontig_1104837696380 (E=2e-77) Swiss-Prot: CAAX prenyl protease 1 homolog [Arabidopsis thaliana] (E=2e-83) Complete ER2
Insulysin 1 ETH_00011835 on Supercontig_36 ME M16 Plasmodium vivax SaI-1 ctg_7222 (E=5e-168) PDB: Bovine Bc1 (Zn-dependant insulinase) (E=1e-101) Complete ER2
Insulysin 2 ETH_00032950 on Supercontig_103 ME M16 Babesia bovis T2Bo chromosome 3 (E=1e-133) PDB: Yeast Mitochondrial Processing Peptidase (E=4e-60) Complete ER2
Insulysin 3 ETH_00001730 on Supercontig_54 ME M16 Toxoplasma gondii ME49 gcontig_1112359871056 (E=3e-90) PDB: Human insulin degrading enzyme (Ide) (E=2e-46) Complete ER1/2
Insulysin 4 no gene name on Supercontig_901 ME M16 Toxoplasma gondii VEG gcontig_1104442817478 (E=1e-22) PDB: Human insulin degrading enzyme (Ide) (E=3e-17) Partial ER2/3
Insulysin 5 no gene name on NODE_2627_length_1769_cov_14.530243 ME M16 - PDB: Pitrilysin (M16 family) [Escherichia coli O157:H7] (E=7e-05) None ER4
Leucine aminopeptidase ETH_00012380 on Supercontig_27 MF M17 Toxoplasma gondii ME49 cytosol aminopeptidase, mRNA (E=4e-54) PDB:E. coli Aminopeptidase A (Pepa) (E=1e-53) Complete ER2
O-sialoglycoprotease ETH_00020530 on Supercontig_5 MK M22 Toxoplasma gondii ME49 glycoprotease family domain-containing protein, mRNA (E=1e-47) PDB:Methanococcus jannaschii Kae1-Bud32 Fusion Protein (Kae1: sialoglycoprotease homologue) (E=1e-51) Complete (disrupted) ER2
S2P-like protease ETH_00009130 on Supercontig_80 MM M50 - NR: peptidase, M50 family protein [Toxoplasma gondii] (E=4e-21) Complete ER3
Serine Proteases
Trypsin 1 ETH_00028355 on Supercontig_45 PA S1 Babesia bovis strain T2Bo chromosome 4 gcontig_1104837696308 (E=1e-56) Swiss-Prot: Protease Do-like 10 [Arabidopsis thaliana] (E=2e-63) Complete ER2
Trypsin 2 ETH_00012215 on Supercontig_27 PA S1 Neospora caninum Liverpool complete genome, chromosome IX (E=1e-21) Swiss-Prot: Protease Do-like 9 [Arabidopsis thaliana] (E=1e-63) Complete ER2/3
Trypsin 3 ETH_00015245 on Supercontig_30 PA S1 Toxoplasma gondii ME49 gcontig_1112359861240 (E=6e-58) Swiss-Prot: Protease Do-like 2 [Arabidopsis thaliana] (E=2e-80) Complete ER2
Subtilisin 1 ETH_00009790 on Supercontig_570 SB S8 Cryptosporidium parvum Iowa II chromosome 6 chr6.s2 (E=2e-22) Swiss-Prot: Cell wall-associated protease [Bacillus subtilis] (E=8e-18) Partial ER2
Subtilisin 2 ETH_00025145 on Supercontig_1463 SB S8 Cryptosporidium muris RN66 gcontig_1106632353939 (E=8e-18) Swiss-Prot: Major intracellular serine protease [Bacillus subtilis] (E=4e-9) Partial ER1/2
Subtilisin 3 ETH_00011050 on Supercontig_4 SB S8 Cryptosporidium muris RN66 gcontig_1106632353939 (E=1e-39) PDB: Subtilisin [Bacillus licheniformis] (E=3e-28) Complete ER2
Subtilisin 4 ETH_00006825 on Supercontig_65 SB S8 Cryptosporidium parvum Iowa II chromosome 6 chr6.s2 (E=3e-53) PDB: Thermitase [Thermoactinomyces vulgaris] (E=3e-32) Complete ER2
Subtilisin 5 ETH_00011340 on Supercontig_4 SB S8 Toxoplasma gondii ME49 gcontig_1112359859078 (E=3e-24) PDB: Thermostable Serine Protease [Bacillus sp] (E=8e-7) Partial ER3/4
Subtilisin 6 ETH_00016890 on Supercontig_22 SB S8 Toxoplasma gondii VEG gcontig_1104442818966 (E=6e-34) PDB: Thermitase [Thermoactinomyces vulgaris] (E= 4e-11) Partial ER3/4
Prolyl endopeptidase ETH_00028960 on Supercontig_1 SC S33 Neospora caninum Liverpool complete genome, chromosome V (E=2e-6) Swiss-Prot: prolyl endopeptidase [Mus musculus] (E=4e-98) Complete ER2
Clp protease ETH_00030480 on Supercontig_126 SK S14 - Swissprot: ATP-dependent Clp protease proteolytic subunit [Neisseria meningitidis] (E= 3e-11) Partial ER3/4
Rhomboid protease ETH_00020020 on Supercontig_2 ST S54 Plasmodium falciparum Santa Lucia cont1.4986, (E=1e-23) Swiss-Prot: Rhomboid-like protease 1 [Toxoplasma gondii] (E=3e-47) Complete ER1/2

The E. tenella genome database (http://www.genedb.org/Homepage/Etenella webcite) was searched for genes predicted to code for proteins with peptidase activity. All auto-annotated peptidase genes identified were manually curated by performing BLAST analysis against apicomplexan genome sequence databases and various protein databases [32] such as the protein data bank (PDB), Swiss-Prot and non-redundant (NR). In addition, signature protein motifs for the protein sequence of each gene were identified through Pfam (http://pfam.sanger.ac.uk/search webcite; [33]), InterproScan (http://www.ebi.ac.uk/Tools/pfa/iprscan/ webcite) and the MEROPS databases (http://merops.sanger.ac.uk/ webcite; [34]). Further gene sequence manipulations, such as translation into amino acid sequences and ClustalW alignments, were performed using the DNASTAR Lasergeneā„¢ 9 Core Suite. Genes were assigned a five-tiered level of confidence for gene function using an Evidence Rating (ER) system giving an overall score of ER1-5, where ER1 indicates extremely reliable experimental data to support function and ER5 indicates no evidence for gene function [17].

Katrib et al.

Katrib et al. BMC Genomics 2012 13:685   doi:10.1186/1471-2164-13-685

Open Data