Protein functional links in Trypanosoma brucei, identified by gene fusion analysis
Biomedical Research Foundation, Academy of Athens, Athens, Greece
BMC Evolutionary Biology 2011, 11:193 doi:10.1186/1471-2148-11-193Published: 5 July 2011
Additional file 1:
Comparison of our results with those of Enright et al. . To test the selectivity and performance of our automatic in-house software we used it to analyse the proteomes of the same organisms analysed by Enright et al. . The table (Worksheet "COMPARISON") is based on supplementary table one of Enright et al. , showing "the 64 fusion events in the genomes of E. coli, H. influenzae and M. jannaschii, detected on the basis of composite proteins in these three genomes plus the genome of S. cerevisiae. Columns COMPONENT list the component gene/protein names (or identifiers); column COMPOSITE lists the composite (fusion) gene/protein name (or identifier); columns with species names list the gene identifiers from the corresponding species; N lists the maximum number of possible pairwise interactions; COMMENT includes various comments for specific cases. White cells in the species columns contain component proteins and blue cells contain composite proteins, on the basis of which interactions are predicted. The sort order follows the three species against the composite-protein sequence-identifiers for the yeast genome, and then the other three species in succession. Genes are named when different names are used; where no name is available, the sequence-identifier is used instead. '#" indicates the absence of a component from a multiple fusion event (case 8). Interacting pairs are separated by "/", while paralogues are separated by commas." A column next to each species name shows the results from the present analysis, where "-" refers to the events not identified by our software (12.5% of the total) but which might be genuine, "c" refers to the events not identified by our software which are probably artefactual in the original analysis (13.5% of the total; details for each result are given in the "comments" columns), "p" refers to paralogues (5.7%), "m" to matches i.e. domains that participate in multiple interactions (34%), "d" to doubles i.e. domains that are involved in two separate fusions events (12.5% of the total), and "u" to unique results that conform to all the selection criteria (21.5% of the total). The percentages (categories) of results from our analysis are shown at the end of the table along with comments concerning each fusion event and the reasons for any difference in the results. In total, we can confirm 87,5% of the 88 fusion events reported by Enright et al. In addition, our software identified a further 27 events not reported by Enright et al. (Worksheet "UNIQUE").
Format: XLS Size: 55KB Download file
This file can be viewed with: Microsoft Excel Viewer
Additional file 2:
Alignments of the fused proteins identified in different species, with the corresponding split protein pairs in T. brucei. A: BLAST output alignment of the P. infestans EEY58132 composite protein with the T. brucei proteins AAX79027 and AAX70704. B: BLAST output alignment of the M. brevicollis EDQ88211 composite protein with the T. brucei proteins AAX70833 and AAX70835. C: BLAST output alignment of the E. histolytica EAL47672 composite protein with the T. brucei proteins EAN76787 and EAN78273. D: BLAST output alignment of the C. reinhardtii EDP05938 composite protein with the T. brucei proteins AAX79872 and EAN76725. E: BLAST output alignment of the C. reinhardtii EDP08267composite protein with the T. brucei proteins AAX79657 and EAN76651.
Format: PDF Size: 191KB Download file
This file can be viewed with: Adobe Acrobat Reader