Figure 5.

Analyses of the T. c. marinkellei -specific acetyltransferase gene. (A) Maximum likelihood phylogenetic tree of the T. c. marinkellei B7 specific gene (MOQ_006101) based on protein sequences. Phylogenetic inference was done on a protein dataset extracted with Blast Explorer (E-value <1e-40). The multiple sequence alignment was done with ClustalW version 2.1 and filtered with Gblocks. The final alignment, which the tree is based on, contained 62 columns. The phylogeny was inferred using RAxML version 7.0.4 with the PROTGAMMAJTT model and 100 bootstrap replicates. Only bootstrap values >40 are shown. Accession numbers for protein sequences are shown in parenthesis after species names. The Tcm gene is shown in red. (B) GC content analysis of MOQ_006101 in relation to all genes. Error bars represent one standard deviation. The white dot represents all genes and the blue dot represents MOQ_006101. GC1, GC2 and GC3 refer to the %GC content at the first, second and third codon positions. 10,342 coding sequences were included in the analysis. (C) Histogram of Codon Adaptation Index (CAI) for all genes in the Tcm genome. 43 ribosomal proteins were used as a reference for highly expressed genes. The vertical black line represents the median CAI (0.545), the two red lines represent +/− one median absolute deviation (0.0548) and the blue line represents the CAI of MOQ_006101 (0.518). CAI was calculated using emboss programs cai and cusp.

Franzén et al. BMC Genomics 2012 13:531   doi:10.1186/1471-2164-13-531
