Additional file 6: Figure S6.

GATA sequence alignment and phylogenetic analyses. A: Partial GATA protein multiple sequence alignment showing the amino acid sites used for phylogenetic analyses. GATA genes encode zinc-finger transcription factors that bind to the (T/A (GATA) A/G) cis-regulatory motif [45] and are characterised by the presence of two zinc-finger domains (N-terminal ‘N-finger’ and C-terminal ‘C-finger’). The alignment encompasses N-finger (underlined in red) and C-finger (underlined in blue) zinc-finger domains each encoding CXNC (boxed) X17CNXC (boxed). B: GATA maximum likelihood phylogeny rooted with GATA-type sequences. Amphimedon queenslandica sequence is highlighted in red. No clear GATA orthologues were identified outside Metazoa, and in the genome of the ctenophore Mnemiopsis leidyi[46]. C: Unrooted GATA maximum likelihood phylogeny. Sequence alignment and phylogenetic analyses were performed on the Geneious platform (v.5.1.7). Related sequences were retrieved via the protein BLAST search using the Amphimedon GATA-like sequence as queries, from GenBank at NCBI [47], Acropora digitifera genome (Version 1.1) portal at OIST [48], Mnemiopsis genome project portal at NIH [46] and Compagen at the University of Kiel (Oscarella sequences; [49]). Sequence IDs/accession numbers are shown with the name of each sequence in B and C. Peptide sequences were aligned with MUSCLE (v3.7) [50] (default settings), and ambiguous regions were manually removed. Phylogenetic trees were reconstructed using the maximum likelihood method implemented in the PhyML program [51]. The WAG substitution model [52] was selected assuming an estimated proportion of invariant sites and four gamma-distributed rate categories to account for rate heterogeneity across sites. The gamma shape parameter was estimated directly from the data. Reliability for internal branches of maximum likelihood trees was assessed using the bootstrapping method (100 bootstrap replicates). Support values are shown at each node except when lower than 50%. The unit of the branch length is the number of substitutions per site.

Format: JPEG Size: 1.4MB Download file

Nakanishi et al. BMC Biology 2014 12:26   doi:10.1186/1741-7007-12-26