Whole genome duplications and expansion of the vertebrate GATA transcription factor gene family
1 Institute of Molecular Biology, University of Oregon, 1229 University of Oregon, Eugene, OR 97403, USA
2 Current address: Biomolecular Engineering Department, University of California, Santa Cruz, CA 95064, USA
BMC Evolutionary Biology 2009, 9:207 doi:10.1186/1471-2148-9-207Published: 20 August 2009
GATA transcription factors influence many developmental processes, including the specification of embryonic germ layers. The GATA gene family has significantly expanded in many animal lineages: whereas diverse cnidarians have only one GATA transcription factor, six GATA genes have been identified in many vertebrates, five in many insects, and eleven to thirteen in Caenorhabditis nematodes. All bilaterian animal genomes have at least one member each of two classes, GATA123 and GATA456.
We have identified one GATA123 gene and one GATA456 gene from the genomic sequence of two invertebrate deuterostomes, a cephalochordate (Branchiostoma floridae) and a hemichordate (Saccoglossus kowalevskii). We also have confirmed the presence of six GATA genes in all vertebrate genomes, as well as additional GATA genes in teleost fish. Analyses of conserved sequence motifs and of changes to the exon-intron structure, and molecular phylogenetic analyses of these deuterostome GATA genes support their origin from two ancestral deuterostome genes, one GATA 123 and one GATA456. Comparison of the conserved genomic organization across vertebrates identified eighteen paralogous gene families linked to multiple vertebrate GATA genes (GATA paralogons), providing the strongest evidence yet for expansion of vertebrate GATA gene families via genome duplication events.
From our analysis, we infer the evolutionary birth order and relationships among vertebrate GATA transcription factors, and define their expansion via multiple rounds of whole genome duplication events. As the genomes of four independent invertebrate deuterostome lineages contain single copy GATA123 and GATA456 genes, we infer that the 0R (pre-genome duplication) invertebrate deuterostome ancestor also had two GATA genes, one of each class. Synteny analyses identify duplications of paralogous chromosomal regions (paralogons), from single ancestral vertebrate GATA123 and GATA456 chromosomes to four paralogons after the first round of vertebrate genome duplication, to seven paralogons after the second round of vertebrate genome duplication, and to fourteen paralogons after the fish-specific 3R genome duplication. The evolutionary analysis of GATA gene origins and relationships may inform understanding vertebrate GATA factor redundancies and specializations.