wDBTF: an integrated database resource for studying wheat transcription factor families
1 Institut National de la Recherche Agronomique (INRA), UMR1095, Génétique, Diversité et Ecophysiologie des Céréales, 234 avenue du Brézet, Clermont-Ferrand, F-63100 France; Université Blaise-Pascal, UMR1095, Campus des Cézeaux, F-63170 Aubière, France
2 Institut National de la Recherche Agronomique (INRA), UR Biopolymères, Interactions, Assemblages (BIA), rue de la Géraudière, Nantes, F-44316, France
BMC Genomics 2010, 11:185 doi:10.1186/1471-2164-11-185Published: 18 March 2010
Transcription factors (TFs) regulate gene expression by interacting with promoters of their target genes and are classified into families based on their DNA-binding domains. Genes coding for TFs have been identified in the sequences of model plant genomes. The rice (Oryza sativa spp. japonica) genome contains 2,384 TF gene models, which represent the mRNA transcript of a locus, classed into 63 families.
We have created an extensive list of wheat (Triticum aestivum L) TF sequences based on sequence homology with rice TFs identified and classified in the Database of Rice Transcription Factors (DRTF). We have identified 7,112 wheat sequences (contigs and singletons) from a dataset of 1,033,960 expressed sequence tag and mRNA (ET) sequences available. This number is about three times the number of TFs in rice so proportionally is very similar if allowance is made for the hexaploidy of wheat. Of these sequences 3,820 encode gene products with a DNA-binding domain and thus were confirmed as potential regulators. These 3,820 sequences were classified into 40 families and 84 subfamilies and some members defined orphan families. The results were compiled in the Database of Wheat Transcription Factor (wDBTF), an inventory available on the web http://wwwappli.nantes.inra.fr:8180/wDBFT/ webcite. For each accession, a link to its library source and its Affymetrix identification number is provided. The positions of Pfam (protein family database) motifs were given when known.
wDBTF collates 3,820 wheat TF sequences validated by the presence of a DNA-binding domain out of 7,112 potential TF sequences identified from publicly available gene expression data. We also incorporated in silico expression data on these TFs into the database. Thus this database provides a major resource for systematic studies of TF families and their expression in wheat as illustrated here in a study of DOF family members expressed during seed development.