Phylogenetic analysis and classification of the Brassica rapa SET-domain protein family
1 Hunan Provincial Key Laboratory of Crop Germplasm Innovation and Utilization, Hunan Agricultural University, Changsha 410128, China
2 College of Bioscience and Biotechnology, Hunan Agricultural University, Changsha 410128, China
3 Institut de Biologie Moléculaire des Plantes du CNRS, Université de Strasbourg, 12 rue du Général Zimmer, 67084 Strasbourg Cedex, France
BMC Plant Biology 2011, 11:175 doi:10.1186/1471-2229-11-175Published: 14 December 2011
The SET (Su(var)3-9, Enhancer-of-zeste, Trithorax) domain is an evolutionarily conserved sequence of approximately 130-150 amino acids, and constitutes the catalytic site of lysine methyltransferases (KMTs). KMTs perform many crucial biological functions via histone methylation of chromatin. Histone methylation marks are interpreted differently depending on the histone type (i.e. H3 or H4), the lysine position (e.g. H3K4, H3K9, H3K27, H3K36 or H4K20) and the number of added methyl groups (i.e. me1, me2 or me3). For example, H3K4me3 and H3K36me3 are associated with transcriptional activation, but H3K9me2 and H3K27me3 are associated with gene silencing. The substrate specificity and activity of KMTs are determined by sequences within the SET domain and other regions of the protein.
Here we identified 49 SET-domain proteins from the recently sequenced Brassica rapa genome. We performed sequence similarity and protein domain organization analysis of these proteins, along with the SET-domain proteins from the dicot Arabidopsis thaliana, the monocots Oryza sativa and Brachypodium distachyon, and the green alga Ostreococcus tauri. We showed that plant SET-domain proteins can be grouped into 6 distinct classes, namely KMT1, KMT2, KMT3, KMT6, KMT7 and S-ET. Apart from the S-ET class, which has an interrupted SET domain and may be involved in methylation of nonhistone proteins, the other classes have characteristics of histone methyltransferases exhibiting different substrate specificities: KMT1 for H3K9, KMT2 for H3K4, KMT3 for H3K36, KMT6 for H3K27 and KMT7 also for H3K4. We also propose a coherent and rational nomenclature for plant SET-domain proteins. Comparisons of sequence similarity and synteny of B. rapa and A. thaliana SET-domain proteins revealed recent gene duplication events for some KMTs.
This study provides the first characterization of the SET-domain KMT proteins of B. rapa. Phylogenetic analysis data allowed the development of a coherent and rational nomenclature of this important family of proteins in plants, as in animals. The results obtained in this study will provide a base for nomenclature of KMTs in other plant species and facilitate the functional characterization of these important epigenetic regulatory genes in Brassica crops.