<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
	<ui>gb-2004-5-7-r46</ui>
	<ji>GBJ</ji>
	<fm>
		<dochead>Research</dochead>
		<bibl>
			<title>
				<p>Identification of conserved gene structures and carboxy-terminal motifs in the Myb gene family of <it>Arabidopsis </it>and <it>Oryza sativa </it>L. ssp. <it>indica</it></p>
			</title>
			<aug>
				<au id="A1">
					<snm>Jiang</snm>
					<fnm>Cizhong</fnm>
					<insr iid="I1"/>
					<email>czjiang@iastate.edu</email>
				</au>
				<au id="A2">
					<snm>Gu</snm>
					<fnm>Xun</fnm>
					<insr iid="I1"/>
					<insr iid="I2"/>
					<email>xgu@iastate.edu</email>
				</au>
				<au id="A3" ca="yes">
					<snm>Peterson</snm>
					<fnm>Thomas</fnm>
					<insr iid="I1"/>
					<email>thomasp@iastate.edu</email>
				</au>
			</aug>
			<insg>
				<ins id="I1">
					<p>Department of Genetics, Development and Cell Biology, and Department of Agronomy, Iowa State University, Ames, IA 50011, USA</p>
				</ins>
				<ins id="I2">
					<p>LHB Center for Bioinformatics and Biological Statistics, Iowa State University, Ames, IA 50011, USA</p>
				</ins>
			</insg>
			<source>Genome Biology</source>
			<issn>1465-6906</issn>
			<pubdate>2004</pubdate>
			<volume>5</volume>
			<issue>7</issue>
			<fpage>R46</fpage>
			<url>http://genomebiology.com/2004/5/7/R46</url>
			<xrefbib>
				<pubidlist><pubid idtype="pmpid">15239831</pubid><pubid idtype="doi">10.1186/gb-2004-5-7-r46</pubid>
				</pubidlist></xrefbib>
		</bibl>
		<history>
			<rec>
				<date>
					<day>22</day>
					<month>12</month>
					<year>2003</year>
				</date>
			</rec>
			<revrec>
				<date>
					<day>23</day>
					<month>3</month>
					<year>2004</year>
				</date>
			</revrec>
			<acc>
				<date>
					<day>29</day>
					<month>5</month>
					<year>2004</year>
				</date>
			</acc>
			<pub>
				<date>
					<day>29</day>
					<month>6</month>
					<year>2004</year>
				</date>
			</pub>
		</history>
		<cpyrt>
			<year>2004</year>
			<collab>Jiang et al.; licensee BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL.</collab>
		</cpyrt>
		<shorttitle>
			<p>Identification of conserved gene structures and carboxy-terminal motifs in the Myb gene family of <it>Arabidopsis </it>and <it>Oryza sativa </it>L. ssp. <it>indica</it></p>
		</shorttitle>
		<shortabs>
			<p>Myb genes from <it>Arabidopsis</it> and rice were clustered into subgroups. The distribution of introns in the phylogenetic tree suggests that introns were inserted during evolution.</p>
		</shortabs>
		<abs>
			<sec>
				<st>
					<p>Abstract</p>
				</st>
				<sec>
					<st>
						<p>Background</p>
					</st>
					<p>Myb proteins contain a conserved DNA-binding domain composed of one to four repeat motifs (referred to as R0R1R2R3); each repeat is approximately 50 amino acids in length, with regularly spaced tryptophan residues. Although the Myb proteins comprise one of the largest families of transcription factors in plants, little is known about the functions of most Myb genes. Here we use computational techniques to classify Myb genes on the basis of sequence similarity and gene structure, and to identify possible functional relationships among subgroups of Myb genes from <it>Arabidopsis </it>and rice (<it>Oryza sativa </it>L. ssp. <it>indica</it>).</p>
				</sec>
				<sec>
					<st>
						<p>Results</p>
					</st>
					<p>This study analyzed 130 Myb genes from <it>Arabidopsis </it>and 85 from rice. The collected Myb proteins were clustered into subgroups based on sequence similarity and phylogeny. Interestingly, the exon-intron structure differed between subgroups, but was conserved in the same subgroup. Moreover, the Myb domains contained a significant excess of phase 1 and 2 introns, as well as an excess of nonsymmetric exons. Conserved motifs were detected in carboxy-terminal coding regions of Myb genes within subgroups. In contrast, no common regulatory motifs were identified in the noncoding regions. Additionally, some Myb genes with similar functions were clustered in the same subgroups.</p>
				</sec>
				<sec>
					<st>
						<p>Conclusions</p>
					</st>
					<p>The distribution of introns in the phylogenetic tree suggests that Myb domains originally were compact in size; introns were inserted and the splicing sites conserved during evolution. Conserved motifs identified in the carboxy-terminal regions are specific for Myb genes, and the identified Myb gene subgroups may reflect functional conservation.</p>
				</sec>
			</sec>
		</abs>
	</fm>
	<meta>
		<classifications>
			<classification type="BMC" subtype="man_spc_id" id="30010019">Plant biology</classification>
			<classification type="BMC" subtype="man_spc_id" id="30010008">Evolution</classification>
			<classification type="BMC" subtype="man_spc_id" id="30010009">Genetics</classification>
		</classifications>
	</meta>
	<bdy>
		<sec>
			<st>
				<p>Background</p>
			</st>
			<p>Regulation of gene expression at the level of transcription controls many important biological processes in a cell or organism. The process of transcription recruits a number of different transcription factors, which can be activators, repressors, or both <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. Genome-wide comparisons have revealed the diversity in the regulation of transcription during evolution. With the completion of <it>Arabidopsis </it>genome sequencing, 5% of its genome was found to encode more than 1,500 transcription factors <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>. On the basis of sequence similarities, transcription factors have been classified into families. In plants, Myb factors comprise one of the largest of these families.</p>
			<p>Myb proteins are defined by a highly conserved DNA-binding domain (termed the Myb domain) composed of one to four helix-turn-helix motifs, which exist as tandem repeats (referred to as R0R1R2R3) in a single Myb protein. Each repeat is about 50 amino acids long, with regularly spaced tryptophan residues, and forms three &#945;-helices The third &#945;-helix has a recognition role during DNA binding <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>. The three-dimensional structure of the Myb domain in the Protein Data Bank (PDB) shows that the DNA recognition &#945;-helix interacts with the DNA major groove. Moreover, previous research indicated that five amino-acid residues in the helix-turn-helix motif bind directly to the major groove <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>. It should be noted that sequences outside the Myb domain are highly divergent.</p>
			<p>The first Myb gene found was the v-<it>Myb </it>oncogene from the avian myeloblastosis virus <abbrgrp><abbr bid="B5">5</abbr></abbrgrp>. Subsequently, members of the Myb gene family were identified in diverse plants and animals <abbrgrp><abbr bid="B6">6</abbr><abbr bid="B7">7</abbr></abbrgrp>. Previous research showed that animal genomes encode relatively few Myb genes <abbrgrp><abbr bid="B7">7</abbr></abbrgrp>. In contrast, flowering plants contain large numbers of Myb genes with very diverse structures and functions <abbrgrp><abbr bid="B6">6</abbr></abbrgrp>. To date, the precise functions of most plant Myb genes are unknown, although some well studied examples suggest important roles for Myb genes in regulation of secondary metabolism, cellular morphogenesis, pathogen resistance, and responses to growth regulators and stress <abbrgrp><abbr bid="B6">6</abbr><abbr bid="B8">8</abbr></abbrgrp>.</p>
			<p>With the completion of <it>Arabidopsis </it>and rice (<it>Oryza sativa </it>L. ssp. <it>indica</it>) genome sequencing <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>, the entire complement of Myb genes can be identified and described. However, a great deal of experimental work is required to determine the specific biological function of each gene. In <it>Arabidopsis</it>, R2R3 Myb gene-expression levels were determined in more than 20 different growth conditions; the results indicated that Myb genes were specifically expressed in different tissues and physiological conditions <abbrgrp><abbr bid="B10">10</abbr></abbrgrp>. To obtain further functional information on <it>Arabidopsis </it>Myb genes, a process of reverse genetics was applied to isolate insertion mutants. In all, 47 insertion mutants were detected in 36 distinct Myb genes by screening a total of 73 genes. However, none of the insertions gave rise to morphological phenotypes visible in soil-grown plants <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>. The redundancy of Myb genes may diminish the efficiency of the molecular approach by complementation of function. No similar research has been done in rice Myb genes. Here, we have used phylogenetic and computational methods to classify Myb genes in subgroups. The resulting subgroup classification and putative functional conserved motif identification may be useful for research on agronomic traits in rice, which is the most important crop for human consumption, and an important model for other cereal grains.</p>
		</sec>
		<sec>
			<st>
				<p>Results and discussion</p>
			</st>
			<sec>
				<st>
					<p>Expansion of Myb genes in <it>Arabidopsis </it>and rice</p>
				</st>
				<p>The Myb gene family has broadly expanded in plants during evolution. The amplification of the Myb gene family occurred before the divergence of monocots and dicots <abbrgrp><abbr bid="B12">12</abbr></abbrgrp>. In our study, 130 Myb genes were found in the <it>Arabidopsis </it>genome and 85 in <it>Oryza sativa </it>L. ssp. <it>indica</it>. The large size of this gene family was also confirmed in <it>Zea mays </it>and sorghum <abbrgrp><abbr bid="B12">12</abbr></abbrgrp>. Although most plant Myb genes contain only two repeats, there have been three-repeat Mybs reported in <it>Arabidopsis </it><abbrgrp><abbr bid="B1">1</abbr></abbrgrp>, maize <abbrgrp><abbr bid="B13">13</abbr></abbrgrp> and other plants <abbrgrp><abbr bid="B12">12</abbr></abbrgrp>. To date, only three-repeat Myb genes have been detected in animals, and it has been proposed that two-repeat Myb genes died out in the animal lineage <abbrgrp><abbr bid="B12">12</abbr></abbrgrp>. The broad presence of three-repeat Myb genes in diverse species indicates the antiquity of these genes. Using homology search in the GenBank non-redundant database, two three-repeat Myb proteins (accession numbers NP_913483 and BAC79618.) were identified in <it>Oryza sativa </it>L. ssp. <it>japonica</it>. However, no three-repeat Mybs were detected in rice (<it>indica</it>) in our study. This could be due to the incompleteness of the <it>Oryza indica </it>dataset.</p>
			</sec>
			<sec>
				<st>
					<p>Topology of Myb gene phylogeny</p>
				</st>
				<p>On the basis of sequence similarity and the topology of the phylogeny, we clustered the Myb genes into 42 subgroups, ranging in size from two to 14 Myb genes (Figure <figr fid="F1">1</figr>). The phylogenetic topology and subgroup structures are consistent with previous reports <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B10">10</abbr></abbrgrp>. The detailed comparison is described in Additional data file 1 (also available with all other supplementary material at <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>). However, because of the large number of taxa, the bootstrap values are low (data not shown). Therefore, we sought other evidence to support the reliability of the subgroup designations.</p>
				<fig id="F1">
					<title>
						<p>Figure 1</p>
					</title>
					<caption>
						<p>Phylogeny, subgroup designations, and carboxy-terminal motifs in Myb proteins from <it>Arabidopsis </it>and rice</p>
					</caption>
					<text>
						<p>Phylogeny, subgroup designations, and carboxy-terminal motifs in Myb proteins from <it>Arabidopsis </it>and rice. The phylogenetic tree on the left represents 130 Myb genes from <it>Arabidopsis</it>, 85 from rice, and 43 from other plants, which are clustered into 42 subgroups (triangles) and seven singletons (lines). The 19 gray subgroups contain conserved carboxy-terminal motifs. The arrow indicates a large cluster of genes involved in the phenylpropanoid biosynthetic pathway or ABA response. The scale bar under the tree respresents 0.2 substitutions. Some 'landmark' Myb proteins are listed in parentheses for functional reference. The uncompressed tree with full taxa names is available as Additional data file 7. Comparison of the subgroup designations used in this study with that in [1] is described in Additional data file 1. The four blocks (A-D) in the center of the diagram indicate the distribution of the four major splicing patterns in the Myb R2R3 domains; see text for details. The motifs on the right were detected using MEME and drawn to scale. The Myb R2 and R3 repeats are indicated. The black boxes indicate the extension motifs following the R3 repeat. The gray boxes represent the motifs identified in the previous report [1], and the white boxes are the motifs newly discovered here. The thin lines indicate coding regions lacking a detectable motif, with a polypeptide length indicated by the number above the diagonal slash marks. The scale bar is equivalent to 50 amino-acid residues.</p>
					</text>
					<graphic file="gb-2004-5-7-r46-1"/>
				</fig>
				<p>Interestingly, AtMyb33, 65, 101, 104 and At3g60460 were complementary, with few mismatches, to <it>Arabidopsis </it>Myb microRNA (noncoding RNA) miR159 <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>. The sequence is 21 nucleotides long and located in the 3' untranslated region (3' UTR) of all five genes. MicroRNAs are proposed to act as regulators of gene expression through interactions with complementary mRNA sequences. Importantly, these five <it>Arabidopsis </it>Mybs are located in subgroup G12 (Figure <figr fid="F1">1</figr>). This clustering provides additional evidence for the reliability of the subgroup designations in our analysis.</p>
			</sec>
			<sec>
				<st>
					<p>Conserved gene structure within each subgroup supports the subgroup designations</p>
				</st>
				<p>The phylogenetic topology and subgroup structures are based on sequence comparisons of the complete predicted Myb genes. To test the reliability of the subgroup designations using independent criteria, we investigated the exon-intron structure of Myb genes subgroup by subgroup. A majority of <it>Arabidopsis </it>(59%) and rice (53%) Myb genes have a conserved splicing pattern of three exons and two introns in R2R3 domains (represented by subgroup G13; Figure <figr fid="F2">2a</figr>). Either or both of the two introns are absent in 19% of <it>Arabidopsis </it>Myb genes and 12% of rice Myb genes. Variable splicing patterns different from G13 were detected in 22% of <it>Arabidopsis </it>and 35% of rice Myb genes, respectively (data not shown). Strikingly, the exon-intron structure is conserved within each subgroup, but varies between subgroups (Figures <figr fid="F2">2b,c</figr>). This supports the subgroup designations from the independent criterion of splicing pattern.</p>
				<fig id="F2">
					<title>
						<p>Figure 2</p>
					</title>
					<caption>
						<p>Intron-exon structure of Myb genes</p>
					</caption>
					<text>
						<p>Intron-exon structure of Myb genes. <b>(a) </b>Locations of introns 1 and 2 splice sites in R2R3 domains. Six representative Myb R2R3 domain sequences are shown. The extent of the R2 and R3 repeats is indicated at bottom of the alignment. The triangles indicate the positions of the splice sites, and the numbers above the triangles indicate the phases of introns. Subgroup G13 represents the major splicing pattern; that is, 77 (of 130) Myb genes in <it>Arabidopsis</it>, and 45 (of 85) Myb genes in rice have this splicing pattern. The shaded W residues indicate the regularly spaced tryptophan residues. The representative sequences of the six subgroups are: G13, At1g57560; G12, At2g32460; G16, At4g32730; N08, Scaffold479_5; N19, At2g39880; N21, At2g25230. The table at the right of the alignment lists the number of Myb genes with each splicing pattern and the total number of Myb genes in <it>Arabidopsis </it>and rice. Note that 22 <it>Arabidopsis </it>and 18 rice genes have the typical G13 splicing pattern, except that they lack either intron 1 or intron 2. Additionally, six <it>Arabidopsis </it>and 12 rice genes have no introns within the R2R3 domain. Finally, two <it>Arabidopsis </it>and four rice Myb genes have other atypical splicing patterns (data not shown). <b>(b,c) </b>The conserved exon-intron structure of all member genes in subgroups G13 and N21. Boxes and lines indicate exons and introns, respectively. Additional examples are provided in Additional data file 8.</p>
					</text>
					<graphic file="gb-2004-5-7-r46-2"/>
				</fig>
				<p>Interestingly, the Myb gene splicing patterns constitute four major blocks in the Myb gene phylogeny (Figure <figr fid="F1">1</figr>). Block A lacks both introns 1 and 2. There are three splicing patterns in block B: subgroup G15 lacks both introns; subgroup G17 lacks only intron 2; and the remaining genes have altered splicing sites when compared to subgroup G13. Myb genes in block C have the major splicing pattern (81.2%) typified by G13, with some individual genes lacking intron 1 (9.4%), intron 2 (4.7%) or both introns (1.9%), or having minor splicing patterns (2.8%). In contrast, 58.2% of Myb genes in block D retain the typical splicing sites, and the rest lack only intron 1 (G02, G05 and half of the genes in G06).</p>
				<p>In addition to splice-site locations, we also examined the position of splicing with respect to the open reading frame (ORF) - the intron phase. The splicing of each intron is designated as occurring in one of three phases: in phase 0, splicing occurs after the third nucleotide of the first codon; in phase 1, splicing occurs after the first nucleotide of the single codon; and in phase 2, splicing occurs after the second nucleotide. Figure <figr fid="F2">2a</figr> shows not only the conserved locations in the Myb-domain protein sequences but also the conserved phases of introns within the same subgroup. Moreover, there is a significant excess of phase 1 and 2 introns as well as an excess of nonsymmetric exons in Myb genes. Symmetric exons are exons that are flanked by introns of the same phase.</p>
				<p>According to the intron-early theory <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>, an excess of phase 0 introns and symmetric exons may facilitate exon shuffling by avoiding interruptions of the ORF, and thus could accelerate the rate of recombinational fusion and exchange of protein domains. Our results suggest that ancient Myb genes had a compact size without introns. During evolution, under some unknown mechanisms, introns were inserted into Myb domains and resulted in the observed splicing patterns. One splicing pattern remained unchanged in the subsequent gene amplification, resulting in the major splicing pattern typified by G13. Consistent with this, transposition of introns occurs very infrequently during evolution <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>. This intron-gain model is consistent with previous results showing that numerous introns have been inserted into plants and retained in the genome <abbrgrp><abbr bid="B18">18</abbr></abbrgrp>. A similar approach to gene classification using intron/exon structure has been applied in the kinesin family <abbrgrp><abbr bid="B19">19</abbr></abbrgrp> and the bHLH family <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>, and the results support a similar evolutionary pattern.</p>
				<p>Although the splicing sites are conserved, the sizes of both introns vary greatly for different Myb genes. Approximately 85% of introns 1 and 2 of Myb genes is shorter than 300 bp in <it>Arabidopsis </it>and rice. Detailed information about the distribution of intron sizes of Myb genes is available in Additional data file 2. It is worth noting that the size of intron 2 of maize <it>p1 </it>and <it>p2 </it>orthologs is very large, around 5 kb. This intron-size information may be helpful for aligning expressed sequence tags (ESTs) with genomic sequences.</p>
				<p>Strikingly, a 743-base fragment was found in intron 2 of maize <it>P1-rr </it>and <it>P1-wr </it>alleles, but not in <it>P1-rw </it>and <it>p2 </it>alleles. A 10-base direct repeat (5'-TGATTTTGAC-3') flanks this fragment. Interestingly, no <it>Ac </it>elements were found inserted in its adjacent 3.2-kb intronic region, but frequent <it>Ac </it>insertion occurred in other regions. This could be due to a particular chromatin structure refractory to <it>Ac </it>insertion in this region <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>. BLAST search detected this fragment (94% identity over 723 base-pairs (bp)) at one other locus in the maize genome, but with a new flanking direct repeat (5'-GGATATCCA-3'). The GenBank accession number is AF466202 (located 84795..85689, 12 March 2002 version). These results are consistent with a previous proposal that some transposable elements could insert into the genome as intronic sequences, a mechanism that has been proposed for the insertion of nuclear introns <abbrgrp><abbr bid="B22">22</abbr></abbrgrp>.</p>
			</sec>
			<sec>
				<st>
					<p>Topology of Myb gene phylogeny may reflect functional conservation</p>
				</st>
				<p>Most plant Myb genes are thought to encode transcription factors that activate or repress target gene expression either independently or together with cofactors. The topology of Myb phylogeny (Figure <figr fid="F1">1</figr>) indicates that some Myb genes in the same subgroup have the same function and that some Myb genes with similar functions are located in the same subgroup. For example, two Myb orthologs, snapdragon <it>PHAN </it>gene and maize <it>rs2 gene</it>, are located in subgroup G18, and both are involved in organ development: <it>PHAN </it>has been shown to regulate the development of the proximo-distal axis and dorso-ventral asymmetry of lateral organs such as leaves, bracts and petal lobes <abbrgrp><abbr bid="B23">23</abbr></abbrgrp>, while the <it>rs2 </it>gene controls the development of maize lateral organ primordia by repressing expression of <it>knox </it>(<it>knotted1</it>-like homeobox) genes that are required for the normal initiation and development of lateral organs <abbrgrp><abbr bid="B24">24</abbr></abbrgrp>. In another example, the <it>Arabidopsis </it>genes <it>GL1 </it>and <it>WER </it>located in subgroup G07 are both involved in epidermal cell development: <it>GL1 </it>activates the <it>GLABRA2 </it>homeobox gene for trichome (hair cell) development in some parts of the leaf and in the stem <abbrgrp><abbr bid="B25">25</abbr><abbr bid="B26">26</abbr></abbrgrp>, while <it>WER </it>controls the formation of the root epidermis by regulating expression of the <it>GLABRA2 </it>gene <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>.</p>
				<p>Similar results are observed for Myb genes involved in the phenylpropanoid biosynthetic pathway (Figure <figr fid="F1">1</figr>, subgroups N08, N09, N14): <it>C1 </it><abbrgrp><abbr bid="B28">28</abbr></abbrgrp>, <it>Pl</it>, <it>TT2 </it><abbrgrp><abbr bid="B29">29</abbr></abbrgrp>, <it>AN2 </it><abbrgrp><abbr bid="B30">30</abbr></abbrgrp>, <it>p1 </it><abbrgrp><abbr bid="B31">31</abbr></abbrgrp>, <it>p2 </it><abbrgrp><abbr bid="B32">32</abbr></abbrgrp>, <it>FaMyb1 </it><abbrgrp><abbr bid="B33">33</abbr></abbrgrp>, <it>PAP1 </it>and <it>PAP2 </it><abbrgrp><abbr bid="B34">34</abbr></abbrgrp>. These genes all encode a transcription factor that activates enzymes for phenylpropanoid synthesis, except that the <it>FaMyb1 </it>transcription factor suppresses anthocyanin and flavonol accumulation <abbrgrp><abbr bid="B33">33</abbr></abbrgrp>. In addition, the functional conservation among some Myb genes during evolution could be observed in the cell-cycle protein <it>CDC5 </it>(Figure <figr fid="F1">1</figr>, G19). The CDC5 protein performs an essential function in cell-cycle control at G2/M, and also participates in pre-mRNA splicing <abbrgrp><abbr bid="B35">35</abbr></abbrgrp>.</p>
			</sec>
			<sec>
				<st>
					<p>Carboxy-terminal motifs</p>
				</st>
				<p>The extent of the Myb R1, R2 and R3 repeats is based on similarity to the previously-published consensus Myb repeat sequences <abbrgrp><abbr bid="B36">36</abbr></abbrgrp>. We used computational methods to identify additional conserved sequences downstream of the Myb repeats. A total of 18 motifs were identified in the carboxy-terminal regions, with each motif ranging in size from 9 to 32 amino acids (Figure <figr fid="F1">1</figr>). An exceptionally large domain (91 residues) was found in subgroup G19, gene <it>CDC5</it>, which is a conserved Myb paralog that originated prior to Myb-family amplification <abbrgrp><abbr bid="B12">12</abbr></abbrgrp>. In addition, Myb genes maize <it>C1</it>, <it>Pl </it>and AtMyb123 (<it>TT2</it>) in subgroup N08 have a nine-amino-acid motif previously reported <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B37">37</abbr></abbrgrp>. This motif has a high e-value (4.5e-008), so it was excluded from our analysis. Three other motifs identified by Stracke <it>et al</it>. <abbrgrp><abbr bid="B1">1</abbr></abbrgrp> were excluded from our analysis because of their high e-values (see Additional data file 3).</p>
				<p>Interestingly, three short fragments directly following the Myb R3 repeat are highly conserved in some subgroups (Figure <figr fid="F1">1</figr>, black boxes). We designated these extension motifs, E1, E2 and E3. Subgroups with an extension motif contain few or zero motifs in their carboxy-terminal coding regions when compared to those subgroups without extension motifs. One exception is subgroup G03, which contains motif E1 and two other carboxy-terminal motifs - 1 and 2 (Figure <figr fid="F1">1</figr>). In subgroup G08, a short conserved segment following E1 is termed E2. The three extension motifs are relatively small, ranging from 8 to 13 residues, but they are much more conserved than other motifs (Table <tblr tid="T1">1</tblr>). In the group of three extension motifs, 28 (of 33) sites are occupied by a single residue in more than 50% of the Myb proteins, and this value is greater than twice the relative frequency of the second most frequent residue.</p>
				<tbl id="T1">
					<title>
						<p>Table 1</p>
					</title>
					<caption>
						<p>Consensus sequences of carboxy-terminal motifs</p>
					</caption>
					<tblbdy cols="4">
						<r>
							<c ca="left">
								<p>Motif</p>
							</c>
							<c ca="center">
								<p>Alias</p>
							</c>
							<c ca="left">
								<p>E-value</p>
							</c>
							<c ca="left">
								<p>Consensus sequences</p>
							</c>
						</r>
						<r>
							<c cspan="4">
								<hr/>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>E1</p>
							</c>
							<c ca="center">
								<p>24</p>
							</c>
							<c ca="left">
								<p>8.3e-081</p>
							</c>
							<c ca="left">
								<p>LxxMGIDPVTH[KR]P</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>E2</p>
							</c>
							<c>
								<p/>
							</c>
							<c ca="left">
								<p>2.4e-058</p>
							</c>
							<c ca="left">
								<p>FSHLMAEI</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>E3</p>
							</c>
							<c ca="center">
								<p>18</p>
							</c>
							<c ca="left">
								<p>5.4e-062</p>
							</c>
							<c ca="left">
								<p>QRAGLPLYPpE[IV]</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>M1</p>
							</c>
							<c ca="center">
								<p>9</p>
							</c>
							<c ca="left">
								<p>4.1e-073</p>
							</c>
							<c ca="left">
								<p>Gq[SA]KnAAxLSH[MT]AQWESARLEAEARLARESKL</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>M2</p>
							</c>
							<c>
								<p/>
							</c>
							<c ca="left">
								<p>7.8e-039</p>
							</c>
							<c ca="left">
								<p>exe[DE]NKNYWNSI[LF]NlV[ND]SSpSdSs</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>M3</p>
							</c>
							<c ca="center">
								<p>15</p>
							</c>
							<c ca="left">
								<p>1.7e-041</p>
							</c>
							<c ca="left">
								<p>WV[HL][ED]D[DE]FELS[ST]L[TV][MN]M</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>M4</p>
							</c>
							<c ca="center">
								<p>1.2</p>
							</c>
							<c ca="left">
								<p>3.9e-032</p>
							</c>
							<c ca="left">
								<p>QGsLSL[IF]EKWLFd[DE]Q[SG]</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>M5</p>
							</c>
							<c ca="center">
								<p>2</p>
							</c>
							<c ca="left">
								<p>2.3e-025</p>
							</c>
							<c ca="left">
								<p>DISNsNKDsatSsEDvlAiIDeSFWSeVv</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>M6</p>
							</c>
							<c ca="center">
								<p>2</p>
							</c>
							<c ca="left">
								<p>4.3e-033</p>
							</c>
							<c ca="left">
								<p>drNdKgYNhDMEFWFD</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>M7</p>
							</c>
							<c ca="center">
								<p>19</p>
							</c>
							<c ca="left">
								<p>2.2e-014</p>
							</c>
							<c ca="left">
								<p>DQ[ST]gENYWg[MV]DD[IL]W[PS]</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>M8</p>
							</c>
							<c ca="center">
								<p>16</p>
							</c>
							<c ca="left">
								<p>1.5e-013</p>
							</c>
							<c ca="left">
								<p>PxLfFSEWl</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>M9</p>
							</c>
							<c>
								<p/>
							</c>
							<c ca="left">
								<p>4.1e-031</p>
							</c>
							<c ca="left">
								<p>PGSP[ST]GSD[VR]SD[SL]S[HT][GI]</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>M10</p>
							</c>
							<c ca="center">
								<p>22.2</p>
							</c>
							<c ca="left">
								<p>1.0e-120</p>
							</c>
							<c ca="left">
								<p>GEFM[AT][VA][VM]QEMI[KR][AT]EVRSYMAe[MV][QG]xx[NA]G[GC]G</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>M11</p>
							</c>
							<c ca="center">
								<p>21</p>
							</c>
							<c ca="left">
								<p>1.2e-047</p>
							</c>
							<c ca="left">
								<p>[PV]pF[FI]DFLGVG</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>M12</p>
							</c>
							<c>
								<p/>
							</c>
							<c ca="left">
								<p>1.5e-150</p>
							</c>
							<c ca="left">
								<p>Pixx[GS][KR]Y[DE][HW][IL]LExFAEKLVKERP</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>M13</p>
							</c>
							<c>
								<p/>
							</c>
							<c ca="left">
								<p>5.4e-112</p>
							</c>
							<c ca="left">
								<p>SPSVTLSL[SA][PS][SA][TA]VA[PA]aP[PA]aP</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>M14</p>
							</c>
							<c>
								<p/>
							</c>
							<c ca="left">
								<p>3.4e-079</p>
							</c>
							<c ca="left">
								<p>YDa[AN]DdPRkLRPGEIDPNPEaKPARPDPVDMDEDEKEMLSEARARLANTrGKKAKRKAREKQLEeARRLAsLQKRRELKAAGIdgrhrKRK</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>M15</p>
							</c>
							<c>
								<p/>
							</c>
							<c ca="left">
								<p>5.3e-020</p>
							</c>
							<c ca="left">
								<p>IDYNAEIPFEK[KR][AP]paGFYDTaDEDRp[AN]D</p>
							</c>
						</r>
					</tblbdy>
					<tblfn>
						<p>Alias indicates the corresponding motifs identified by Stracke <it>et al</it>. [1]. E-value was calculated by MEME. Consensus sequences follow the criteria of Joshi <it>et al</it>. [44]: a single capital letter is given if the relative frequency of a single residue at a certain position is greater than 50% and greater than twice that of the second most frequent residue. When no single residue satisfied these criteria, a pair of residues was assigned as capital letters in brackets if the sum of their relative frequencies exceeded 75%. If neither of these two criteria was fulfilled, a lower-case letter was given if the relative frequency of a residue is greater than 40%. Otherwise, x is given.</p>
					</tblfn>
				</tbl>
				<p>To test the reliability of the motif predictions, the similarity scores were calculated over the motif plus its flanking regions. The similarity plots produced much higher scores in the motif region than in the flanking regions (Figure <figr fid="F3">3a</figr>), thus supporting the identified motifs. Similar results were observed in the nonsynonymous (dN) substitution analysis, which is a typical way of examining the degree of functional constraints on proteins using evolutionary comparisons <abbrgrp><abbr bid="B37">37</abbr></abbrgrp>. The results indicated that motif regions were less frequently subject to substitution than flanking regions. The distribution of dN values showed that most dN values are equal to or less than 0.6 in motif regions, and greater than 1 in flanking regions (Figure <figr fid="F3">3</figr>, Table <tblr tid="T2">2</tblr>). Interestingly, there are seven other motifs identified by Stracke <it>et al</it>. <abbrgrp><abbr bid="B1">1</abbr></abbrgrp> that had a high dN value and did not pass this test (see Additional data file 3). The presence of carboxy-terminal motifs could reflect either the long-term conservation of critical sequences from antiquity or more recent gene duplications. The low dN values in the motif regions compared with the flanking regions suggest that the motifs are ancient sequences which have been conserved over long periods of time, rather than being the result of more recent duplications.</p>
				<fig id="F3">
					<title>
						<p>Figure 3</p>
					</title>
					<caption>
						<p>Similarity scores and dN values</p>
					</caption>
					<text>
						<p>Similarity scores and dN values. <b>(a) </b>Similarity scores and the average dN values of motifs 1 and 3 plus 10-residue flanking fragments. The curve indicates the similarity scores along the upstream flanking region, the motif region (peak), and the downstream flanking region. The dashed line shows the average similarity value for the entire alignment. The vertical axis is the score obtained from the scoring matrix BLOSUM62, with scores ranging from -4 to 11. The horizontal axis indicates the position of the alignment. The three values indicate the nonsynonymous substitution (dN) values in the upstream flanking region, the motif region (peak), and the downstream flanking region, respectively. Diagrams for other motifs are given in Additional data file 9. <b>(b) </b>Distribution of pairwise dN values of motifs 1 and 3. Most dN values of motifs are equal to or less than 0.6. In contrast, the counterparts of flanking regions are equal to or greater than 1. This result indicates that sites in motif regions are highly conserved and less frequently subject to substitution than those in flanking regions. White boxes, amino-terminal regions; gray boxes, motif regions; black boxes, carboxy-terminal regions. Histograms for other motifs are given in Additional data file 10.</p>
					</text>
					<graphic file="gb-2004-5-7-r46-3"/>
				</fig>
				<tbl id="T2" hint_layout="single">
					<title>
						<p>Table 2</p>
					</title>
					<caption>
						<p>The average dN values of carboxy-terminal motifs and their flanking regions</p>
					</caption>
					<tblbdy cols="4">
						<r>
							<c ca="left">
								<p>Motif</p>
							</c>
							<c ca="left">
								<p>Amino-terminal</p>
							</c>
							<c ca="left">
								<p>Motif</p>
							</c>
							<c ca="left">
								<p>Carboxy-terminal</p>
							</c>
						</r>
						<r>
							<c cspan="4">
								<hr/>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>E1</p>
							</c>
							<c ca="left">
								<p>0.2516</p>
							</c>
							<c ca="left">
								<p>0.3186</p>
							</c>
							<c ca="left">
								<p>2.0265</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>E2</p>
							</c>
							<c ca="left">
								<p>0.1624</p>
							</c>
							<c ca="left">
								<p>0.4690</p>
							</c>
							<c ca="left">
								<p>1.4686</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>E3</p>
							</c>
							<c ca="left">
								<p>0.2309</p>
							</c>
							<c ca="left">
								<p>0.3326</p>
							</c>
							<c ca="left">
								<p>1.9216</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>M1</p>
							</c>
							<c ca="left">
								<p>1.8129</p>
							</c>
							<c ca="left">
								<p>0.3199</p>
							</c>
							<c ca="left">
								<p>2.1799</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>M2</p>
							</c>
							<c ca="left">
								<p>1.9477</p>
							</c>
							<c ca="left">
								<p>0.5149</p>
							</c>
							<c ca="left">
								<p>1.9044</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>M3</p>
							</c>
							<c ca="left">
								<p>1.9579</p>
							</c>
							<c ca="left">
								<p>0.1662</p>
							</c>
							<c ca="left">
								<p>1.7259</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>M4</p>
							</c>
							<c ca="left">
								<p>1.7544</p>
							</c>
							<c ca="left">
								<p>0.5062</p>
							</c>
							<c ca="left">
								<p>2.1131</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>M5</p>
							</c>
							<c ca="left">
								<p>2.3692</p>
							</c>
							<c ca="left">
								<p>0.3601</p>
							</c>
							<c ca="left">
								<p>2.2258</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>M6</p>
							</c>
							<c ca="left">
								<p>2.2981</p>
							</c>
							<c ca="left">
								<p>0.4278</p>
							</c>
							<c>
								<p/>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>M7</p>
							</c>
							<c ca="left">
								<p>1.9431</p>
							</c>
							<c ca="left">
								<p>0.3928</p>
							</c>
							<c ca="left">
								<p>1.6694</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>M8</p>
							</c>
							<c ca="left">
								<p>2.2754</p>
							</c>
							<c ca="left">
								<p>0.2326</p>
							</c>
							<c ca="left">
								<p>1.6436</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>M9</p>
							</c>
							<c ca="left">
								<p>1.8157</p>
							</c>
							<c ca="left">
								<p>0.5401</p>
							</c>
							<c ca="left">
								<p>2.2248</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>M10</p>
							</c>
							<c ca="left">
								<p>2.1319</p>
							</c>
							<c ca="left">
								<p>0.3695</p>
							</c>
							<c ca="left">
								<p>1.8878</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>M11</p>
							</c>
							<c ca="left">
								<p>2.1358</p>
							</c>
							<c ca="left">
								<p>0.3209</p>
							</c>
							<c>
								<p/>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>M12</p>
							</c>
							<c ca="left">
								<p>1.5835</p>
							</c>
							<c ca="left">
								<p>0.3525</p>
							</c>
							<c ca="left">
								<p>1.6608</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>M13</p>
							</c>
							<c ca="left">
								<p>1.5503</p>
							</c>
							<c ca="left">
								<p>0.4099</p>
							</c>
							<c ca="left">
								<p>1.9772</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>M14</p>
							</c>
							<c ca="left">
								<p>1.9661</p>
							</c>
							<c ca="left">
								<p>0.2701</p>
							</c>
							<c ca="left">
								<p>1.2742</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>M15</p>
							</c>
							<c ca="left">
								<p>1.4746</p>
							</c>
							<c ca="left">
								<p>0.4667</p>
							</c>
							<c ca="left">
								<p>1.9047</p>
							</c>
						</r>
					</tblbdy>
					<tblfn>
						<p>Amino-terminal and carboxy-terminal indicate the upstream and downstream flanking regions, respectively. The upstream flanking regions of extension motifs E1, E2 and E3 are the carboxy-terminal ending fragments of the Myb R3 repeat, which is highly conserved. Therefore, their amino-terminal dN values are low. dN values for the carboxy-terminal motifs M6 and M11 could not be calculated because of their close proximity to the carboxyl terminus.</p>
					</tblfn>
				</tbl>
			</sec>
			<sec>
				<st>
					<p>Specificity of motifs to Myb genes</p>
				</st>
				<p>We wished to determine whether the detected motifs are specifically present in Myb genes, but are absent in non-Myb genes. Therefore, we used protein motif sequences as query sequences and performed Blastp searches in the Swiss-Prot database. For the motifs with size equal to or less than 15 amino acids, the homologous hits with 85% of the query motif length are all Myb domain containing proteins. For the motifs with size greater than 15 amino acids, the corresponding homologous hits with 70% of the query motif length contain Myb domains. The search result is described in Additional data file 4.</p>
				<p>We obtained similar results in EST searches. When translated into proteins, all the 14 ESTs detected from an extension motif search also contain a Myb domain. Interestingly, we detected more ESTs from E1 than from the other extension motifs. Most probably this is due to the presence of E1 in more Myb genes than other motifs (Figure <figr fid="F1">1</figr>). The search result is described in Additional data file 5.</p>
				<p>After comparing these two search results, we found that not all carboxy-terminal motifs detected homologous ESTs. This could be due to the low levels of expression of some Myb genes so that their EST sequences are not yet available. In some cases, these ESTs did not contain Myb domains; however, because the carboxy-terminal motifs are located downstream some distance from the Myb domains, the returned ESTs are probably too short to reach the Myb domains. However, alignment of ESTs with known Myb genes showed high identity not only in the motif sequence but also for considerable lengths in the flanking regions. This suggests that such ESTs are very likely from Myb genes.</p>
				<p>Interestingly, we checked each carboxy-terminal motif in the 258 Myb proteins and found they are subgroup specific. For example, motif 1 from subgroup G03 was not detected in other subgroups.</p>
			</sec>
			<sec>
				<st>
					<p>Identification of regulatory elements in noncoding regions</p>
				</st>
				<p>In addition to the carboxy-terminal motifs detected in the predicted Myb proteins, we wanted to test whether any conserved DNA sequence motifs could be identified among the Myb gene subgroups. We applied motif-searching tools to detect conserved regulatory elements in the promoter region plus 5' UTR of the Myb genes and in intron regions. In contrast to the carboxy-terminal coding regions, no conserved DNA sequence motifs were identified in the Myb gene noncoding regions. This could be due to the fact that the Myb genes clustered in each subgroup are probably not orthologs or paralogs. In contrast, within the subgroup N14 (Figure <figr fid="F1">1</figr>) containing the maize <it>p1 </it>and <it>p2 </it>genes, and orthologs/paralogs from sorghum and rice, a highly conserved scheme of TATA-box, transcription start site sequences, and 5' UTR CA-box were found (data not shown). Otherwise, no significant regulatory elements were detected in noncoding regions of other Myb genes. However, it should be noted that segments of intron sequence closer to flanking exons are significantly more conserved than interior intron sequence. It has been reported that this level of intron sequence conservation may have a functional role in gene regulation <abbrgrp><abbr bid="B38">38</abbr></abbrgrp>. Our results suggest that it will be difficult to directly identify regulatory motifs in noncoding regions using only existing computational techniques. The chance of identification of regulatory elements will be increased in orthologs/paralogs. Possibly, the identification of co-regulated genes using microarray analysis will assist in the identification of common regulatory elements.</p>
			</sec>
		</sec>
		<sec>
			<st>
				<p>Conclusions</p>
			</st>
			<p>The expansion of Myb genes in plants makes it one of the largest families of transcription factors known to date. However, the specific roles of Myb genes in regulating plant traits are still unclear. Here, we used overall sequence similarity to cluster Myb genes from <it>Arabidopsis </it>and rice into 42 subgroups. The subgroup designations were well supported by sequence similarity and exon-intron structure. In one subgroup, significant complementarity to a specific miRNA was also observed. Furthermore, we found that the splicing sites and the phase of introns are conserved in Myb domains within the same subgroup, but differ between subgroups. The phylogenetic topology of splicing patterns suggested that Myb domains may originally have been compact in size, and that introns were inserted and remained in place during evolution. Computational searches were used to identify conserved carboxy-terminal motifs present in the different subgroups. These motifs appear to be specific characteristics of the Myb subgroups. In contrast to the carboxy-terminal motifs specifically present in Myb genes, no conserved regulatory elements were identified in the noncoding regions.</p>
		</sec>
		<sec>
			<st>
				<p>Materials and methods</p>
			</st>
			<sec>
				<st>
					<p>Myb proteins used in the analysis</p>
				</st>
				<p>At the initiation of our project, the international rice genome sequencing project was not finished. Two finished rice genomes - one by Monsanto, the other by Syngenta (<it>Oryza sativa </it>L. ssp. <it>japonica</it>) - were not available to the public. Only the rice (<it>indica</it>) genome sequenced by Beijing Genomics Institute was publicly available. The sequence-quality assessments through sequence-tagged site (STS) markers, UniGene clusters and nonredundant cDNAs showed that 92% of the functional sequences that encode genes, and their immediate regulatory elements were present in the assembled sequences <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>. Therefore, we chose subspecies <it>indica </it>rather than <it>japonica </it>for this study.</p>
				<p>Rice genome sequences (scaffold dataset) were obtained from Beijing Genomics Institute <abbrgrp><abbr bid="B39">39</abbr></abbrgrp>. FGeneSH has been used successfully to predict genes in rice <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>, and GenScan was used together with it to predict genes by taking rice genomic sequences as input. The two prediction results were combined as the complement of rice proteins. We performed Blastp and HMMER <abbrgrp><abbr bid="B40">40</abbr></abbrgrp> searches to identify Myb genes from this rice protein dataset. For Blastp, we used a set of Myb R2R3 domains as query sequences. For HMMER, we used the Myb profile from Pfam. We parsed and combined the results of both searches, and obtained the final complement of rice Myb proteins with manual inspection of each sequence to confirm the identification of <it>bona fide </it>Myb genes. In the end, 85 typical Myb genes with complete R2R3 domains (one R0R1R2R3 and 84 R2R3) and 28 partial Myb genes were detected in the rice genome. Partial Myb genes contain a segment similar to one or a partial Myb repeat. The sequences of rice Myb genes are listed in Additional data file 6.</p>
				<p>The complement of <it>Arabidopsis </it>proteins from GenBank were used to identify Myb proteins with complete R2R3 domains. The same methods as above were applied. We obtained 130 typical Myb proteins containing complete R2R3 domains (one R0R1R2R3 Myb, five R1R2R3 proteins, 124 R2R3 protein) and 11 partial Myb proteins. The results are consistent with previous findings <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>.</p>
				<p>To collect reference information on Myb gene functions, we used Blastp search against the nonredundant dataset in GenBank. The search yielded 43 plant Myb proteins with complete R2R3 domains; for most of these, some experimental information regarding functions or expression patterns was deposited by individual researchers.</p>
			</sec>
			<sec>
				<st>
					<p>Construction of phylogeny and subgroup designations</p>
				</st>
				<p>For phylogenetic analysis, the above 258 Myb proteins (130 <it>Arabidopsis</it>, 85 rice and 43 from various other plants) with complete R2R3 domains were included. The sequences were aligned by ClustalX (version 1.81). The phylogenetic tree was constructed by the neighbor-joining method using MEGA version 2.0 <abbrgrp><abbr bid="B41">41</abbr></abbrgrp>, with the setting of pairwise gap deletion and Poisson distance. Bootstrapping (1,000 replicates) was performed to evaluate the degree of support for a particular grouping in the neighbor-joining analysis. To enable the identification of motifs in the carboxy-terminal regions within each subgroup, we did not employ complete gap deletion as this may tend to exclude the contribution of carboxy-terminal residues because of their high divergence. The p-distance represents the simplest sort of genetic distance calculation and can be highly biased, so it was not used. In addition, attempts to use only the carboxy-terminal regions in construction of phylogeny were negative as a result of the high divergence. Therefore, we used the complete Myb proteins in clustering.</p>
				<p>Three trees were constructed with the above settings, then taxa were classified into subgroups based on the topology of the phylogeny. Tree I used the 43 landmark Myb proteins with 130 <it>Arabidopsis </it>Myb proteins. The clustering result is consistent with the previous report <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>; that is, the taxa that were clustered as subgroups in Stracke <it>et al</it>.'s findings <abbrgrp><abbr bid="B1">1</abbr></abbrgrp> are located within a subgroup in tree I. Tree II replaced the 130 <it>Arabidopsis </it>Myb proteins in tree I with 85 rice Myb proteins. Tree III used the total 258 Myb proteins. We found that the clustering result of Myb proteins from <it>Arabidopsis</it>, rice and the landmark Myb proteins was consistent among all three trees. Therefore, we used tree III as the representative in this study.</p>
			</sec>
			<sec>
				<st>
					<p>Motif identification</p>
				</st>
				<p>Within each subgroup, motifs were detected using MEME <abbrgrp><abbr bid="B42">42</abbr></abbrgrp> with the following parameter settings: the distribution of motifs: zero or one per sequence; maximum number of motifs to find: 16; minimum width of motif: 6; maximum width of motif: 117, in order to identify long R2R3 domains; minimum number of sites for each motif: the number of sequences, i.e., the motif must be present in all members within the same subgroup. Other options used the default values. Only motifs with e-value &lt;= 1e-10 were kept for further analysis.</p>
			</sec>
			<sec>
				<st>
					<p>Motif analysis: similarity scores and nonsynonymous (dN) substitution</p>
				</st>
				<p>To confirm the reliability of the 38 motif candidates identified by MEME, we used PlotSimilarity from GCG package from Genetics Computer Group, Inc. to calculate the similarity score of each motif plus its 10-residue flanking fragments (protein sequences). There were 33 motifs with values above the average score in the motif region and below the average score in the flanking regions, and these were tested further using the dN values. The program YN00 from PAML package <abbrgrp><abbr bid="B43">43</abbr></abbrgrp> was applied to analyze the conservation of each motif plus its flanking regions (coding DNA sequences). The frequency of synonymous substitution is too high to detect the conservation. Therefore, nonsynonymous substitution value was calculated. Low dN values indicate conservation whereas high dN values indicate divergence. We detected 18 motifs with dN &lt;0.5 in the motif region and &gt;1 in flanking regions; these 18 motifs included 3 extension and 15 carboxy-terminal motifs.</p>
				<p>Originally, MEME identified 38 motif candidates with e-value &lt;= 1e-10. Then five motifs were removed in the similarity score test. Later, 15 motifs were discarded in the nonsynonymous substitution test. This result suggests that the similarity score test is not sufficiently powerful to determine the reliability of motif candidates, and may be safely ignored in the future.</p>
			</sec>
			<sec>
				<st>
					<p>Specificity of motifs</p>
				</st>
				<p>To test whether the carboxy-terminal motifs are specific to Myb genes, motif sequences were used to perform homology search in Swiss-Prot database and EST data set from GenBank. The latter can also provide information on the expression pattern of Myb genes. Low complexity was turned off for optimal short sequence search in both homology searches. In addition, in EST search, for motifs less than 15 residues 10 downstream residues were appended, and this elongated sequence was used as query sequence to perform EST search. The corresponding Myb R2 repeats were used in a tblastn EST search as an internal positive control.</p>
			</sec>
		</sec>
		<sec>
			<st>
				<p>Additional data files</p>
			</st>
			<p>The following additional files are available: Additional data file <supplr sid="s1">1</supplr> gives the mapping relations of subgroups; Additional data file <supplr sid="s2">2</supplr> gives the distribution of intron sizes in Myb genes; Additional data file <supplr sid="s3">3</supplr> gives the previously identified carboxy-terminal motifs not included in this study; Additional data file <supplr sid="s4">4</supplr> gives the Blastp search results for homologous Myb genes; Additional data file <supplr sid="s5">5</supplr> gives the homologous EST search results; Additional data file <supplr sid="s6">6</supplr> (a .FAS file) gives the sequences of all rice Myb genes; Additional data file <supplr sid="s7">7</supplr> is a tree of the relationship of 130 <it>Arabidopsis</it>, 85 rice Myb proteins and 43 'landmark' Myb proteins; Additional data file <supplr sid="s8">8</supplr> gives the intron-exon structure of all R2R3 domains; Additional data file <supplr sid="s9">9</supplr> gives similarity scores and average dN values for all motifs; Additional data file <supplr sid="s10">10</supplr> gives the distribution of pairwise dN values of motifs and flanking regions.</p>
			<suppl id="s1">
				<title>
					<p>Additional data file 1</p>
				</title>
				<caption>
					<p>The mapping relations of subgroups</p>
				</caption>
				<text>
					<p>The mapping relations of subgroups</p>
				</text>
				<file name="gb-2004-5-7-r46-s1.pdf">
					<p>Click here for additional data file</p>
				</file>
			</suppl>
			<suppl id="s2">
				<title>
					<p>Additional data file 2</p>
				</title>
				<caption>
					<p>The distribution of intron sizes in Myb genes</p>
				</caption>
				<text>
					<p>The distribution of intron sizes in Myb genes</p>
				</text>
				<file name="gb-2004-5-7-r46-s2.pdf">
					<p>Click here for additional data file</p>
				</file>
			</suppl>
			<suppl id="s3">
				<title>
					<p>Additional data file 3</p>
				</title>
				<caption>
					<p>The previously identified carboxy-terminal motifs not included in this study</p>
				</caption>
				<text>
					<p>The previously identified carboxy-terminal motifs not included in this study</p>
				</text>
				<file name="gb-2004-5-7-r46-s3.pdf">
					<p>Click here for additional data file</p>
				</file>
			</suppl>
			<suppl id="s4">
				<title>
					<p>Additional data file 4</p>
				</title>
				<caption>
					<p>The Blastp search results for homologous Myb genes</p>
				</caption>
				<text>
					<p>The Blastp search results for homologous Myb genes</p>
				</text>
				<file name="gb-2004-5-7-r46-s4.pdf">
					<p>Click here for additional data file</p>
				</file>
			</suppl>
			<suppl id="s5">
				<title>
					<p>Additional data file 5</p>
				</title>
				<caption>
					<p>The homologous EST search results</p>
				</caption>
				<text>
					<p>The homologous EST search results</p>
				</text>
				<file name="gb-2004-5-7-r46-s5.pdf">
					<p>Click here for additional data file</p>
				</file>
			</suppl>
			<suppl id="s6">
				<title>
					<p>Additional data file 6</p>
				</title>
				<caption>
					<p>The sequences of all rice Myb genes</p>
				</caption>
				<text>
					<p>The sequences of all rice Myb genes</p>
				</text>
				<file name="gb-2004-5-7-r46-s6.fas">
					<p>Click here for additional data file</p>
				</file>
			</suppl>
			<suppl id="s7">
				<title>
					<p>Additional data file 7</p>
				</title>
				<caption>
					<p>A tree of the relationship of 130 <it>Arabidopsis</it>, 85 rice Myb proteins and 43 'landmark' Myb proteins</p>
				</caption>
				<text>
					<p>A tree of the relationship of 130 <it>Arabidopsis</it>, 85 rice Myb proteins and 43 'landmark' Myb proteins</p>
				</text>
				<file name="gb-2004-5-7-r46-s7.doc">
					<p>Click here for additional data file</p>
				</file>
			</suppl>
			<suppl id="s8">
				<title>
					<p>Additional data file 8</p>
				</title>
				<caption>
					<p>The intron-exon structure of all R2R3 domains</p>
				</caption>
				<text>
					<p>The intron-exon structure of all R2R3 domains</p>
				</text>
				<file name="gb-2004-5-7-r46-s8.pdf">
					<p>Click here for additional data file</p>
				</file>
			</suppl>
			<suppl id="s9">
				<title>
					<p>Additional data file 9</p>
				</title>
				<caption>
					<p>Similarity scores and average dN values for all motifs</p>
				</caption>
				<text>
					<p>Similarity scores and average dN values for all motifs</p>
				</text>
				<file name="gb-2004-5-7-r46-s9.pdf">
					<p>Click here for additional data file</p>
				</file>
			</suppl>
			<suppl id="s10">
				<title>
					<p>Additional data file 10</p>
				</title>
				<caption>
					<p>The distribution of pairwise dN values of motifs and flanking regions</p>
				</caption>
				<text>
					<p>The distribution of pairwise dN values of motifs and flanking regions</p>
				</text>
				<file name="gb-2004-5-7-r46-s10.pdf">
					<p>Click here for additional data file</p>
				</file>
			</suppl>
		</sec>
	</bdy>
	<bm>
		<refgrp>
			<bibl id="B1">
				<title>
					<p>The R2R3-MYB gene family in <it>Arabidopsis thaliana</it>.</p>
				</title>
				<aug>
					<au>
						<snm>Stracke</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Werber</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Weisshaar</snm>
						<fnm>B</fnm>
					</au>
				</aug>
				<source>Curr Opin Plant Biol</source>
				<pubdate>2001</pubdate>
				<volume>4</volume>
				<fpage>447</fpage>
				<lpage>456</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/S1369-5266(00)00199-0</pubid>
						<pubid idtype="pmpid" link="fulltext">11597504</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B2">
				<title>
					<p><it>Arabidopsis </it>transcription factors: genome-wide comparative analysis among eukaryotes.</p>
				</title>
				<aug>
					<au>
						<snm>Riechmann</snm>
						<fnm>JL</fnm>
					</au>
					<au>
						<snm>Heard</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Martin</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Reuber</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Jiang</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Keddie</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Adam</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Pineda</snm>
						<fnm>O</fnm>
					</au>
					<au>
						<snm>Ratcliffe</snm>
						<fnm>OJ</fnm>
					</au>
					<au>
						<snm>Samaha</snm>
						<fnm>RR</fnm>
					</au>
					<etal/>
				</aug>
				<source>Science</source>
				<pubdate>2000</pubdate>
				<volume>290</volume>
				<fpage>2105</fpage>
				<lpage>2110</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1126/science.290.5499.2105</pubid>
						<pubid idtype="pmpid" link="fulltext">11118137</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B3">
				<title>
					<p>Maize R2R3 Myb genes: sequence analysis reveals amplification in the higher plants.</p>
				</title>
				<aug>
					<au>
						<snm>Rabinowicz</snm>
						<fnm>PD</fnm>
					</au>
					<au>
						<snm>Braun</snm>
						<fnm>EL</fnm>
					</au>
					<au>
						<snm>Wolfe</snm>
						<fnm>AD</fnm>
					</au>
					<au>
						<snm>Bowen</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Grotewold</snm>
						<fnm>E</fnm>
					</au>
				</aug>
				<source>Genetics</source>
				<pubdate>1999</pubdate>
				<volume>153</volume>
				<fpage>427</fpage>
				<lpage>444</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">10471724</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B4">
				<title>
					<p>Solution structure of a specific DNA complex of the Myb DNA-binding domain with cooperative recognition helices.</p>
				</title>
				<aug>
					<au>
						<snm>Ogata</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Morikawa</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Nakamura</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Sekikawa</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Inoue</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Kanai</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Sarai</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Ishii</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Nishimura</snm>
						<fnm>Y</fnm>
					</au>
				</aug>
				<source>Cell</source>
				<pubdate>1994</pubdate>
				<volume>79</volume>
				<fpage>639</fpage>
				<lpage>648</lpage>
				<xrefbib>
					<pubid idtype="pmpid">7954830</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B5">
				<title>
					<p>Nucleotide sequence of the retroviral leukemia gene v-<it>myb </it>and its cellular progenitor c-<it>myb</it>: the architecture of a transduced oncogene.</p>
				</title>
				<aug>
					<au>
						<snm>Klempnauer</snm>
						<fnm>KH</fnm>
					</au>
					<au>
						<snm>Gonda</snm>
						<fnm>TJ</fnm>
					</au>
					<au>
						<snm>Bishop</snm>
						<fnm>JM</fnm>
					</au>
				</aug>
				<source>Cell</source>
				<pubdate>1982</pubdate>
				<volume>31</volume>
				<fpage>453</fpage>
				<lpage>463</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/0092-8674(82)90138-6</pubid>
						<pubid idtype="pmpid">6297766</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B6">
				<title>
					<p>MYB transcription factors in plants.</p>
				</title>
				<aug>
					<au>
						<snm>Martin</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Paz-Ares</snm>
						<fnm>J</fnm>
					</au>
				</aug>
				<source>Trends Genet</source>
				<pubdate>1997</pubdate>
				<volume>13</volume>
				<fpage>67</fpage>
				<lpage>73</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/S0168-9525(96)10049-4</pubid>
						<pubid idtype="pmpid" link="fulltext">9055608</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B7">
				<title>
					<p>Molecular evolution of the Myb family of transcription factors: evidence for polyphyletic origin.</p>
				</title>
				<aug>
					<au>
						<snm>Rosinski</snm>
						<fnm>JA</fnm>
					</au>
					<au>
						<snm>Atchley</snm>
						<fnm>WR</fnm>
					</au>
				</aug>
				<source>J Mol Evol</source>
				<pubdate>1998</pubdate>
				<volume>46</volume>
				<fpage>74</fpage>
				<lpage>83</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">9419227</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B8">
				<title>
					<p>The MYB transcription factor family: from maize to <it>Arabidopsis</it>.</p>
				</title>
				<aug>
					<au>
						<snm>Petroni</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Tonelli</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Paz-Ares</snm>
						<fnm>J</fnm>
					</au>
				</aug>
				<source>Maydica</source>
				<pubdate>2002</pubdate>
				<volume>47</volume>
				<fpage>213</fpage>
				<lpage>232</lpage>
			</bibl>
			<bibl id="B9">
				<title>
					<p>A draft sequence of the rice genome (<it>Oryza sativa </it>L. ssp. <it>indica</it>).</p>
				</title>
				<aug>
					<au>
						<snm>Yu</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Hu</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Wang</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Wong</snm>
						<fnm>GK</fnm>
					</au>
					<au>
						<snm>Li</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Liu</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Deng</snm>
						<fnm>Y</fnm>
					</au>
					<au>
						<snm>Dai</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Zhou</snm>
						<fnm>Y</fnm>
					</au>
					<au>
						<snm>Zhang</snm>
						<fnm>X</fnm>
					</au>
					<etal/>
				</aug>
				<source>Science</source>
				<pubdate>2002</pubdate>
				<volume>296</volume>
				<fpage>79</fpage>
				<lpage>92</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1126/science.1068037</pubid>
						<pubid idtype="pmpid" link="fulltext">11935017</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B10">
				<title>
					<p>Towards functional characterisation of the members of the R2R3-MYB gene family from <it>Arabidopsis thaliana</it>.</p>
				</title>
				<aug>
					<au>
						<snm>Kranz</snm>
						<fnm>HD</fnm>
					</au>
					<au>
						<snm>Denekamp</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Greco</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Jin</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Leyva</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Meissner</snm>
						<fnm>RC</fnm>
					</au>
					<au>
						<snm>Petroni</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Urzainqui</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Bevan</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Martin</snm>
						<fnm>C</fnm>
					</au>
					<etal/>
				</aug>
				<source>Plant J</source>
				<pubdate>1998</pubdate>
				<volume>16</volume>
				<fpage>263</fpage>
				<lpage>276</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1046/j.1365-313x.1998.00278.x</pubid>
						<pubid idtype="pmpid" link="fulltext">9839469</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B11">
				<title>
					<p>Function search in a large transcription factor gene family in <it>Arabidopsis</it>: assessing the potential of reverse genetics to identify insertional mutations in R2R3 MYB genes.</p>
				</title>
				<aug>
					<au>
						<snm>Meissner</snm>
						<fnm>RC</fnm>
					</au>
					<au>
						<snm>Jin</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Cominelli</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>Denekamp</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Fuertes</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Greco</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Kranz</snm>
						<fnm>HD</fnm>
					</au>
					<au>
						<snm>Penfield</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Petroni</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Urzainqui</snm>
						<fnm>A</fnm>
					</au>
					<etal/>
				</aug>
				<source>Plant Cell</source>
				<pubdate>1999</pubdate>
				<volume>11</volume>
				<fpage>1827</fpage>
				<lpage>1840</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1105/tpc.11.10.1827</pubid>
						<pubid idtype="pmpid" link="fulltext">10521515</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B12">
				<title>
					<p>Ordered origin of the typical two- and three-repeated Myb genes.</p>
				</title>
				<aug>
					<au>
						<snm>Jiang</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Gu</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Chopra</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Gu</snm>
						<fnm>X</fnm>
					</au>
					<au>
						<snm>Peterson</snm>
						<fnm>T</fnm>
					</au>
				</aug>
				<source>Gene</source>
				<pubdate>2004</pubdate>
				<volume>326</volume>
				<fpage>13</fpage>
				<lpage>22</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/j.gene.2003.09.049</pubid>
						<pubid idtype="pmpid" link="fulltext">14729259</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B13">
				<title>
					<p>Newly discovered plant c-<it>myb</it>-like genes rewrite the evolution of the plant myb gene family.</p>
				</title>
				<aug>
					<au>
						<snm>Braun</snm>
						<fnm>EL</fnm>
					</au>
					<au>
						<snm>Grotewold</snm>
						<fnm>E</fnm>
					</au>
				</aug>
				<source>Plant Physiol</source>
				<pubdate>1999</pubdate>
				<volume>121</volume>
				<fpage>21</fpage>
				<lpage>24</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1104/pp.121.1.21</pubid>
						<pubid idtype="pmpid" link="fulltext">10482656</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B14">
				<title>
					<p>Supplemental materials: index of /~czjiang/Myb</p>
				</title>
				<url>http://www.public.iastate.edu/~czjiang/Myb</url>
			</bibl>
			<bibl id="B15">
				<title>
					<p>Prediction of plant microRNA targets.</p>
				</title>
				<aug>
					<au>
						<snm>Rhoades</snm>
						<fnm>MW</fnm>
					</au>
					<au>
						<snm>Reinhart</snm>
						<fnm>BJ</fnm>
					</au>
					<au>
						<snm>Lim</snm>
						<fnm>LP</fnm>
					</au>
					<au>
						<snm>Burge</snm>
						<fnm>CB</fnm>
					</au>
					<au>
						<snm>Bartel</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Bartel</snm>
						<fnm>DP</fnm>
					</au>
				</aug>
				<source>Cell</source>
				<pubdate>2002</pubdate>
				<volume>110</volume>
				<fpage>513</fpage>
				<lpage>520</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/S0092-8674(02)00863-2</pubid>
						<pubid idtype="pmpid" link="fulltext">12202040</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B16">
				<title>
					<p>The exon theory of genes.</p>
				</title>
				<aug>
					<au>
						<snm>Gilbert</snm>
						<fnm>W</fnm>
					</au>
				</aug>
				<source>Cold Spring Harb Symp Quant Biol</source>
				<pubdate>1987</pubdate>
				<volume>52</volume>
				<fpage>901</fpage>
				<lpage>905</lpage>
				<xrefbib>
					<pubid idtype="pmpid">2456887</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B17">
				<title>
					<p>Introns in gene evolution.</p>
				</title>
				<aug>
					<au>
						<snm>Fedorova</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Fedorov</snm>
						<fnm>A</fnm>
					</au>
				</aug>
				<source>Genetica</source>
				<pubdate>2003</pubdate>
				<volume>118</volume>
				<fpage>123</fpage>
				<lpage>131</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1023/A:1024145407467</pubid>
						<pubid idtype="pmpid">12868603</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B18">
				<title>
					<p>Remarkable interkingdom conservation of intron positions and massive, lineage-specific intron loss and gain in eukaryotic evolution.</p>
				</title>
				<aug>
					<au>
						<snm>Rogozin</snm>
						<fnm>IB</fnm>
					</au>
					<au>
						<snm>Wolf</snm>
						<fnm>YI</fnm>
					</au>
					<au>
						<snm>Sorokin</snm>
						<fnm>AV</fnm>
					</au>
					<au>
						<snm>Mirkin</snm>
						<fnm>BG</fnm>
					</au>
					<au>
						<snm>Koonin</snm>
						<fnm>EV</fnm>
					</au>
				</aug>
				<source>Curr Biol</source>
				<pubdate>2003</pubdate>
				<volume>13</volume>
				<fpage>1512</fpage>
				<lpage>1517</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/S0960-9822(03)00558-X</pubid>
						<pubid idtype="pmpid" link="fulltext">12956953</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B19">
				<title>
					<p>Maximum likelihood methods reveal conservation of function among closely related kinesin families.</p>
				</title>
				<aug>
					<au>
						<snm>Lawrence</snm>
						<fnm>CJ</fnm>
					</au>
					<au>
						<snm>Malmberg</snm>
						<fnm>RL</fnm>
					</au>
					<au>
						<snm>Muszynski</snm>
						<fnm>MG</fnm>
					</au>
					<au>
						<snm>Dawe</snm>
						<fnm>RK</fnm>
					</au>
				</aug>
				<source>J Mol Evol</source>
				<pubdate>2002</pubdate>
				<volume>54</volume>
				<fpage>42</fpage>
				<lpage>53</lpage>
				<xrefbib>
					<pubid idtype="pmpid">11734897</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B20">
				<title>
					<p>The <it>Arabidopsis </it>basic/helix-loop-helix transcription factor family.</p>
				</title>
				<aug>
					<au>
						<snm>Toledo-Ortiz</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Huq</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>Quail</snm>
						<fnm>PH</fnm>
					</au>
				</aug>
				<source>Plant Cell</source>
				<pubdate>2003</pubdate>
				<volume>15</volume>
				<fpage>1749</fpage>
				<lpage>1770</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1105/tpc.013839</pubid>
						<pubid idtype="pmpid" link="fulltext">12897250</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B21">
				<title>
					<p>Insertional mutagenesis of the maize <it>P </it>gene by intragenic transposition of <it>Ac</it>.</p>
				</title>
				<aug>
					<au>
						<snm>Athma</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Grotewold</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>Peterson</snm>
						<fnm>T</fnm>
					</au>
				</aug>
				<source>Genetics</source>
				<pubdate>1992</pubdate>
				<volume>131</volume>
				<fpage>199</fpage>
				<lpage>209</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">1317315</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B22">
				<title>
					<p>The <it>En/Spm </it>transposable element of <it>Zea mays </it>contains splice sites at the termini generating a novel intron from a <it>dSpm </it>element in the <it>A2 </it>gene.</p>
				</title>
				<aug>
					<au>
						<snm>Menssen</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Hohmann</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Martin</snm>
						<fnm>W</fnm>
					</au>
					<au>
						<snm>Schnable</snm>
						<fnm>PS</fnm>
					</au>
					<au>
						<snm>Peterson</snm>
						<fnm>PA</fnm>
					</au>
					<au>
						<snm>Saedler</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Gierl</snm>
						<fnm>A</fnm>
					</au>
				</aug>
				<source>EMBO J</source>
				<pubdate>1990</pubdate>
				<volume>9</volume>
				<fpage>3051</fpage>
				<lpage>3057</lpage>
				<xrefbib>
					<pubid idtype="pmpid">2170105</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B23">
				<title>
					<p>The <it>PHANTASTICA </it>gene encodes a MYB transcription factor involved in growth and dorsoventrality of lateral organs in <it>Antirrhinum</it>.</p>
				</title>
				<aug>
					<au>
						<snm>Waites</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Selvadurai</snm>
						<fnm>HR</fnm>
					</au>
					<au>
						<snm>Oliver</snm>
						<fnm>IR</fnm>
					</au>
					<au>
						<snm>Hudson</snm>
						<fnm>A</fnm>
					</au>
				</aug>
				<source>Cell</source>
				<pubdate>1998</pubdate>
				<volume>93</volume>
				<fpage>779</fpage>
				<lpage>789</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/S0092-8674(00)81439-7</pubid>
						<pubid idtype="pmpid" link="fulltext">9630222</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B24">
				<title>
					<p>The maize <it>rough sheath2 </it>gene and leaf development programs in monocot and dicot plants.</p>
				</title>
				<aug>
					<au>
						<snm>Tsiantis</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Schneeberger</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Golz</snm>
						<fnm>JF</fnm>
					</au>
					<au>
						<snm>Freeling</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Langdale</snm>
						<fnm>JA</fnm>
					</au>
				</aug>
				<source>Science</source>
				<pubdate>1999</pubdate>
				<volume>284</volume>
				<fpage>154</fpage>
				<lpage>156</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1126/science.284.5411.154</pubid>
						<pubid idtype="pmpid" link="fulltext">10102817</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B25">
				<title>
					<p>A <it>myb </it>gene required for leaf trichome differentiation in <it>Arabidopsis </it>is expressed in stipules.</p>
				</title>
				<aug>
					<au>
						<snm>Oppenheimer</snm>
						<fnm>DG</fnm>
					</au>
					<au>
						<snm>Herman</snm>
						<fnm>PL</fnm>
					</au>
					<au>
						<snm>Sivakumaran</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Esch</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Marks</snm>
						<fnm>MD</fnm>
					</au>
				</aug>
				<source>Cell</source>
				<pubdate>1991</pubdate>
				<volume>67</volume>
				<fpage>483</fpage>
				<lpage>493</lpage>
				<xrefbib>
					<pubid idtype="pmpid">1934056</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B26">
				<title>
					<p>Flower colour intensity depends on specialized cell shape controlled by a Myb-related transcription factor.</p>
				</title>
				<aug>
					<au>
						<snm>Noda</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Glover</snm>
						<fnm>BJ</fnm>
					</au>
					<au>
						<snm>Linstead</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Martin</snm>
						<fnm>C</fnm>
					</au>
				</aug>
				<source>Nature</source>
				<pubdate>1994</pubdate>
				<volume>369</volume>
				<fpage>661</fpage>
				<lpage>664</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1038/369661a0</pubid>
						<pubid idtype="pmpid" link="fulltext">8208293</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B27">
				<title>
					<p>WEREWOLF, a MYB-related protein in <it>Arabidopsis</it>, is a position-dependent regulator of epidermal cell patterning.</p>
				</title>
				<aug>
					<au>
						<snm>Lee</snm>
						<fnm>MM</fnm>
					</au>
					<au>
						<snm>Schiefelbein</snm>
						<fnm>J</fnm>
					</au>
				</aug>
				<source>Cell</source>
				<pubdate>1999</pubdate>
				<volume>99</volume>
				<fpage>473</fpage>
				<lpage>483</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/S0092-8674(00)81536-6</pubid>
						<pubid idtype="pmpid" link="fulltext">10589676</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B28">
				<title>
					<p>The regulatory <it>c1 </it>locus of <it>Zea mays </it>encodes a protein with homology to <it>myb </it>proto-oncogene products and with structural similarities to transcriptional activators.</p>
				</title>
				<aug>
					<au>
						<snm>Paz-Ares</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Ghosal</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Wienand</snm>
						<fnm>U</fnm>
					</au>
					<au>
						<snm>Peterson</snm>
						<fnm>PA</fnm>
					</au>
					<au>
						<snm>Saedler</snm>
						<fnm>H</fnm>
					</au>
				</aug>
				<source>EMBO J</source>
				<pubdate>1987</pubdate>
				<volume>6</volume>
				<fpage>3553</fpage>
				<lpage>3558</lpage>
				<xrefbib>
					<pubid idtype="pmpid">3428265</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B29">
				<title>
					<p>The <it>Arabidopsis TT2 </it>gene encodes an R2R3 MYB domain protein that acts as a key determinant for proanthocyanidin accumulation in developing seed.</p>
				</title>
				<aug>
					<au>
						<snm>Nesi</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Jond</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Debeaujon</snm>
						<fnm>I</fnm>
					</au>
					<au>
						<snm>Caboche</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Lepiniec</snm>
						<fnm>L</fnm>
					</au>
				</aug>
				<source>Plant Cell</source>
				<pubdate>2001</pubdate>
				<volume>13</volume>
				<fpage>2099</fpage>
				<lpage>2114</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1105/tpc.13.9.2099</pubid>
						<pubid idtype="pmpid" link="fulltext">11549766</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B30">
				<title>
					<p>Molecular analysis of the <it>anthocyanin2 </it>gene of petunia and its role in the evolution of flower color.</p>
				</title>
				<aug>
					<au>
						<snm>Quattrocchio</snm>
						<fnm>F</fnm>
					</au>
					<au>
						<snm>Wing</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>van der Woude</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Souer</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>de Vetten</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Mol</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Koes</snm>
						<fnm>R</fnm>
					</au>
				</aug>
				<source>Plant Cell</source>
				<pubdate>1999</pubdate>
				<volume>11</volume>
				<fpage>1433</fpage>
				<lpage>1444</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1105/tpc.11.8.1433</pubid>
						<pubid idtype="pmpid" link="fulltext">10449578</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B31">
				<title>
					<p>Alternatively spliced products of the maize P gene encode proteins with homology to the DNA-binding domain of myb-like transcription factors.</p>
				</title>
				<aug>
					<au>
						<snm>Grotewold</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>Athma</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Peterson</snm>
						<fnm>T</fnm>
					</au>
				</aug>
				<source>Proc Natl Acad Sci USA</source>
				<pubdate>1991</pubdate>
				<volume>88</volume>
				<fpage>4587</fpage>
				<lpage>4591</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">2052542</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B32">
				<title>
					<p>A segmental gene duplication generated differentially expressed myb-homologous genes in maize.</p>
				</title>
				<aug>
					<au>
						<snm>Zhang</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Chopra</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Peterson</snm>
						<fnm>T</fnm>
					</au>
				</aug>
				<source>Plant Cell</source>
				<pubdate>2000</pubdate>
				<volume>12</volume>
				<fpage>2311</fpage>
				<lpage>2322</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1105/tpc.12.12.2311</pubid>
						<pubid idtype="pmpid" link="fulltext">11148280</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B33">
				<title>
					<p>The strawberry FaMYB1 transcription factor suppresses anthocyanin and flavonol accumulation in transgenic tobacco.</p>
				</title>
				<aug>
					<au>
						<snm>Aharoni</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>De Vos</snm>
						<fnm>CH</fnm>
					</au>
					<au>
						<snm>Wein</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Sun</snm>
						<fnm>Z</fnm>
					</au>
					<au>
						<snm>Greco</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Kroon</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Mol</snm>
						<fnm>JN</fnm>
					</au>
					<au>
						<snm>O'Connell</snm>
						<fnm>AP</fnm>
					</au>
				</aug>
				<source>Plant J</source>
				<pubdate>2001</pubdate>
				<volume>28</volume>
				<fpage>319</fpage>
				<lpage>332</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1046/j.1365-313X.2001.01154.x</pubid>
						<pubid idtype="pmpid" link="fulltext">11722774</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B34">
				<title>
					<p>Activation tagging identifies a conserved MYB regulator of phenylpropanoid biosynthesis.</p>
				</title>
				<aug>
					<au>
						<snm>Borevitz</snm>
						<fnm>JO</fnm>
					</au>
					<au>
						<snm>Xia</snm>
						<fnm>Y</fnm>
					</au>
					<au>
						<snm>Blount</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Dixon</snm>
						<fnm>RA</fnm>
					</au>
					<au>
						<snm>Lamb</snm>
						<fnm>C</fnm>
					</au>
				</aug>
				<source>Plant Cell</source>
				<pubdate>2000</pubdate>
				<volume>12</volume>
				<fpage>2383</fpage>
				<lpage>2394</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1105/tpc.12.12.2383</pubid>
						<pubid idtype="pmpid" link="fulltext">11148285</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B35">
				<title>
					<p>Evidence that Myb-related CDC5 proteins are required for pre-mRNA splicing.</p>
				</title>
				<aug>
					<au>
						<snm>Burns</snm>
						<fnm>CG</fnm>
					</au>
					<au>
						<snm>Ohi</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Krainer</snm>
						<fnm>AR</fnm>
					</au>
					<au>
						<snm>Gould</snm>
						<fnm>KL</fnm>
					</au>
				</aug>
				<source>Proc Natl Acad Sci USA</source>
				<pubdate>1999</pubdate>
				<volume>96</volume>
				<fpage>13789</fpage>
				<lpage>13794</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1073/pnas.96.24.13789</pubid>
						<pubid idtype="pmpid" link="fulltext">10570151</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B36">
				<title>
					<p>Solution structure of a DNA-binding unit of Myb: a helix-turn-helix-related motif with conserved tryptophans forming a hydrophobic core.</p>
				</title>
				<aug>
					<au>
						<snm>Ogata</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Hojo</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Aimoto</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Nakai</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Nakamura</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Sarai</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Ishii</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Nishimura</snm>
						<fnm>Y</fnm>
					</au>
				</aug>
				<source>Proc Natl Acad Sci USA</source>
				<pubdate>1992</pubdate>
				<volume>89</volume>
				<fpage>6428</fpage>
				<lpage>6432</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">1631139</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B37">
				<title>
					<p>Recently duplicated maize R2R3 Myb genes provide evidence for distinct mechanisms of evolutionary divergence after duplication.</p>
				</title>
				<aug>
					<au>
						<snm>Dias</snm>
						<fnm>AP</fnm>
					</au>
					<au>
						<snm>Braun</snm>
						<fnm>EL</fnm>
					</au>
					<au>
						<snm>McMullen</snm>
						<fnm>MD</fnm>
					</au>
					<au>
						<snm>Grotewold</snm>
						<fnm>E</fnm>
					</au>
				</aug>
				<source>Plant Physiol</source>
				<pubdate>2003</pubdate>
				<volume>131</volume>
				<fpage>610</fpage>
				<lpage>620</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1104/pp.012047</pubid>
						<pubid idtype="pmpid" link="fulltext">12586885</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B38">
				<title>
					<p>High intron sequence conservation across three mammalian orders suggests functional constraints.</p>
				</title>
				<aug>
					<au>
						<snm>Hare</snm>
						<fnm>MP</fnm>
					</au>
					<au>
						<snm>Palumbi</snm>
						<fnm>SR</fnm>
					</au>
				</aug>
				<source>Mol Biol Evol</source>
				<pubdate>2003</pubdate>
				<volume>20</volume>
				<fpage>969</fpage>
				<lpage>978</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1093/molbev/msg111</pubid>
						<pubid idtype="pmpid" link="fulltext">12716984</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B39">
				<title>
					<p>Rice GD</p>
				</title>
				<url>http://btn.genomics.org.cn:8080/rice</url>
			</bibl>
			<bibl id="B40">
				<title>
					<p>Hidden Markov models.</p>
				</title>
				<aug>
					<au>
						<snm>Eddy</snm>
						<fnm>SR</fnm>
					</au>
				</aug>
				<source>Curr Opin Struct Biol</source>
				<pubdate>1996</pubdate>
				<volume>6</volume>
				<fpage>361</fpage>
				<lpage>365</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/S0959-440X(96)80056-X</pubid>
						<pubid idtype="pmpid">8804822</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B41">
				<title>
					<p>MEGA2: molecular evolutionary genetics analysis software.</p>
				</title>
				<aug>
					<au>
						<snm>Kumar</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Tamura</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Jakobsen</snm>
						<fnm>IB</fnm>
					</au>
					<au>
						<snm>Nei</snm>
						<fnm>M</fnm>
					</au>
				</aug>
				<source>Bioinformatics</source>
				<pubdate>2001</pubdate>
				<volume>17</volume>
				<fpage>1244</fpage>
				<lpage>1245</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1093/bioinformatics/17.12.1244</pubid>
						<pubid idtype="pmpid" link="fulltext">11751241</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B42">
				<title>
					<p>Fitting a mixture model by expectation maximization to discover motifs in biopolymers.</p>
				</title>
				<aug>
					<au>
						<snm>Bailey</snm>
						<fnm>TL</fnm>
					</au>
					<au>
						<snm>Elkan</snm>
						<fnm>C</fnm>
					</au>
				</aug>
				<source>Proc Int Conf Intell Syst Mol Biol</source>
				<pubdate>1994</pubdate>
				<volume>2</volume>
				<fpage>28</fpage>
				<lpage>36</lpage>
				<xrefbib>
					<pubid idtype="pmpid">7584402</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B43">
				<title>
					<p>Estimating synonymous and nonsynonymous substitution rates under realistic evolutionary models.</p>
				</title>
				<aug>
					<au>
						<snm>Yang</snm>
						<fnm>Z</fnm>
					</au>
					<au>
						<snm>Nielsen</snm>
						<fnm>R</fnm>
					</au>
				</aug>
				<source>Mol Biol Evol</source>
				<pubdate>2000</pubdate>
				<volume>17</volume>
				<fpage>32</fpage>
				<lpage>43</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">10666704</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B44">
				<title>
					<p>Context sequences of translation initiation codon in plants.</p>
				</title>
				<aug>
					<au>
						<snm>Joshi</snm>
						<fnm>CP</fnm>
					</au>
					<au>
						<snm>Zhou</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Huang</snm>
						<fnm>X</fnm>
					</au>
					<au>
						<snm>Chiang</snm>
						<fnm>VL</fnm>
					</au>
				</aug>
				<source>Plant Mol Biol</source>
				<pubdate>1997</pubdate>
				<volume>35</volume>
				<fpage>993</fpage>
				<lpage>1001</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1023/A:1005816823636</pubid>
						<pubid idtype="pmpid" link="fulltext">9426620</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
		</refgrp>
	</bm>
</art>
