<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1471-2229-8-124</ui>
   <ji>1471-2229</ji>
   <fm>
      <dochead>Research article</dochead>
      <bibl>
         <title>
            <p>A putative autonomous 20.5 kb-CACTA transposon insertion in an <it>F3'H </it>allele identifies a new CACTA transposon subfamily in <it>Glycine max</it></p>
         </title>
         <aug>
            <au id="A1">
               <snm>Zabala</snm>
               <fnm>Gracia</fnm>
               <insr iid="I1"/>
               <email>g-zabala@uiuc.edu</email>
            </au>
            <au id="A2" ca="yes">
               <snm>Vodkin</snm>
               <fnm>Lila</fnm>
               <insr iid="I1"/>
               <email>l-vodkin@uiuc.edu</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Department of Crop Sciences, University of Illinois, Urbana, Illinois 61801, USA</p>
            </ins>
         </insg>
         <source>BMC Plant Biology</source>
         <issn>1471-2229</issn>
         <pubdate>2008</pubdate>
         <volume>8</volume>
         <issue>1</issue>
         <fpage>124</fpage>
         <url>http://www.biomedcentral.com/1471-2229/8/124</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">19055742</pubid>
               <pubid idtype="doi">10.1186/1471-2229-8-124</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>25</day>
               <month>7</month>
               <year>2008</year>
            </date>
         </rec>
         <acc>
            <date>
               <day>02</day>
               <month>12</month>
               <year>2008</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>02</day>
               <month>12</month>
               <year>2008</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2008</year>
         <collab>Zabala and Vodkin; licensee BioMed Central Ltd.</collab>
         <note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>The molecular organization of very few genetically defined CACTA transposon systems have been characterized thoroughly as those of <it>Spm/En </it>in maize, <it>Tam1 </it>of <it>Antirrhinum majus Candystripe1 </it>(<it>Cs1</it>) from <it>Sorghum bicolor </it>and <it>CAC1 </it>from <it>Arabidopsis thaliana</it>, for example. To date, only defective deletion derivatives of CACTA elements have been described for soybean, an economically important plant species whose genome sequence will be completed in 2008.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>We identified a 20.5 kb insertion in a soybean flavonoid 3'-hydroxylase (<it>F3'H</it>) gene representing the <it>t* </it>allele (stable gray trichome color) whose origin traces to a single mutable chimeric plant displaying both tawny and gray trichomes. This 20.5 kb insertion has the molecular structure of a putative autonomous transposon of the CACTA family, designated <it>Tgmt*</it>. It encodes a large gene that was expressed in two sister isolines (<it>T* </it>and <it>t</it><sup><it>m</it></sup>) of the stable gray line (<it>t*</it>) from which <it>Tgmt* </it>was isolated. RT-PCR derived cDNAs uncovered the structure of a large precursor mRNA as well as alternatively spliced transcripts reminiscent of the TNPA-mRNA generated by the <it>En-1 </it>element of maize but without sequence similarity to the maize TNPA. The larger mRNA encodes a transposase with a tnp2 and TNP1-transposase family domains. Because the two soybean lines expressing <it>Tgmt* </it>were derived from the same mutable chimeric plant that created the stable gray trichome <it>t* </it>allele line from which the element was isolated, <it>Tgmt* </it>has the potential to be an autonomous element that was rapidly inactivated in the stable gray trichome <it>t* </it>line. Comparison of <it>Tgmt* </it>to previously described <it>Tgm </it>elements demonstrated that two subtypes of CACTA transposon families exist in soybean based on divergence of their characteristic subterminal repeated motifs and their transposases. In addition, we report the sequence and annotation of a BAC clone containing the <it>F3'H </it>gene (<it>T </it>locus) which was interrupted by the novel <it>Tgmt* </it>element in the gray trichome allele <it>t*</it>.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusion</p>
               </st>
               <p>The molecular characterization of a 20.5 kb insertion in the flavonoid 3'-hydroxylase (<it>F3'H</it>) gene of a soybean gray pubescence allele (<it>t*</it>) identified the structure of a CACTA transposon designated <it>Tgmt*</it>. Besides the terminal inverted repeats and subterminal repeated motifs,<it>Tgmt* </it>encoded a large gene with two putative functions that are required for excision and transposition of a CACTA element, a transposase and the DNA binding protein known to associate to the subterminal repeated motifs. The degree of dissimilarity between <it>Tgmt* </it>transposase and subterminal repeated motifs with those of previously characterized defective CACTA elements (<it>Tgm1-7</it>) were evidence of the existence of two subfamilies of CACTA transposons in soybean, an observation not previously reported in other plants. In addition, our analyses of a genetically active and potentially autonomous element sheds light on the complete structure of a soybean element that is useful for annotation of the repetitive fraction of the soybean genome sequence and may prove useful for transposon tagging or transposon display experiments in different genetic lines.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>The CACTA transposons are characterized by terminal inverted repeats (TIRs), target-site duplications of conserved length and transposition through a DNA intermediate. One of the best genetically characterized transposons is the <it>Suppressor-Mutator </it>(<it>Spm</it>) also named <it>Enhancer </it>(<it>En</it>) system of <it>Z. mays </it><abbrgrp><abbr bid="B1">1</abbr><abbr bid="B2">2</abbr></abbrgrp>. It consists of two components, the <it>Spm/En </it>element that is autonomous with regard to excision, transposition and integration. The defective <it>Spm </it>(<it>dSpm</it>) or Inhibitor (<it>I</it>) elements are non-autonomous, generally internal deletion derivatives, that transpose only if an active <it>Spm/En </it>is present elsewhere in the genome to provide the <it>trans</it>-active Suppressor and Mutator functions of the element <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>.</p>
         <p>The <it>Spm/En </it>element was molecularly characterized <abbrgrp><abbr bid="B4">4</abbr><abbr bid="B5">5</abbr></abbrgrp> and shown to be 8.3 kb with capacity to encode at least two alternatively spliced transcripts of 5.8 kb and 2.5 kb. The 2.5 kb transcript was 100 times more abundant than the 5.8 kb mRNA and contains 11 exons of the <it>tnpA </it>gene. In addition to the multiple exons of the tnpA gene, the 5.8 kb <it>tnpD </it>transcript (molecularly known as the <it>tnp2 </it>region in similar elements in other species) contains the large orf from the Intron-1 region at the 5'-end of the element. The larger <it>tnpD </it>transcript (with a <it>tnp2 </it>domain) encoded a putative transposase required for the excision/integration in a transposition event whereas <it>tnpA </it>codes for a putative protein of 67 kd that functions as a DNA binding protein <abbrgrp><abbr bid="B6">6</abbr><abbr bid="B7">7</abbr></abbrgrp>. The TNPA protein recognizes the subterminal repeats, which act as <it>cis</it>-determinants for the excision and transposition of the <it>Spm/En </it>element <abbrgrp><abbr bid="B8">8</abbr></abbrgrp>. The Suppressor function of <it>Spm/En </it>was also assigned to the TNPA protein <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>. The binding of TNPA to subterminal domains of defective <it>Spm/En </it>elements has been shown to create a steric block to the advancement of RNA polymerase, resulting in transcripts terminated prematurely <abbrgrp><abbr bid="B10">10</abbr></abbrgrp>. Both proteins, TNPD and TNPA, are absolutely required for transposition of <it>Spm/En </it><abbrgrp><abbr bid="B11">11</abbr><abbr bid="B12">12</abbr></abbrgrp>. The proteins may be provided <it>in trans </it>from active autonomous (intact) elements to allow the transposition of non-autonomous (deletion derivatives) elements, as long as they retain the <it>cis</it>-acting target sequences. In addition, TNPA can act as a positive and negative regulator of its own activity <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>.</p>
         <p>A 17 kb <it>Tam1 </it>CACTA-element of <it>Antirrhinum majus </it>also encodes two transcripts with somewhat parallel organization as the <it>Spm/En </it>element, an abundant 2.5 kb and a low abundance 5-kb mRNA <abbrgrp><abbr bid="B13">13</abbr></abbrgrp>. The 2.5 kb transcript of <it>Tam1 </it>(known as <it>Tnp1</it>) is also pasted together from distant exons as is the <it>tnpA </it>mRNA of <it>Spm/En</it>; however, they have no sequence similarity. On the other hand, the larger transcripts of both elements containing sequences of the open reading frames (<it>Tnp2 </it>and <it>TnpD</it>), share significant (45%) amino acid homology <abbrgrp><abbr bid="B14">14</abbr></abbrgrp> in the orf region that is found in the large Intron-1 of the maize <it>Spm/En </it>element. Similarly, previously reported non-autonomous CACTA-<it>Tgm</it>-elements of soybean feature a portion of an open reading frame 39% similar to the <it>tnpD </it>gene of <it>Spm/En </it><abbrgrp><abbr bid="B15">15</abbr></abbrgrp>. The conservation of these sequences suggests a common function of these gene products. It has been proposed but not proven that TNPD may interact with the conserved 13-bp TIRs and cleave the transposon termini <abbrgrp><abbr bid="B6">6</abbr><abbr bid="B14">14</abbr></abbrgrp>.</p>
         <p>Other active or full length CACTA elements have been isolated and characterized to some extent from several other species including, <it>Tpn1 </it>from <it>Ipomoea nil </it><abbrgrp><abbr bid="B16">16</abbr></abbrgrp>,<it>Tdc1 </it>from <it>Daucus carota </it><abbrgrp><abbr bid="B17">17</abbr></abbrgrp>, <it>PsI </it>from <it>Petunia hybrida </it><abbrgrp><abbr bid="B18">18</abbr></abbrgrp>, <it>Candystripe1 </it>(<it>Cs1</it>) from <it>Sorghum bicolor </it><abbrgrp><abbr bid="B19">19</abbr></abbrgrp>, <it>Atenspm1 </it>(or <it>CAC1</it>) from <it>Arabidopsis thaliana </it><abbrgrp><abbr bid="B20">20</abbr></abbrgrp>, <it>Tpo1 </it>from <it>Lolium perenne </it><abbrgrp><abbr bid="B21">21</abbr></abbrgrp> and <it>TamRSA1 </it>of <it>A. majus </it><abbrgrp><abbr bid="B22">22</abbr></abbrgrp>.</p>
         <p>Only deletion derivative, non-autonomous CACTA family members (<it>Tgm1-Tgm7</it>) and <it>Tgm-Express1</it>, have been studied to date from soybean <abbrgrp><abbr bid="B23">23</abbr><abbr bid="B24">24</abbr><abbr bid="B15">15</abbr><abbr bid="B25">25</abbr></abbrgrp>. We recently identified the <it>T </it>locus in soybean that controls the tawny color of the trichome hairs on the plant leaves and stems as encoding a flavonoid 3' hydroxylase gene (<it>F3'H</it>) <abbrgrp><abbr bid="B26">26</abbr></abbrgrp>. The stable gray <it>t* </it>allele, that was derived from a single mutable plant displaying both tawny and gray trichomes on the same plant, appeared to contain a large insertion that was not amplifiable by standard PCR conditions. Using PCR conditions designed to amplify large fragments, we here report finding a large 20.5 kb CACTA transposon (designated <it>Tgmt*</it>) inserted in Intron-1 of the <it>F3'H </it>gene, thus defining the novel <it>t* </it>mutant allele. <it>Tgmt* </it>had imperfect 13 bp CACTA TIRs and asymmetric subterminal repeats, and created a three base duplication upon insertion.</p>
         <p>Comparison of the <it>Tgmt* </it>subterminal repeats to those of other reported CACTA elements of soybean and other plant species revealed wide divergence of the repeated motifs and the existence of two distinct CACTA transposon families in soybean. At the same time, all elements' subterminal repeated motifs had some regions of similarity. In addition, <it>Tgmt* </it>encodes a 14.6 kb complex Gene-1 with a transposase and a presumed DNA-binding function. A large precursor transcript encoded a transposase with tnp2 and TNP1 domains. Smaller alternatively spliced transcripts (2 &#8211; 2.6 kb) had little similarity to other CACTA transposon mosaic transcripts such as tnpA of maize and they likely encode a distinct type of binding protein that recognizes the soybean <it>Tgmt* </it>subtype of subterminal inverted repeats. Thus, in addition to broadening our understanding of the CACTA transposon's integral components, these findings point to <it>Tgmt* </it>as a potentially autonomous CACTA element isolated from soybean. Significantly, the data also clearly demonstrate diversification of the subterminal repeats of CACTA families within a species such as soybean (<it>Tgmt* </it>versus <it>Tgm1</it>), as well as between species as has been noted before for <it>Tgm1</it>, <it>En-1</it>, and <it>Tam1</it>, for example. This observation is substantiated not only by the divergence of the subterminal inverted repeats between the <it>Tgmt* </it>and <it>Tgm1-Tgm7 </it>type elements, but also by the divergence (76% similarity over a 1 kb segment) between the tnp2-containing orf of <it>Tgmt* </it>and the previously characterized partial tnp2-containing orf in the <it>Tgm5 </it>element. Thus, our data point to considerable within-species divergence of the CACTA element families. Whether these subtypes of element families originated before or after the origin of soybean as a distinct species is unknown. Bioinformatic analyses of these elements in many whole genome sequences within the legume family and other dicotyledonous plants in future years may shed light on how rapidly the DNA in the subterminal repeats and the tnpA-like exons that putatively recognize these subterminal repeats may coevolve and diversify both within and between species.</p>
      </sec>
      <sec>
         <st>
            <p>Results</p>
         </st>
         <sec>
            <st>
               <p>Isolation of a large DNA insertion in the novel mutant allele with gray trichomes (<it>t</it>*)</p>
            </st>
            <p>In a previous study in which the <it>T </it>locus of soybean was identified as the flavonoid 3'-hydroxylase gene (<it>F3'H</it>), we reported the partial genomic sequences of three <it>F3'H </it>alleles of the soybean lines, Williams 43 (<it>T</it>, tawny), XB22A (<it>T*</it>, tawny) and its gray trichome isoline, 37609 (<it>t*</it>). We postulated that a large intron existed in the 5' region of the <it>F3'H </it>genes and that it prevented PCR amplification of the contiguous full length <it>F3'H </it>genomic sequence using standard methods. In order to obtain the full sequence of soybean <it>F3'H</it>, we isolated a BAC clone from cultivar PI437654 with the <it>T/T </it>genotype. The 64 kb of BAC sequence (to be detailed later) revealed a 4.3 kb intron in the <it>F3'</it>H gene. This information was utilized to synthesize primers intended to amplify the promoter and Intron-1 genomic regions of <it>F3'H </it>alleles in XB22A, 37609, 37643, 33745 and Williams 43 soybean lines (Table <tblr tid="T1">1</tblr>). The ~2 kb promoter region of <it>F3'H </it>in all those lines was identical although there were several differences from the PI437654 cultivar. The 4.1 kb Intron-1 sequence was also determined for the <it>F3'H </it>alleles of Williams 43 (<it>T</it>, tawny trichomes), XB22A (<it>T*</it>, tawny trichomes) and the stable gray-trichome isoline, 37609 (<it>t*</it>). The three <it>F3'H </it>full length genomic sequences: Williams 43 (<it>T</it>: 8,680 bp), XB22A (<it>T*</it>: 8,678 bp) and its isoline 37609 (<it>t*</it>: 29,609 bp) were entered in GenBank with accession numbers, <ext-link ext-link-type="gen" ext-link-id="EU190438">EU190438</ext-link>, <ext-link ext-link-type="gen" ext-link-id="EU190439">EU190439</ext-link> and <ext-link ext-link-type="gen" ext-link-id="EU190440">EU190440</ext-link> respectively.</p>
            <tbl id="T1">
               <title>
                  <p>Table 1</p>
               </title>
               <caption>
                  <p>Soybean cultivars and derived isolines: Phenotypes and genotypes</p>
               </caption>
               <tblbdy cols="3">
                  <r>
                     <c ca="center">
                        <p>Cultivars</p>
                     </c>
                     <c ca="center">
                        <p>Genotype</p>
                     </c>
                     <c ca="left">
                        <p>Phenotype</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="3">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>XB22A</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>T*</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Stable tawny trichomes</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>37609</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>t*</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Stable gray trichomes</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>37643</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>t*</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Stable gray trichomes</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>33745</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>t</it>
                           <sup>
                              <it>m</it>
                           </sup>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Mutable tawny/gray trichomes</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Williams 43</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>T</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Tawny trichomes</p>
                     </c>
                  </r>
               </tblbdy>
            </tbl>
            <p>The dramatically differing sizes between the tawny (<it>T </it>and <it>T*</it>) and gray (<it>t*</it>) trichome <it>F3'H </it>alleles above was found to be due to a 20.5 kb insertion in Intron-1 in the 37609 mutant isoline (Figure <figr fid="F1">1</figr>). PCR amplification of the 20.5 kb insertion required the use of higher melting temperature oligonucleotide primers (Z37F and IN663R) and optimized conditions for <it>LA Taq </it>polymerase (TaKaRa Bio Inc.) as described in the Methods section. As shown in Figure <figr fid="F1">1</figr>, a large fragment of ~23 kb was amplified from genomic DNA from both of the stable gray trichome lines 37609 and 37643 as well as from the mutable tawny/gray trichome line, 33745 (Table <tblr tid="T1">1</tblr>). The same primers also amplified a smaller 2.6 kb fragment from the mutable 33745 line and those lines with the dominant alleles, Williams 43 and XB22A. The 2.6 kb fragment is the Intron-1 portion comprised between the two primers used in the PCR reactions. The insertion maps ~2.4 kb to the left of the IN663R primer (Figure <figr fid="F2">2</figr>).</p>
            <fig id="F1">
               <title>
                  <p>Figure 1</p>
               </title>
               <caption>
                  <p>Amplification of a large DNA insertion in Intron-1 of <it>F3'H </it>alleles</p>
               </caption>
               <text>
                  <p><b>Amplification of a large DNA insertion in Intron-1 of <it>F3'H </it>alleles</b>. DNA fragments amplified from genomic DNAs isolated from 5 soybean lines with different <it>T </it>locus alleles were visualized in this EtBr stained gel: Williams 43 (<it>T</it>), XB22A (<it>T*</it>), 33745 (<it>t</it><sup><it>m</it></sup>), 37643 (<it>t*</it>) and 37609 (<it>t*</it>). The ~23 kb fragment contains a large insertion in Intron-1 of the <it>t* </it>allele. The 2.6 kb fragment is the Intron-1 DNA portion between the primers chosen for the PCR reactions (Z37F and IN663; see Additional file <supplr sid="S5">5</supplr>: Alignment of cDNA sequences of clones 43&#8211;53 to <it>Tgmt* </it>genomic sequence, and Methods). M is a 1 kb lambda DNA ladder. M2 is a <it>HindIII </it>lambda DNA marker.</p>
               </text>
               <graphic file="1471-2229-8-124-1"/>
            </fig>
            <fig id="F2">
               <title>
                  <p>Figure 2</p>
               </title>
               <caption>
                  <p>Schematic of the <it>Tgmt* </it>transposon insertion in the <it>F3'H </it>gene of the <it>t* </it>allele</p>
               </caption>
               <text>
                  <p><b>Schematic of the <it>Tgmt* </it>transposon insertion in the <it>F3'H </it>gene of the <it>t* </it>allele</b>. A 20.5 kb transposon (<it>Tgmt*</it>) inserted near the center of Intron-1 is shown together with the location of primers (Z37F, IN663R) used to amplify it. The approximate location of a third primer (TGM23R) used with Z37F to amplify a 17 kb portion of <it>Tgmt* </it>is also indicated. The three primers are represented by solid black arrows. The parallel lines at the end of <it>Tgmt*</it>, Intron-1 and promoter's 5'-end indicate size reduction to fit the size scale of the drawing. The CACTA ends and the ATA target size duplication are also shown.</p>
               </text>
               <graphic file="1471-2229-8-124-2"/>
            </fig>
         </sec>
         <sec>
            <st>
               <p>Cloning and sequencing a 20.5 kb CACTA transposon, <it>Tgmt</it>*</p>
            </st>
            <p>The ~23 kb PCR fragment was purified from 1% seaplaque agarose gel with the aid of GELase agarose gel-digesting preparation (Epicentre Biotechnologies), cut with <it>HindIII </it>and cloned in a pGem vector. Fortunately, two of the resulting clones that hybridized to a labeled probe prepared with an aliquot of the GELase purified 23 kb fragment, with sizes 199 bp, and 3,620 bp turned out to be the insertion's borders, each containing portions of intron sequence, a CACTA inverted repeat and the adjacent three base target duplication (ATA).</p>
            <p>A 37-mer reverse primer (TGM23R) designed to the insertion's right border found in the 3.6 kb clone was paired with the Z37F primer upstream of the insertion's left border to amplify a 17 kb DNA fragment (Figure <figr fid="F2">2</figr>). An attempt at cloning the17 kb fragment into the pJAZZ-OC vector using the BigEasy v2.0 linear cloning kit from Lucigen Corp. failed to clone it in one piece but many partial clones ranging in size from 1 to 7 kb were recovered. Five different clones were sequenced and assembled to reveal a 20,544 bp CACTA transposable element that was named <it>Tgmt* </it>(Figure <figr fid="F2">2</figr>). This sequence was entered in GenBank as part of the <it>t* </it>allele genomic sequence from the 37609 line (Acc. No. EU190440). Previous studies of soybean clones containing CACTA ends analyzed seven distinct sequences (<it>Tgm1-Tgm7</it>) ranging in size from 1.6 to 12 kb that were believed to be portions of deletion derivatives of a larger active element (Rhodes and Vodkin, 1985 and 1988). Only <it>Tgm4 </it>and <it>Tgm5 </it>sequences had a 1 kb segment with 39% similarity to the ORF1 of maize <it>Spm/En </it>transposable element <it>tnpD </it>gene. The larger <it>Tgmt* </it>element we have cloned in this report has the capacity to encode all known functions required of an active and autonomous CACTA element as will be described later.</p>
         </sec>
         <sec>
            <st>
               <p>Molecular structure of <it>Tgmt</it>*: terminal and subterminal repeats</p>
            </st>
            <p>Sequence analyses revealed features that are conserved among transposable elements of the <it>Spm/En </it>family. <it>Tgmt* </it>possesses nearly identical 13 bp CACTA inverted repeats with three reciprocal mismatches and features a three-base-target site duplication, ATA. It also contains asymmetric, reiterated direct and inverted sequence motif in the subterminal regions capable of forming 12 stem-loop structures in the right border and two in the left border (Figure <figr fid="F3">3</figr>). The largest stem-loop structures were formed by direct and inverted repeated sequence motif 17 bp long. This sequence motif could be divided in two sub-motifs, a 7 bp motif (TTGGCAG) present in all 14 stem-loop structures and a 10 bp motif (AATCTTACAG) that was missing or incomplete in some of the stem-loops. See an alignment of all direct stem repeated sequences (read from 3'-end to 5'-end) in Figure <figr fid="F4">4</figr>.</p>
            <fig id="F3">
               <title>
                  <p>Figure 3</p>
               </title>
               <caption>
                  <p>Regions of subterminal repeats in <it>Tgmt* </it>left and right borders</p>
               </caption>
               <text>
                  <p><b>Regions of subterminal repeats in <it>Tgmt* </it>left and right borders</b>. There are 4 subterminal sequence repeats in the left border in a stretch of 159 bp from the 5'-TIR. The right border contains 24 subterminal sequence repeats in a stretch of 732 bp from the 3'-TIR. These sequence repeats are capable of forming two palindromes in the left border and 12 in the right border. Red arrows mark the repeats shown in Figure 4. Red letters indicate base pair mismatches. An additional short stem loop in each border (marked with small solid black arrows) could be fold but the repeated sequences that form them are not related to the 28 repeats forming the other 14 stem loops. Thin black arrows point to the relative location of <it>Tgmt* </it>Gene-1.</p>
               </text>
               <graphic file="1471-2229-8-124-3"/>
            </fig>
            <fig id="F4">
               <title>
                  <p>Figure 4</p>
               </title>
               <caption>
                  <p>Alignment of <it>Tgmt* </it>subterminal direct repeats</p>
               </caption>
               <text>
                  <p><b>Alignment of <it>Tgmt* </it>subterminal direct repeats</b>. The repeated sequences vary in length and they have been organized in this figure starting from the 3'end of the transposon right border. Each direct repeat was read from the 3'-end to the 5'-end. A consensus sequence motif was deduced and is shown boxed. This larger motif can be subdivided into two smaller sub-motifs. One is TTGGCAG that is present in all repeats (shown in red letters in the consensus sequence motif). The second, AATCTTACAG, is more divergent and absent in its entirety in two of the repeats (shown in blue letters in the consensus sequence motif).</p>
               </text>
               <graphic file="1471-2229-8-124-4"/>
            </fig>
            <p>A similarly complex pattern of stem-loop structures was described for <it>Tgm1 </it><abbrgrp><abbr bid="B15">15</abbr></abbrgrp>. In this instance the length of the repeated sequence motif (17&#8211;20 bp) was more uniform (see Additional file <supplr sid="S1">1</supplr>: Alignment of <it>Tgm1 </it>subterminal direct repeats). The consensus sequence of these direct repeats from <it>Tgm1 </it>was compared to the one deduced from <it>Tgmt*</it>, <it>Tgm-Express1 </it>(<abbrgrp><abbr bid="B25">25</abbr></abbrgrp>; see Additional file <supplr sid="S2">2</supplr>: Alignment of <it>Tgm-Express1 </it>subterminal direct repeats) and <it>Tgmw4m </it>(GenBank Acc. No.: <ext-link ext-link-type="gen" ext-link-id="EU068464">EU068464</ext-link>; see Additional file <supplr sid="S3">3</supplr>: Alignment of <it>Tgmw4m </it>subterminal direct repeats). Surprisingly, the <it>Tgm1 </it>sequence motif diverged from all three others in 6 nucleotide positions highlighted in red in Figure <figr fid="F5">5</figr>. The subterminal direct repeat motif of <it>En-1 </it>is smaller (12 bp) and except for a TCTTA domain its sequence is different from the motif of the soybean <it>Tgmt* </it>transposon family (<it>Tgm-Express1</it>and <it>Tgmw4m</it>) analyzed. The extended sequence divergence of <it>Tgm1 </it>is intriguing and suggests the existence of a second CACTA transposon family in soybean. This variability of subterminal repeats extends to other motifs reported. Figure <figr fid="F5">5</figr> shows an alignment of subterminal repeats from 12 transposon sequences emphasizing both, their divergence and slight commonality. They all seem to have a GC rich and an AT rich portion. Whether or not these are recognition sites for the DNA-binding proteins remains to be determined.</p>
            <suppl id="S1">
               <title>
                  <p>Additional file 1</p>
               </title>
               <text>
                  <p><b>Alignment of <it>Tgm1 </it>subterminal direct repeats</b>. The repeated sequences have been organized in this figure starting from the 3'end of the transposon right border. Each direct repeat was read from the 3'-end to the 5'-end. A consensus sequence motif was deduced and is shown boxed.</p>
               </text>
               <file name="1471-2229-8-124-S1.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <suppl id="S2">
               <title>
                  <p>Additional file 2</p>
               </title>
               <text>
                  <p><b>Alignment of <it>Tgm-Express1 </it>subterminal direct repeats</b>. The repeated sequences have been organized in this figure starting from the 3'end of the transposon right border. Each sequence repeat was read from the 3'-end to the 5'-end. A consensus sequence motif was deduced and is shown boxed.</p>
               </text>
               <file name="1471-2229-8-124-S2.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <suppl id="S3">
               <title>
                  <p>Additional file 3</p>
               </title>
               <text>
                  <p><b>Alignment of <it>Tgmw4m </it>subterminal direct repeats</b>. The repeated sequences have been organized in this figure starting from the 3'end of the transposon right border. Each repeat has been read from the 3'-end to the 5'-end. A consensus sequence motif was deduced and is shown boxed.</p>
               </text>
               <file name="1471-2229-8-124-S3.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <fig id="F5">
               <title>
                  <p>Figure 5</p>
               </title>
               <caption>
                  <p>Alignment of multiple CACTA transposon subterminal repeated motifs</p>
               </caption>
               <text>
                  <p><b>Alignment of multiple CACTA transposon subterminal repeated motifs</b>. The consensus sequences of subterminal repeated motifs were aligned to determine the degree of divergence. The <it>Glycine max </it>(<it>Tgm</it>) transposon motifs fell into two subfamilies: <it>Tgm1 </it>had 6 base changes (shown in red) compared to the consensus motif of the <it>Tgmt* </it>family (<it>Tgmt*, Tgmw4m </it>and <it>Tgm-Express1</it>). The consensus sequences of the <it>Tgmt* </it>family members were identical. These <it>Tgm </it>motifs were compared to other published CACTA transposon's subterminal repeats. Two rectangles were use to demark the sequence portions of the motifs with some similarity, a GC rich (left) and AT rich (right). These sequence motifs are domains for TNPA-like proteins binding.</p>
               </text>
               <graphic file="1471-2229-8-124-5"/>
            </fig>
         </sec>
         <sec>
            <st>
               <p>Molecular prediction of <it>Tgmt</it>* open reading frames</p>
            </st>
            <p>Softberry- FGeneSH gene prediction web-based algorithms using <it>Medicago truncatula </it>as training set <url>http://www.softberry.com</url> found one gene with 21 exons in the negative DNA strand relative to the <it>F3'H </it>gene (<it>t*</it>) where <it>Tgmt* </it>inserted. Thus, all sequence analysis description and depiction of <it>Tgmt* </it>Gene-1 is that of the reverse complement. The predicted Gene-1 first exon starts at base pair 6,123 and its polyA at 19,332 bp (Table <tblr tid="T2">2</tblr>). This gene could transcribe a 6,585 bp mRNA that would translate a 2,194 amino acid (aa) product. The derived aa sequence subjected to NCBI Non-redundant protein sequence database BLAST found similarity to many putative CACTA transposon proteins. The highest similarities (59% and 53%) were to <it>Vitis vinifera </it>hypothetical proteins CAN82870 and CAN66891 over a length of 1060 and 1400 aa respectively. The <it>Tgmt* </it>predicted protein had two putative conserved domains, a transposase family tnp2 domain [+](pfam02992) (E value 1e-80) located between 265 &#8211; 493 aa and a TNP1/EN/SPM transposase domain [+](pfam03017) (E value: 0.006) located at 1,493 &#8211; 1,534 aa. A search for similar domain architectures with NCBI CDART (Conserved Domain Architecture Retrival Tool) found two sequences from <it>Oryza sativa </it>(Os03g0714800) <abbrgrp><abbr bid="B27">27</abbr></abbrgrp> to have the most closely related domain architectures. In contrast, the two <it>V. vinifera </it>hypothetical proteins with the highest similarity to <it>Tgmt* </it>Gene-1 product had only one transposase domain, the tnp2 domain.</p>
            <tbl id="T2">
               <title>
                  <p>Table 2</p>
               </title>
               <caption>
                  <p>Softberry predicted gene exons and deduced from RT-PCR cDNA clones</p>
               </caption>
               <tblbdy cols="10">
                  <r>
                     <c cspan="5" ca="center">
                        <p>Softberry-FGENESH Prediction</p>
                     </c>
                     <c cspan="5" ca="center">
                        <p>RT-PCR cDNA cloned sequences</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="5">
                        <hr/>
                     </c>
                     <c cspan="5">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>#</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>Start</p>
                     </c>
                     <c ca="center">
                        <p>End</p>
                     </c>
                     <c ca="center">
                        <p>Length</p>
                     </c>
                     <c ca="center">
                        <p>#</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>Start</p>
                     </c>
                     <c ca="center">
                        <p>End</p>
                     </c>
                     <c ca="center">
                        <p>Length</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>TSS</p>
                     </c>
                     <c ca="center">
                        <p>4709</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>TSS</p>
                     </c>
                     <c ca="center">
                        <p>4708</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>CDS f</p>
                     </c>
                     <c ca="center">
                        <p>6123</p>
                     </c>
                     <c ca="center">
                        <p>9287</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>3165</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>CDS f</p>
                     </c>
                     <c ca="center">
                        <p>6143</p>
                     </c>
                     <c ca="center">
                        <p>9319</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>3177</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>CDS i</p>
                     </c>
                     <c ca="center">
                        <p>9446</p>
                     </c>
                     <c ca="center">
                        <p>9753</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>306</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>CDS i</p>
                     </c>
                     <c ca="center">
                        <p>9446</p>
                     </c>
                     <c ca="center">
                        <p>9753</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>308</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>CDS i</p>
                     </c>
                     <c ca="center">
                        <p>9854</p>
                     </c>
                     <c ca="center">
                        <p>10066</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>213</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>CDS i</p>
                     </c>
                     <c ca="center">
                        <p>9854</p>
                     </c>
                     <c ca="center">
                        <p>10066</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>213</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>CDS i</p>
                     </c>
                     <c ca="center">
                        <p>10156</p>
                     </c>
                     <c ca="center">
                        <p>10419</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>264</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>CDS i</p>
                     </c>
                     <c ca="center">
                        <p>10156</p>
                     </c>
                     <c ca="center">
                        <p>10419</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>264</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>5</p>
                     </c>
                     <c ca="center">
                        <p>CDS i</p>
                     </c>
                     <c ca="center">
                        <p>10508</p>
                     </c>
                     <c ca="center">
                        <p>11200</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>693</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                     <c ca="center">
                        <p>CDS i</p>
                     </c>
                     <c ca="center">
                        <p>10508</p>
                     </c>
                     <c ca="center">
                        <p>11200</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>693</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="center">
                        <p>CDS i</p>
                     </c>
                     <c ca="center">
                        <p>11302</p>
                     </c>
                     <c ca="center">
                        <p>11555</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>252</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="center">
                        <p>CDS i</p>
                     </c>
                     <c ca="center">
                        <p>11302</p>
                     </c>
                     <c ca="center">
                        <p>11555</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>254</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>7</p>
                     </c>
                     <c ca="center">
                        <p>CDS i</p>
                     </c>
                     <c ca="center">
                        <p>11659</p>
                     </c>
                     <c ca="center">
                        <p>11826</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>165</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>7</p>
                     </c>
                     <c ca="center">
                        <p>CDS i</p>
                     </c>
                     <c ca="center">
                        <p>11659</p>
                     </c>
                     <c ca="center">
                        <p>11826</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>168</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>8</p>
                     </c>
                     <c ca="center">
                        <p>CDS i</p>
                     </c>
                     <c ca="center">
                        <p>11902</p>
                     </c>
                     <c ca="center">
                        <p>12006</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>102</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>8</p>
                     </c>
                     <c ca="center">
                        <p>CDS i</p>
                     </c>
                     <c ca="center">
                        <p>11902</p>
                     </c>
                     <c ca="center">
                        <p>12006</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>105</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>9</p>
                     </c>
                     <c ca="center">
                        <p>CDS i</p>
                     </c>
                     <c ca="center">
                        <p>12110</p>
                     </c>
                     <c ca="center">
                        <p>12182</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>72</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>9</p>
                     </c>
                     <c ca="center">
                        <p>CDS i</p>
                     </c>
                     <c ca="center">
                        <p>12110</p>
                     </c>
                     <c ca="center">
                        <p>12182</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>73</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>10</p>
                     </c>
                     <c ca="center">
                        <p>CDS i</p>
                     </c>
                     <c ca="center">
                        <p>12264</p>
                     </c>
                     <c ca="center">
                        <p>12359</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>96</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>10</p>
                     </c>
                     <c ca="center">
                        <p>CDS i</p>
                     </c>
                     <c ca="center">
                        <p>12264</p>
                     </c>
                     <c ca="center">
                        <p>12359</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>96</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>11</p>
                     </c>
                     <c ca="center">
                        <p>CDS i</p>
                     </c>
                     <c ca="center">
                        <p>12928</p>
                     </c>
                     <c ca="center">
                        <p>13028</p>
                     </c>
                     <c ca="center">
                        <p>99</p>
                     </c>
                     <c ca="center">
                        <p>11</p>
                     </c>
                     <c ca="center">
                        <p>CDS i</p>
                     </c>
                     <c ca="center">
                        <p>12446</p>
                     </c>
                     <c ca="center">
                        <p>13028</p>
                     </c>
                     <c ca="center">
                        <p>583</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>12</p>
                     </c>
                     <c ca="center">
                        <p>CDS i</p>
                     </c>
                     <c ca="center">
                        <p>13113</p>
                     </c>
                     <c ca="center">
                        <p>13153</p>
                     </c>
                     <c ca="center">
                        <p>41</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>13</p>
                     </c>
                     <c ca="center">
                        <p>CDS i</p>
                     </c>
                     <c ca="center">
                        <p>13254</p>
                     </c>
                     <c ca="center">
                        <p>13331</p>
                     </c>
                     <c ca="center">
                        <p>78</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>12</p>
                     </c>
                     <c ca="center">
                        <p>CDS i</p>
                     </c>
                     <c ca="center">
                        <p>13509</p>
                     </c>
                     <c ca="center">
                        <p>13671</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>162</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>14</p>
                     </c>
                     <c ca="center">
                        <p>CDS i</p>
                     </c>
                     <c ca="center">
                        <p>13505</p>
                     </c>
                     <c ca="center">
                        <p>13671</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>167</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>13</p>
                     </c>
                     <c ca="center">
                        <p>CDS i</p>
                     </c>
                     <c ca="center">
                        <p>13844</p>
                     </c>
                     <c ca="center">
                        <p>13948</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>105</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>15</p>
                     </c>
                     <c ca="center">
                        <p>CDS i</p>
                     </c>
                     <c ca="center">
                        <p>13844</p>
                     </c>
                     <c ca="center">
                        <p>13948</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>105</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>16</p>
                     </c>
                     <c ca="center">
                        <p>CDS i</p>
                     </c>
                     <c ca="center">
                        <p>14160</p>
                     </c>
                     <c ca="center">
                        <p>14216</p>
                     </c>
                     <c ca="center">
                        <p>57</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>14</p>
                     </c>
                     <c ca="center">
                        <p>CDS i</p>
                     </c>
                     <c ca="center">
                        <p>14381</p>
                     </c>
                     <c ca="center">
                        <p>14461</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>81</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>17</p>
                     </c>
                     <c ca="center">
                        <p>CDS i</p>
                     </c>
                     <c ca="center">
                        <p>14381</p>
                     </c>
                     <c ca="center">
                        <p>14461</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>81</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>18</p>
                     </c>
                     <c ca="center">
                        <p>CDS i</p>
                     </c>
                     <c ca="center">
                        <p>14711</p>
                     </c>
                     <c ca="center">
                        <p>14745</p>
                     </c>
                     <c ca="center">
                        <p>35</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>15</p>
                     </c>
                     <c ca="center">
                        <p>CDS i</p>
                     </c>
                     <c ca="center">
                        <p>15067</p>
                     </c>
                     <c ca="center">
                        <p>15156</p>
                     </c>
                     <c ca="center">
                        <p>90</p>
                     </c>
                     <c ca="center">
                        <p>19</p>
                     </c>
                     <c ca="center">
                        <p>CDS i</p>
                     </c>
                     <c ca="center">
                        <p>15067</p>
                     </c>
                     <c ca="center">
                        <p>15505</p>
                     </c>
                     <c ca="center">
                        <p>439</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>20</p>
                     </c>
                     <c ca="center">
                        <p>CDS i</p>
                     </c>
                     <c ca="center">
                        <p>15614</p>
                     </c>
                     <c ca="center">
                        <p>15659</p>
                     </c>
                     <c ca="center">
                        <p>46</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>16</p>
                     </c>
                     <c ca="center">
                        <p>CDS i</p>
                     </c>
                     <c ca="center">
                        <p>16207</p>
                     </c>
                     <c ca="center">
                        <p>16482</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>276</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>21</p>
                     </c>
                     <c ca="center">
                        <p>CDS i</p>
                     </c>
                     <c ca="center">
                        <p>16207</p>
                     </c>
                     <c ca="center">
                        <p>16482</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>276</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>17</p>
                     </c>
                     <c ca="center">
                        <p>CDS i</p>
                     </c>
                     <c ca="center">
                        <p>16542</p>
                     </c>
                     <c ca="center">
                        <p>16573</p>
                     </c>
                     <c ca="center">
                        <p>30</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>18</p>
                     </c>
                     <c ca="center">
                        <p>CDS i</p>
                     </c>
                     <c ca="center">
                        <p>17132</p>
                     </c>
                     <c ca="center">
                        <p>17314</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>180</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>22</p>
                     </c>
                     <c ca="center">
                        <p>CDS i</p>
                     </c>
                     <c ca="center">
                        <p>17132</p>
                     </c>
                     <c ca="center">
                        <p>17314</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>183</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>19</p>
                     </c>
                     <c ca="center">
                        <p>CDS i</p>
                     </c>
                     <c ca="center">
                        <p>17573</p>
                     </c>
                     <c ca="center">
                        <p>17681</p>
                     </c>
                     <c ca="center">
                        <p>108</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>20</p>
                     </c>
                     <c ca="center">
                        <p>CDS i</p>
                     </c>
                     <c ca="center">
                        <p>18574</p>
                     </c>
                     <c ca="center">
                        <p>18621</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>48</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>23</p>
                     </c>
                     <c ca="center">
                        <p>CDS i</p>
                     </c>
                     <c ca="center">
                        <p>18574</p>
                     </c>
                     <c ca="center">
                        <p>18621</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>48</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>21</p>
                     </c>
                     <c ca="center">
                        <p>CDS l</p>
                     </c>
                     <c ca="center">
                        <p>19016</p>
                     </c>
                     <c ca="center">
                        <p>19072</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>57</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>24</p>
                     </c>
                     <c ca="center">
                        <p>CDS i</p>
                     </c>
                     <c ca="center">
                        <p>19016</p>
                     </c>
                     <c ca="center">
                        <p>19079</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>64</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>PolA</p>
                     </c>
                     <c ca="center">
                        <p>19332</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>PolA</p>
                     </c>
                     <c ca="center">
                        <p>19335</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
               </tblbdy>
            </tbl>
            <p>The soybean deletion derivative <it>Tgm5 </it>transposon sequence (Acc. No. X13528) has 79% similarity to the <it>Tgmt* </it>sequence stretch from 6,572 to 7,571 bp and contains the entire tnp2 transposase domain. It was previously estimated that <it>Tgm5 </it>was 39% similar to the ORF1 of the <it>Zea mays En-1 </it>transposable element <abbrgrp><abbr bid="B24">24</abbr></abbrgrp>.</p>
            <p>When we applied Softberry- FGeneSH gene prediction web-based algorithms using <it>monocot plants </it>as training set <url>http://www.softberry.com</url> to the autonomous transposable element <it>En-1 </it>sequence <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>, it predicted one gene with 12 exons, an mRNA of 5,190 bp and a protein of 1,729 aa. A Non-redundant protein sequence database BLAST search with the predicted 1,729 aa sequence found similarities to other maize transposable elements proteins and identified two transposase domains. One located at 264&#8211;491 aa, [+]pfam02992, transposase_21 transposase family tnp2 (E value: 7e-98) and the second at 1,391&#8211;1,487 aa, [+]pfam03004, transposase_24 plant transposase ptta/En/Spm family (E value: 8e-05). Thus, it appears that the predicted genes from <it>Tgmt* </it>and <it>En-1 </it>are similar at the 5'end with the tnp2 transposase domain at identical location (264(265)-491(493) aa). In contrast, the aa sequences of the predicted genes beyond the 780 aa, diverged significantly with two distinct, non aligned, transposase domains. <it>Tgmt* </it>and <it>En-1 </it>aa sequence alignment using the "Multiple sequence alignment with hierarchical clustering" (<it>MultAlin</it>) program <abbrgrp><abbr bid="B28">28</abbr></abbrgrp>, showed the similarities at the 5'end of the proteins with the tnp2 domains aligned and 258 identical aa's over the first 780 aa stretch (33% similar). The remaining 1,422 aa of the <it>Tgmt* </it>predicted protein had only 141 aa identities (10% similarity) to the <it>En-1 </it>predicted product and the TNP1/EN/SPM transposase domain of <it>Tgmt* </it>is different and did not align with the ptta/En/Spm domain of <it>En-1</it>. This second stretch of the aligned aa sequences included large gaps that would account for the smaller size of <it>En-1 </it>predicted gene and its product, 465 aa shorter. (see Additional file <supplr sid="S4">4</supplr>: <it>Tgmt* </it>and <it>En-1 </it>amino acid sequence alignment).</p>
            <suppl id="S4">
               <title>
                  <p>Additional file 4</p>
               </title>
               <text>
                  <p><b><it>Tgmt* </it>and <it>En-1 </it>amino acid sequence alignment</b>. The transposase amino acid sequences predicted with Softberry- FGeneSH and aligned with the <it>MultAlin </it>program (Corpet, 1988) have 258 aa identities (33%) in the 780 aa at the 5'-end. These regions of both transposases contain the tnp2 domains that map at the same location (highlighted in yellow). Beyond the 780 aa stretch the two proteins diverge considerably with only 141 aa identities (10%), two different conserved domains TNP1 in <it>Tgmt* </it>(highlighted in green) and ptta in <it>En-1 </it>(highlighted in blue) that map in different locations, and many gaps that reflect their differences in length.</p>
               </text>
               <file name="1471-2229-8-124-S4.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
         </sec>
         <sec>
            <st>
               <p>Expression of <it>Tgmt</it>* Gene-1 in mutable and stable trichome color soybean isolines</p>
            </st>
            <p>Based on the molecular analysis of <it>Tgmt* </it>DNA sequence with web-based gene prediction algorithms and the extent of the similarity of the predicted gene to the well characterized gene of the <it>En-1 </it>autonomous CACTA transposable element, it was presumed that <it>Tgmt* </it>Gene-1 should be expressed in soybean lines where the putative autonomous <it>Tgmt* </it>element could be active. An initial attempt at determining Gene-1 expression was a search for RNAs hybridizing to Gene-1 DNA probes on RNA blots with samples extracted from multiple tissues of four soybean lines varying at the <it>T </it>locus, Williams 43 (<it>T</it>), XB22A (<it>T*</it>), 37609 (<it>t*</it>) and 33745 (<it>t</it><sup><it>m</it></sup>) (Table <tblr tid="T1">1</tblr>). No clear, significant hybridization was detected in any of the RNA blots with any of the tested DNA probes which together covered the region of Gene-1 encoding Exons 1&#8211;7, with the tnp2 and TNP1 transposase domains (data not shown).</p>
            <p>These results suggested that if the transposase gene was expressed it did so at very low levels, and thus we opted for the more sensitive reverse transcriptase polymerase chain reaction technique (RT-PCR) to assay Gene-1 expression. Figure <figr fid="F6">6</figr> shows the amplification products obtained using three primer pairs. Each pair amplified a different portion of the Gene-1 region with the tnp2 and TNP1 domains. The RT-PCR reactions shown were carried out with RNAs from three isolines that were derived from a single rogue soybean plant that had shown variegation in trichome color in a field breeding program: XB22A (<it>T*</it>), 37609 (<it>t*</it>) and 33745 (<it>t</it><sup><it>m</it></sup>) (Table <tblr tid="T1">1</tblr>). The negative controls (-) were reactions where the cDNA synthesis step was allowed in the absence of superscript (See Methods). Interestingly, the 37609 line with the recessive allele (<it>t*</it>) from which <it>Tgmt* </it>was isolated did not appear to express Gene-1. In contrast, XB22A (<it>T*</it>) and 33745 (<it>t</it><sup><it>m</it></sup>) isolines seem to have retained a putatively active <it>Tgmt* </it>expressing the transposase region of Gene-1. This differential pattern of RT-PCR amplification amongst the isolines was repeated with several other primer sets tested covering other regions of Gene-1 (data not shown). Figure <figr fid="F7">7</figr> shows results of additional examples of PCR reactions performed with cDNAs synthesized from RNA of the mutable isoline 33745 (<it>t</it><sup><it>m</it></sup>) and that generated larger or more complex DNA fragment patterns. Lane 2 of Figure <figr fid="F7">7A</figr> shows the range of amplification products (~2- 5.5 kb) obtained with two primers (1 and 7) each at either end of the predicted Gene-1. Many of the products amplified from cDNAs of this mutable line, 33745 (<it>t</it><sup><it>m</it></sup>), were cloned and sequenced and the analysis and assembly of the cDNA sequences provided a clear picture of <it>Tgmt* </it>expression which is shown in Figure <figr fid="F8">8</figr>. The location of all primer pairs used in the PCR reactions which results are shown in Figures <figr fid="F6">6</figr> and <figr fid="F7">7</figr> were marked in Figure <figr fid="F8">8</figr> with small arrows numbered from 1&#8211;7. In certain instances the arrows were placed near the clones (Cl. 23, 14 and 47) to simplify the diagram.</p>
            <fig id="F6">
               <title>
                  <p>Figure 6</p>
               </title>
               <caption>
                  <p><it>Tgmt* </it>Gene-1 expression in isolines with differing <it>T</it>-locus alleles</p>
               </caption>
               <text>
                  <p><b><it>Tgmt* </it>Gene-1 expression in isolines with differing <it>T</it>-locus alleles</b>. DNA fragments amplified in RT-PCR reactions using three different sets of primers and RNAs from XB22A (<it>T*</it>), 37609 (<it>t*</it>) and 33745 (<it>t</it><sup><it>m</it></sup>) isolines were visualized in an EtBr stained agarose gel. The primer numbers are indicated at the bottom of the figure for each set of RT-PCR reactions. See Methods for sequences of DNA primers. The primer's locations are indicated in Figure 8. The negative controls (-) were reactions in which the cDNA synthesis step was allowed in the absence of superscript.</p>
               </text>
               <graphic file="1471-2229-8-124-6"/>
            </fig>
            <fig id="F7">
               <title>
                  <p>Figure 7</p>
               </title>
               <caption>
                  <p><it>Tgmt* </it>Gene-1 expression in mutable 33745 (<it>t</it><sup><it>m</it></sup>) line</p>
               </caption>
               <text>
                  <p><b><it>Tgmt* </it>Gene-1 expression in mutable 33745 (<it>t</it><sup><it>m</it></sup>) line</b>. DNA fragments amplified in RT-PCR reactions using four different sets of primers and RNAs from the mutable 33745 (<it>t</it><sup><it>m</it></sup>) line were visualized in EtBr stained agarose gels. The primer numbers are indicated at the bottom of the figure for each set of RT-PCR reactions. A) 1 and 6; 1 and 7; 5 and 7. B) 3 and 6. See Methods for sequences of DNA primers. The negative controls (-) were reactions in which the cDNA synthesis step was allowed in the absence of superscript.</p>
               </text>
               <graphic file="1471-2229-8-124-7"/>
            </fig>
            <fig id="F8">
               <title>
                  <p>Figure 8</p>
               </title>
               <caption>
                  <p>Schematic representation of <it>Tgmt* </it>Gene-1 and cloned cDNAs</p>
               </caption>
               <text>
                  <p><b>Schematic representation of <it>Tgmt* </it>Gene-1 and cloned cDNAs</b>. The amplification products shown in Figures 5 and 6 were cloned and their sequences aligned to determine the exon-intron boundaries. Gene-1 larger exons are shown boxed and numbered on a solid line that represent the introns and the 5'- and 3'-ends of Gene-1. The smallest exons (-9, -12, -16, -18, -20, -23, -24) are represented by two parallel lines. Left (L) and right (R) borders are reduced in size to fit the size scale. L is 6,122 bp and R, 1,465 bp. Solid arrow heads represent the CACTA terminal inverted repeats. The two unfilled arrows above the diagram of the transposon represent ORF1 with the transposase tnp2 domain (tnp2) and ORF2 with the TNP1 transposase domain. Below the <it>Tgmt* </it>diagram are the cDNAs amplified with the different sets of primers shown in Figures 6 and 7. The numbered primers are sketched with small arrows underneath <it>Tgmt* </it>or near the cDNAs to simplify the figure. The splice site of the mosaic cDNAs amplified with primers 1 and 7 is indicated with an asterisk (*). The 1/9 portion of Exon-1 represents the 346 bp of Exon-1 that are spliced to Intron-10 right border. Cl = clone number.</p>
               </text>
               <graphic file="1471-2229-8-124-8"/>
            </fig>
            <p>The sequences of 17 cDNA fragments were used to map all the exons and introns on the genomic <it>Tgmt* </it>sequence. The exon-intron boundaries obey the canonical GT-AG rule <abbrgrp><abbr bid="B29">29</abbr></abbrgrp> in the majority of the cDNA clones. See Additional files <supplr sid="S5">5</supplr> and <supplr sid="S6">6</supplr> where exon sequences of cDNA clones have been aligned with the <it>Tgmt* </it>genomic sequence. The sizes and locations of the cDNA exons are listed in Table <tblr tid="T2">2</tblr> and highlighted in bold are 17 of them that are almost identical in size and location to exons predicted by the Softberry- FGeneSH program. Two of the predicted Exons (No. 11 and 15) were portions of larger exons in the cloned cDNAs (No. 11 and 19). However, predicted Exons No. 17 and 19 towards the 3' end of the gene were not part of the cloned cDNAs and since we recovered 11 different cDNA clones from this region of the gene it can be assume with certainty that predicted Exons-17 and -19 were errors of the FGeneSH program. A full length cDNA of 7,554 bp spanning the predicted Gene-1 could be assembled with all the exon sequences of the cDNA clones (see Additional file <supplr sid="S7">7</supplr>: <it>Tgmt* </it>Gene-1 precursor transcripts sequence). Most likely, transcripts of this size were synthesized in vivo given the RT-PCR results obtained and the locations of the primers used on the gene sequence (Figures <figr fid="F7">7</figr> and <figr fid="F8">8</figr>).</p>
            <suppl id="S5">
               <title>
                  <p>Additional file 5</p>
               </title>
               <text>
                  <p><b>Alignment of cDNA sequences of clones 43&#8211;53 to <it>Tgmt* </it>genomic sequence</b>. The sequences from RT-PCR derived cDNA clones (43, 44, 45, 47 and 53) were aligned to <it>Tgmt* </it>genomic sequence with <it>MultAlin </it>program (Corpet, 1988) to reveal the exon-intron junctions and the canonical GT-AG splice boundaries. The exon sequences appear in red and blue depending on the number of clones bearing the exon.</p>
               </text>
               <file name="1471-2229-8-124-S5.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <suppl id="S6">
               <title>
                  <p>Additional file 6</p>
               </title>
               <text>
                  <p><b>Alignment of Mosaic-cDNA sequences to <it>Tgmt* </it>genomic sequence</b>. The sequences from RT-PCR derived cDNA clones amplified with primers 1 and 7 (Figure <figr fid="F8">8</figr>) (No.:37, 38, 39, 42 and 56) were aligned to <it>Tgmt* </it>genomic sequence with <it>MultAlin </it>program (Corpet, 1988). It revealed the splicing site (marked with an asterisk) that does not conform to the canonical GT-AG intron-exon splice boundaries (GA, following the 346 bp of Exon-1 and AT, prior to the 23 bp of Intron-10). The 23 bp of Intron-10 have been highlighted in yellow. The exon sequences appear in red and blue depending on the number of clones bearing the exon.</p>
               </text>
               <file name="1471-2229-8-124-S6.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <suppl id="S7">
               <title>
                  <p>Additional file 7</p>
               </title>
               <text>
                  <p><b><it>Tgmt* </it>Gene-1 precursor transcripts sequence</b>. The sequence of a precursor transcript (7,554 bp) expressed by Gene-1 was composed with all 24 exon sequences present in the cDNA clones analyzed (Figure <figr fid="F8">8</figr> and see Additional file <supplr sid="S5">5</supplr>: Alignment of cDNA sequences of clones 43&#8211;53 to <it>Tgmt* </it>genomic sequence).</p>
               </text>
               <file name="1471-2229-8-124-S7.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <p>Of most relevance were the sequences of the amplified cDNAs with primers at the ends of the gene (1 and 7) because they revealed splicing of the first 346 bp (1/9 portion) of Exon-1 with Exon-11 to generate transcripts 2 &#8211; 2.5 kb in size (Figure <figr fid="F8">8</figr>, Mosaic cDNAs, Cl. 37 &#8211; 39; see Additional file <supplr sid="S8">8</supplr>: <it>Tgmt* </it>Mosaic-transcript sequence). The splicing of the 346 bp of Exon-1, ending in ATAAT, did not occur at the intron-exon junction of Exon-11 but rather 23 bp into Intron-10 right border, adding the 23 bp of intron sequence in the synthesis of mosaic transcripts. The Intron-10 sequence up-stream of the splice site also ends on ATAAT, the same motif found up-stream of Exon-1 splice site (see Additional file <supplr sid="S9">9</supplr>: <it>Tgmt* </it>genomic sequence (20,544 bp)). Whether or not this sequence motif is involved in the recognition and splicing mechanism to create the mosaic gene is not known.</p>
            <suppl id="S8">
               <title>
                  <p>Additional file 8</p>
               </title>
               <text>
                  <p><b><it>Tgmt* </it>Mosaic-transcript sequence</b>. The sequence of the largest (2,572 bp) cDNA clone, No. 37, amplified with primers -1 and -7 contains all 14 exons of the Gene-1 3'-end and the 346 bp (1/9) portion of Exon-1. Purple letters represent Exon-1 bases, highlighted yellow are Intron-10 bases and orange letters are Exons 11&#8211;24 bases. Highlighted pink and green are primer 1 and 7 sequences respectively.</p>
               </text>
               <file name="1471-2229-8-124-S8.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <suppl id="S9">
               <title>
                  <p>Additional file 9</p>
               </title>
               <text>
                  <p><b><it>Tgmt* </it>genomic sequence (20,544 bp)</b>. The reverse complement of <it>Tgmt* </it>genomic sequence (20,544 bp) is shown. Exons are in orange letters. Purple letters are the portion of Exon-1 that splices with Intron-10 to form the mosaic-transcripts. The 23 bases of Intron-10 that are part of Mosaic transcripts are shown in yellow. Also in yellow are the bases of Intron-19 and -20 that are not spliced out in cDNA clone 47. The direct and reverse subterminal repeats are highlighted in yellow and green respectively. The 13 bp CACTA terminal inverted repeats are highlighted in blue. Sequences of Primers 1, 3 and 5 are highlighted in pink and those of primers 2, 4, 6 and 7 are highlighted in green.</p>
               </text>
               <file name="1471-2229-8-124-S9.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <p>The splicing of all the many exons to form the mosaic transcripts as well as the full length precursor transcripts must be cumbersome judging by the multiplicity of related cDNAs amplified with a given set of primers (Figure <figr fid="F7">7</figr> A, lanes 2 and 3; Figure <figr fid="F8">8</figr>, mosaic gene transcripts (Cl. 37&#8211;39) and transcripts amplified with primers 5 and 7, Cl. 47-43). Some of the cDNAs retained entire introns such as Cl. 47 or intron portions in Cl. 44 and 53 (Figure <figr fid="F8">8</figr>). Exon-19 was spliced out in six of the cDNA clones sequenced (Cl.44-43 and 38 and 39). This erratic splicing of exons was observed earlier in transcripts from another complex soybean CACTA transposon,<it>Tgm-Express1 </it>containing multiple host-gene fragments <abbrgrp><abbr bid="B25">25</abbr><abbr bid="B30">30</abbr></abbrgrp>.</p>
            <p>Nevertheless, a putative full length mosaic cDNA of 2,572 bp was cloned and sequenced (Figure <figr fid="F8">8</figr> and see Additional file <supplr sid="S8">8</supplr>: <it>Tgmt* </it>Mosaic-transcript sequence). The splice site between Exon-1 (first 1/9 portion of the exon) and Exon-11 is indicated with an asterisk in Figure <figr fid="F8">8</figr>. This mosaic transcript may encode the DNA-binding protein that is required for excision of these CACTA transposons in soybean in the same fashion TNPA does it for the <it>En-1 </it>element of maize. NCBI/BLAST/blastx (version 2.2.18) found similarities to some predicted proteins from <it>V. vinifera</it>, <it>Arabidopsis </it>and <it>O. sativa </it>in the non-redundant protein database but not to maize. This result supports previous observations that proteins that share little similarity, such as TNPA and TNP1, may have a similar function, that of binding to the subterminal repeats, regions that vary considerably amongst the different CACTA transposons <abbrgrp><abbr bid="B31">31</abbr></abbrgrp>.</p>
            <p>In summary, <it>Tgmt* </it>expresses the predicted Gene-1 in XB22A (<it>T*</it>) and the mutable 33745 (<it>t</it><sup><it>m</it></sup>) lines, but not in the stable, gray trichome 37609 (<it>t*</it>) isoline from which it was cloned. A multiplicity of mRNAs, were transcribed from Gene-1 and the putative largest transcript assembled was 7,554 bp in length. In addition, mosaic transcripts of different sizes were also synthesized and the largest was 2,572 bp. The precursor mRNA codes for a transposase with a tnp2 and TNP1 domains in Exons-1 and 5 respectively. The 2,572 mosaic transcript may encode a DNA-binding protein with a function similar to that of TNPA of the maize <it>En-1 </it>and TNP1 of <it>A. maju</it>s <it>Tam3</it>, but with very limited amino acid sequence similarity.</p>
         </sec>
         <sec>
            <st>
               <p>Redundancy of <it>Tgmt</it>*-like sequences in the soybean genome</p>
            </st>
            <p>In order to determine the abundance of <it>Tgmt* </it>sequences in the soybean genome, an initial NCBI/BLAST/blastn search of the nucleotide collection (nr/nt) database optimized for highly similar sequences was run using either the entire 20,544 bp <it>Tgmt* </it>sequence, the 5,078 bp comprising Exon-1 through Exon-5 with the transposases ORF1 and ORF2 or the 7,778 bp starting at Exon-11 through Exon-24. The transposase portion of the element had 84% identity with 98% coverage to a <it>Glycine max </it>clone gmw1-45m6, (<ext-link ext-link-type="gen" ext-link-id="AC166742.25">AC166742.25</ext-link>), 75% identity over a 58% coverage to two <it>Glycine tomentella </it>clones gtt1-62b6 (<ext-link ext-link-type="gen" ext-link-id="AC188785.18">AC188785.18</ext-link>) and gtt1-188p11 (<ext-link ext-link-type="gen" ext-link-id="AC183809.15">AC183809.15</ext-link>), 75%identity and 26% coverage to a <it>Glycine max </it>clone BAC GM_WBb080D (<ext-link ext-link-type="gen" ext-link-id="EF533700.1">EF533700.1</ext-link>), and finally, 76% identity over 17% coverage, to the soybean <it>Tgm5 </it>transposable element (<ext-link ext-link-type="gen" ext-link-id="X13528.1">X13528.1</ext-link>). However, no highly similar sequences were found when the BLAST search was performed with the second half of Gene-1 sequence (7,778 bp). Two additional sequences, one from <it>Glycine max </it>cultivar T322 dihydroflavonol-4-reductase 2 (<it>DFR2</it>) gene, Intron II and transposon <it>Tgmw4m </it>(<ext-link ext-link-type="gen" ext-link-id="EU068464.1">EU068464.1</ext-link>) and a second one from <it>Glycine max </it>cultivar T321 dihydroflavonol-4-reductase 2 (<it>DFR2</it>) gene, DFR2-w4-dp allele, disrupted promoter region and transposon <it>Tgmw4m </it>(<ext-link ext-link-type="gen" ext-link-id="EU068463.1">EU068463.1</ext-link>) had 99% identity to the right and left border of <it>Tgmt* </it>over 1,325 and 975 bases, respectively, indicating that these may be deletion derivatives from an element similar to <it>Tgmt*</it>.</p>
            <p>The search was extended to the 7&#215; draft sequence assembly Glyma0 from the Joint Genome Inititive (JGI version 12/6/07, see methods). The blastn search with the <it>Tgmt* </it>entire sequence (20.5 kb), found high level of similarity to 52 scaffolds (E value = 0.0). To determine how many of the 52 scaffolds had the transposase and the exons of the mosaic transcript, two blastn searches with the 5,078 bp (Exon-1 through Exon-5) and the 7,778 bp (Exon-11 through Exon-24) were performed. The transposase exons found high similarity (E value = 0.0) to 45 scaffolds while the exons of the mosaic transcript found high similarity (E value = 0.0) to 15 scaffolds. Scaffolds with high similarities to both halves of <it>Tgmt* </it>Gene-1 were, Scaffold_4, _81, _24, and _5. Thus, it appears that the soybean genome of cultivar Williams has at least 4 regions with extensive sequence similarity to <it>Tgmt*</it>.</p>
            <p>We examined also the regions of similarity in the Glyma0 soybean genome (JGI 7&#215; draft sequence assembly) to the <it>Tgm5 </it>transposase sequence (Ac. No. X13528.1) and the equivalent sequence region of <it>Tgmt* </it>(932 bp) using the Phytozome <it>Glycine max </it>(v3.0) BLAST search <url>http://www.phytozome.net/soybean.php</url>. We found that the two transposons had similarities to two different sets of scaffolds. <it>Tgm5 </it>transposase had the highest similarities to scaffolds_38, _20, _10, _229, _7, _12, _97 and _31 while <it>Tgmt* </it>transposase had the highest similarities to scaffolds_24, _4, _81, _6, _5 and _15. This distinction is additional supporting evidence that <it>Tgm5 </it>and <it>Tgmt* </it>are elements representing two different CACTA families in soybean.</p>
            <p>A closer examination of the regions of similarities in each one of the scaffolds revealed a large number of partial transposase sequence copies in each one of those scaffolds when the 932 bp segment carrying the tnp2 domain was used as query. For <it>Tgm5 </it>scaffolds_38, _20, _10, _229, _7, _12, _97 and _31 there were 50, 37, 47, 6, 31, 23, 6 and 31 copies respectively. For <it>Tgmt* </it>scaffolds_24, _4, _81, _6, _5 and _15 there were 27, 55, 13, 53, 55 and 3 copies respectively. However most of these were relatively short regions not containing the element ends. Thus it appears that some regions of the 932 bp orf that carries the conserved tnp2 domain are widely dispersed in the soybean genome. These observations of moderately high copy number is supported by the intense hybridization signals of varying sizes obtained when this region of the element is used as a probe on genomic DNA blots. Furthermore, this region also hybridized to many clones when used to screen genomic libraries (unpublished observations).</p>
         </sec>
         <sec>
            <st>
               <p>Isolation and characterization of a <it>T </it>locus BAC clone</p>
            </st>
            <p>As previously discussed, to fully determine the molecular defect of the <it>F3'H </it>allele (<it>t*</it>) with the gray trichome phenotype, it required the cloning of a full length <it>F3'H </it>gene including its promoter. With such an aim, two <it>Glycine max </it>BAC libraries, one from Williams 82 (GMWBa libarary) and the other from the more distantly related plant introduction line PI437654 (Clemson University Genomic Institute) were screened. No clones were obtained from the Williams 82 GMWBa library but a BAC clone (71B1) containing a full length <it>F3'H </it>gene copy was isolated from the PI437654 library. Partial sequence of the <it>F3'H </it>gene in 71B1 clone was determined initially from PCR amplified fragments and later confirmed and extended with sequences resulting from a shotgun library of the entire BAC clone. These sequences were assembled into three contigs, most likely arranged as shown in Figure <figr fid="F9">9</figr> (Acc. No.: EU721743). Softberry FGeneSH gene prediction and annotation web-based program identified and placed the full <it>F3'H </it>copy (1,590 bp) of the gene in contig-2. In addition, two smaller fragments with <it>F3'H </it>similarity (504 and 354 bp) mapped to contig-1. All three <it>F3'H </it>sequences are marked with red arrows in Figure <figr fid="F9">9</figr>. All other genes found in 71B1 clone were also annotated and their size and relative location in the three contigs are depicted in Figure <figr fid="F9">9</figr> and listed in the annotation Table <tblr tid="T3">3</tblr>. As noted, other full copy genes in this BAC clone were three located in tandem in contig-3 encoding functions of a retrotransposon (gag-pol polyprotein and envelop-like protein). The assembly of the sequences resulting from the shotgun library did not overlap the three contigs shown in Figure <figr fid="F9">9</figr> and their order could not be determine. With the recent release of the 7&#215; draft sequence assembly (JGI version 12/6/07) (see Methods) from the soybean genome of cultivar Williams 82, a BLASTn search was run with the sequences of the three contigs from 71B1 BAC clone. It was determined that contigs-1 and -2 are adjacent in scaffold_83 and contig-3 sequence is an insertion in the middle of contig-1. For that reason contig-3 was drawn above contig-1 in Figure <figr fid="F9">9</figr>. Because the sequences with high similarity to the three contigs mapped closely together in scaffold_83, we can presume conservation of this region in the two cultivars, Williams 82 and PI437654 (see Additional file <supplr sid="S10">10</supplr>: Alignment of sequences of 3 contigs from 71B1 BAC clone in the 7&#215; draft sequence assembly (JGI)).</p>
            <suppl id="S10">
               <title>
                  <p>Additional file 10</p>
               </title>
               <text>
                  <p>Alignment of sequences of 3 contigs from 71B1 BAC clone in the 7&#215; draft sequence assembly (JGI).</p>
               </text>
               <file name="1471-2229-8-124-S10.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <tbl id="T3">
               <title>
                  <p>Table 3</p>
               </title>
               <caption>
                  <p>Gene annotation of BAC clone 71B1 from the <it>Glycine max </it>PI437654 library</p>
               </caption>
               <tblbdy cols="10">
                  <r>
                     <c ca="center">
                        <p>
                           <b>Contig</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p><b>Gene No</b>.</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>Strand</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>Exons</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p><b>Nt</b>.</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>AA</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>Annotation</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p><b>Acc. No</b>.</p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>E-value</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>Organism</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c cspan="10">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>(+)</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>351</p>
                     </c>
                     <c ca="center">
                        <p>116</p>
                     </c>
                     <c ca="center">
                        <p>Unknown Protein</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>(+)</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>459</p>
                     </c>
                     <c ca="center">
                        <p>152</p>
                     </c>
                     <c ca="center">
                        <p>Reverse transcriptase</p>
                     </c>
                     <c ca="center">
                        <p>ABB00038</p>
                     </c>
                     <c ca="center">
                        <p>2e<sup>-79</sup></p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>Glycine max</it>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>(+)</p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>504</p>
                     </c>
                     <c ca="center">
                        <p>167</p>
                     </c>
                     <c ca="center">
                        <p>F3'H</p>
                     </c>
                     <c ca="center">
                        <p>BAB83261</p>
                     </c>
                     <c ca="center">
                        <p>4e<sup>-47</sup></p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>Glycine max</it>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>(+)</p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>290</p>
                     </c>
                     <c ca="center">
                        <p>156</p>
                     </c>
                     <c ca="center">
                        <p>Ovarian tumor otubain</p>
                     </c>
                     <c ca="center">
                        <p>ABN05752</p>
                     </c>
                     <c ca="center">
                        <p>1e<sup>-28</sup></p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>M. truncatula</it>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                     <c ca="center">
                        <p>(+)</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>387</p>
                     </c>
                     <c ca="center">
                        <p>128</p>
                     </c>
                     <c ca="center">
                        <p>Ovarian tumor otubain</p>
                     </c>
                     <c ca="center">
                        <p>ABN05752</p>
                     </c>
                     <c ca="center">
                        <p>3e<sup>-19</sup></p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>M. truncatula</it>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="center">
                        <p>(-)</p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>1407</p>
                     </c>
                     <c ca="center">
                        <p>468</p>
                     </c>
                     <c ca="center">
                        <p>Hypothetical protein</p>
                     </c>
                     <c ca="center">
                        <p>CAN67561</p>
                     </c>
                     <c ca="center">
                        <p>8e<sup>-14</sup></p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>Vitis vinifera</it>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>7</p>
                     </c>
                     <c ca="center">
                        <p>(-)</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>693</p>
                     </c>
                     <c ca="center">
                        <p>230</p>
                     </c>
                     <c ca="center">
                        <p>Hypothetical protein</p>
                     </c>
                     <c ca="center">
                        <p>CAN70327</p>
                     </c>
                     <c ca="center">
                        <p>4e<sup>-34</sup></p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>Vitis vinifera</it>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>8</p>
                     </c>
                     <c ca="center">
                        <p>(-)</p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                     <c ca="center">
                        <p>432</p>
                     </c>
                     <c ca="center">
                        <p>143</p>
                     </c>
                     <c ca="center">
                        <p>Unknown protein</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>9</p>
                     </c>
                     <c ca="center">
                        <p>(+)</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>354</p>
                     </c>
                     <c ca="center">
                        <p>117</p>
                     </c>
                     <c ca="center">
                        <p>F3'H</p>
                     </c>
                     <c ca="center">
                        <p>AA047853</p>
                     </c>
                     <c ca="center">
                        <p>5e<sup>-43</sup></p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>Glycine max</it>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>10</p>
                     </c>
                     <c ca="center">
                        <p>(+)</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>723</p>
                     </c>
                     <c ca="center">
                        <p>240</p>
                     </c>
                     <c ca="center">
                        <p>Unknown protein</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>11</p>
                     </c>
                     <c ca="center">
                        <p>(-)</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>801</p>
                     </c>
                     <c ca="center">
                        <p>266</p>
                     </c>
                     <c ca="center">
                        <p>Unknown protein</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>(-)</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>381</p>
                     </c>
                     <c ca="center">
                        <p>126</p>
                     </c>
                     <c ca="center">
                        <p>MuRD transposase</p>
                     </c>
                     <c ca="center">
                        <p>AAC26234</p>
                     </c>
                     <c ca="center">
                        <p>2e<sup>-06</sup></p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>Arabidopsis</it>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>(+)</p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>1590</p>
                     </c>
                     <c ca="center">
                        <p>529</p>
                     </c>
                     <c ca="center">
                        <p>F3'H</p>
                     </c>
                     <c ca="center">
                        <p>BAB83261</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>Glycine max</it>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>(-)</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>309</p>
                     </c>
                     <c ca="center">
                        <p>102</p>
                     </c>
                     <c ca="center">
                        <p>Unknown protein</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>(-)</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>435</p>
                     </c>
                     <c ca="center">
                        <p>144</p>
                     </c>
                     <c ca="center">
                        <p>Unknown protein</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>(-)</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>2181</p>
                     </c>
                     <c ca="center">
                        <p>726</p>
                     </c>
                     <c ca="center">
                        <p>Gag-pol polyprotein</p>
                     </c>
                     <c ca="center">
                        <p>AA073523</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>Glycine max</it>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>(-)</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>1851</p>
                     </c>
                     <c ca="center">
                        <p>616</p>
                     </c>
                     <c ca="center">
                        <p>Envelope-like protein</p>
                     </c>
                     <c ca="center">
                        <p>AA073528</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>Glycine max</it>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>(-)</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>3018</p>
                     </c>
                     <c ca="center">
                        <p>1005</p>
                     </c>
                     <c ca="center">
                        <p>Gag-pol polyprotein</p>
                     </c>
                     <c ca="center">
                        <p>AA073521</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>Glycine max</it>
                        </p>
                     </c>
                  </r>
               </tblbdy>
            </tbl>
            <fig id="F9">
               <title>
                  <p>Figure 9</p>
               </title>
               <caption>
                  <p>Schematic representation of <it>F3'H </it>gene sequences in the PI437654 line BAC clone</p>
               </caption>
               <text>
                  <p><b>Schematic representation of <it>F3'H </it>gene sequences in the PI437654 line BAC clone</b>. Three non overlapping sequence contigs-1 (40,518 nt),-2 (23,207 nt) and-3 (9,467 nt) of 71B1 BAC clone were arranged based on the organization of corresponding sequences in scaffold_83 of the 7&#215; Glyma0 sequence assembly (JGI) of cultivar Williams. Contig-1 and -2 sequences are adjacent in Glyma0 assembly in the order shown. Contig-3 sequence is an insertion in the center of contig-1 in the Glyma0 assembly and here it is displayed a top and center of contig-1 to indicate the likely approximate location in 71B1 BAC clone. The sizes and orientations of genes are represented by gray arrows with their respective annotations. <it>F3'H </it>sequences are shown in red. The full length (1,590 bp) <it>F3'H </it>gene maps to Contig-2, and the smaller (504 and 354 bp) fragments with homology to <it>F3'H </it>map to Contig-1. Contig-3 encodes functions of a retrotransposon. The introns of genes are not displayed.</p>
               </text>
               <graphic file="1471-2229-8-124-9"/>
            </fig>
            <p>We reported earlier that the <it>F3'H </it>gene appeared to be single copy in soybean <abbrgrp><abbr bid="B26">26</abbr></abbrgrp>. The results of a BLAST search of the 7&#215; draft sequence assembly (JGI version 12/16/07, see Methods) from the soybean genome of cultivar Williams with the Williams <it>F3'H </it>gene sequence (Ac. No.: EU190438) and the sequence of BAC clone 71B1, all pointing to a single Scaffold_83 with only one full <it>F3'H </it>gene copy, are further evidence that <it>F3'H </it>is a single copy gene in soybean. These results also explain the gray trichome phenotype of the soybean line in which the large <it>Tgmt* </it>element inserted in Intron-1 of the single copy of the <it>F3'H </it>gene in soybean.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Discussion</p>
         </st>
         <p>Although the molecular structures of 7 deletion derivative CACTA transposable elements (<it>Tgm1-7</it>; <abbrgrp><abbr bid="B23">23</abbr><abbr bid="B24">24</abbr><abbr bid="B15">15</abbr></abbrgrp>) as well as that of a <it>Tgm-Express1 </it>CACTA transposon of 5.7 kb that carries 5 gene fragments <abbrgrp><abbr bid="B25">25</abbr></abbrgrp> have been described for the soybean, the existence of an autonomous element has remained elusive. A recent study that helped identify the <it>T </it>locus as a flavonoid 3'-hydroxylase (<it>F3'H</it>) gene was based on the sequence and expression of two different recessive alleles <abbrgrp><abbr bid="B26">26</abbr></abbrgrp> that specify gray instead of tawny color trichomes. The <it>t </it>allele of Richland was characterized molecularly and was found to have a base deletion that creates a frame shift and terminates the F3'H open reading frame prematurely. The second gray trichome allele studied (<it>t*</it>) was derived from a different genetic stock, line 37609, that had undetectable F3'H mRNA levels but no significant differences in the genomic sequence of the two F3'H fragments we had isolated and analyzed. The 37609 stable, gray-trichome, line, was originally derived from a single rogue plant having chimeric sectors of both tawny and gray trichomes on the same plant that appeared spontaneously in the field of a breading program. A stable tawny line, XB22A (<it>T*</it>) and the gray/tawny mutable 33745 (<it>t</it><sup><it>m</it></sup>) line were also derived from that initial rogue plant. The continued mutability of this isoline and the spontaneous appearance of the rogue progenitor plant suggested the existence of an autonomous transposable element in this soybean genetic stock that manifested in the mixed trichome phenotype.</p>
         <p>As we had predicted in the above mentioned study, the portion of <it>F3'H </it>genomic sequence that eluded us at that time was a 4.1 kb Intron-1 in which a CACTA tranposon, 20.5 kb in size, had inserted in the gray trichome allele (<it>t*</it>) (Figure <figr fid="F2">2</figr>). This large insertion could have prevented proper splicing of Intron-1 and assembly of a functional F3'H mRNA. The mutable line 33745 has the gray (<it>t*</it>) and tawny (<it>T*</it>) alleles in its genetic make-up (Figure <figr fid="F1">1</figr>). The 20.5 kb element was named <it>Tgmt* </it>and its isolation through long-distance PCR amplification permitted its sequencing, molecular characterization and the study of its expression in the mutable and the stable gray or tawny trichome isolines.</p>
         <p><it>Tgmt* </it>has imperfect 13 base CACTA inverted repeats, a target site duplication (ATA) and the asymmetric highly structured subterminal regions characteristic of the CACTA transposon family. The 13 bp CACTA inverted repeats and the reiterated subterminal sequence motif, are <it>cis</it>-determinants for transposition. Unlike the 13 bp CACTA inverted repeats that are conserved, the subterminal repeats vary in size and sequence among the different CACTA transposons. The subterminal repeated motifs of the soybean CACTA elements analyzed (<it>Tgm1, Tgmt*</it>, <it>Tgm-Express1</it>and <it>Tgmw4m</it>) were more complex than the <it>En-1 </it>motif. It has been proposed that a DNA-binding protein encoded by the transposon attaches to these subterminal motifs to help or suppress the element's excision <abbrgrp><abbr bid="B31">31</abbr></abbrgrp>. Two of the well characterized DNA binding proteins TNPA of <it>En-1 </it>and TNP1 of <it>Tam1 </it>shared little similarity and it was suggested that non-homologous DNA binding proteins may recognize diverse subterminal DNA-binding motifs <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>. The <it>Tgmt* </it>subterminal repeated sequence motif is not only different from that of <it>En-1 </it>transposon but it diverged from the repeated motif in another soybean CACTA element, <it>Tgm1 </it><abbrgrp><abbr bid="B24">24</abbr><abbr bid="B32">32</abbr></abbrgrp> (Figure <figr fid="F5">5</figr>). Our results, thus, reinforce the notion of variability in subterminal regions of a CACTA element that is likely paralleled by DNA-binding protein diversity. In addition, the difference observed between the <it>Tgm1 </it>subterminal repeat motif and that of the <it>Tgmt* </it>family (<it>Tgm-Express1</it>and <it>Tgmw4m</it>), suggests that <it>Tgm1 </it>is a deletion derivative from another CACTA element distancing from <it>Tgmt*</it>. Could the TNPA-like DNA binding protein encoded in <it>Tgmt* </it>help in the excision/suppression of <it>Tgm1</it>? Although the repeated sequence motifs are different among all CACTA transposons characterized, it is possible that there is some communality among them that is recognized by the dissimilar TNPA-like proteins. The alignment of all subterminal repeated motifs shown in Figure <figr fid="F5">5</figr> showed some broad similarities with a GC rich and AT rich domains. The GC rich domain may also be a site susceptible to methylation which could hinder TNPA-like binding and consequent inhibition of transposon excision <abbrgrp><abbr bid="B6">6</abbr></abbrgrp>.</p>
         <p>The large and complex Gene-1 found in the negative strand through gene prediction web-based algorithms was confirmed by RT-PCR amplified cDNAs. It is a dual function gene with 24 exons stretching a length of 14,627 bp from the start site (4,708 bp) to the PolyA (19,335 bp) out of the 20,544 bp <it>Tgmt* </it>element. A precursor transcript of ~7.5 kb codes for a putative transposase with a tnp2 and a TNP1domains. A smaller mRNA of ~2.5 kb resulted from alternatively splicing the 5'end (346 bp) of Exon-1 to the 3'end of Intron-10, 23 bp upstream of Exon-11 (Figure <figr fid="F8">8</figr>; see Additional file <supplr sid="S9">9</supplr>: <it>Tgmt* </it>genomic sequence (20,544 bp)). The product of this mosaic transcript had no conserved domains and in a NCBI basic blastx the highest similarity was to a hypothetical <it>V.vinifera </it>protein (CA048701) with 79% similarities over a stretch of 428 aa. Likewise, the transposase mRNA in a similar blastx had 64% similarity to a <it>V. vinifera </it>hypothetical protein (CAN82870). In addition, the domain architecture of the transposase is most similar to that of an <it>O. sativa </it>protein (Os03g0714800) with a (pfam02992) tnp2 motif and a (pfam03017) TNP1/EN/SPM motif. These results suggest that the soybean <it>Tgmt* </it>Gene-1 is more closely related to genes in <it>V. vinifera</it>, <it>O.sativa </it>and <it>Arabidopsis </it>than to those of <it>Z. mays</it>, <it>En-1 </it>and <it>A. majus</it>, <it>Tam1</it>. <it>En-1 </it>transposase has also two domains, the (pfam02992) tnp2 that seems to be conserved in all CACTA element transposases, but the second domain is of the (pfam03004) Ptta/En/Spm family.</p>
         <p>Another distinction of <it>Tgmt* </it>is that the 82% high GC content of Exon-1 (between positions 300 and 550) in <it>En-1 </it>element does not occur. Because CpG residues are sensitive to methylation, that region of <it>En-1 </it>with high GC content was proposed as potential site for gene regulation. The GC content of <it>Tgmt* </it>Gene-1 is uniformly lower with a 40% average throughout. Thus, transposon inactivation by methylation may not require such large concentration of CpG residues.</p>
         <p>Although the majority of intron-exon splicing occurs at the canonical GT-AG boundaries, it is of interest to point out that the splice signals used to create the ~2.5 kb mosaic transcripts do not appear to be either the GT-AG for intron splicing or the CTPuAPy branch site signal typically located 20&#8211;50 bp upstream of the acceptor site where Pu = A or G and Py = C or T <abbrgrp><abbr bid="B33">33</abbr></abbrgrp>. We do not know if the ATAAT motifs adjacent to both the donor and acceptor splice sites are used as recognition signals in <it>Tgmt*</it>. Alternatively, the splicing mechanism in this transposon may resemble the sex lethal gene model of the fruit fly where splicing signals may be masked by a regulatory protein <abbrgrp><abbr bid="B34">34</abbr></abbrgrp>. This type of mechanism could have evolved to help reduce the number of transposition events. A regulatory protein that binds to Exon-1 blocking transcription of the transposase gene Orf1 and Orf2 would nonetheless allow splicing of the mosaic transcript. This blockage could result in reduction of transposon excision while allowing the suppressor function of the mosaic protein that will bind to the subterminal repeats of the element when possible.</p>
         <p>The copy number of CACTA elements in <it>Z. mays </it>has been estimated to 50&#8211;100 <abbrgrp><abbr bid="B8">8</abbr></abbrgrp> and 4 (20)<it>A. thaliana </it><abbrgrp><abbr bid="B20">20</abbr></abbrgrp>. The soybean <it>Tgm </it>family is also repetitive with earlier estimates from DNA blots using the element ends as probes, determining that the family of deletion derivatives numbered less than 50 copies per genome <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>. That figure may have been an underestimate as it likely included only the <it>Tgm1 </it>family of elements, excluding all those elements of the <it>Tgmt* </it>family. Our BLAST search also showed that the transposase first orf is very repetitive throughout the soybean genome. However, based on RNA blot results probed with Exon-1 sequence as well as the under-representation in the soybean expressed sequence tag (EST) collections, the number of active elements seems to be very low. It is possible that there are multiple autonomous <it>Tgm </it>elements in soybean but that they are inactivated by methylation under optimal growth conditions and re-activated in stressful environments. Precedent for this has been shown with rice retrotransposons where the copy number increased from 2 to 30 in some strains as measured by DNA blots <abbrgrp><abbr bid="B35">35</abbr></abbrgrp>.</p>
         <p>On support of the possible existence of multiple autonomous elements in the soybean genome is the result obtained from a BLAST search of the 7&#215; draft sequence assembly Glyma0 (JGI) that produced 4 regions (Scaffold_4, _81, _24, and _5) with significant similarities to the <it>Tgmt* </it>Gene-1 sequence.</p>
      </sec>
      <sec>
         <st>
            <p>Conclusion</p>
         </st>
         <p>We have determined the molecular bases for the soybean gray trichome phenotype of the mutant <it>t* </it>allele to be a 20.5 kb putative CACTA transposon (<it>Tgmt*</it>) inserted in Intron-1 of the single copy <it>F3'H </it>gene. <it>Tgmt* </it>has conserved 13 bp TIRs, the 3 bp target site duplication and asymmetric subterminal repeated motif. The latter distinguished two CACTA-transposon families in soybean and defined regions of some similarity among other CACTA transposons subterminal motifs. <it>Tgmt* </it>expressed a 14.6 kb Gene-1 that encodes the two functions required of an active, autonomous element: a transposase with a conserved Tnp2 domain and a mosaic transcript bearing little homology to other transposon mosaic gene products. Thus, <it>Tgmt* </it>has the potential to be an active and autonomous transposon expressed in two isolines of the genetic stock studied. <it>Tgmt* </it>transposase is more closely related to <it>O. sativa </it>and <it>A. Thaliana </it>transposases with a TNP1 (pfam03017) domain than to <it>En-1 </it>transposase with a Ptta (pfam03004) domain.</p>
         <p>In addition, our results support previous assertions that CACTA transposases are conserved, most likely because they associate with the conserved CACTA TIRs during scission and insertion, while the DNA-binding proteins, products of mosaic transcripts that bind to subterminal less-conserved repeated motifs, share little similarity. The divergence of the subterminal repeats within soybean as exemplified by the two subtypes of <it>Tgm1 </it>versus <it>Tgmt* </it>indicate that within-species diversification into subtypes of subterminal repeats and the mosaic transcript products that bind them, will be more extensively found as more genomes are sequenced and analyzed.</p>
      </sec>
      <sec>
         <st>
            <p>Methods</p>
         </st>
         <sec>
            <st>
               <p>Plant Material and Genotypes</p>
            </st>
            <p>The <it>Glycine max </it>cultivars and isolines used for this study were: Williams (<it>T</it>, tawny trichomes) XB22A (<it>T*</it>, tawny trichomes), 37609 (<it>t* </it>gray trichomes), 37643 (<it>t*</it>, gray trichomes) and 37345 (<it>t</it><sup><it>m</it></sup>, with variegated hilum tawny and gray trichomes). Each is homozygous for the indicated alleles of the <it>T </it>locus. The origin, genetics, and discovery of the <it>T </it>allele as <it>F3'H </it>has been described previously <abbrgrp><abbr bid="B26">26</abbr></abbrgrp>. Plants were grown in the greenhouse. Shoot tips (meristems surrounded by primordial leaves), seed coats and cotyledons dissected from seeds at varying stages of development were frozen in liquid nitrogen, freeze dried (Multi-dry lyophilazer; FTS systems), and stored at -20&#176;C. The seed coats and cotyledons used in this study were those of seeds which fresh weight of the entire seed was 25&#8211;50 mg.</p>
         </sec>
         <sec>
            <st>
               <p>RNA Extraction, Purification and cDNA Synthesis</p>
            </st>
            <p>Total RNA was isolated from shoot tips, seed coats and cotyledons using a phenol-chloroform and lithium chloride precipitation method <abbrgrp><abbr bid="B36">36</abbr><abbr bid="B37">37</abbr></abbrgrp>. RNA was stored at -70&#176;C until used.</p>
            <p>cDNA copies of the <it>Tgmt* </it>predicted gene(s) from three of the isolines (XB22A, 37609, 33745) and Williams 43 were amplified from a first-strand cDNA pool synthesized using 1 &#956;g of cotyledon total RNA and the Superscript first strand synthesis system for reverse transcriptase (RT)-PCR (Invitrogen, San Diego). The total RNAs used for these RT-PCR reactions were treated with DNAaseI using Ambion's DNA-free kit and concentrated in Microcon YM-30 columns (Millipore, Bedford, MA). For each RNA sample, parallel reactions were allowed in the absence of superscript (- controls) to assess the extent of DNA contamination. The sequences of primer pairs used in RT-PCR reactions shown in Figure <figr fid="F6">6</figr> were:</p>
            <p>(1) Primer No. 3 5'-GGGACATCTGAGAATGAC-3' (TGME3-43F)</p>
            <p>Primer No. 4 5'-AACAACATACAATCCCAT-3' (TGME3-36R)</p>
            <p>(2) Primer No. 1: 5'-AACAGCACGCATAACTGAAGA-3' (TGM8-14677R)</p>
            <p>Primer No. 2: 5'-AACTGGTGTCGTACTCCC-3' (43-43-FR)</p>
            <p>(3) Primer No. 5: 5'-GGCTGAAATAATATCAGG-3' (TGME-36NZ-F)</p>
            <p>Primer No. 6: 5'-CATCATATTTAGCATCTG-3' (43-43-R2)</p>
            <p>Primer pairs of RT-PCR reactions shown in Figure <figr fid="F7">7A</figr> and <figr fid="F7">7B</figr> were:</p>
            <p>(A1) Primer No. 1: 5'-AACAGCACGCATAACTGAAGA-3' (TGM8-14677R)</p>
            <p>Primer No. 6: 5'-TTAGCATCTGTTATTTCTAATAGC-3' (TRP-1-F) &#8773; (43-43-R2)</p>
            <p>(A2) Primer No. 1: 5'-AACAGCACGCATAACTGAAGA-3' (TGM8-14677R)</p>
            <p>Primer No.7: 5'-TGTGAAGTCATATAGAGTGGC-3' (TGM8-1741F)</p>
            <p>(A3) Primer No. 5: 5'-GGCTGAAATAATATCAGG-3' (TGME-36NZ-F)</p>
            <p>Primer No.7: 5'-TGTGAAGTCATATAGAGTGGC-3' (TGM8-1741F)</p>
            <p>(B) Primer No. 3 5'-GGGACATCTGAGAATGAC-3' (TGME3-43F)</p>
            <p>Primer No. 6: 5'-TTAGCATCTGTTATTTCTAATAGC-3' (TRP-1-F) &#8773; (43-43-R2)</p>
         </sec>
         <sec>
            <st>
               <p>Primer synthesis, PCR reaction conditions and DNA cloning</p>
            </st>
            <p>Oligonucleotide primers were synthesized on an Applied Biosystems (Foster City, CA) model 394A DNA synthesizer at the Keck Center, a unit of the University of Illinois Biotechnology Center. For small DNA fragment amplification, PCR reactions were performed by an initial denaturation step at 94&#176;C for 2 min followed by 30 cycles of denaturing at 94&#176;C for 30 sec, annealing at 56&#176;C for 1 min, extension at 68&#176;C for 9 min, to end with a 10 min extension at 72&#176;C. High-fidelity and -efficiency <it>Ex Taq </it>(Takara Bio Inc. Otsu, Japan) polymerase was used at 0.75 units per 50 &#956;l reaction. To amplify the larger DNA fragments, PCR reaction conditions were as follows: initial denaturation step at 94&#176;C for 2 min followed by 30 cycles of 94&#176;C for 30 sec and 68&#176;C for 10 min, and ending with a 10 min extension at 72&#176;C. <it>LA Taq </it>polymerase (Takara Bio Inc. Otsu, Japan) was used at 0.75 units per 50 &#956;l reaction.</p>
            <p>In most instances, amplified DNAs were separated from oligonucleotides with a QIAquick PCR Purification kit (QIAGEN), cloned into pGem-T-easy and sequenced in an ABI 3730 &#215; l (Applied Biosystems, Inc. Foster City, CA) at the Keck Center. However, the larger, 23 kb PCR product (primers: Z37F, 5'-ATTTGAAACGCGTGGTGCCTGCATTTAAAGACAA</p>
            <p>TTT-3' and IN663R, 5'-ATCACCTCCATCACCATAGCCTTAAACTCATCAGCCC-3') was extracted from 1% Seaplaque agarose gel in TA buffer with the aid of GELase Agarose Gel-Digesting Preparation (Epicenter Biot. Madison, WI) following the manufacturer's protocol. Two equal fractions (200 ng) of the 23 kb DNA fragment extracted in this fashion was utilized to prepare a radiolabel probe and in <it>HindIII </it>restriction digests. The resulting <it>HindIII </it>fragments were cleaned with QIAquick PCR Purification kit (QIAGEN) and cloned into CIP dephosphorylated pGem vector cut with <it>HindIII</it>. A DNA ligation kit Ver.2.1 (TaKaRa) aided the cloning step. The resulting 23 kb <it>HindIII </it>clones were verified in DNA blots hybridized to the radiolabel 23 kb probe.</p>
            <p>The conditions used to optimize the cloning of a 17 kb DNA fragment included the phosphorylation of primers Z37F (sequence above) and TGM23R (5'-GTGATTTGATAGAACAAGGTACGTAAAAGCTGAAAC-3') with T4-polynucleotide kinase (Lucigen Co. Middleton, WI) prior to the PCR amplification reaction. The products of this reaction were extracted from a 1% low-melt Seaplaque agarose gel with QIAEX II gel extraction Kit (QIAGEN) and cloned into pJAZZ-OC vector using the BigEasy v2.0 Linear Cloning Kit (Lucigen Co. Middleton, WI). Multiple size fragments were cloned and verified by sequencing.</p>
         </sec>
         <sec>
            <st>
               <p>RNA gel-blot analysis and synthesis of DNA probes</p>
            </st>
            <p>RNA (10 &#956;g/sample) was electrophoresed in a 1.2% agarose-3% formaldehyde gel <abbrgrp><abbr bid="B38">38</abbr></abbrgrp>. Size-fractionated RNAs were transferred to Optitran-supported nitrocellulose membrane (Midwest Scientific, Valley Park, MO) by capillary action as described in Sambrook et al., (1989) <abbrgrp><abbr bid="B38">38</abbr></abbrgrp> and cross-linked with UV light (Stratagene, La Jolla, CA). Nitrocellulose RNA blots were prehybridized, hybridized, washed, and exposed to Hyperfilm (Amersham, Arlington Heights, IL) as described by Todd and Vodkin (1996) <abbrgrp><abbr bid="B39">39</abbr></abbrgrp>.</p>
            <p>Cloned DNAs used as probes were PCR amplified, electrophoresed, and purified from the agarose using the QIAquick gel extraction kit (QUIAGEN, Valencia, CA). DNA concentration of the final eluate was determined with a NanoDrop (NanoDrop Technologies, Inc. Rockland, DE). Purified DNA fragments (25&#8211;250 ng) were labeled with [a-<sup>32</sup>P]dATP by random primer reaction <abbrgrp><abbr bid="B40">40</abbr></abbrgrp>.</p>
         </sec>
         <sec>
            <st>
               <p>Sequencing and sequence annotation</p>
            </st>
            <p>A shotgun library of BAC clone 71B1 from <it>G. max </it>PI437654 cultivar was constructed at the Keck Center for Comparative and Functional Genomics (University of Illinois). BAC DNA was randomly sheared and cloned into the pCR4Blunt-TOPO vector using the Topo Shotgun Subcloning Kit (Invitrogen). Sequencing of the clones was done using the Big Dye Terminator chemistry (Applied Biosystems, Foster City, CA). Base-calling and quality assessment were performed automatically using PHRED <abbrgrp><abbr bid="B41">41</abbr></abbrgrp>. High-quality sequences were assembled with PHRAP to order the contigs. PCR amplification and direct sequencing of subclones was used to close gaps with an end result of three contigs that could not be overlapped.</p>
            <p>Sequencing of all other clones was done at the Keck Center (University of Illinois Biotechnology Center)</p>
            <p>Blast searches with sequences of the <it>Tgmt* </it>transposon and the three contigs of 71B1 BAC clones were extended to the 7&#215; draft sequence assembly <it>Glycine max </it>0 (Glyma0) from the Joint Genome Initiative (JGI) released 12/06/07. This 7&#215; assembly consisted of 3,317 sequences with a total of 996,173,606 letters which was made accessible to us by the Matt Hudson Laboratory (U. of Illinois at Urbana/Champaign) <url>http://stan.cropsci.uiuc.edu/blast/blast.html</url>. We also searched the JGI <b>Glyma0 </b>soybean genome assembly using the Phytozome: <it>Glycine max </it>(v3.0) BLAST web-site <url>http://www.phytozome.net/search.php?show=blast</url>. Note that the scaffolds numbers are predicted to change once the final 8&#215; soybean genome assembly is completed.</p>
         </sec>
         <sec>
            <st>
               <p>Accession Numbers</p>
            </st>
            <p>Sequence data from this article can be found in the EMBL/GenBank data libraries under accession numbers: [<ext-link ext-link-type="gen" ext-link-id="EU190438">EU190438</ext-link>: <it>F3'H </it>(<it>T</it>) allele in Williams43]; [<ext-link ext-link-type="gen" ext-link-id="EU190439">EU190439</ext-link>: <it>F3'H </it>(<it>T*</it>) allele in XB22A]; [<ext-link ext-link-type="gen" ext-link-id="EU190440">EU190440</ext-link>: <it>F3'H </it>(<it>t*</it>) mutant allele in 37609 isoline]; [<ext-link ext-link-type="gen" ext-link-id="EU721743">EU721743</ext-link>: 71B1 BAC clone].</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Authors' contributions</p>
         </st>
         <p>GZ carried out the design of the study, performed and analyzed the results of all the experimental work described including the comparative analysis and drafted the manuscript. LV initiated, coordinated, and led the project and edited the manuscript. Both authors read and approved the final manuscript.</p>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>We thank Laura Guest who directed sequencing of all cloned DNAs and to Virginia Lukas for the synthesis of the oligonucleotide primers. Special thanks to Alvaro G. Hernandez for directing the construction and sequencing of the 71B1 BAC clone shotgun library and to Jyothi Thimmapuram that assembled the BAC sequences. This technical assistance was provided at the Roy J. Carver Biotechnology Center, W. M. Keck Center for Comparative and Functional Genomics at the University of Illinois, Urbana-Champaign. We gratefully acknowledge support from grants of the Illinois Soybean Association, USDA, and United Soybean Board.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>Mutations in maize and chromosomal aberrations in Neurospora</p>
            </title>
            <aug>
               <au>
                  <snm>McClintock</snm>
                  <fnm>B</fnm>
               </au>
            </aug>
            <source>Carnegie Inst Washington Year Book</source>
            <pubdate>1954</pubdate>
            <volume>53</volume>
            <fpage>254</fpage>
            <lpage>260</lpage>
         </bibl>
         <bibl id="B2">
            <title>
               <p>A mutable pale green locus in maize</p>
            </title>
            <aug>
               <au>
                  <snm>Peterson</snm>
                  <fnm>PA</fnm>
               </au>
            </aug>
            <source>Genetics</source>
            <pubdate>1953</pubdate>
            <volume>38</volume>
            <fpage>682</fpage>
            <lpage>683</lpage>
         </bibl>
         <bibl id="B3">
            <title>
               <p>The control of gene action in maize</p>
            </title>
            <aug>
               <au>
                  <snm>McClintock</snm>
                  <fnm>B</fnm>
               </au>
            </aug>
            <source>Brookhaven Symp Biol</source>
            <pubdate>1965</pubdate>
            <volume>18</volume>
            <fpage>162</fpage>
            <lpage>184</lpage>
         </bibl>
         <bibl id="B4">
            <title>
               <p>Molecular analysis of the En/Spm transposable element system of <it>Zea mays</it></p>
            </title>
            <aug>
               <au>
                  <snm>Pereira</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Cuypers</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Gierl</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Schwarz-Sommer</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Saedler</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>EMBO J</source>
            <pubdate>1986</pubdate>
            <volume>5</volume>
            <fpage>835</fpage>
            <lpage>841</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1166871</pubid>
                  <pubid idtype="pmpid">15957213</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>Genetic and molecular analysis of the <it>Spm-dependent a-m2 </it>alleles of the maize <it>a </it>locus</p>
            </title>
            <aug>
               <au>
                  <snm>Masson</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Surosky</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Kingsbury</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Federoff</snm>
                  <fnm>NV</fnm>
               </au>
            </aug>
            <source>Genetics</source>
            <pubdate>1987</pubdate>
            <volume>177</volume>
            <fpage>117</fpage>
            <lpage>137</lpage>
         </bibl>
         <bibl id="B6">
            <title>
               <p><it>TnpA </it>product encoded by the transposable element <it>En-1 </it>of <it>Zea mays </it>is a DNA binding protein</p>
            </title>
            <aug>
               <au>
                  <snm>Gierl</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Lutticke</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Saedler</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>EMBO J</source>
            <pubdate>1988</pubdate>
            <volume>7</volume>
            <fpage>4045</fpage>
            <lpage>4053</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">455112</pubid>
                  <pubid idtype="pmpid">2854053</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>The hAT and CACTA superfamilies of plant transposon</p>
            </title>
            <aug>
               <au>
                  <snm>Kunze</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Weil</snm>
                  <fnm>CF</fnm>
               </au>
            </aug>
            <publisher>Mobile DNA II ASM Press, Washington DC</publisher>
            <editor>Craigie R, Gellert M, Lambowitz</editor>
            <pubdate>2002</pubdate>
            <fpage>565</fpage>
            <lpage>610</lpage>
         </bibl>
         <bibl id="B8">
            <title>
               <p>The <it>En/Spm </it>transposable element of maize</p>
            </title>
            <aug>
               <au>
                  <snm>Gierl</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Curr Top Microbiol Immunol</source>
            <pubdate>1996</pubdate>
            <volume>204</volume>
            <fpage>145</fpage>
            <lpage>159</lpage>
            <xrefbib>
               <pubid idtype="pmpid">8556865</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p><it>En/Spm</it>-encoded TNPA protein requires a specific target sequence for suppression</p>
            </title>
            <aug>
               <au>
                  <snm>Grant</snm>
                  <fnm>SR</fnm>
               </au>
               <au>
                  <snm>Gierl</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Saedler</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>EMBO J</source>
            <pubdate>1990</pubdate>
            <volume>9</volume>
            <fpage>2029</fpage>
            <lpage>2035</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">551919</pubid>
                  <pubid idtype="pmpid">2162760</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>Molecular interactions between the components of the En-1 transposable element system of Zea mays</p>
            </title>
            <aug>
               <au>
                  <snm>Gierl</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Schwarz-Sommer</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Saedler</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>EMBO J</source>
            <pubdate>1985</pubdate>
            <volume>4</volume>
            <fpage>579</fpage>
            <lpage>583</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">554228</pubid>
                  <pubid idtype="pmpid">15926216</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>Excision of <it>En/Spm </it>transposable element of <it>Zea Mays </it>requires two element-encoded proteins</p>
            </title>
            <aug>
               <au>
                  <snm>Frey</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Reinicke</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Grant</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Saedler</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Gierl</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>EMBO J</source>
            <pubdate>1990</pubdate>
            <volume>9</volume>
            <fpage>4037</fpage>
            <lpage>4044</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">552176</pubid>
                  <pubid idtype="pmpid">2174354</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>The <it>tnpA </it>and <it>tnpD </it>gene products of the <it>Spm </it>element are required for transposition in tobacco</p>
            </title>
            <aug>
               <au>
                  <snm>Masson</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Strem</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Federoff</snm>
                  <fnm>N</fnm>
               </au>
            </aug>
            <source>Plant Cell</source>
            <pubdate>1991</pubdate>
            <volume>33</volume>
            <fpage>73</fpage>
            <lpage>85</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1105/tpc.3.1.73</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B13">
            <title>
               <p>The transposable element <it>Tam1 </it>from <it>Anthirrhinum majus </it>shows structural homology to the maize transposon <it>En/Spm </it>and has no sequence specificity of insertion</p>
            </title>
            <aug>
               <au>
                  <snm>Nacken</snm>
                  <fnm>WKF</fnm>
               </au>
               <au>
                  <snm>Piotrowiak</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Saedler</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Sommer</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>Mol Gen Gent</source>
            <pubdate>1991</pubdate>
            <volume>228</volume>
            <fpage>201</fpage>
            <lpage>208</lpage>
         </bibl>
         <bibl id="B14">
            <title>
               <p>Transposable elements of <it>Antirrhinum majus</it></p>
            </title>
            <aug>
               <au>
                  <snm>Sommer</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Hehl</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Krebbers</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Piotrowiak</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Lonnig</snm>
                  <fnm>W-E</fnm>
               </au>
               <au>
                  <snm>Saedler</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>Proc Internatl Symp Plant Transposable Elements</source>
            <publisher>New York: Plenum</publisher>
            <editor>Nelson O</editor>
            <pubdate>1988</pubdate>
            <fpage>227</fpage>
            <lpage>36</lpage>
         </bibl>
         <bibl id="B15">
            <title>
               <p>Organization of the <it>Tgm </it>family of transposable elements in soybean</p>
            </title>
            <aug>
               <au>
                  <snm>Rhodes</snm>
                  <fnm>PR</fnm>
               </au>
               <au>
                  <snm>Vodkin</snm>
                  <fnm>LO</fnm>
               </au>
            </aug>
            <source>Genetics</source>
            <pubdate>1988</pubdate>
            <volume>120</volume>
            <fpage>597</fpage>
            <lpage>604</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1203536</pubid>
                  <pubid idtype="pmpid" link="fulltext">2848748</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <title>
               <p>Isolation of a <it>Suppressor-Mutator/Enhancer</it>-like transposable element, <it>Tnp </it>1, from Japanese morning glory bearing variegated flowers</p>
            </title>
            <aug>
               <au>
                  <snm>Inagaki</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Hitsatomi</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Suzuki</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Kasahara</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Iida</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Plant Cell</source>
            <pubdate>1994</pubdate>
            <volume>6</volume>
            <fpage>375</fpage>
            <lpage>383</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">160440</pubid>
                  <pubid idtype="pmpid" link="fulltext">8180498</pubid>
                  <pubid idtype="doi">10.1105/tpc.6.3.375</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>Somatic variation during long term subculturing of plant cells caused by insertion of a transposable element in a phenylalanine ammonia-lyase (PAL) gene</p>
            </title>
            <aug>
               <au>
                  <snm>Ozeki</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Davies</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Takeda</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Mol Gen Genet</source>
            <pubdate>1997</pubdate>
            <volume>254</volume>
            <fpage>407</fpage>
            <lpage>416</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1007/s004380050433</pubid>
                  <pubid idtype="pmpid">9180694</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p><it>PsI</it>: a novel <it>Spm</it>-like transposable element from <it>Petunia hybrida</it></p>
            </title>
            <aug>
               <au>
                  <snm>Snowden</snm>
                  <fnm>KC</fnm>
               </au>
               <au>
                  <snm>Napoli</snm>
                  <fnm>CA</fnm>
               </au>
            </aug>
            <source>Plant J</source>
            <pubdate>1998</pubdate>
            <volume>14</volume>
            <fpage>43</fpage>
            <lpage>54</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1046/j.1365-313X.1998.00098.x</pubid>
                  <pubid idtype="pmpid" link="fulltext">9681025</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B19">
            <title>
               <p>Molecular characterization of a mutable pigmentation phenotype and isolation of the first active transposable element from Sorghum bicolor</p>
            </title>
            <aug>
               <au>
                  <snm>Chopra</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Brendel</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Axtell</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Peterson</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci</source>
            <pubdate>1999</pubdate>
            <volume>96</volume>
            <fpage>15330</fpage>
            <lpage>15335</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">24819</pubid>
                  <pubid idtype="pmpid" link="fulltext">10611384</pubid>
                  <pubid idtype="doi">10.1073/pnas.96.26.15330</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B20">
            <title>
               <p>Mobilization of transposons by a mutation abolishing full DNA methylation in <it>Arabidopsis</it></p>
            </title>
            <aug>
               <au>
                  <snm>Miura</snm>
                  <fnm>A</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nature</source>
            <pubdate>2001</pubdate>
            <volume>411</volume>
            <fpage>212</fpage>
            <lpage>214</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/35075612</pubid>
                  <pubid idtype="pmpid" link="fulltext">11346800</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B21">
            <title>
               <p>A high-copy- number CACTA family transposon in temperate grasses and cereals</p>
            </title>
            <aug>
               <au>
                  <snm>Langdon</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Jenkins</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Hasterok</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Jones</snm>
                  <fnm>RN</fnm>
               </au>
               <au>
                  <snm>King</snm>
                  <fnm>IP</fnm>
               </au>
            </aug>
            <source>Genetics</source>
            <pubdate>2003</pubdate>
            <volume>163</volume>
            <fpage>1097</fpage>
            <lpage>1108</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1462479</pubid>
                  <pubid idtype="pmpid" link="fulltext">12663547</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B22">
            <title>
               <p>ROSINA (RSI) is part of a CACTA transposable element, <it>TamRSI</it>, and links flower development to transposon activity</p>
            </title>
            <aug>
               <au>
                  <snm>Roccaro</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Sommer</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Saedler</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>Mol Genet Genomics</source>
            <pubdate>2007</pubdate>
            <volume>278</volume>
            <fpage>243</fpage>
            <lpage>254</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1007/s00438-007-0245-x</pubid>
                  <pubid idtype="pmpid" link="fulltext">17588178</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B23">
            <title>
               <p>A lectin gene insertion has the structural features of a transposable element</p>
            </title>
            <aug>
               <au>
                  <snm>Vodkin</snm>
                  <fnm>LO</fnm>
               </au>
               <au>
                  <snm>Rhodes</snm>
                  <fnm>PR</fnm>
               </au>
               <au>
                  <snm>Goldberg</snm>
                  <fnm>RB</fnm>
               </au>
            </aug>
            <source>Cell</source>
            <pubdate>1983</pubdate>
            <volume>34</volume>
            <fpage>1023</fpage>
            <lpage>1031</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/0092-8674(83)90560-3</pubid>
                  <pubid idtype="pmpid" link="fulltext">6313203</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B24">
            <title>
               <p>Highly structured sequence homology between an insertion element and the gene in which it resides</p>
            </title>
            <aug>
               <au>
                  <snm>Rhodes</snm>
                  <fnm>PR</fnm>
               </au>
               <au>
                  <snm>Vodkin</snm>
                  <fnm>LO</fnm>
               </au>
            </aug>
            <source>PNAS</source>
            <pubdate>1985</pubdate>
            <volume>82</volume>
            <fpage>493</fpage>
            <lpage>497</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">397065</pubid>
                  <pubid idtype="pmpid" link="fulltext">16593538</pubid>
                  <pubid idtype="doi">10.1073/pnas.82.2.493</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B25">
            <title>
               <p>The <it>wp </it>mutation of <it>Glycine max </it>carries a gene- fragment-rich-transposon of the CACTA superfamily</p>
            </title>
            <aug>
               <au>
                  <snm>Zabala</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Vodkin</snm>
                  <fnm>LO</fnm>
               </au>
            </aug>
            <source>The Plant Cell</source>
            <pubdate>2005</pubdate>
            <volume>17</volume>
            <fpage>2619</fpage>
            <lpage>2632</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1242261</pubid>
                  <pubid idtype="pmpid" link="fulltext">16141454</pubid>
                  <pubid idtype="doi">10.1105/tpc.105.033506</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B26">
            <title>
               <p>Cloning of the pleiotropic <it>T </it>locus in soybean and two recessive alleles that differentially affect structure and expression of the encoded flavonoid 3' hydroxylase</p>
            </title>
            <aug>
               <au>
                  <snm>Zabala</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Vodkin</snm>
                  <fnm>LO</fnm>
               </au>
            </aug>
            <source>Genetics</source>
            <pubdate>2003</pubdate>
            <volume>163</volume>
            <fpage>295</fpage>
            <lpage>309</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1462420</pubid>
                  <pubid idtype="pmpid" link="fulltext">12586717</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B27">
            <title>
               <p>The rice annotation project database (RAP-D B): hub for <it>Oryza sativa </it>ssp. Japonica genome information</p>
            </title>
            <aug>
               <au>
                  <snm>Ohyanagi</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Tanaka</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Sakai</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Shigemoto</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Yamaguchi</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Habara</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Fujii</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Antonio</snm>
                  <fnm>BA</fnm>
               </au>
               <au>
                  <snm>Nagamura</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Imanishi</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Ikeo</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Itoh</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Gojobori</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Sasaki</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2006</pubdate>
            <volume>34</volume>
            <fpage>D741</fpage>
            <lpage>4</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1347456</pubid>
                  <pubid idtype="pmpid" link="fulltext">16381971</pubid>
                  <pubid idtype="doi">10.1093/nar/gkj094</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B28">
            <title>
               <p>Multiple sequence alignment with hierarchical clustering</p>
            </title>
            <aug>
               <au>
                  <snm>Corpet</snm>
                  <fnm>F</fnm>
               </au>
            </aug>
            <source>Nucl Acids Res</source>
            <pubdate>1988</pubdate>
            <volume>16</volume>
            <issue>22</issue>
            <fpage>10881</fpage>
            <lpage>10890</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">338945</pubid>
                  <pubid idtype="pmpid" link="fulltext">2849754</pubid>
                  <pubid idtype="doi">10.1093/nar/16.22.10881</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B29">
            <title>
               <p>Organization and expression of eucaryotic split genes coding for proteins</p>
            </title>
            <aug>
               <au>
                  <snm>Breathnach</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Chambon</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Annu Rev Biochem</source>
            <pubdate>1981</pubdate>
            <volume>50</volume>
            <fpage>349</fpage>
            <lpage>383</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1146/annurev.bi.50.070181.002025</pubid>
                  <pubid idtype="pmpid" link="fulltext">6791577</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B30">
            <title>
               <p>Novel exon combinations generated by alternative splicing of gene fragments mobilized by a CACTA transposon in <it>Glycine max</it></p>
            </title>
            <aug>
               <au>
                  <snm>Zabala</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Vodkin</snm>
                  <fnm>LO</fnm>
               </au>
            </aug>
            <source>BMC Plant Biology</source>
            <pubdate>2007</pubdate>
            <volume>7</volume>
            <fpage>38</fpage>
            <note>PMCID: PMC1947982</note>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1947982</pubid>
                  <pubid idtype="pmpid" link="fulltext">17629935</pubid>
                  <pubid idtype="doi">10.1186/1471-2229-7-38</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B31">
            <title>
               <p>Plant transposable elements and gene tagging</p>
            </title>
            <aug>
               <au>
                  <snm>Gierl</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Saedler</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>Plant Mol Biol</source>
            <pubdate>1992</pubdate>
            <volume>19</volume>
            <fpage>39</fpage>
            <lpage>49</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1007/BF00015605</pubid>
                  <pubid idtype="pmpid">1318114</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B32">
            <title>
               <p>Maize transposable elements</p>
            </title>
            <aug>
               <au>
                  <snm>Gierl</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Saedler</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Peterson</snm>
                  <fnm>PA</fnm>
               </au>
            </aug>
            <source>Annu Rev Genet</source>
            <pubdate>1989</pubdate>
            <volume>23</volume>
            <fpage>71</fpage>
            <lpage>85</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1146/annurev.ge.23.120189.000443</pubid>
                  <pubid idtype="pmpid" link="fulltext">2559653</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B33">
            <title>
               <p>AT-AC Pre-mRNA Splicing Mechanisms and Conservation of Minor Introns in Voltage-Gated Ion Channel Genes</p>
            </title>
            <aug>
               <au>
                  <snm>Wu</snm>
                  <fnm>Q</fnm>
               </au>
               <au>
                  <snm>Krainer</snm>
                  <fnm>AR</fnm>
               </au>
            </aug>
            <source>Mol Cell Bio</source>
            <pubdate>1999</pubdate>
            <volume>19</volume>
            <issue>5</issue>
            <fpage>3225</fpage>
            <lpage>3236</lpage>
         </bibl>
         <bibl id="B34">
            <title>
               <p>Binding of the Drosophila sex-lethal gene product to the alternative splice site of transformer primary transcript</p>
            </title>
            <aug>
               <au>
                  <snm>Inoue</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Hoshijima</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Sakamoto</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Shimura</snm>
                  <fnm>Y</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>1990</pubdate>
            <volume>29</volume>
            <fpage>461</fpage>
            <lpage>463</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1038/344461a0</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B35">
            <title>
               <p>Retrotransposons of rice involved in mutations induced by tissue culture</p>
            </title>
            <aug>
               <au>
                  <snm>Hirochika</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Sugimoto</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Otsuki</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Tsugawa</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Kanda</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci</source>
            <pubdate>1996</pubdate>
            <volume>93</volume>
            <fpage>7783</fpage>
            <lpage>7788</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">38825</pubid>
                  <pubid idtype="pmpid" link="fulltext">8755553</pubid>
                  <pubid idtype="doi">10.1073/pnas.93.15.7783</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B36">
            <title>
               <p>A simple method for extraction of RNA from maize tissue</p>
            </title>
            <aug>
               <au>
                  <snm>McCarty</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Maize Genet Coop Newsl</source>
            <pubdate>1986</pubdate>
            <volume>60</volume>
            <fpage>61</fpage>
         </bibl>
         <bibl id="B37">
            <title>
               <p>Chalcone synthase mRNA and activity are reduced in yellow soybean seed coats with dominant I alleles</p>
            </title>
            <aug>
               <au>
                  <snm>Wang</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Todd</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Vodkin</snm>
                  <fnm>LO</fnm>
               </au>
            </aug>
            <source>Plant Physiol</source>
            <pubdate>1994</pubdate>
            <volume>105</volume>
            <fpage>739</fpage>
            <lpage>748</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">159416</pubid>
                  <pubid idtype="pmpid" link="fulltext">8066134</pubid>
                  <pubid idtype="doi">10.1104/pp.105.2.739</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B38">
            <title>
               <p>Molecular cloning: A laboratory manual</p>
            </title>
            <aug>
               <au>
                  <snm>Sambrook</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Fritsch</snm>
                  <fnm>EF</fnm>
               </au>
               <au>
                  <snm>Maniatis</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <publisher>Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY</publisher>
            <edition>2</edition>
            <pubdate>1989</pubdate>
         </bibl>
         <bibl id="B39">
            <title>
               <p>Duplications that suppress and deletions that restore expression from a chalcone synthase multigene family</p>
            </title>
            <aug>
               <au>
                  <snm>Todd</snm>
                  <fnm>JJ</fnm>
               </au>
               <au>
                  <snm>Vodkin</snm>
                  <fnm>LO</fnm>
               </au>
            </aug>
            <source>Plant Cell</source>
            <pubdate>1996</pubdate>
            <volume>8</volume>
            <fpage>687</fpage>
            <lpage>699</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">161129</pubid>
                  <pubid idtype="pmpid" link="fulltext">12239396</pubid>
                  <pubid idtype="doi">10.1105/tpc.8.4.687</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B40">
            <title>
               <p>A technique for radiolabeling DNA restriction fragments to high specific activity</p>
            </title>
            <aug>
               <au>
                  <snm>Feinberg</snm>
                  <fnm>AP</fnm>
               </au>
               <au>
                  <snm>Vogelstein</snm>
                  <fnm>B</fnm>
               </au>
            </aug>
            <source>Anal Biochem</source>
            <pubdate>1983</pubdate>
            <volume>132</volume>
            <fpage>6</fpage>
            <lpage>13</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/0003-2697(83)90418-9</pubid>
                  <pubid idtype="pmpid" link="fulltext">6312838</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B41">
            <title>
               <p>Base-calling of automated sequencer traces using phred. II. Error probabilities</p>
            </title>
            <aug>
               <au>
                  <snm>Ewing</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Green</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>1998</pubdate>
            <volume>8</volume>
            <fpage>186</fpage>
            <lpage>194</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">9521922</pubid>
            </xrefbib>
         </bibl>
      </refgrp>
   </bm>
</art>
