<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1471-2164-9-282</ui>
   <ji>1471-2164</ji>
   <fm>
      <dochead>Research article</dochead>
      <bibl>
         <title>
            <p>The highest-copy repeats are methylated in the small genome of the early divergent vascular plant <it>Selaginella moellendorffii</it></p>
         </title>
         <aug>
            <au id="A1">
               <snm>Chan</snm>
               <mi>P</mi>
               <fnm>Agnes</fnm>
               <insr iid="I1"/>
               <email>achan@jcvi.org</email>
            </au>
            <au id="A2">
               <snm>Melake-Berhan</snm>
               <fnm>Admasu</fnm>
               <insr iid="I1"/>
               <email>amelake@jcvi.org</email>
            </au>
            <au id="A3">
               <snm>O'Brien</snm>
               <fnm>Kimberly</fnm>
               <insr iid="I1"/>
               <insr iid="I3"/>
               <email>kobrien@som.umaryland.edu</email>
            </au>
            <au id="A4">
               <snm>Buckley</snm>
               <fnm>Stephanie</fnm>
               <insr iid="I1"/>
               <insr iid="I4"/>
               <email>sbuckley@som.umaryland.edu</email>
            </au>
            <au id="A5">
               <snm>Quan</snm>
               <fnm>Hui</fnm>
               <insr iid="I1"/>
               <insr iid="I5"/>
               <email>hui.quan@gmail.com</email>
            </au>
            <au id="A6">
               <snm>Chen</snm>
               <fnm>Dan</fnm>
               <insr iid="I1"/>
               <email>danchen@jcvi.org</email>
            </au>
            <au id="A7">
               <snm>Lewis</snm>
               <fnm>Matthew</fnm>
               <insr iid="I1"/>
               <email>MLewis@jcvi.org</email>
            </au>
            <au id="A8">
               <snm>Banks</snm>
               <mnm>Ann</mnm>
               <fnm>Jo</fnm>
               <insr iid="I2"/>
               <email>banksj@purdue.edu</email>
            </au>
            <au id="A9" ca="yes">
               <snm>Rabinowicz</snm>
               <mi>D</mi>
               <fnm>Pablo</fnm>
               <insr iid="I1"/>
               <insr iid="I3"/>
               <insr iid="I6"/>
               <email>prabinowicz@som.umaryland.edu</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>J. Craig Venter Institute (JCVI), Rockville, MD 20850, USA</p>
            </ins>
            <ins id="I2">
               <p>Department of Botany and Plant Pathology, Lilly Hall, Purdue University, West Lafayette, IN 47907, USA</p>
            </ins>
            <ins id="I3">
               <p>Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD 21201, USA </p>
            </ins>
            <ins id="I4">
               <p>Program in Oncology, University of Maryland School of Medicine, Baltimore, MD 21201, USA</p>
            </ins>
            <ins id="I5">
               <p>Applied Biosystems, 2130 Woodward St., Austin, TX 78744, USA</p>
            </ins>
            <ins id="I6">
               <p>Department of Biochemistry and Molecular Biology, University of Maryland School of Medicine, Baltimore, MD 21201, USA</p>
            </ins>
         </insg>
         <source>BMC Genomics</source>
         <issn>1471-2164</issn>
         <pubdate>2008</pubdate>
         <volume>9</volume>
         <issue>1</issue>
         <fpage>282</fpage>
         <url>http://www.biomedcentral.com/1471-2164/9/282</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">18549478</pubid>
               <pubid idtype="doi">10.1186/1471-2164-9-282</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>14</day>
               <month>2</month>
               <year>2008</year>
            </date>
         </rec>
         <acc>
            <date>
               <day>12</day>
               <month>6</month>
               <year>2008</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>12</day>
               <month>6</month>
               <year>2008</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2008</year>
         <collab>Chan et al; licensee BioMed Central Ltd.</collab>
         <note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>The lycophyte <it>Selaginella moellendorffii </it>is a vascular plant that diverged from the fern/seed plant lineage at least 400 million years ago. Although genomic information for <it>S. moellendorffii </it>is starting to be produced, little is known about basic aspects of its molecular biology. In order to provide the first glimpse to the epigenetic landscape of this early divergent vascular plant, we used the methylation filtration technique. Methylation filtration genomic libraries select unmethylated DNA clones due to the presence of the methylation-dependent restriction endonuclease McrBC in the bacterial host.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>We conducted a characterization of the DNA methylation patterns of the <it>S. moellendorffii </it>genome by sequencing a set of <it>S. moellendorffii </it>shotgun genomic clones, along with a set of methylation filtered clones. Chloroplast DNA, which is typically unmethylated, was enriched in the filtered library relative to the shotgun library, showing that there is DNA methylation in the extremely small <it>S. moellendorffii </it>genome. The filtered library also showed enrichment in expressed and gene-like sequences, while the highest-copy repeats were largely under-represented in this library. These results show that genes and repeats are differentially methylated in the <it>S</it>. <it>moellendorffii </it>genome, as occurs in other plants studied.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusion</p>
               </st>
               <p>Our results shed light on the genome methylation pattern in a member of a relatively unexplored plant lineage. The DNA methylation data reported here will help understanding the involvement of this epigenetic mark in fundamental biological processes, as well as the evolutionary aspects of epigenetics in land plants.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>DNA methylation has been found throughout the plant kingdom, typically in cytosines, forming part of symmetric (CpNpG and CpG) and asymmetric (CpNpN) sites <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B2">2</abbr></abbrgrp>. The proportion of methylated cytosine in plants is variable, ranging from 6% in <it>Arabidopsis </it><abbrgrp><abbr bid="B3">3</abbr></abbrgrp> to 25% in maize <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>. DNA methylation has been associated with the inactivation of transposons and silencing of genes <abbrgrp><abbr bid="B5">5</abbr><abbr bid="B6">6</abbr><abbr bid="B7">7</abbr><abbr bid="B8">8</abbr><abbr bid="B9">9</abbr><abbr bid="B10">10</abbr></abbrgrp>, and it has also been proposed that the function of DNA methylation is to decrease transcriptional "noise" <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>.</p>
         <p>In plants, most DNA methylation is found in repetitive elements, while genes and other low copy sequences are generally hypomethylated <abbrgrp><abbr bid="B12">12</abbr><abbr bid="B13">13</abbr><abbr bid="B14">14</abbr><abbr bid="B15">15</abbr><abbr bid="B16">16</abbr><abbr bid="B17">17</abbr></abbrgrp>.</p>
         <p>Because of the large size of many plant genomes, particularly those of important crops <abbrgrp><abbr bid="B18">18</abbr></abbrgrp>, gene-enriched sequencing strategies have been designed as an alternative to whole genome sequencing in an attempt to capture the so-called gene-space of such genomes. One of these gene-enrichment techniques, called methylation filtration (MF), takes advantage of the difference in methylation between plant genes and repeats <abbrgrp><abbr bid="B19">19</abbr></abbrgrp>. MF exploits the methylation-dependent restriction endonuclease McrBC (modified cytosine restriction) from <it>E. coli </it><abbrgrp><abbr bid="B20">20</abbr><abbr bid="B21">21</abbr></abbrgrp>. This enzyme digests DNA in sequences that contain two sites, each one consisting of a purine and a cytosine methylated in carbon 5, separated by 40&#8211;3000 bp <abbrgrp><abbr bid="B22">22</abbr></abbrgrp>. Therefore, using an <it>mcrBC</it><sup>+ </sup><it>E. coli </it>strain as a host to construct a genomic shotgun library, heavily methylated repetitive DNA is efficiently counter-selected, while hypomethylated low copy (<it>i.e</it>. genic) sequences are substantially over-represented. MF was first tested in maize, where it yielded a 6-fold enrichment for genes relative to a whole genome shotgun (WGS) library used as a control <abbrgrp><abbr bid="B19">19</abbr></abbrgrp>. Subsequently, MF was applied at large scale in maize <abbrgrp><abbr bid="B23">23</abbr><abbr bid="B24">24</abbr></abbrgrp> and in sorghum <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>, showing that approximately 95% of the genes in each genome were tagged (A. Chan <it>et al.</it>, unpublished) and that most genes and regulatory elements are unmethylated in these two species. These results led to the suggestion that a combination of gene-enrichment and traditional genome sequencing techniques could be combined to efficiently sequence large plant genomes <abbrgrp><abbr bid="B26">26</abbr></abbrgrp>. Further analyses of the large-scale MF data in maize and sorghum also provided insights into the biology of transposable element methylation and activity <abbrgrp><abbr bid="B23">23</abbr><abbr bid="B24">24</abbr><abbr bid="B25">25</abbr></abbrgrp>. Pilot MF studies of several monocot, dicot, and non-angiosperm plants (such as pine, fern, and moss) were also conducted <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>. These analyses determined that MF enriches for genes in all plants tested, although to different levels, and that it can be an effective approach to selectively clone and sequence genes from some large plant genomes, where the majority of the DNA is composed of methylated repetitive elements.</p>
         <p>In this study we performed a MF analysis of the lycophyte <it>Selaginella moellendorffii </it>(family Selaginellaceae), representing a clade not included in previous MF studies. The lycophyte clade diverged from the fern/seed-plant lineage about 400 million years ago <abbrgrp><abbr bid="B28">28</abbr></abbrgrp>.</p>
         <p>The <it>S. moellendorffii </it>sporophyte is diploid and consists of dichotomously branching shoot and root systems. The shoot frequently terminates in arrested buds or bulbils that dehisce and allow clonal propagation. The reproductive structures are the strobili, which form toward the tip of the shoot, each one with either one micro- or megasporangium that produce micro- or megaspores, which in turn germinate and divide mitotically to form either the male or female gametophytes, respectively. The gametophyte produces either motile sperm or egg-forming archegonia. After fertilization of the egg, the new sporophyte remains dependent upon the female gametophyte for a short period of time. <it>S. moellendorffii </it>is an excellent model system to study some developmental processes, such as sporogenesis and gametophyte development, which are difficult to study in angiosperms because their spores and gametophytes are dependent upon and surrounded by sporophytic tissues. Seedless plants provide an excellent opportunity to study the epigenetics of these processes, but little is known about DNA methylation and other epigenetic marks in early vascular plants, except for the presence of heterochromatic bands identified by cytological staining <abbrgrp><abbr bid="B29">29</abbr></abbrgrp>. Ferns have been used in attempts to address the methylation of the haploid and diploid generations <abbrgrp><abbr bid="B30">30</abbr></abbrgrp> but their genomes are usually large and only specific sequences were analyzed. The extremely small genome of <it>S. moellendorffii </it>(90&#8211;130 Mbp; <abbrgrp><abbr bid="B31">31</abbr></abbrgrp>) and its available 8&#215; coverage, high-quality draft genome assembly generated by the Joint Genome Institute of the U.S. Department of Energy (JGI-DOE), will facilitate the study of <it>S. moellendorffii</it>'s epigenome and its involvement in the alternation of generations. Due to its small genome size, several transposon families, which are common targets of epigenetic modifications, may be low copy in <it>S. moellendorffii </it>and their sequence and epigenetics can be studied without the complications of high copy numbers, allowing the unequivocal identification of individual transposon loci.</p>
         <p>Sequences from this study have been deposited in NCBI GenBank under the accession numbers [<ext-link ext-link-type="gen" ext-link-id="ET218553">ET218553</ext-link>&#8211;<ext-link ext-link-type="gen" ext-link-id="ET221769">ET221769</ext-link>].</p>
      </sec>
      <sec>
         <st>
            <p>Results and Discussion</p>
         </st>
         <sec>
            <st>
               <p>Sequence data and chloroplast content</p>
            </st>
            <p>We constructed MF and WGS libraries from <it>S. moellendorffii </it>and produced 1,621 and 1,598 high-quality paired sequence reads, respectively, each set representing approximately 1% of the genome. We did not expect a substantial difference in the proportion of gene-like sequences in the MF library relative to the WGS library in the small genome of <it>S. moellendorffii </it>because previous studies showed that in the ~400 Mbp genomes of rice and <it>Ceratodon purpureus </it>(the smallest genomes in which MF has been tested) the gene enrichment factors (GEF, calculated as the ratio between the MF and WGS proportion of non-repetitive, gene-like sequences) were 1.9 and 2.5, respectively <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>. As the chloroplast genome is typically non-methylated, we prepared total <it>S. moellendorffii </it>DNA to construct the MF and WGS libraries in order to retain the chloroplast DNA in both libraries and used it to verify that methylated sequences exist in <it>S. moellendorffii </it>and are counter-selected by MF. High-stringency alignments against the <it>Selaginella uncinata </it>chloroplast genome <abbrgrp><abbr bid="B32">32</abbr></abbrgrp> identified 14.9% and 7.8% chloroplast DNA sequences in the MF and WGS datasets, respectively (Figure <figr fid="F1">1A</figr>), thus demonstrating that the <it>S. moellendorffii </it>genome is methylated and that MF selects for non-methylated sequences as expected.</p>
            <fig id="F1">
               <title>
                  <p>Figure 1</p>
               </title>
               <caption>
                  <p>Proportion of repetitive and low-copy sequences in the MF and WGS libraries</p>
               </caption>
               <text>
                  <p><b>Proportion of repetitive and low-copy sequences in the MF and WGS libraries</b>. <b>A: </b>Proportion of all low-copy sequences (transcribed, gene-like and anonymous) are shown together (LC). Proportion of all repeats (All Rpts) and their break down into known (Known Rpts) and <it>de novo </it>repeats (de novo Rpts) are shown separately. The percentage of chloroplast (Chlor) sequences is calculated relative to the total number of sequences in each library. All the other percentages are calculated relative tot the total of non-chloroplast reads in each library. <b>B: </b>Percentages of MF and WGS reads matching the reference genome, classified by the number of hits. Any read showing 20 or more hits in the reference genome is considered a <it>de novo </it>identified repeat.</p>
               </text>
               <graphic file="1471-2164-9-282-1"/>
            </fig>
            <p>The chloroplast reads identified in this way were not analyzed further and, therefore, a total of 1,379 MF and 1,471 WGS non-chloroplast reads were used in the following analyses.</p>
            <p>Overall, the C+G content is slightly higher in MF than in WGS data (47.9% vs. 46.2, respectively), probably due to the higher C+G content of gene sequences, which are predominant in the MF set (Table <tblr tid="T1">1</tblr>).</p>
            <tbl id="T1">
               <title>
                  <p>Table 1</p>
               </title>
               <caption>
                  <p>C+G content in different sequence classes</p>
               </caption>
               <tblbdy cols="5">
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>%C+G total</p>
                     </c>
                     <c ca="center">
                        <p>%C+G in repeats</p>
                     </c>
                     <c ca="center">
                        <p>% C+G in low-copy DNA</p>
                     </c>
                     <c ca="center">
                        <p>% C+G in genes and EST hits</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="5">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>MF</p>
                     </c>
                     <c ca="center">
                        <p>47.9</p>
                     </c>
                     <c ca="center">
                        <p>50.9</p>
                     </c>
                     <c ca="center">
                        <p>47.8</p>
                     </c>
                     <c ca="center">
                        <p>49.7</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>WGS</p>
                     </c>
                     <c ca="center">
                        <p>46.2</p>
                     </c>
                     <c ca="center">
                        <p>47.5</p>
                     </c>
                     <c ca="center">
                        <p>45.8</p>
                     </c>
                     <c ca="center">
                        <p>49.6</p>
                     </c>
                  </r>
               </tblbdy>
            </tbl>
            <p>Approximately 13% of the sequences could not be aligned to the reference genome assembly at the stringency used in this study. This discrepancy may be due to the exclusion of sequence assemblies shorter than 1 kbp from the reference genome sequence.</p>
         </sec>
         <sec>
            <st>
               <p>Repetitive Sequences</p>
            </st>
            <p>In order to identify repetitive sequences we used nucleotide and amino acid databases of plant repetitive sequences <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>. Consistent with the notion that repetitive elements are methylated in plants, only 2.9% of the MF sequences had a match in either of the repeat databases, while matches in the WGS set reached 8.1% (Figure <figr fid="F1">1A</figr>). As sequences from early vascular plants are underrepresented in available databases, it is possible that many <it>S. moellendorffii </it>repetitive elements will not be identified by comparison with previously annotated plant repeats. To better estimate the repeat content of each dataset, we attempted to identify repeats <it>de novo </it>by aligning our MF and WGS reads to the draft genome assembly produced by JGI-DOE. Any sequence that had 20 or more high-stringency matches in the reference genome was considered repetitive (Figure <figr fid="F1">1B</figr> and Additional files <supplr sid="S1">1</supplr> and <supplr sid="S2">2</supplr>). A 10-fold reduction in the percentage of these <it>de novo </it>repeats was observed in the MF vs. the WGS data set, showing that the highest-copy elements (largest number of hits in the reference genome) are methylated in <it>S. moellendorffii </it>(Figure <figr fid="F1">1A</figr>). Taken together, the known and <it>de novo </it>repeats account for 3.1% and 20.4% of the MF and WGS reads, respectively (Figure <figr fid="F1">1A</figr>). Among the repetitive MF reads, most were identified as known repeats by database searches, and nearly half were also identified <it>de novo </it>(Figure <figr fid="F2">2</figr>), although no MF repeats shows more than 42 copies in the <it>S. moellendorffii </it>reference genome sequence, and many are ribosomal RNA sequences (see Additional file <supplr sid="S1">1</supplr>). Interestingly, all MF sequences matching known transposable elements are low-copy in <it>S. moellendorffii </it>(<it>i.e. </it>have less than 20 hits in the genome; Additional file <supplr sid="S1">1</supplr>). In contrast, over 60% of the WGS repeats were identified <it>de novo</it>, and do not have a database match, while among those that have similarity to known repeats, nearly 1/3 are high-copy (Figure <figr fid="F2">2</figr>). Furthermore, known WGS repeats show a maximum of 234 hits in the genome, but 1/3 of the WGS repeats have more than 234 copies, the highest having over 500 (see Additional file <supplr sid="S2">2</supplr>). The prevalence of low-copy repeats detected by MF suggests that low-copy transposons are unmethylated and, therefore, potentially active <abbrgrp><abbr bid="B6">6</abbr><abbr bid="B33">33</abbr><abbr bid="B34">34</abbr></abbrgrp>. The observed substantial number of WGS unknown high-copy elements highlights the diversity of transposable elements throughout the plant kingdom.</p>
            <suppl id="S1">
               <title>
                  <p>Additional file 1</p>
               </title>
               <text>
                  <p><b>Complete MF analysis results</b>. An excel file listing BLAST hits of all MF sequences against all databases used, classified as "repeats", "low-copy sequences", and "chloroplast sequences".</p>
               </text>
               <file name="1471-2164-9-282-S1.xls">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <suppl id="S2">
               <title>
                  <p>Additional file 2</p>
               </title>
               <text>
                  <p><b>Complete WGS analysis results</b>. An excel file listing BLAST hits of all WGS sequences against all databases used, classified as "repeats", "low-copy sequences", and "chloroplast sequences".</p>
               </text>
               <file name="1471-2164-9-282-S2.xls">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <fig id="F2">
               <title>
                  <p>Figure 2</p>
               </title>
               <caption>
                  <p>Proportions of repetitive sequences</p>
               </caption>
               <text>
                  <p><b>Proportions of repetitive sequences</b>. Repeats are classified as matching the repeat databases (Known), identified <it>de novo</it>, matching the repeat databases but not identified <it>de novo </it>(Known only), identified <it>de novo </it>(de novo) but with no match in the repeat databases (de novo only), or that were identified <it>de novo </it>and also have a match in the repeat databases (de novo &amp; known). Percentages are calculated relative to the total number of repetitive sequences.</p>
               </text>
               <graphic file="1471-2164-9-282-2"/>
            </fig>
            <p>Sequence composition analysis of the repetitive sequences showed that MF repeats are richer in C+G than those in the WGS set, probably due to the abundance of conserved, non-methylated ribosomal RNA sequences among the MF repeats (Table <tblr tid="T1">1</tblr>).</p>
         </sec>
         <sec>
            <st>
               <p>Gene sequences, expressed sequences and gene enrichment</p>
            </st>
            <p>Using BLASTX, the MF and WGS non-repetitive sequences were compared to a partially curated, non-identical amino acid sequence database (NIAA) maintained at JCVI, containing most proteins available from GenBank. The percentages of BLASTX matches against this database were 35% and 22% in MF and WGS sequences, respectively (Figure <figr fid="F3">3</figr>), representing a 1.6-fold enrichment in MF relative to WGS sequences, indicating that protein-encoding genes are frequently hypomethylated. Therefore, MF enriches for genes even in the minute genome of <it>S. moellendorffii</it>. We also performed high stringency alignments of our sequences to the <it>S. moellendorffii </it>assembled ESTs <abbrgrp><abbr bid="B35">35</abbr></abbrgrp>, which showed that MF enriches for transcribed sequences to a similar level as it does for protein sequences, suggesting hypomethylation of expressed sequences. Combining the protein and transcribed sequences alignments, 49% of the MF and 31% of the WGS sequences matched either database, while sequences with no database match represented 48% of each dataset (Figure <figr fid="F3">3</figr>).</p>
            <fig id="F3">
               <title>
                  <p>Figure 3</p>
               </title>
               <caption>
                  <p>Gene and transcribed sequence content</p>
               </caption>
               <text>
                  <p><b>Gene and transcribed sequence content</b>. Percentages of matches to the database of curated genes, matches to the NIAA protein database, matches to the <it>S. moellendorffii </it>EST assemblies (ESTs), the combination of matches to NIAA protein and EST assemblies databases, and the anonymous low copy-sequences are shown.</p>
               </text>
               <graphic file="1471-2164-9-282-3"/>
            </fig>
            <p>In order to estimate the level of gene enrichment achieved with MF in <it>S. moellendorffii </it>in comparison with previous studies done in other plants <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>, all sequences that were not identified as repeats or chloroplast were compared to the same curated database of known gene sequences used in those studies. In this way, 12.8% of the WGS sequences and 21.5% of the MF sequences had a match in the known gene database, resulting in a GEF of 1.7 (Figure <figr fid="F3">3</figr>).</p>
            <p>We attempted to confirm that the exclusion of high-copy sequences in the MF library was due to DNA methylation using an independent assay. To do this, we digested <it>Selaginella </it>genomic DNA with the methylation-sensitive restriction enzyme <it>Hpa</it>II, whose restriction target site (CCGG) includes the frequently methylated CpG motif. We then selected 5 of the highest-copy sequences in the WGS library with no match in the databases (<it>de novo </it>highest-copy repeats), as well as 5 low-copy MF sequences showing similarity to ESTs and known genes. We designed polymerase chain reaction (PCR) primer pairs so that the expected amplification product would include at least one <it>Hpa</it>II site, and carried out PCR reactions with each primer pair using <it>Hpa</it>II-digested and undigested DNA as template. The results in Figure <figr fid="F4">4A</figr> show that the amount of amplification product obtained with <it>Hpa</it>II-digested template was substantially reduced relative to the undigested control in the 5 low-copy MF sequences, indicating that these sequences are not methylated in the genome, allowing digestion by <it>Hpa</it>II and cleavage of the target template sequence. On the other hand, no difference in amplification was observed between the <it>Hpa</it>II-digested and undigested high-copy templates. As each of these PCR products correspond to a mixture of sequences from multiple repeated loci, it is possible that some copies of these repeats show variation with respect to the presence of <it>Hpa</it>II recognition sites, due to polymorphisms relative to the sequence used for PCR primer design. Thus, the absence of digestion may reflect a lack of <it>Hpa</it>II sites or the presence of methylated <it>Hpa</it>II sites. To test if <it>Hpa</it>II sites were present in the high-copy WGS PCR products amplified from <it>Hpa</it>II-digested DNA, we treated these 5 PCR products with <it>Hpa</it>II. We observed digestion in all PCR products, indicating that <it>Hpa</it>II sites were present and thus, methylated (figure <figr fid="F4">4B</figr>). Nevertheless, we also observed the presence a low proportion of undigested PCR product in addition to the expected digestion products, as well as additional bands, suggesting that multiple polymorphic copies of each repeat were amplified in all cases.</p>
            <fig id="F4">
               <title>
                  <p>Figure 4</p>
               </title>
               <caption>
                  <p><it>Hpa</it>II-digestion and PCR amplificaion of low- and high-copy sequences</p>
               </caption>
               <text>
                  <p><b><it>Hpa</it>II-digestion and PCR amplificaion of low- and high-copy sequences</b>. <b>A: </b>PCR products ran on agarose gels are shown. Five highly repeated WGS sequences (left panels) and 5 MF sequences (right panels) were amplified from <it>Hpa</it>II-digested (bottom panels) or undigested (top panels) <it>S. moellendorffii </it>genomic DNA. <b>B: </b><it>Hpa</it>II digestion of the 5 PCR products obtained with <it>Hpa</it>II-digested genomic DNA from panel A. From left to right, digested and undigested PCR products (in the same order as panel A, bottom left).</p>
               </text>
               <graphic file="1471-2164-9-282-4"/>
            </fig>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Conclusion</p>
         </st>
         <p>Our results show that even in the small genome of <it>S. moellendorffii</it>, MF sequences display much lower repeat content than WGS sequences, and that each of the identified MF repeats has less than 42 copies in the genome. If the MF repeat sequences are aligned to the reference genome at higher stringency, the number of hits for each repeat decreases, indicating that polymorphisms can be found inside families of repetitive elements (data not shown). Therefore, by sequencing the hypomethylated fraction of the <it>S. moellendorffii </it>genome using MF it would be possible to identify which copies of these repetitive elements are methylated. MF of the <it>S. moellendorffii </it>genome can be used to obtain information on gene methylation as well, as it has been shown in <it>Arabidopsis</it>, where a fraction of the genes do contain cytosine methylation (although at a lower level than repeats and pseudogenes) and this methylation is predominant in particular regions of the genes <abbrgrp><abbr bid="B14">14</abbr><abbr bid="B15">15</abbr><abbr bid="B16">16</abbr><abbr bid="B17">17</abbr></abbrgrp>. In consequence, a genome-wide DNA methylation profile can be generated by comprehensive MF sequencing of this genome. Furthermore, combining MF with ultra-high throughput next-generation sequencing techniques will facilitate this kind of analyses using the sequenced genome as a reference. As the variety of <it>S. moellendorffii </it>whose genome was sequenced by JGI-DOE has two distinct haplotypes that differ in nucleotide sequence by ~2&#8211;5%, (J. Banks, unpublished), it will be possible to determine if there is haplotype-specific DNA methylation using MF sequencing. Genome-wide epigenetic studies of early-diverging land plants will provide the foundation to broaden our understanding of the evolution of epigenetic regulation of developmental processes in plant biology.</p>
      </sec>
      <sec>
         <st>
            <p>Methods</p>
         </st>
         <p>Total DNA was purified using DNeasy kits (Qiagen, CA) from green tissues of <it>S. moellendorffii </it>plants kept in growth chamber. The DNA was mechanically sheared using a Hydroshear device (Genomic Solutions, MI) and fragments ranging from 3 to 4 kb were eluted from an agarose gel after electrophoresis, end-repaired, and ligated into a cloning vector. DNA ligation reactions were transformed into <it>E. coli </it>DH5&#945; (<it>mcrBC</it><sup>+</sup>) to consruct the MF library. The WGS library was constructed by introducing the same ligation reaction into <it>E. coli </it>GC10 (<it>mcrBC</it><sup>-</sup>). Recombinant clones were sequenced using Big Dye Terminator chemistry and ABI 3730xl sequencers (Applied Biosystems, CA), and vector and low-quality sequences were electronically trimmed.</p>
         <p>Chloroplast sequences were identified by BLASTN alignment to the <it>S. uncinata </it>chloroplast genome (GenBank accession <ext-link ext-link-type="gen" ext-link-id="AB197035">AB197035</ext-link>) at high stringency (E value smaller than 10<sup>-56</sup>). The chloroplast sequences were excluded from any further sequence analyses. Protein sequence alignments against the NIAA database were done using BLAT. Alignments with at least 70% similarity and 40 amino acids long were recorded as matches.</p>
         <p>Alignments to assembled EST sequences were done using BLASTN at high stringency. Matches showing an E value smaller than 10<sup>-56 </sup>were recorded.</p>
         <p><it>De novo </it>repeats were identified by aligning MF and WGS reads to the JGI-DOE <it>S. moellendorffii </it>genome assembly using BLASTN and matches covering 50% of the read with 95% identity were recorded.</p>
         <p>Alignments to the curated database of known genes were done as previously reported <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>, using BLASTX and recording matches with an E value better than 10<sup>-7</sup>.</p>
         <p>Known repeats were identified using a nucleotide database and a protein database of known repetitive elements described earlier <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>. These databases do not contain simple sequence repeats. Repetitive element proteins were identified using the protein database of repeats. The same criteria were used to identify known genes, while repetitive nucleotide sequences were identified using BLASTN with an E value smaller than 10<sup>-10</sup>.</p>
         <p>DNA digestion with <it>Hpa</it>II was preformed following manufacturer recommendations. PCR assays were carried out using 50 ng of <it>Hpa</it>II-digested or undigested genomic DNA as template, and denaturing 3 minutes at 94&#176;C followed by 25 amplification cycles using the following program: 30 seconds at 94&#176;C, 30 seconds at 59&#176;C, and 60 seconds at 72&#176;C. Elongation was allowed for 10 minutes at 72&#176;C after amplification. Target and primer sequences are shown in Additional file <supplr sid="S3">3</supplr>.</p>
         <suppl id="S3">
            <title>
               <p>Additional file 3</p>
            </title>
            <text>
               <p><b>PCR primers and target sequences</b>. A Word document with the selected WGS and MF sequences that were checked by <it>Hpa</it>II digestion and subsequent PCR. Primers are shown as underlined sequence and <it>Hpa</it>II sites are shown in red.</p>
            </text>
            <file name="1471-2164-9-282-S3.doc">
               <p>Click here for file</p>
            </file>
         </suppl>
      </sec>
      <sec>
         <st>
            <p>Authors' contributions</p>
         </st>
         <p>APC and HQ performed the sequence analysis; AM&#8211;B, DC and SB worked on DNA preparations and library constructions; KO'B performed PCR assays; ML was in charge of sequencing; JAB contributed to the intellectual design of the study; PDR conceived the study, participated in the analysis, and drafted the manuscript. All authors have read and approved the final manuscript.</p>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>This work was funded by The Institute for Genomic Research (TIGR, now called J. Craig Venter Institute or JCVI, Rockville, MD).</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>Sequence specificity of methylation in higher plant DNA</p>
            </title>
            <aug>
               <au>
                  <snm>Gruenbaum</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Naveh-Many</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Cedar</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Razin</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>1981</pubdate>
            <volume>292</volume>
            <fpage>860</fpage>
            <lpage>862</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/292860a0</pubid>
                  <pubid idtype="pmpid">6267477</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>Evidence for cytosine methylation of non-symmetrical sequences in transgenic Petunia hybrida</p>
            </title>
            <aug>
               <au>
                  <snm>Meyer</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Niedenhof</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>ten Lohuis</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Embo J</source>
            <pubdate>1994</pubdate>
            <volume>13</volume>
            <fpage>2084</fpage>
            <lpage>2088</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">395059</pubid>
                  <pubid idtype="pmpid">8187761</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>Meiotically and mitotically stable inheritance of DNA hypomethylation induced by ddm1 mutation of Arabidopsis thaliana</p>
            </title>
            <aug>
               <au>
                  <snm>Kakutani</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Munakata</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Richards</snm>
                  <fnm>EJ</fnm>
               </au>
               <au>
                  <snm>Hirochika</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>Genetics</source>
            <pubdate>1999</pubdate>
            <volume>151</volume>
            <fpage>831</fpage>
            <lpage>838</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1460490</pubid>
                  <pubid idtype="pmpid" link="fulltext">9927473</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>Maize chromomethylase Zea methyltransferase2 is required for CpNpG methylation</p>
            </title>
            <aug>
               <au>
                  <snm>Papa</snm>
                  <fnm>CM</fnm>
               </au>
               <au>
                  <snm>Springer</snm>
                  <fnm>NM</fnm>
               </au>
               <au>
                  <snm>Muszynski</snm>
                  <fnm>MG</fnm>
               </au>
               <au>
                  <snm>Meeley</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Kaeppler</snm>
                  <fnm>SM</fnm>
               </au>
            </aug>
            <source>Plant Cell</source>
            <pubdate>2001</pubdate>
            <volume>13</volume>
            <fpage>1919</fpage>
            <lpage>1928</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">139128</pubid>
                  <pubid idtype="pmpid" link="fulltext">11487702</pubid>
                  <pubid idtype="doi">10.1105/tpc.13.8.1919</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>DNA methylation patterns and epigenetic memory</p>
            </title>
            <aug>
               <au>
                  <snm>Bird</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Genes Dev</source>
            <pubdate>2002</pubdate>
            <volume>16</volume>
            <fpage>6</fpage>
            <lpage>21</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1101/gad.947102</pubid>
                  <pubid idtype="pmpid" link="fulltext">11782440</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>DNA modification of a maize transposable element correlates with loss of activity</p>
            </title>
            <aug>
               <au>
                  <snm>Chandler</snm>
                  <fnm>VL</fnm>
               </au>
               <au>
                  <snm>Walbot</snm>
                  <fnm>V</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>1986</pubdate>
            <volume>83</volume>
            <fpage>1767</fpage>
            <lpage>1771</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">323165</pubid>
                  <pubid idtype="pmpid" link="fulltext">3006070</pubid>
                  <pubid idtype="doi">10.1073/pnas.83.6.1767</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>Eukaryotic DNA methylation as an evolutionary device</p>
            </title>
            <aug>
               <au>
                  <snm>Colot</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Rossignol</snm>
                  <fnm>JL</fnm>
               </au>
            </aug>
            <source>Bioessays</source>
            <pubdate>1999</pubdate>
            <volume>21</volume>
            <fpage>402</fpage>
            <lpage>411</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/(SICI)1521-1878(199905)21:5&lt;402::AID-BIES7>3.0.CO;2-B</pubid>
                  <pubid idtype="pmpid" link="fulltext">10376011</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>DNA methylation and epigenetic inheritance in plants and filamentous fungi</p>
            </title>
            <aug>
               <au>
                  <snm>Martienssen</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Colot</snm>
                  <fnm>V</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2001</pubdate>
            <volume>293</volume>
            <fpage>1070</fpage>
            <lpage>1074</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.293.5532.1070</pubid>
                  <pubid idtype="pmpid" link="fulltext">11498574</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>Inactivation of gene expression in plants as a consequence of specific sequence duplication</p>
            </title>
            <aug>
               <au>
                  <snm>Flavell</snm>
                  <fnm>RB</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>1994</pubdate>
            <volume>91</volume>
            <fpage>3490</fpage>
            <lpage>3496</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">43606</pubid>
                  <pubid idtype="pmpid" link="fulltext">8170935</pubid>
                  <pubid idtype="doi">10.1073/pnas.91.9.3490</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>Transposons, DNA methylation and gene control</p>
            </title>
            <aug>
               <au>
                  <snm>Martienssen</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Trends Genet</source>
            <pubdate>1998</pubdate>
            <volume>14</volume>
            <fpage>263</fpage>
            <lpage>264</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0168-9525(98)01518-2</pubid>
                  <pubid idtype="pmpid" link="fulltext">9676527</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>Gene number, noise reduction and biological complexity</p>
            </title>
            <aug>
               <au>
                  <snm>Bird</snm>
                  <fnm>AP</fnm>
               </au>
            </aug>
            <source>Trends Genet</source>
            <pubdate>1995</pubdate>
            <volume>11</volume>
            <fpage>94</fpage>
            <lpage>100</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0168-9525(00)89009-5</pubid>
                  <pubid idtype="pmpid" link="fulltext">7732579</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>Active maize genes are unmodified and flanked by diverse classes of modified, highly repetitive DNA</p>
            </title>
            <aug>
               <au>
                  <snm>Bennetzen</snm>
                  <fnm>JL</fnm>
               </au>
               <au>
                  <snm>Schrick</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Springer</snm>
                  <fnm>PS</fnm>
               </au>
               <au>
                  <snm>Brown</snm>
                  <fnm>WE</fnm>
               </au>
               <au>
                  <snm>SanMiguel</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Genome</source>
            <pubdate>1994</pubdate>
            <volume>37</volume>
            <fpage>565</fpage>
            <lpage>576</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1139/g94-081</pubid>
                  <pubid idtype="pmpid">7958822</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B13">
            <title>
               <p>Genes and transposons are differentially methylated in plants, but not in mammals</p>
            </title>
            <aug>
               <au>
                  <snm>Rabinowicz</snm>
                  <fnm>PD</fnm>
               </au>
               <au>
                  <snm>Palmer</snm>
                  <fnm>LE</fnm>
               </au>
               <au>
                  <snm>May</snm>
                  <fnm>BP</fnm>
               </au>
               <au>
                  <snm>Hemann</snm>
                  <fnm>MT</fnm>
               </au>
               <au>
                  <snm>Lowe</snm>
                  <fnm>SW</fnm>
               </au>
               <au>
                  <snm>McCombie</snm>
                  <fnm>WR</fnm>
               </au>
               <au>
                  <snm>Martienssen</snm>
                  <fnm>RA</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2003</pubdate>
            <volume>13</volume>
            <fpage>2658</fpage>
            <lpage>2664</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">403807</pubid>
                  <pubid idtype="pmpid" link="fulltext">14656970</pubid>
                  <pubid idtype="doi">10.1101/gr.1784803</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>Role of transposable elements in heterochromatin and epigenetic control</p>
            </title>
            <aug>
               <au>
                  <snm>Lippman</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Gendrel</snm>
                  <fnm>AV</fnm>
               </au>
               <au>
                  <snm>Black</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Vaughn</snm>
                  <fnm>MW</fnm>
               </au>
               <au>
                  <snm>Dedhia</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>McCombie</snm>
                  <fnm>WR</fnm>
               </au>
               <au>
                  <snm>Lavine</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Mittal</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>May</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Kasschau</snm>
                  <fnm>KD</fnm>
               </au>
               <au>
                  <snm>Carrington</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Doerge</snm>
                  <fnm>RW</fnm>
               </au>
               <au>
                  <snm>Colot</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Martienssen</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2004</pubdate>
            <volume>430</volume>
            <fpage>471</fpage>
            <lpage>476</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nature02651</pubid>
                  <pubid idtype="pmpid" link="fulltext">15269773</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B15">
            <title>
               <p>Epigenetic Natural Variation in Arabidopsis thaliana</p>
            </title>
            <aug>
               <au>
                  <snm>Vaughn</snm>
                  <fnm>MW</fnm>
               </au>
               <au>
                  <snm>Tanurd Ic</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Lippman</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Jiang</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Carrasquillo</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Rabinowicz</snm>
                  <fnm>PD</fnm>
               </au>
               <au>
                  <snm>Dedhia</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>McCombie</snm>
                  <fnm>WR</fnm>
               </au>
               <au>
                  <snm>Agier</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Bulski</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Colot</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Doerge</snm>
                  <fnm>RW</fnm>
               </au>
               <au>
                  <snm>Martienssen</snm>
                  <fnm>RA</fnm>
               </au>
            </aug>
            <source>PLoS Biol</source>
            <pubdate>2007</pubdate>
            <volume>5</volume>
            <fpage>e174</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1892575</pubid>
                  <pubid idtype="pmpid" link="fulltext">17579518</pubid>
                  <pubid idtype="doi">10.1371/journal.pbio.0050174</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <title>
               <p>Genome-wide analysis of Arabidopsis thaliana DNA methylation uncovers an interdependence between methylation and transcription</p>
            </title>
            <aug>
               <au>
                  <snm>Zilberman</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Gehring</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Tran</snm>
                  <fnm>RK</fnm>
               </au>
               <au>
                  <snm>Ballinger</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Henikoff</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>2007</pubdate>
            <volume>39</volume>
            <fpage>61</fpage>
            <lpage>69</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/ng1929</pubid>
                  <pubid idtype="pmpid" link="fulltext">17128275</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>Genome-wide high-resolution mapping and functional analysis of DNA methylation in arabidopsis</p>
            </title>
            <aug>
               <au>
                  <snm>Zhang</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Yazaki</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Sundaresan</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Cokus</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Chan</snm>
                  <fnm>SW</fnm>
               </au>
               <au>
                  <snm>Chen</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Henderson</snm>
                  <fnm>IR</fnm>
               </au>
               <au>
                  <snm>Shinn</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Pellegrini</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Jacobsen</snm>
                  <fnm>SE</fnm>
               </au>
               <au>
                  <snm>Ecker</snm>
                  <fnm>JR</fnm>
               </au>
            </aug>
            <source>Cell</source>
            <pubdate>2006</pubdate>
            <volume>126</volume>
            <fpage>1189</fpage>
            <lpage>1201</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.cell.2006.08.003</pubid>
                  <pubid idtype="pmpid" link="fulltext">16949657</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p>Nuclear DNA contetn of some important plant species</p>
            </title>
            <aug>
               <au>
                  <snm>Arumuganathan</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Earle</snm>
                  <fnm>ED</fnm>
               </au>
            </aug>
            <source>Plant Mol Biol Rep</source>
            <pubdate>1991</pubdate>
            <volume>9</volume>
            <fpage>208</fpage>
            <lpage>218</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1007/BF02672069</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B19">
            <title>
               <p>Differential methylation of genes and retrotransposons facilitates shotgun sequencing of the maize genome</p>
            </title>
            <aug>
               <au>
                  <snm>Rabinowicz</snm>
                  <fnm>PD</fnm>
               </au>
               <au>
                  <snm>Schutz</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Dedhia</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Yordan</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Parnell</snm>
                  <fnm>LD</fnm>
               </au>
               <au>
                  <snm>Stein</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>McCombie</snm>
                  <fnm>WR</fnm>
               </au>
               <au>
                  <snm>Martienssen</snm>
                  <fnm>RA</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>1999</pubdate>
            <volume>23</volume>
            <fpage>305</fpage>
            <lpage>308</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/15479</pubid>
                  <pubid idtype="pmpid" link="fulltext">10545948</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B20">
            <title>
               <p>Genetic and sequence organization of the mcrBC locus of Escherichia coli K-12</p>
            </title>
            <aug>
               <au>
                  <snm>Dila</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Sutherland</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Moran</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Slatko</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Raleigh</snm>
                  <fnm>EA</fnm>
               </au>
            </aug>
            <source>J Bacteriol</source>
            <pubdate>1990</pubdate>
            <volume>172</volume>
            <fpage>4888</fpage>
            <lpage>4900</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">213143</pubid>
                  <pubid idtype="pmpid" link="fulltext">2203735</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B21">
            <title>
               <p>Escherichia coli K-12 restricts DNA containing 5-methylcytosine</p>
            </title>
            <aug>
               <au>
                  <snm>Raleigh</snm>
                  <fnm>EA</fnm>
               </au>
               <au>
                  <snm>Wilson</snm>
                  <fnm>G</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>1986</pubdate>
            <volume>83</volume>
            <fpage>9070</fpage>
            <lpage>9074</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">387076</pubid>
                  <pubid idtype="pmpid" link="fulltext">3024165</pubid>
                  <pubid idtype="doi">10.1073/pnas.83.23.9070</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B22">
            <title>
               <p>McrBC: a multisubunit GTP-dependent restriction endonuclease</p>
            </title>
            <aug>
               <au>
                  <snm>Sutherland</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Coe</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Raleigh</snm>
                  <fnm>EA</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1992</pubdate>
            <volume>225</volume>
            <fpage>327</fpage>
            <lpage>348</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/0022-2836(92)90925-A</pubid>
                  <pubid idtype="pmpid" link="fulltext">1317461</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B23">
            <title>
               <p>Maize genome sequencing by methylation filtration</p>
            </title>
            <aug>
               <au>
                  <snm>Palmer</snm>
                  <fnm>LE</fnm>
               </au>
               <au>
                  <snm>Rabinowicz</snm>
                  <fnm>PD</fnm>
               </au>
               <au>
                  <snm>O'Shaughnessy</snm>
                  <fnm>AL</fnm>
               </au>
               <au>
                  <snm>Balija</snm>
                  <fnm>VS</fnm>
               </au>
               <au>
                  <snm>Nascimento</snm>
                  <fnm>LU</fnm>
               </au>
               <au>
                  <snm>Dike</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>de la Bastide</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Martienssen</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>McCombie</snm>
                  <fnm>WR</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2003</pubdate>
            <volume>302</volume>
            <fpage>2115</fpage>
            <lpage>2117</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.1091265</pubid>
                  <pubid idtype="pmpid" link="fulltext">14684820</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B24">
            <title>
               <p>Enrichment of gene-coding sequences in maize by genome filtration</p>
            </title>
            <aug>
               <au>
                  <snm>Whitelaw</snm>
                  <fnm>CA</fnm>
               </au>
               <au>
                  <snm>Barbazuk</snm>
                  <fnm>WB</fnm>
               </au>
               <au>
                  <snm>Pertea</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Chan</snm>
                  <fnm>AP</fnm>
               </au>
               <au>
                  <snm>Cheung</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Lee</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Zheng</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>van Heeringen</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Karamycheva</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Bennetzen</snm>
                  <fnm>JL</fnm>
               </au>
               <au>
                  <snm>SanMiguel</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Lakey</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Bedell</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Yuan</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Budiman</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Resnick</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Van Aken</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Utterback</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Riedmuller</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Williams</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Feldblyum</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Schubert</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Beachy</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Fraser</snm>
                  <fnm>CM</fnm>
               </au>
               <au>
                  <snm>Quackenbush</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2003</pubdate>
            <volume>302</volume>
            <fpage>2118</fpage>
            <lpage>2120</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.1090047</pubid>
                  <pubid idtype="pmpid" link="fulltext">14684821</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B25">
            <title>
               <p>Sorghum genome sequencing by methylation filtration</p>
            </title>
            <aug>
               <au>
                  <snm>Bedell</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Budiman</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Nunberg</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Citek</snm>
                  <fnm>RW</fnm>
               </au>
               <au>
                  <snm>Robbins</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Jones</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Flick</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Rholfing</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Fries</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Bradford</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>McMenamy</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Smith</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Holeman</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Roe</snm>
                  <fnm>BA</fnm>
               </au>
               <au>
                  <snm>Wiley</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Korf</snm>
                  <fnm>IF</fnm>
               </au>
               <au>
                  <snm>Rabinowicz</snm>
                  <fnm>PD</fnm>
               </au>
               <au>
                  <snm>Lakey</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>McCombie</snm>
                  <fnm>WR</fnm>
               </au>
               <au>
                  <snm>Jeddeloh</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Martienssen</snm>
                  <fnm>RA</fnm>
               </au>
            </aug>
            <source>PLoS Biol</source>
            <pubdate>2005</pubdate>
            <volume>3</volume>
            <fpage>e13</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">539327</pubid>
                  <pubid idtype="pmpid" link="fulltext">15660154</pubid>
                  <pubid idtype="doi">10.1371/journal.pbio.0030013</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B26">
            <title>
               <p>The maize genome as a model for efficient sequence analysis of large plant genomes</p>
            </title>
            <aug>
               <au>
                  <snm>Rabinowicz</snm>
                  <fnm>PD</fnm>
               </au>
               <au>
                  <snm>Bennetzen</snm>
                  <fnm>JL</fnm>
               </au>
            </aug>
            <source>Curr Opin Plant Biol</source>
            <pubdate>2006</pubdate>
            <volume>9</volume>
            <fpage>149</fpage>
            <lpage>156</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.pbi.2006.01.015</pubid>
                  <pubid idtype="pmpid" link="fulltext">16459129</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B27">
            <title>
               <p>Differential methylation of genes and repeats in land plants</p>
            </title>
            <aug>
               <au>
                  <snm>Rabinowicz</snm>
                  <fnm>PD</fnm>
               </au>
               <au>
                  <snm>Citek</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Budiman</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Nunberg</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Bedell</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Lakey</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>O'Shaughnessy</snm>
                  <fnm>AL</fnm>
               </au>
               <au>
                  <snm>Nascimento</snm>
                  <fnm>LU</fnm>
               </au>
               <au>
                  <snm>McCombie</snm>
                  <fnm>WR</fnm>
               </au>
               <au>
                  <snm>Martienssen</snm>
                  <fnm>RA</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2005</pubdate>
            <volume>15</volume>
            <fpage>1431</fpage>
            <lpage>1440</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1240086</pubid>
                  <pubid idtype="pmpid" link="fulltext">16204196</pubid>
                  <pubid idtype="doi">10.1101/gr.4100405</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B28">
            <title>
               <p>The origin and early evolution of plants on land</p>
            </title>
            <aug>
               <au>
                  <snm>Kenrick</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Crane</snm>
                  <fnm>PR</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>1997</pubdate>
            <volume>389</volume>
            <fpage>33</fpage>
            <lpage>39</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1038/37918</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B29">
            <title>
               <p>Variation in chromosome numbers, CMA bands and 45S rDNA sites in species of Selaginella (Pteridophyta)</p>
            </title>
            <aug>
               <au>
                  <snm>Marcon</snm>
                  <fnm>AB</fnm>
               </au>
               <au>
                  <snm>Barros</snm>
                  <fnm>IC</fnm>
               </au>
               <au>
                  <snm>Guerra</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Ann Bot (Lond)</source>
            <pubdate>2005</pubdate>
            <volume>95</volume>
            <fpage>271</fpage>
            <lpage>276</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">15567808</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B30">
            <title>
               <p>Methylation of somatic and sperm DNA in the homosporous fern Ceratopteris richardii</p>
            </title>
            <aug>
               <au>
                  <snm>McGrath</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Pichersky</snm>
                  <fnm>E</fnm>
               </au>
            </aug>
            <source>Plant Mol Biol</source>
            <pubdate>1997</pubdate>
            <volume>35</volume>
            <fpage>1023</fpage>
            <lpage>1027</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1023/A:1005962520544</pubid>
                  <pubid idtype="pmpid" link="fulltext">9426624</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B31">
            <title>
               <p>Construction of a bacterial artificial chromosome library from the spikemoss Selaginella moellendorffii: a new resource for plant comparative genomics</p>
            </title>
            <aug>
               <au>
                  <snm>Wang</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Tanurdzic</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Luo</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Sisneros</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Kim</snm>
                  <fnm>HR</fnm>
               </au>
               <au>
                  <snm>Weng</snm>
                  <fnm>JK</fnm>
               </au>
               <au>
                  <snm>Kudrna</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Mueller</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Arumuganathan</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Carlson</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Chapple</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>de Pamphilis</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Mandoli</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Tomkins</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Wing</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Banks</snm>
                  <fnm>JA</fnm>
               </au>
            </aug>
            <source>BMC Plant Biol</source>
            <pubdate>2005</pubdate>
            <volume>5</volume>
            <fpage>10</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1177970</pubid>
                  <pubid idtype="pmpid" link="fulltext">15955246</pubid>
                  <pubid idtype="doi">10.1186/1471-2229-5-10</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B32">
            <title>
               <p>The chloroplast genome from a lycophyte (microphyllophyte), Selaginella uncinata, has a unique inversion, transpositions and many gene losses</p>
            </title>
            <aug>
               <au>
                  <snm>Tsuji</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Ueda</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Nishiyama</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Hasebe</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Yoshikawa</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Konagaya</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Nishiuchi</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Yamaguchi</snm>
                  <fnm>K</fnm>
               </au>
            </aug>
            <source>Journal of plant research</source>
            <pubdate>2007</pubdate>
            <volume>120</volume>
            <fpage>281</fpage>
            <lpage>290</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1007/s10265-006-0055-y</pubid>
                  <pubid idtype="pmpid" link="fulltext">17297557</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B33">
            <title>
               <p>Robertson's Mutator transposons in A. thaliana are regulated by the chromatin-remodeling gene Decrease in DNA Methylation (DDM1)</p>
            </title>
            <aug>
               <au>
                  <snm>Singer</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Yordan</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Martienssen</snm>
                  <fnm>RA</fnm>
               </au>
            </aug>
            <source>Genes Dev</source>
            <pubdate>2001</pubdate>
            <volume>15</volume>
            <fpage>591</fpage>
            <lpage>602</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">312647</pubid>
                  <pubid idtype="pmpid" link="fulltext">11238379</pubid>
                  <pubid idtype="doi">10.1101/gad.193701</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B34">
            <title>
               <p>Mobilization of transposons by a mutation abolishing full DNA methylation in Arabidopsis</p>
            </title>
            <aug>
               <au>
                  <snm>Miura</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Yonebayashi</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Watanabe</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Toyama</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Shimada</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Kakutani</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2001</pubdate>
            <volume>411</volume>
            <fpage>212</fpage>
            <lpage>214</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/35075612</pubid>
                  <pubid idtype="pmpid" link="fulltext">11346800</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B35">
            <title>
               <p>Purdue Selaginella Genomics</p>
            </title>
            <url>http://selaginella.genomics.purdue.edu/data.html</url>
         </bibl>
      </refgrp>
   </bm>
</art>
