<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1471-2164-9-614</ui>
   <ji>1471-2164</ji>
   <fm>
      <dochead>Software</dochead>
      <bibl>
         <title>
            <p>Automated paleontology of repetitive DNA with RE<smcaps>ANNOTATE</smcaps></p>
         </title>
         <aug>
            <au id="A1" ca="yes">
               <snm>Pereira</snm>
               <fnm>Vini</fnm>
               <insr iid="I1"/>
               <insr iid="I2"/>
               <email>vini.pereira@bbsrc.ac.uk</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Department of Life Sciences, Imperial College London, Silwood Park campus, Ascot, Berkshire SL5 7PY, UK</p>
            </ins>
            <ins id="I2">
               <p>Theoretical Systems Biology, Institute of Food Research, Norwich Research Park, Colney, Norwich NR4 7UA, UK</p>
            </ins>
         </insg>
         <source>BMC Genomics</source>
         <issn>1471-2164</issn>
         <pubdate>2008</pubdate>
         <volume>9</volume>
         <issue>1</issue>
         <fpage>614</fpage>
         <url>http://www.biomedcentral.com/1471-2164/9/614</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">19094224</pubid>
               <pubid idtype="doi">10.1186/1471-2164-9-614</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>05</day>
               <month>3</month>
               <year>2008</year>
            </date>
         </rec>
         <acc>
            <date>
               <day>18</day>
               <month>12</month>
               <year>2008</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>18</day>
               <month>12</month>
               <year>2008</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2008</year>
         <collab>Pereira; licensee BioMed Central Ltd.</collab>
         <note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>Dispersed repeats are a major component of eukaryotic genomes and drivers of genome evolution. Annotation of DNA sequences homologous to known repetitive elements has been mainly performed with the program R<smcaps>EPEAT</smcaps>M<smcaps>ASKER</smcaps>. Sequences annotated by R<smcaps>EPEAT</smcaps>M<smcaps>ASKER</smcaps> often correspond to fragments of repetitive elements resulting from the insertion of younger elements or other rearrangements. Although R<smcaps>EPEAT</smcaps>M<smcaps>ASKER</smcaps> annotation is indispensable for studying genome biology, this annotation does not contain much information on the common origin of fossil fragments that share an insertion event, especially where clusters of nested insertions of repetitive elements have occurred.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>Here I present RE<smcaps>ANNOTATE</smcaps>, a computational tool to process R<smcaps>EPEAT</smcaps>M<smcaps>ASKER</smcaps> annotation for automated i) defragmentation of dispersed repetitive elements, ii) resolution of the temporal order of insertions in clusters of nested elements, and iii) estimating the age of the elements, if they have long terminal repeats. I have re-annotated the repetitive content of human chromosomes, providing evidence for a recent expansion of satellite repeats on the Y chromosome and, from the retroviral age distribution, for a higher rate of evolution on the Y relative to autosomes.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusion</p>
               </st>
               <p>RE<smcaps>ANNOTATE</smcaps> is ready to process existing annotation for automated evolutionary analysis of all types of complex repeats in any genome. The tool is freely available under the GPL at <url>http://www.bioinformatics.org/reannotate</url>.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>Repeats of high sequence complexity &#8211; mostly transposable elements (TEs) &#8211; account for a large portion of many eukaryotic genomes. In humans they comprise almost half of the (cytologically euchromatic) genome <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>, and in some plants (e.g. maize) most of the DNA (>70%) is repetitive <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr></abbrgrp>. In addition to genome size evolution, repetitive sequences are fundamentally implicated in structural and functional genome evolution. Sequence similarity and complementarity form the basis of many biochemical reactions involving nucleic acids. Hence the occurrence of repeated sequences may mediate genome, epigenome and transcriptome interactions, such as chromosomal rearrangements <abbrgrp><abbr bid="B5">5</abbr><abbr bid="B6">6</abbr><abbr bid="B7">7</abbr></abbrgrp>, centromere <abbrgrp><abbr bid="B8">8</abbr></abbrgrp> and telomere <abbrgrp><abbr bid="B9">9</abbr></abbrgrp> function, and chromatin remodelling and gene silencing mediated by repeat-induced small RNAs <abbrgrp><abbr bid="B10">10</abbr><abbr bid="B11">11</abbr><abbr bid="B12">12</abbr><abbr bid="B13">13</abbr><abbr bid="B14">14</abbr><abbr bid="B15">15</abbr></abbrgrp>. In addition to the effects of sequence repetition, TE insertions may have sequence-specific phenotypic consequences. For instance, they encode regulatory signals that can potentially affect gene expression <abbrgrp><abbr bid="B16">16</abbr><abbr bid="B17">17</abbr><abbr bid="B18">18</abbr><abbr bid="B19">19</abbr></abbrgrp>.</p>
         <p>Given the importance of repetitive DNA sequences for genome structure and evolution, systematic annotation of repeats is essential for inferring biological organisation and function from genomic sequences. When large amounts of sequence data are analysed, automation of the annotation procedure is indispensable. R<smcaps>EPEAT</smcaps>M<smcaps>ASKER</smcaps><abbrgrp><abbr bid="B20">20</abbr></abbrgrp> has become the default computational tool for automated repeat annotation. Despite the current indispensability and efficiency of R<smcaps>EPEAT</smcaps>M<smcaps>ASKER</smcaps> in annotating genomic regions similar to known families of repeats, the R<smcaps>EPEAT</smcaps>M<smcaps>ASKER</smcaps> annotation contains relatively little information on the origin and evolution of repeats. For example, if a given TE insertion is subsequently targeted by further insertions, the original TE sequence will be interrupted and fragmented by sequences of later origin. In such a situation R<smcaps>EPEAT</smcaps>M<smcaps>ASKER</smcaps> may annotate multiple sequence similarity hits to the given TE family without establishing the common origin of these sequences <abbrgrp><abbr bid="B21">21</abbr><abbr bid="B22">22</abbr><abbr bid="B23">23</abbr><abbr bid="B24">24</abbr></abbrgrp>, and no information on the temporal order of insertion of overlapping repetitive elements is obtained without human analysis. This situation is common as TEs are non-randomly distributed within genomes and are often 'nested', i.e. inserted into another TE. Nesting has been observed in diverse genomes across the eukaryotic kingdom <abbrgrp><abbr bid="B10">10</abbr><abbr bid="B24">24</abbr><abbr bid="B25">25</abbr><abbr bid="B26">26</abbr><abbr bid="B27">27</abbr><abbr bid="B28">28</abbr><abbr bid="B29">29</abbr><abbr bid="B30">30</abbr><abbr bid="B31">31</abbr><abbr bid="B32">32</abbr></abbrgrp>. Thus, post-processing is necessary to improve the biological interpretation of R<smcaps>EPEAT</smcaps>M<smcaps>ASKER</smcaps> annotations (reviewed in ref. <abbrgrp><abbr bid="B33">33</abbr></abbrgrp>).</p>
         <p>Here I describe RE<smcaps>ANNOTATE</smcaps> (<it>R</it>epetitive <it>E</it>lement <it>re-annotation</it>), a computational tool for automated defragmentation and evolutionary analysis of (high complexity) repetitive DNA elements (mainly TEs). The term <it>re-annotation </it>reflects the use of R<smcaps>EPEAT</smcaps>M<smcaps>ASKER</smcaps> annotation as input to RE<smcaps>ANNOTATE</smcaps>, and in this context it means neither the prediction of previously unannotated sequence features (similarity hits) nor the detection of false positives in the original annotation. Rather, it means adding "layers" of information that contain new <it>kinds </it>of inferences not present in the original annotation. RE<smcaps>ANNOTATE</smcaps> automatically generates up to three layers of re-annotation: <it>i) defragmentation</it>, via construction of repetitive element "models" consisting of sequence features originally annotated as separate similarity hits; <it>ii) order of insertion</it>, where TE models constructed in <it>(i) </it>overlap; and <it>iii) age</it>, for long terminal repeat (LTR)-elements in particular (dating of their insertion events is performed for structurally complete elements). In addition to annotation, RE<smcaps>ANNOTATE</smcaps> can output the sequences of defragmented repetitive element models, with appropriate gaps so that elements classified in a given family are all pairwise aligned to the family reference sequence; in this form they can easily be multiply aligned (which would be non-trivial with ungapped sequences if they have large indels relative to one another) and are ready for phylogenetic analysis.</p>
         <p>RE<smcaps>ANNOTATE</smcaps> is ready to re-annotate existing R<smcaps>EPEAT</smcaps>M<smcaps>ASKER</smcaps> annotation &#8211; either in its original format or as R<smcaps>EPEAT</smcaps>M<smcaps>ASKER</smcaps> track tables from the UCSC Genome Browser web site <url>http://genome.ucsc.edu</url>. The first two layers of automated re-annotation can be visualised in the A<smcaps>POLLO</smcaps> genome browser <abbrgrp><abbr bid="B34">34</abbr></abbrgrp>, and in addition they can be combined with other kinds of annotation (e.g. non-repetitive genes), facilitating direct human analysis when required.</p>
         <p>Among the functions performed by RE<smcaps>ANNOTATE</smcaps>, defragmentation of genomic sequence regions homologous to TEs has been previously addressed by other tools <abbrgrp><abbr bid="B22">22</abbr><abbr bid="B24">24</abbr><abbr bid="B35">35</abbr></abbrgrp>. M<smcaps>ATCHER</smcaps><abbrgrp><abbr bid="B22">22</abbr></abbrgrp> uses a dynamic programming algorithm to defragment TEs, but does not include biological constraints that can assist the defragmentation process. PLOTREP <abbrgrp><abbr bid="B35">35</abbr></abbrgrp> is an interactive tool that assists manual defragmentation but cannot provide fully automated defragmentation necessary for genome scale analysis. Recently, TCF <abbrgrp><abbr bid="B24">24</abbr></abbrgrp> has become available for automated identification and defragmentation of TE clusters, but TCF does not attempt to defragment all TEs &#8211; notably pieces of LTR-elements found in clusters, and fragments of TEs nested within another TE. RE<smcaps>ANNOTATE</smcaps> provides fully automated defragmentation of any kind of TE using biologically informed constraints. Importantly, RE<smcaps>ANNOTATE</smcaps> is able to defragment LTR-elements, which are often represented by separate query sequences for LTRs and internal regions, and to estimate the age of LTR-element insertion events. Other important differences between RE<smcaps>ANNOTATE</smcaps> and TCF include: <it>i) </it>TCF was developed for mammalian genomes and visualisation of annotation is limited to genomes available in the UCSC genome browser web site, whilst RE<smcaps>ANNOTATE</smcaps> is ready for the re-annotation of any genome and its visualisation (using A<smcaps>POLLO</smcaps>); <it>ii) </it>TCF effectively defragments only TEs that are interrupted by other previously characterised TE sequences, whilst RE<smcaps>ANNOTATE</smcaps> allows interruptions by any kind of sequence (e.g. unknown TEs) and also re-annotates complex repeats other than TEs; <it>iii) </it>parameters in the RE<smcaps>ANNOTATE</smcaps> defragmentation algorithm may be set by users to adapt the algorithm to the repetitive content of a particular genome; and <it>iv) </it>the annotation produced by RE<smcaps>ANNOTATE</smcaps> has been <it>validated </it>by comparison with manually curated annotation of sequences containing highly nested clusters of TEs. During the write-up of this paper, another tool, TE<smcaps>NEST</smcaps><abbrgrp><abbr bid="B36">36</abbr></abbrgrp>, has become available that has similar functionality to RE<smcaps>ANNOTATE</smcaps>. However, there are also important differences between RE<smcaps>ANNOTATE</smcaps> and TE<smcaps>NEST</smcaps>: <it>i) </it>TE<smcaps>NEST</smcaps> currently uses plant repeat databases, and is therefore designed to annotate plant genomes, whilst RE<smcaps>ANNOTATE</smcaps> is ready to annotate any genome. <it>ii) </it>TE<smcaps>NEST</smcaps> itself manages the sequence similarity searches (using WU<smcaps>BLAST</smcaps><abbrgrp><abbr bid="B37">37</abbr></abbrgrp> and LALIGN <abbrgrp><abbr bid="B38">38</abbr></abbrgrp>) against the repeat library, whilst RE<smcaps>ANNOTATE</smcaps> processes the similarity annotation produced by RepeatMasker. The similarity search is the most computationally expensive step in the annotation process, and because RepeatMasker annotation is already available for many genomic sequences, the re-annotation of such sequences is computationally cheaper with REannotate. <it>iii) </it>The visualisation method employed by RE<smcaps>ANNOTATE</smcaps> allows the repetitive DNA annotation to be combined with other kinds of genome annotation (see below).</p>
         <p>In order to illustrate the kind of analysis that become possible with RE<smcaps>ANNOTATE</smcaps>, an application of whole chromosome re-annotation is provided for a human sex chromosome and two autosomes, providing analyses of <it>i) </it>repetitive element patterns of nesting (including evidence for recent expansion of satellite repeats on the Y chromosome), and <it>ii) </it>the age distribution of endogenous retroviruses.</p>
         <p>RE<smcaps>ANNOTATE</smcaps> is open source and freely available under the GNU Public License at <url>http://www.bioinformatics.org/reannotate</url></p>
      </sec>
      <sec>
         <st>
            <p>Implementation</p>
         </st>
         <sec>
            <st>
               <p>Repetitive element model construction and re-annotation algorithm</p>
            </st>
            <sec>
               <st>
                  <p>A. Input <it>R<smcaps>EPEAT</smcaps>M<smcaps>ASKER</smcaps></it>annotation</p>
               </st>
               <p>R<smcaps>EPEAT</smcaps>M<smcaps>ASKER</smcaps> annotation either in its original format or as a UCSC table is input to RE<smcaps>ANNOTATE</smcaps>. If this annotation reports similarity to <it>N </it>different reference repetitive elements, let <it>R </it>= {<it>r</it><sup>1</sup>, ... <it>r</it><sup><it>N</it></sup>} denote this library set of <it>N </it>reference elements. Here I call a <it>hit </it>to the reference element <it>r </it>a query sequence region homologous to <it>r </it>and with higher sequence similarity to <it>r </it>than to any other reference element in <it>R</it>, annotated by R<smcaps>EPEAT</smcaps>M<smcaps>ASKER</smcaps>. As an example, a visual representation of the input annotation is given in Figure <figr fid="F1">1A</figr>.</p>
               <fig id="F1">
                  <title>
                     <p>Figure 1</p>
                  </title>
                  <caption>
                     <p>Re-annotation algorithm</p>
                  </caption>
                  <text>
                     <p><b>Re-annotation algorithm</b>. A. Graphic representation of the input R<smcaps>EPEAT</smcaps>M<smcaps>ASKER</smcaps> annotation of the first 100 Kb of genomic sequence [G<smcaps>EN</smcaps>B<smcaps>ANK</smcaps>:<ext-link ext-link-type="gen" ext-link-id="AF123535.1">AF123535.1</ext-link>] around the <it>adh </it>gene of the maize cultivar <it>LH_82 </it>(this refers to the same maize sequence that was manually annotated in <abbrgrp><abbr bid="B45">45</abbr></abbrgrp> and used to validate RE<smcaps>ANNOTATE</smcaps>'s predictions in Results, Table <tblr tid="T1">1</tblr>, and Figure <figr fid="F3">3</figr>). B. Boxes highlighted in magenta on the bottom tier represent hits to the reference element <smcaps>PREM2_ZM_I</smcaps> (the internal region of an LTR-retrotransposon in R<smcaps>EP</smcaps>B<smcaps>ASE</smcaps>), of which the three innermost hits, shown again in red on the top tier united by a horizontal line, were defragmented by RE<smcaps>ANNOTATE</smcaps> into a repetitive element model. The black arrows show the orientation of the hits on the chromosome, and the three hits shown in red are colinear with the reference <smcaps>PREM2_ZM_I</smcaps> sequence. C. Boxes highlighted in blue on the bottom tier represent hits to the reference LTR sequence <smcaps>PREM2_ZM_LTR</smcaps> (in R<smcaps>EP</smcaps>B<smcaps>ASE</smcaps>). Above, two (single-hit) LTR models (shown in orange) flank an IR model (in red): these three models have been assembled into a higher-order model of an element of the <smcaps>PREM2_ZM</smcaps> family. D. The chromosomal span of the defragmented <smcaps>PREM2_ZM</smcaps> element (red and orange) is within the span of another element (bottom model in black); the <smcaps>PREM2_ZM</smcaps> element is inferred to have inserted into the element shown below it. Two other elements (black boxes on top tier) are inferred to have inserted into the <smcaps>PREM2_ZM</smcaps> element. E. Pairs of intra-element LTR sequences are output, aligned with <smcaps>CLUSTAL</smcaps>W, and the number of point substitutions between them estimated.</p>
                  </text>
                  <graphic file="1471-2164-9-614-1"/>
               </fig>
            </sec>
            <sec>
               <st>
                  <p>B. Defragmentation of repetitive elements</p>
               </st>
               <p>RE<smcaps>ANNOTATE</smcaps> constructs repetitive element models assigned to the different families in the reference library <it>R</it>. A given reference element <it>r </it>defines a set of hits along the query sequence. RE<smcaps>ANNOTATE</smcaps> will search for subsets of hits to <it>r </it>that are to be defragmented into an element model if they satisfy the criteria:</p>
               <p>(i) <it>Colinearity with the reference element</it>.</p>
               <p>(ii) <it>Maximum span</it>.</p>
               <p>Criterion (i) requires that the defragmented hits be in the same orientation on the query sequence, and that they match consecutive (though not necessarily contiguous) regions of the reference element <it>r</it>. (Note that along the query sequence these defragmented hits need not be consecutive, in the sense that there may be other hits to <it>r </it>nested between them.) An example is illustrated in Figure <figr fid="F1">1B</figr>. The reference element in question is <smcaps>PREM2_ZM_I</smcaps> (from R<smcaps>EP</smcaps>B<smcaps>ASE</smcaps> U<smcaps>PDATE</smcaps>), and hits to this reference element along the maize sequence are highlighted in magenta. RE<smcaps>ANNOTATE</smcaps> constructed a model defragmenting the three innermost hits to <smcaps>PREM2_ZM_I</smcaps>, shown in red, which respectively match nucleotide positions 1-787, 783-4958, and 4956-6864 along the reference <smcaps>PREM2_ZM_I</smcaps> sequence. Note that there is a small overlap between the matching coordinates along the reference sequence (5 nucleotide between the first two matches, and 3 nucleotides between the last two), which is an ostensible deviation from strict colinearity between the three hits on the maize chromosome and the reference sequence. This situation is common and is due to uncertainty about the ends of local alignments. RE<smcaps>ANNOTATE</smcaps> has a user-definable tolerance parameter <b><it>&#1013; </it></b>in the requirement for colinearity between the element model and the reference element that allows for an overlap (<it>o</it>) between the matching coordinates of two defragmented hits along the reference sequence, if <it>o </it>&#8804; <b><it>&#1013;</it></b>. (Default <b><it>&#1013; </it></b>= 40 nucleotides; if <b><it>&#1013; </it></b>> <it>L</it>/10, where <it>L </it>is the length of a given reference sequence <it>r</it>, then the tolerance margin for that family is automatically set to <it>L/</it>10).</p>
               <p>Criterion (ii) requires that the span (i.e. the query sequence length from the end of the first to the beginning of the last defragmented hit) of candidate repetitive element models do not exceed a (user-definable) length <b><it>&#948; </it></b>(default <b><it>&#948; </it></b>= 40 Kb).</p>
               <p>For each reference family in the library <it>R</it>, RE<smcaps>ANNOTATE</smcaps> searches for candidate repetitive element models that satisfy (i) and (ii), but only constructs models that additionally satisfy:</p>
               <p>(iii) <it>Uniqueness</it>.</p>
               <p>(iv) <it>Maximum defragmentation</it>.</p>
               <p>(v) <it>Recursive model nesting</it>.</p>
               <p>Criterion (iii) requires that constructed models comprise mutually exclusive sets of hits.</p>
               <p>If two candidate models are such that all the hits defragmented in one model are present in the other model that defragments a higher number of hits, then criterion (iv) requires that only the model defragmenting the maximum number of hits be constructed.</p>
               <p>If two candidate models (assigned to the same family) comprise different sets of hits with at least one hit in common, then criterion (v) requires that the candidate model whose hits span the narrower region of the query sequence be constructed. If a hierarchy of such models exists then this pairwise criterion is repeated recursively. Additionally, if two candidate models (assigned to any families) are such that <it>a) </it>they comprise non-intersecting sets of hits and <it>b) </it>they span overlapping regions of the query sequence, then criterion (v) requires that they cannot both be constructed unless the span of one candidate model lies entirely within the span of the other.</p>
            </sec>
            <sec>
               <st>
                  <p>Defragmentation of chromosomal elements matching multiple reference sequences</p>
               </st>
               <p>If, upon human inspection of either the original R<smcaps>EPEAT</smcaps>M<smcaps>ASKER</smcaps> annotation or the automated re-annotation, the occurrence of fragments of a given repetitive element matching multiple reference sequences is detected, a user may supply as input to RE<smcaps>ANNOTATE</smcaps> a text file containing lists of "related" reference elements, so that hits to different reference elements within one such list may be considered for defragmentation into a repetitive element model. The use of this option was essential for the re-annotation and analysis of human endogenous retroviruses in Results; the ERV names equivalence lists are provided in Additional file <supplr sid="S1">1</supplr>.</p>
               <suppl id="S1">
                  <title>
                     <p>Additional file 1</p>
                  </title>
                  <text>
                     <p><b>Equivalence table of different reference sequences corresponding to the same families of ERVs.</b> Names of reference LTRs and internal regions of primate ERVs in R<smcaps>EP</smcaps>B<smcaps>ASE</smcaps> UPDATE that correspond to the same ERV family may be dissimilar. This text file contains lists (one per line) of equivalent (i.e. assigned to the same ERV family) names. This file must be input to RE<smcaps>ANNOTATE</smcaps> for defragmentation of primate ERVs if the input annotation was performed with the R<smcaps>EP</smcaps>B<smcaps>ASE</smcaps> U<smcaps>PDATE</smcaps> library. The first name in each line is used by RE<smcaps>ANNOTATE</smcaps> to assign a family to an element model that may contain hits matching different reference sequences within an equivalence class.</p>
                  </text>
                  <file name="1471-2164-9-614-S1.txt">
                     <p>Click here for file</p>
                  </file>
               </suppl>
            </sec>
            <sec>
               <st>
                  <p>C. Defragmentation of LTR-elements</p>
               </st>
               <p>The defragmentation procedure described in step <it>B </it>above applies to any high-complexity repetitive element. Both the LTR and the internal region (i.e. the sequence between the two LTRs of a complete element) of a given LTR-element family may be found as dispersed repeats. Fossil remains of a given insertion may contain only LTR sequence or only internal region (IR) sequence. RE<smcaps>ANNOTATE</smcaps> performs additional analysis of LTR-elements (i.e. LTR-retrotransposons and retroviruses) if the reference library contains separate entries for the LTR and IR sequences. Names of reference LTR and IR elements of the same family should be identical (unless a name equivalence list is input to RE<smcaps>ANNOTATE</smcaps>) apart from a suffix. Reference LTR names do not need a suffix, but they may be suffixed with either the string '-LTR' or '_LTR' (case-insensitive). Reference IR names should be suffixed with either '-int', '_int', '-I', or '_I' (case-insensitive). This is the naming convention used in R<smcaps>EP</smcaps>B<smcaps>ASE</smcaps> U<smcaps>PDATE</smcaps> for LTR-elements in most genomes, though currently with the notable exception of human/primate endogenous retroviruses (for the solution to this problem adopted in this study see below and Additional files <supplr sid="S1">1</supplr> and <supplr sid="S2">2</supplr>).</p>
               <suppl id="S2">
                  <title>
                     <p>Additional file 2</p>
                  </title>
                  <text>
                     <p><b>Script for pre-processing primate ERV names in R<smcaps>EPEAT</smcaps>M<smcaps>ASKER</smcaps> annotation</b>. A shell script for pre-processing R<smcaps>EPEAT</smcaps>M<smcaps>ASKER</smcaps> annotation so that appropriate suffixes are added to the names of reference primate ERV internal regions (and LTRs). This is necessary for defragmentation of primate ERVs if the input annotation was performed with the R<smcaps>EP</smcaps>B<smcaps>ASE</smcaps> U<smcaps>PDATE</smcaps> library.</p>
                  </text>
                  <file name="1471-2164-9-614-S2.sh">
                     <p>Click here for file</p>
                  </file>
               </suppl>
               <p>Models of LTRs and IRs are separately constructed with the <it>Defragmentation </it>algorithm in step <it>B</it>. RE<smcaps>ANNOTATE</smcaps> then constructs higher-order models combining models of defragmented LTRs and defragmented IRs. Higher-order model construction proceeds with requirements analogous to those in step <it>B</it>, but now the colinearity criterion refers to the <it>LTR-IR-LTR </it>structure of a canonical LTR-element. An example is illustrated in Figure <figr fid="F1">1C</figr>. Recursive model nesting (as in B) is used to defragment LTR-element structures within structures. Where nested structures corresponding to the same family are identified, these are resolved by mapping the coordinates of the component LTR or IR fragments onto their respective reference sequences. A nested element should interrupt the nesting element. (If there is ambiguity in the sense that a given LTR could be paired with two different LTRs and IRs, a model of a 'complete' element is constructed with the two LTRs whose sequences are most similar to each other).</p>
               <p>If a candidate higher-order model contains only one LTR and one IR model, a higher-order model for a truncated LTR-element is constructed only if the LTR and IR are separated on the query by less than a (user-definable) distance <b><it>&#963; </it></b>(RE<smcaps>ANNOTATE</smcaps> default <b><it>&#963; </it></b>= 15 Kb). Furthermore, truncated higher-order models are only constructed when the constituent LTR and IR models cannot be accommodated in a model of a 'complete' LTR-element.</p>
               <p>RE<smcaps>ANNOTATE</smcaps> classifies LTR-element models as either <it>(i) </it>'complete', if they contain at least part of the sequence of <it>both </it>their original LTRs and of the IR; or <it>(ii) </it>'truncated', if they are not 'complete' and not a 'solo' LTR; or <it>(iii) </it>a 'solo' LTR, if a model contains only sequence corresponding to a single LTR, and if this is separated from the nearest LTR or IR model &#8211; of the same family and inserted in the same orientation &#8211; by a distance greater than <b><it>&#963;</it></b>. (Note that the term 'solo LTR' is often used to mean an element that resulted from a deletion event that occurred via recombination between intra-element LTRs; such an event would preserve target site duplications (TSDs) flanking the original element, however RE<smcaps>ANNOTATE</smcaps> currently does not check for TSDs, and such a check would only be possible if the termini of the original element had not been truncated).</p>
            </sec>
            <sec>
               <st>
                  <p>DNA rearrangements other than transposition</p>
               </st>
               <p>RE<smcaps>ANNOTATE</smcaps> will flag the possibility that LTR-element sequences have been involved in DNA rearrangements other than transposition of an entire element when an <it>IR-LTR-IR </it>structure is detected, i.e. two IR models flanking an LTR model of the same family, in the same orientation. Here "flanking" means that either <it>i) </it>the LTR model and the IR models are contiguous on the query sequence (within a tolerance margin <b><it>&#1013;</it></b>), and that neighbouring ends have no missing sequence (within <b><it>&#1013;</it></b>) &#8211; this excludes the possibility of a nested insertion of an IR or LTR of the same family being considered; or <it>ii) </it>the LTR and IR models are not contiguous on the query and they are not complete, but the length of the gaps between them equals the amount of sequence they are missing (within <b><it>&#1013;</it></b>) &#8211; this includes structures in which the <it>IR-LTR </it>boundaries have been obliterated (possibly prior to the re-arrangement), or structures with segments that are not homologous to the closest library sequence; or <it>(iii) </it>the structure is highly symmetrical, i.e. the two IR models are equidistant from the LTR model between them (within <b><it>&#1013;</it></b>), and provided that this separation on the query is less than <b><it>&#963;</it></b>.</p>
            </sec>
            <sec>
               <st>
                  <p>D. Inference of nesting order</p>
               </st>
               <p>After all hits to high-complexity repetitive elements have been defragmented into element models, any nested structures can be resolved by comparing the coordinate ranges of the models along the query sequence. If the span of a given model is contained within the span of another model, the former is classified as 'nested' in the latter (inferred to have inserted into the latter). An example is continued from step <it>C </it>and shown in Figure <figr fid="F1">1D</figr>.</p>
               <p>RE<smcaps>ANNOTATE</smcaps> also provides an <it>optional </it>algorithm for the identification of 'truncated nesting': if one terminus of a given element model interrupts another model, the interrupting element may be classified as nested even if the interrupted element does not contain detectable sequence on both sides of the interrupting element. Truncated nesting is annotated only if there is no sequence missing from the the interrupting terminus of the interrupting element (so that a deletion spanning part of the truncated element and part of the interrupting element is not considered).</p>
            </sec>
            <sec>
               <st>
                  <p>E. Dating of LTR-elements</p>
               </st>
               <p>For each (structurally) complete LTR-element model constructed, RE<smcaps>ANNOTATE</smcaps> outputs the gapped sequence (see below) of each intra-element LTR separately. RE<smcaps>ANNOTATE</smcaps> then generates automated intra-element LTR alignments using the <smcaps>CLUSTAL</smcaps>W(2) <abbrgrp><abbr bid="B39">39</abbr></abbrgrp> alignment program. The number of nucleotide substitutions per site (<it>K</it>) between intra-element LTR sequences (and its variance) is then estimated (Figure <figr fid="F1">1E</figr>) using the Kimura 2-parameter model <abbrgrp><abbr bid="B40">40</abbr></abbrgrp>. If a rate of substitutions per site (<it>s</it>) is provided, the time elapsed since the insertion of a 'complete' LTR-element (<it>t</it>) is estimated as <inline-formula><m:math name="1471-2164-9-614-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:mi>t</m:mi><m:mo>=</m:mo><m:mfrac><m:mi>K</m:mi><m:mrow><m:mn>2</m:mn><m:mi>s</m:mi></m:mrow></m:mfrac></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemiDaqNaeyypa0tcfa4aaSaaaeaacqWGlbWsaeaacqaIYaGmcqWGZbWCaaaaaa@326A@</m:annotation></m:semantics></m:math></inline-formula>. In addition to the variance propagated from the estimation of <it>K</it>, the variance of the time estimate (<inline-formula><m:math name="1471-2164-9-614-i2" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>&#963;</m:mi><m:mi>t</m:mi><m:mn>2</m:mn></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaeq4Wdm3aa0baaSqaaiabdsha0bqaaiabikdaYaaaaaa@3028@</m:annotation></m:semantics></m:math></inline-formula>) accounts for the fact that the accumulation of nucleotide substitutions occurs stochastically over time, which can be modeled as a Poisson process; the variance in <it>t </it>is then estimated as <inline-formula><m:math name="1471-2164-9-614-i3" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>&#963;</m:mi><m:mi>t</m:mi><m:mn>2</m:mn></m:msubsup><m:mo>=</m:mo><m:mfrac><m:mrow><m:msup><m:mrow><m:mo stretchy="false">(</m:mo><m:msub><m:mi>&#963;</m:mi><m:mi>K</m:mi></m:msub><m:mi>L</m:mi><m:mo stretchy="false">)</m:mo></m:mrow><m:mn>2</m:mn></m:msup><m:mo>+</m:mo><m:mi>K</m:mi><m:mi>L</m:mi></m:mrow><m:mrow><m:mn>4</m:mn><m:msup><m:mrow><m:mo stretchy="false">(</m:mo><m:mi>s</m:mi><m:mi>L</m:mi><m:mo stretchy="false">)</m:mo></m:mrow><m:mn>2</m:mn></m:msup></m:mrow></m:mfrac></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaeq4Wdm3aa0baaSqaaiabdsha0bqaaiabikdaYaaakiabg2da9KqbaoaalaaabaGaeiikaGIaeq4Wdm3aaSbaaeaacqWGlbWsaeqaaiabdYeamjabcMcaPmaaCaaabeqaaiabikdaYaaacqGHRaWkcqWGlbWscqWGmbataeaacqaI0aancqGGOaakcqWGZbWCcqWGmbatcqGGPaqkdaahaaqabeaacqaIYaGmaaaaaaaa@422E@</m:annotation></m:semantics></m:math></inline-formula>, where <it>&#963;</it><sub><it>K </it></sub>is the standard deviation of the estimate of <it>K</it>, and <it>L </it>is the length of the intra-element LTR alignment.</p>
            </sec>
         </sec>
         <sec>
            <st>
               <p>Gapped sequence output</p>
            </st>
            <p>Each repetitive element model constructed by RE<smcaps>ANNOTATE</smcaps> is associated with a given reference element <it>r</it>; each similarity hit defragmented into a given model is locally aligned to <it>r </it>by the sequence similarity search engine used with R<smcaps>EPEAT</smcaps>M<smcaps>ASKER</smcaps>, either WU<smcaps>BLAST</smcaps><abbrgrp><abbr bid="B37">37</abbr></abbrgrp> or <smcaps>CROSS_MATCH</smcaps><abbrgrp><abbr bid="B41">41</abbr></abbrgrp>. For each model RE<smcaps>ANNOTATE</smcaps> outputs the chromosomal sequences (that have been locally aligned to <it>r</it>) of its hits, separated by gaps if necessary. The gaps refer to local alignment positions along <it>r</it>: in the output model sequence the gap length between two hits does not correspond to their distance along the chromosomal sequence, but rather to the distance between the hits' terminal alignment positions along <it>r</it>. Terminal gaps are also included if the hits to do not align as far as the termini of <it>r</it>. An example is given in Figure <figr fid="F2">2</figr>.</p>
            <fig id="F2">
               <title>
                  <p>Figure 2</p>
               </title>
               <caption>
                  <p>Sequence output</p>
               </caption>
               <text>
                  <p><b>Sequence output</b>. The element model shown in green (in A) defragmented three regions of human chromosome Y homologous to segments of the reference sequence of the DNA transposon family <smcaps>MER58B</smcaps> (<smcaps>CHESHIRE_B</smcaps>), shown in B. Hits marked 1 and 2 (in A) are separated on the chromosome by only 26 bp, but in the output model sequence (shown in C) their respective sequences are separated by an internal gap of length 79 &#8211; this is the distance along the reference <smcaps>MER58B</smcaps> separating its segments that are aligned to hits 1 and 2. In contrast, the sequences of hits 2 and 3 are output contiguously (without an intervening gap) because they match contiguous segments of <smcaps>MER58B</smcaps> &#8211; even though the corresponding chromosomal regions are separated by an A<smcaps>LU</smcaps>S<smcaps>X</smcaps> SINE insertion (blue box above the green model in A). The terminal gap in the model sequence is added to indicate that the annotated alignment of hit 3 ends five nucleotide positions short of the 5' terminus of the <smcaps>MER58B</smcaps> sequence.</p>
               </text>
               <graphic file="1471-2164-9-614-2"/>
            </fig>
            <p>If RE<smcaps>ANNOTATE</smcaps> is run with the option to estimate the age of structurally complete LTR-element models then the <smcaps>CLUSTAL</smcaps>W2 pairwise alignments of gapped intra-element LTR sequences are also output.</p>
         </sec>
         <sec>
            <st>
               <p>Evaluation of RE<smcaps>ANNOTATE</smcaps>'s predictions</p>
            </st>
            <p>Accuracy of the defragmentation layer of re-annotation was assessed by its sensitivity and specificity at the element level. A high sensitivity would indicate that most of the hits in the input R<smcaps>EPEAT</smcaps>M<smcaps>ASKER</smcaps> annotation that correspond to fragmented repetitive elements in the manually annotated query sequence have been correctly assembled into multi-hit element models by RE<smcaps>ANNOTATE</smcaps>. A high specificity would indicate that few hits that are not part of fragmented elements have been incorrectly included into multi-hit element models. The sensitivity and specificity were respectively calculated according to the formulas <inline-formula><m:math name="1471-2164-9-614-i4" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:mfrac><m:mrow><m:mi>T</m:mi><m:mi>P</m:mi></m:mrow><m:mrow><m:mi>T</m:mi><m:mi>P</m:mi><m:mo>+</m:mo><m:mi>F</m:mi><m:mi>N</m:mi></m:mrow></m:mfrac></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaqcfa4aaSaaaeaacqWGubavcqWGqbauaeaacqWGubavcqWGqbaucqGHRaWkcqWGgbGrcqWGobGtaaaaaa@3443@</m:annotation></m:semantics></m:math></inline-formula> and <inline-formula><m:math name="1471-2164-9-614-i5" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:mfrac><m:mrow><m:mi>T</m:mi><m:mi>N</m:mi></m:mrow><m:mrow><m:mi>T</m:mi><m:mi>N</m:mi><m:mo>+</m:mo><m:mi>F</m:mi><m:mi>P</m:mi></m:mrow></m:mfrac></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaqcfa4aaSaaaeaacqWGubavcqWGobGtaeaacqWGubavcqWGobGtcqGHRaWkcqWGgbGrcqWGqbauaaaaaa@343F@</m:annotation></m:semantics></m:math></inline-formula>, where <it>TP </it>(count of true positive predictions) is the number of hits correctly assembled into multi-hit repetitive element models; <it>FN </it>(count of false negatives) is the number of separate element models constructed by RE<smcaps>ANNOTATE</smcaps> that correspond to the same elements in the manual annotation; <it>TN </it>(count of true negatives) is the number of hits correctly modeled as single-hit elements; and <it>FP </it>(count of false positives) is the number of hits incorrectly assembled into multi-hit models. Accuracy of both the nesting structure and the time layers of re-annotation was calculated as the proportion of RE<smcaps>ANNOTATE</smcaps> predictions in agreement with manually curated annotation.</p>
         </sec>
         <sec>
            <st>
               <p>Input to RE<smcaps>ANNOTATE</smcaps></p>
            </st>
            <sec>
               <st>
                  <p>Analysis of maize and wheat sequences</p>
               </st>
               <p>Both the maize [G<smcaps>EN</smcaps>B<smcaps>ANK</smcaps>:<ext-link ext-link-type="gen" ext-link-id="AF123535.1">AF123535.1</ext-link>] and wheat [G<smcaps>EN</smcaps>B<smcaps>ANK</smcaps>:<ext-link ext-link-type="gen" ext-link-id="AF459639.1">AF459639.1</ext-link>] sequences were annotated with R<smcaps>EPEAT</smcaps>M<smcaps>ASKER</smcaps> using a cutoff score of 200. Maize repeat sequences in the R<smcaps>EP</smcaps>B<smcaps>ASE</smcaps> U<smcaps>PDATE</smcaps><abbrgrp><abbr bid="B42">42</abbr></abbrgrp> repeatmaskerlibrary version 20050112 were used as a reference library, supplemented with sequences from the TIGR Z<smcaps>EA</smcaps> R<smcaps>EPEAT</smcaps> D<smcaps>ATABASE</smcaps> v3.0 <abbrgrp><abbr bid="B43">43</abbr></abbrgrp>. Library elements <smcaps>CINFUL1_ZM</smcaps> and <smcaps>CINFUL2_ZM</smcaps> are closely related, and hits to these elements were considered together for defragmentation into element models. In the diploid wheat analysis the monocot library from the R<smcaps>EP</smcaps>B<smcaps>ASE</smcaps> U<smcaps>PDATE</smcaps> repeatmaskerlibrary version 20050523 was used.</p>
            </sec>
            <sec>
               <st>
                  <p>Analysis of human and fly sequences</p>
               </st>
               <p>R<smcaps>EPEAT</smcaps>M<smcaps>ASKER</smcaps> annotation was downloaded directly from the University of California at Santa Cruz (UCSC) Genome Browser web site <url>http://www.genome.ucsc.edu</url>.</p>
               <p>Annotation of the human genome sequences was further processed with a custom script (Additional file <supplr sid="S2">2</supplr>) to suffix the names of reference LTR and IR elements for defragmentation. Additionally, a file with ERV reference name equivalence lists (Additional file <supplr sid="S1">1</supplr>) was input to RE<smcaps>ANNOTATE</smcaps>. This is because many HERV reference sequences in R<smcaps>EP</smcaps>B<smcaps>ASE</smcaps> U<smcaps>PDATE</smcaps> (which were used in the R<smcaps>EPEAT</smcaps>M<smcaps>ASKER</smcaps> annotation) are closely related but may have disparate names, and for most ERV families the corresponding LTR and IR reference sequences have different names.</p>
            </sec>
         </sec>
         <sec>
            <st>
               <p>Re-annotation output</p>
            </st>
            <p>The main annotation is output to a tab-delimited text file. As an example, the annotation of the maize sequence analysed here is given in Additional file <supplr sid="S3">3</supplr>, and its data fields are described in Additional file <supplr sid="S4">4</supplr>.</p>
            <suppl id="S3">
               <title>
                  <p>Additional file 3</p>
               </title>
               <text>
                  <p><b>Output re-annotation of dispersed repeats in the maize accession [G<smcaps>EN</smcaps>B<smcaps>ANK</smcaps>: </b><ext-link ext-link-type="gen" ext-link-id="AF123535.1">AF123535.1</ext-link><b>].</b> This is the main repeat re-annotation file output by RE<smcaps>ANNOTATE</smcaps> as a tab delimited text file. Each line corresponds to a repetitive element model.</p>
               </text>
               <file name="1471-2164-9-614-S3.txt">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <suppl id="S4">
               <title>
                  <p>Additional file 4</p>
               </title>
               <text>
                  <p><b>Description of data fields in the main re-annotation file.</b> Data fields in Additional file <supplr sid="S1">1</supplr>.</p>
               </text>
               <file name="1471-2164-9-614-S4.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <p>A copy of the input R<smcaps>EPEAT</smcaps>M<smcaps>ASKER</smcaps> annotation is also output, with the original "ID" column replaced with identifiers of defragmented elements in the main RE<smcaps>ANNOTATE</smcaps> annotation file.</p>
         </sec>
         <sec>
            <st>
               <p>Benchmarking of processing time</p>
            </st>
            <p>In order to illustrate the computational cost of re-annotation, I have benchmarked the CPU time used by REannotate for processing the entire <it>Arabidopsis thaliana </it>genome. R<smcaps>EPEAT</smcaps>M<smcaps>ASKER</smcaps> annotation of the <it>A. thaliana </it>genome was downloaded from <url>http://www.repeatmasker.org/PreMaskedGenomes.html</url>, and the corresponding genome sequence was downloaded from <url>ftp://ftp.arabidopsis.org/home/tair/Genes/TIGR5_genome_release/</url>.</p>
            <p>Running (under GNU/L<smcaps>INUX</smcaps>) on a 3.16 GHz I<smcaps>NTEL</smcaps> E8500 processor, REannotate took:</p>
            <p>&#8226; 157 CPU <it>seconds </it>to re-annotate the entire genome and output sequence. Out of these 157 seconds, 62 seconds were used by <smcaps>CLUSTAL</smcaps>W2 to produce alignments. (19213 repetitive element models were constructed, 3619 nested insertions were inferred, and 294 intra-element LTR pairs were aligned).</p>
            <p>&#8226; 107 CPU seconds (57 seconds used by <smcaps>CLUSTAL</smcaps>W2) in total, to re-annotate the five A. thaliana chromosomes if the input RepeatMasker annotation was processed separately for each chromosome (i.e. the input annotation for each chromosome was stored in separate files).</p>
            <p>&#8226; 195 CPU seconds (86 seconds used by <smcaps>CLUSTAL</smcaps>W2) to re-annotate the entire genome when equivalence lists of reference repeats were used to defragment chromosomal elements matching multiple reference sequences. (Nine equivalence lists were used: 1. <it>athila4 athila4A athila4B athila4C</it>; 2. <it>athila8A athila8B</it>; 3. <it>athila6 athila athila2 athila0</it>; 4. <it>athila6A athila6B athila6</it>; 5. <it>athila athila5 athila2</it>; 6. <it>athila0 athila3</it>; 7. <it>atgp3 atgp5 atgp7</it>; 8. <it>atgp1 atgp2N</it>; 9. <it>ATREP3 ATREP4</it>). (18880 element models were constructed, 3673 nested insertions were inferred, and 393 intra-element LTR pairs were aligned).</p>
            <p>Processing times will be much longer if many equivalence lists (such as those in Additional file <supplr sid="S1">1</supplr>, which exceed 100) are input to REannotate.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Results and discussion</p>
         </st>
         <sec>
            <st>
               <p>Re-annotation</p>
            </st>
            <p>RE<smcaps>ANNOTATE</smcaps> creates up to three new 'layers' of repetitive DNA annotation over existing R<smcaps>EPEAT</smcaps>M<smcaps>ASKER</smcaps><abbrgrp><abbr bid="B20">20</abbr></abbrgrp> annotation taken as input. The new layers of annotation are:</p>
            <sec>
               <st>
                  <p>Layer 1 &#8211; Defragmentation</p>
               </st>
               <p>After integration, the sequence of a repetitive element may become fragmented by subsequent insertion/deletion events or other rearrangements. RE<smcaps>ANNOTATE</smcaps> generates models of repetitive element insertion events into an ancestral state of a query genomic sequence, in order to identify query sequence regions that originated in the same insertion event and that correspond to (possibly) multiple hits in the input R<smcaps>EPEAT</smcaps>M<smcaps>ASKER</smcaps> annotation. Therefore for a given re-annotated query sequence the number of TE hits in the R<smcaps>EPEAT</smcaps>M<smcaps>ASKER</smcaps> annotation is usually greater than the total number of repetitive element models obtained from the defragmentation of hits performed by RE<smcaps>ANNOTATE</smcaps> (the inferred number of insertion events). Figure <figr fid="F3">3A</figr> shows a visual representation of the R<smcaps>EPEAT</smcaps>M<smcaps>ASKER</smcaps> annotation of multiple similarity hits to TEs along a DNA sequence. Figure <figr fid="F3">3B</figr> shows, as an example, the re-annotation of 6 similarity hits that were defragmented into the same TE model.</p>
               <fig id="F3">
                  <title>
                     <p>Figure 3</p>
                  </title>
                  <caption>
                     <p>Layers of re-annotation: maize adh1 locus</p>
                  </caption>
                  <text>
                     <p><b>Layers of re-annotation: maize adh1 locus. </b>A. Representation of the input REPEATMASKER annotation of 160 Kb of sequence around the adh1 locus of maize cultivar LH82. In the bottom tier pale yellow boxes correspond to un-masked sequence (no similarity to known repeats), dark vertical lines indicate low-complexity repeats. Boxes in top tier represent similarity hits to dispersed repeats, separated by vertical lines. Red boxes represent hits to the IR of LTR-elements, orange boxes LTRs, lilac boxes non-LTR retrotransposons, dark green boxes DNA transposons, and the pink box an unknown type of repeat. B. Detail of re-annotation layer 1: Defragmentation. The two bottom tiers represent a portion of the REPEATMASKER annotation shown in A, whilst boxes above indicate 3 IR hits (third tier from bottom) and 3 LTR hits (top tier) defragmented into a single model, and therefore inferred to share an insertion event. This element is labeled 'i' in C and in Table <tblr tid="T1">1</tblr>, and corresponds to element Victim in <abbr bid="B45">45</abbr>. (Three hits modeled as part of the same IR are linked by a red line, two hits modeled as part of the second LTR linked by an orange line). C. Re-annotation layer 2: Nesting Structure. Overlapping element models are shown in their order of insertion as resolved by REANNOTATE. (Figure created by rendering in APOLLO GFF annotation automatically generated by REANNOTATE). Letters label LTR-elements in Table 1 that were  annotated in <abbr bid="B45">45</abbr>. D. Re-annotation layer 3: Time. For 'complete' LTR-elements the number of substitutions (vertical scale) between intra-element LTRs and the time since insertion have been automatically estimated. Upper bounds on the ages of two solo LTRs (labeled 'j' and 'm') could be placed as the elements are inserted into complete LTR-elements. Double-headed arrows span two standard deviations around estimates of K. This figure may be compared to Figures 1 and 2 in <abbr bid="B45">45</abbr>.</p>
                  </text>
                  <graphic file="1471-2164-9-614-3"/>
               </fig>
               <p>Fragmentation of repetitive element sequences in the R<smcaps>EPEAT</smcaps>M<smcaps>ASKER</smcaps> annotation may also result from divergent or chimeric elements relative to the reference sequences used in the similarity searches, or from matches to closely related sequences in the reference library, rather than sequence evolution of a repetitive element since its integration into the genome. RE<smcaps>ANNOTATE</smcaps> provides a facility to defragment hits in this situation as well (see section "Defragmentation of chromosomal elements matching multiple reference sequences" in Implementation). Additionally, DNA re-arrangements other than transposition of an entire element (e.g. segmental duplication) may occur involving repetitive elements after their integration; in this situation a single insertion model is not adequate to describe all of a query sequence that is homologous to a given repetitive element in the re-arranged region. Currently for LTR-elements (only) RE<smcaps>ANNOTATE</smcaps> checks for the possibility of re-arrangements that result in multiplication of segments of elements, and generates models of such re-arrangements (see below and Implementation).</p>
               <p>R<smcaps>EPEAT</smcaps>M<smcaps>ASKER</smcaps> annotation implicitly contains the chromosomal <it>distribution of sequence similarity hits </it>to dispersed repeats along the query sequence. This first layer of re-annotation consequently contains the chromosomal <it>distribution of repetitive element integration events</it>.</p>
            </sec>
         </sec>
         <sec>
            <st>
               <p>Layer 2 &#8211; Nesting Structure</p>
            </st>
            <p>As the genomic distribution of repetitive elements is not random, families of elements may vary in their distribution patterns, which may reflect biases towards chromatin states or proximity to particular classes of genes or other sequence elements <abbrgrp><abbr bid="B44">44</abbr></abbrgrp>. Clustering of repeats occurs not only at the chromosomal level, but also at smaller scales. In eukaryotic genomes, repeat deserts are commonly punctuated by regions of high repeat density, where subsequent insertions have occurred into previous repetitive element insertions ('nested' elements) <abbrgrp><abbr bid="B10">10</abbr><abbr bid="B24">24</abbr><abbr bid="B25">25</abbr><abbr bid="B26">26</abbr><abbr bid="B27">27</abbr><abbr bid="B28">28</abbr><abbr bid="B29">29</abbr><abbr bid="B30">30</abbr><abbr bid="B31">31</abbr><abbr bid="B32">32</abbr></abbrgrp>. Building upon the insertion models of the <it>Defragmentation </it>layer of re-annotation, RE<smcaps>ANNOTATE</smcaps> resolves any nested structures among the repetitive elements identified. Figure <figr fid="F3">3C</figr> shows resolved clusters of re-annotated TEs. The temporal <it>order of insertion events in nested structures </it>is reflected by the nesting level of each element model.</p>
            <sec>
               <st>
                  <p>Layer 3 &#8211; Time</p>
               </st>
               <p>The direct, long terminal repeats (LTRs) of any individual LTR-retrotransposon or retrovirus are created from the same parent template, and are therefore identical at the time the element integrates into the host molecule. Nucleotide substitutions in either of the two intra-element LTR sequences can accumulate as time passes, so that their sequence divergence works as a molecular timer, set to zero at the time of integration <abbrgrp><abbr bid="B45">45</abbr></abbrgrp>. RE<smcaps>ANNOTATE</smcaps> can generate automated intra-element LTR alignments and then estimate the number of nucleotide substitutions per site (<it>K</it>) that have occurred between the intra-element LTR of a given element since its insertion event. Estimates of <it>K </it>are a relative measure of time; but if the user can provide a rate of nucleotide substitution, estimates of <it>K </it>can be converted to estimates of absolute ages of (structurally) 'complete' LTR-elements. (Here by 'complete' I refer to elements for which at least a portion of each LTR is still detectable). Although direct dating of insertions is only available for 'complete' LTR-elements (and made possible by the previous <it>Defragmentation </it>layer of re-annotation), in nested structures bounds on the ages of other kinds of sequence elements can be estimated by using the age estimate for an overlapping LTR-element. If a 'complete' LTR-element is found in a nested cluster of repetitive elements, the previous <it>Nesting Structure </it>layer of re-annotation allows the placement of <it>i) </it>an upper bound on the age of any elements nested within, and <it>ii) </it>a lower bound on the age of any elements nesting, the LTR-element. Figure <figr fid="F3">3D</figr> shows examples of both direct dating of insertion events and indirectly using the nesting structure. Additionally, indirect dating of host molecules is possible in cases such as <it>i) </it>segmental duplications with a differential content of, and <it>ii) </it>haplotypes bearing insertional polymorphism for, 'complete' LTR-elements. Here a paleontological analogy is particularly apt, as the dating of both the molecular fossil and its sequence context are interrelated, just as real fossils can be dated by either absolute (e.g. radiocarbon) or relative (e.g. stratigraphic) methods. In eukaryotic genomes nesting of repetitive elements (see Background) and insertional polymorphisms <abbrgrp><abbr bid="B44">44</abbr><abbr bid="B46">46</abbr><abbr bid="B47">47</abbr><abbr bid="B48">48</abbr><abbr bid="B49">49</abbr><abbr bid="B50">50</abbr></abbrgrp>, for example] are common &#8211; in plants particularly for LTR-elements.</p>
               <p>This layer of re-annotation implicitly contains the <it>age distribution </it>of 'complete' LTR-elements, and bounds on the ages of other kinds of sequences that may be found overlapping LTR-elements.</p>
            </sec>
            <sec>
               <st>
                  <p>Visualisation of re-annotation</p>
               </st>
               <p>In addition to the main re-annotation output to tab-delimited text files, for visualisation and human analysis of the automated annotation RE<smcaps>ANNOTATE</smcaps> generates a General Feature Format (GFF) annotation file. The GFF annotation can be visualised (using a configuration file distributed with RE<smcaps>ANNOTATE</smcaps>) in the A<smcaps>POLLO</smcaps> genome browser <abbrgrp><abbr bid="B34">34</abbr></abbrgrp>. Figure <figr fid="F3">3C</figr> is taken from a screenshot of such a visualisation. The GFF output can be combined with any other kinds of annotation of the query sequences that are available in the GFF format.</p>
            </sec>
         </sec>
         <sec>
            <st>
               <p>Sequence output</p>
            </st>
            <p>RE<smcaps>ANNOTATE</smcaps> retrieves (from the query) and assembles the sequence of all repetitive element models constructed. For each element model, it outputs the sequences of all the fossil fragments associated with a given model without intervening (unrelated) regions of the query sequence (Figure <figr fid="F2">2</figr>). Each pair of neighbouring fragment sequences is either <it>i) </it>output contiguously if they match contiguous segments of the reference sequence, or <it>ii) </it>output separated by a gap of length equal to distance between the respective matching segments of the reference sequence (see Figure <figr fid="F2">2</figr> and Implementation). Thus these gapped sequences reflect the element insertion model and exclude the sequence of subsequent insertions into the element. For each 'family' of repeats (where 'family' refers to the set of elements in the query sequence that share a closest homologue in the reference library): <it>gapped element sequences are all aligned relative to the reference sequence</it>. Alignment of all the elements within a family, whose sequences may contain large indels relative to one another, is a powerful feature for the evolutionary (phylogenetic) analysis of repetitive elements.</p>
         </sec>
         <sec>
            <st>
               <p>Validation: nested transposable elements in cereal genomes</p>
            </st>
            <p>In order to validate annotation and evolutionary analysis generated by RE<smcaps>ANNOTATE</smcaps>, I have re-annotated repetitive elements in maize and wheat sequences, and then compared their automated re-annotation with published human annotation of these sequences. The term 'molecular paleontology' was coined by SanMiguel <it>et al</it>. <abbrgrp><abbr bid="B45">45</abbr></abbrgrp>, who produced manual annotation of nested clusters of LTR-retrotransposons within 240 kb around the <it>adh1 </it>locus of a particular maize cultivar and inferred a doubling of the maize genome size due to LTR-retrotransposon activity over a period of three million years &#8211; an inference that was possible by dating (using intra-element LTR sequence divergence) insertion events. Approximately 160 Kb of contiguous sequence and annotation for this locus are available. Detailed annotation of highly nested clusters of LTR-retrotransposons (including age estimates for complete elements) and other types of TEs is also available for a 215 Kb segment of chromosome 5A<sup><it>m </it></sup>of the diploid wheat <it>Triticum monococcum </it><abbrgrp><abbr bid="B51">51</abbr></abbrgrp>. Both the 160 Kb- and 215 Kb-long maize and diploid wheat sequences were annotated with R<smcaps>EPEAT</smcaps>M<smcaps>ASKER</smcaps> and re-annotated with RE<smcaps>ANNOTATE</smcaps> (see Implementation for details). The automated re-annotation and evolutionary analyses are illustrated in Figure <figr fid="F3">3</figr> (maize) and Figure <figr fid="F4">4</figr> (wheat), and compared to the manually curated annotation in Table <tblr tid="T1">1</tblr>.</p>
            <tbl id="T1">
               <title>
                  <p>Table 1</p>
               </title>
               <caption>
                  <p>Comparison between automated and human annotation of TEs</p>
               </caption>
               <tblbdy cols="11">
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c cspan="2" ca="center">
                        <p>repeat<sup><it>a</it></sup></p>
                     </c>
                     <c ca="center">
                        <p>hits<sup><it>b</it></sup></p>
                     </c>
                     <c cspan="2" ca="center">
                        <p>nests<sup><it>c</it></sup></p>
                     </c>
                     <c cspan="2" ca="center">
                        <p><it>K </it>&#177; s.d. (&#215; 10<sup>3</sup>)<sup><it>d</it></sup></p>
                     </c>
                     <c cspan="2" ca="center">
                        <p>time &#177; s.d. (Mya)<sup><it>e</it></sup></p>
                     </c>
                     <c ca="left">
                        <p>type</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="11">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>a</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Ji-6</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <smcaps>PREM2_ZM</smcaps>
                        </p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>2</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="left">
                        <p>...</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>...</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>LTR</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="11">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>b</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Tekay</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <smcaps>TEKAY_ZM</smcaps>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>1</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>...</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>...</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>LTR</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="11">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>c</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Rle</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <smcaps>REINA</smcaps>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>0</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>LTR</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="11">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>d</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Cinful-2</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <smcaps>CINFUL2_ZM</smcaps>
                        </p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>0</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>LTR</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="11">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>e</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Milt</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>00081</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>0</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>20.3 &#177; 5.5</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>>2.4 &#177; 1.4</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>1.56 &#177; 0.42</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>> .18 &#177; .15</p>
                     </c>
                     <c ca="left">
                        <p>LTR</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="11">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>*</p>
                     </c>
                     <c ca="left">
                        <p>00081</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>0</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>LTR</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="11">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>f</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Opie-2</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <smcaps>OPIE2_ZM</smcaps>
                        </p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>1</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>2.4 &#177; 1.6</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>2.4 &#177; 1.4</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>0.18 &#177; 0.11</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>0.18 &#177; 0.15</p>
                     </c>
                     <c ca="left">
                        <p>LTR</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="11">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>g</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Fourf</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>00098</p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>0</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>18.1 &#177; 4.1</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>18.2 &#177; 4.1</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>1.39 &#177; 0.32</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>1.40 &#177; 0.44</p>
                     </c>
                     <c ca="left">
                        <p>LTR</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="11">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>h</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Huck-2</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <smcaps>HUCK1</smcaps>
                        </p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>1</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>12.3 &#177; 2.9</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>15.3 &#177; 3.1</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>0.95 &#177; 0.22</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>1.18 &#177; 0.34</p>
                     </c>
                     <c ca="left">
                        <p>LTR</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="11">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>i</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Victim</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>00093</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>0</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>31.4 &#177; 19</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>30.7 &#177; 18</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>2.42 &#177; 1.44</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>2.36 &#177; 1.92</p>
                     </c>
                     <c ca="left">
                        <p>LTR</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="11">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>j</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Ji-2</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <smcaps>PREM2_ZM</smcaps>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>1</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>&lt; 31 &#177; 18</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>&lt; 2.4 &#177; 1.9</p>
                     </c>
                     <c ca="left">
                        <p>LTR</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="11">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>k</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Ji-3</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <smcaps>PREM2_ZM</smcaps>
                        </p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>1</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>24.2 &#177; 4.8</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>24.7 &#177; 4.7</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>1.86 &#177; 0.37</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>1.90 &#177; 0.51</p>
                     </c>
                     <c ca="left">
                        <p>LTR</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="11">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>l</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Opie-3</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <smcaps>OPIE2_ZM</smcaps>
                        </p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>2</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>6.4 &#177; 2.3</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>6.4 &#177; 2.3</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>0.49 &#177; 0.18</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>0.49 &#177; 0.25</p>
                     </c>
                     <c ca="left">
                        <p>LTR</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="11">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>m</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Ji-5</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <smcaps>PREM2_ZM</smcaps>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>2</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>&lt; 25 &#177; 5</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>&lt; 1.9 &#177; 0.5</p>
                     </c>
                     <c ca="left">
                        <p>LTR</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="11">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>n</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Ji-4</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <smcaps>PREM2_ZM</smcaps>
                        </p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>1</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>21.1 &#177; 4.2</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>20.8 &#177; 4.1</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>1.62 &#177; 0.32</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>1.60 &#177; 0.44</p>
                     </c>
                     <c ca="left">
                        <p>LTR</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="11">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>o</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Reina</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <smcaps>REINA</smcaps>
                        </p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>0</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>27.0 &#177; 9.8</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>26.4 &#177; 9.4</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>2.08 &#177; 0.75</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>2.03 &#177; 1.02</p>
                     </c>
                     <c ca="left">
                        <p>LTR</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="11">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>p</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Cinful-1</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <smcaps>CINFUL1/2_ZM</smcaps>
                        </p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>1</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>3.4 &#177; 2.4</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>3.4 &#177; 2.4</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>0.26 &#177; 0.18</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>0.26 &#177; 0.26</p>
                     </c>
                     <c ca="left">
                        <p>LTR</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="11">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>q</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Kake-1</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>00243</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>1</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>...</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>...</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>LTR</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="11">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Angela_F2-2</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <smcaps>ANGELA1_TM</smcaps>
                        </p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>1</it>
                           <sup>&#8224;</sup>
                        </p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>LTR</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="11">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>2</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>RIRE2 (rice)</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <smcaps>SABRINA2_TM</smcaps>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>0</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>LTR</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="11">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>3</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>SabrinaF_2-2</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <smcaps>SABRINA2_TM</smcaps>
                        </p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>0</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>25.9 &#177; 4.2</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>26.6 &#177; 4.2</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>1.99 &#177; 0.32</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>2.04 &#177; 0.46</p>
                     </c>
                     <c ca="left">
                        <p>LTR</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="11">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>
                           <smcaps>SABRINA3_TM</smcaps>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>&lt; 27 &#177; 4</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>&lt; 2.0 &#177; 0.5</p>
                     </c>
                     <c ca="left">
                        <p>LTR</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="11">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>
                           <smcaps>SABRINA_HV</smcaps>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>&lt; 27 &#177; 4</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>&lt; 2.0 &#177; 0.5</p>
                     </c>
                     <c ca="left">
                        <p>LTR</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="11">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>4</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Nusif_F2-1</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <smcaps>NUSIF1_TM</smcaps>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>1</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>&lt; 27 &#177; 4</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>&lt; 2.0 &#177; 0.5</p>
                     </c>
                     <c ca="left">
                        <p>LTR</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="11">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>5</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>RIRE2 (rice)</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <smcaps>RIRE2</smcaps>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>0</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>LTR</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="11">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>6</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>MITE 1-4</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <smcaps>THALOS_HV</smcaps>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>0</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>MITE</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="11">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>7</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>MITE 2-5</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <smcaps>TREP220</smcaps>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>0</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>MITE</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="11">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>8</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Veju_F2-1</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <smcaps>VEJU1_TM</smcaps>
                        </p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>0</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>10.8 &#177; 5.5</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>10.8 &#177; 5.4</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>0.83 &#177; 0.42</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>0.83 &#177; 0.59</p>
                     </c>
                     <c ca="left">
                        <p>LTR</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="11">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>9</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Claudia_F2-1</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <smcaps>CLAUDIA1_TM</smcaps>
                        </p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>0</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>> 41 &#177; 6</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>> 3.2 &#177; 0.6</p>
                     </c>
                     <c ca="left">
                        <p>LTR</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="11">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>10</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Latidu F2-1</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <smcaps>LATIDU2_TM</smcaps>
                        </p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>1</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>13.3 &#177; 5.3</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>13.1 &#177; 5.4</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>1.01 &#177; 0.41</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>1.01 &#177; 0.58</p>
                     </c>
                     <c ca="left">
                        <p>LTR</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="11">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>11</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Wham F2-1</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <smcaps>WHAM3_TM</smcaps>
                        </p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>1</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>40.6 &#177; 5.6</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>41.4 &#177; 5.6</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>3.12 &#177; 0.43</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>3.18 &#177; 0.60</p>
                     </c>
                     <c ca="left">
                        <p>LTR</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="11">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>12</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Fatima_F2-1</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <smcaps>FATIMA_TM</smcaps>
                        </p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>0</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>> 31 &#177; 4</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>> 2.4 &#177; 0.4</p>
                     </c>
                     <c ca="left">
                        <p>LTR</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="11">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>13</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Sukkula_F2-1</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <smcaps>SUKKULA3_TM</smcaps>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>1</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>29.9 &#177; 2.6</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>2.30 &#177; 0.20</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>LTR</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="11">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>
                           <smcaps>SUKKULA3_TM</smcaps>
                        </p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>1</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>29.9 &#177; 2.6</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>30.7 &#177; 3.5</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>2.30 &#177; 0.20</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>2.36 &#177; 0.37</p>
                     </c>
                     <c ca="left">
                        <p>LTR</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="11">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>14</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Angela_F2-3</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <smcaps>ANGELA1_TM</smcaps>
                        </p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>2</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>&lt; 31 &#177; 4</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>&lt; 2.4 &#177; 0.4</p>
                     </c>
                     <c ca="left">
                        <p>LTR</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="11">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>15</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Angela_F2-1</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <smcaps>ANGELA1_TM</smcaps>
                        </p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>2</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>19.9 &#177; 3.4</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>19.9 &#177; 3.4</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>1.53 &#177; 0.26</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>1.53 &#177; 0.37</p>
                     </c>
                     <c ca="left">
                        <p>LTR</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="11">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>16</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Sabrina_F2-1</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <smcaps>SABRINA3_TM</smcaps>
                        </p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>0</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>LTR</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="11">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>17</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Wis_F2-1</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <smcaps>WIS4_TM</smcaps>
                        </p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>0</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>58.1 &#177; 6.0</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>57.0 &#177; 6.0</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>4.47 &#177; 0.46</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>4.38 &#177; 0.64</p>
                     </c>
                     <c ca="left">
                        <p>LTR</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="11">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>18</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Sabrina_G1-1</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <smcaps>SABRINA1_TM</smcaps>
                        </p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>0</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>55.8 &#177; 6.1</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>> 39 &#177; 5</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>4.29 &#177; 0.47</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>> 3.0 &#177; 0.6</p>
                     </c>
                     <c ca="left">
                        <p>LTR</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="11">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>
                           <smcaps>SABRINA1_TM</smcaps>
                        </p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>0</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>55.8 &#177; 6.1</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>4.29 &#177; 0.47</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>LTR</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="11">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>19</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Wham_G1-2</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <smcaps>WHAM2_TM</smcaps>
                        </p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>1</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>39.1 &#177; 5.5</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>39.1 &#177; 5.4</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>3.01 &#177; 0.42</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>3.01 &#177; 0.58</p>
                     </c>
                     <c ca="left">
                        <p>LTR</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="11">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>20</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Sabrina_G1-2</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <smcaps>SABRINA2_TM</smcaps>
                        </p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>2</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>34.7 &#177; 4.8</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>35.9 &#177; 4.9</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>2.67 &#177; 0.37</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>2.76 &#177; 0.52</p>
                     </c>
                     <c ca="left">
                        <p>LTR</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="11">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>21</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Wham_G1-1</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <smcaps>WHAM1_TM</smcaps>
                        </p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>3</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>32.2 &#177; 4.9</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>31.6 &#177; 4.8</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>2.48 &#177; 0.38</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>2.43 &#177; 0.52</p>
                     </c>
                     <c ca="left">
                        <p>LTR</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="11">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>22</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Miuse_G1-1</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <smcaps>MIUSE1_TM</smcaps>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>2</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>&lt; 39 &#177; 5</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>&lt; 3.0 &#177; 0.6</p>
                     </c>
                     <c ca="left">
                        <p>LINE</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="11">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>23</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Latidu_G1-1</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <smcaps>LATIDU2_TM</smcaps>
                        </p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>1</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>LTR</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="11">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>24</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Eway_G1-1</it>
                           <sup>&#8225;</sup>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <smcaps>EWAY1_TM</smcaps>
                        </p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>0</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>0</it>
                           <sup>&#8225;</sup>
                        </p>
                     </c>
                     <c ca="left">
                        <p>73.1 &#177; 18</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>0</it>
                           <sup>&#8225;</sup>
                        </p>
                     </c>
                     <c ca="left">
                        <p>5.62 &#177; 1.87</p>
                     </c>
                     <c ca="left">
                        <p>LTR</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="11">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>25</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it><smcaps>MITE</smcaps> 4A-10</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <smcaps>TREP216</smcaps>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>0</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>MITE</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="11">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>26</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it><smcaps>MITE</smcaps> 4A-4B</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <smcaps>TREP216</smcaps>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>0</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>MITE</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="11">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>27</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Barbara</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <smcaps>BARBARA_TM</smcaps>
                        </p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>0</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>LTR</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="11">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>28</p>
                     </c>
                     <c ca="left">
                        <p>
                           <it>Angela_G1-1</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <smcaps>ANGELA6_TM</smcaps>
                        </p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>0</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1<sup>&#8224;</sup></p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>-</p>
                     </c>
                     <c ca="left">
                        <p>LTR</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>Manual annotation results of maize <abbrgrp><abbr bid="B45">45</abbr></abbrgrp> and diploid wheat <abbrgrp><abbr bid="B51">51</abbr></abbrgrp> sequences are shown in <it>italics</it>. RE<smcaps>ANNOTATE</smcaps> results are shown in regular font style. Only elements spanning sequences that were annotated as TEs both in the manual annotation and in the input (R<smcaps>EPEAT</smcaps>M<smcaps>ASKER</smcaps>) to the automated re-annotation are listed. In the first column letters indicate maize elements and correspond to labels in Figure <figr fid="F3">3C</figr>, numbers indicate wheat elements and labels in Figure <figr fid="F4">4</figr>.</p>
                  <p><sup><it>a </it></sup>Uppercase names correspond to reference element sequences in R<smcaps>EP</smcaps>B<smcaps>ASE</smcaps> U<smcaps>PDATE</smcaps> (RU), numbers correspond to reference sequences in the TIGR Z<smcaps>EA</smcaps> R<smcaps>EPEAT</smcaps> D<smcaps>ATABASE</smcaps>. Rows without an entry for the manually annotated repeat name indicate that RE<smcaps>ANNOTATE</smcaps> constructed multiple models (one model per row) corresponding to a single element in the manual annotation: for instance, <it>Sabrina_F2-2 </it>corresponds to three automated models, a result due to the fact that (parts of) different RU reference elements, <smcaps>SABRINA2_TM</smcaps>, <smcaps>SABRINA3_TM</smcaps> and <smcaps>SABRINA_HV</smcaps> are closely related, and were best matches (annotated by R<smcaps>EPEAT</smcaps>M<smcaps>ASKER</smcaps>) to different segments of <it>Sabrina_F2-2</it>.</p>
                  <p><sup><it>b </it></sup>Number of similarity hits reported by R<smcaps>EPEAT</smcaps>M<smcaps>ASKER</smcaps> that were defragmented into a single element model by RE<smcaps>ANNOTATE</smcaps>.</p>
                  <p><sup><it>c </it></sup>Number of repetitive elements nesting a given element. (<sup>&#8224;</sup>) The first wheat element listed was annotated in <abbrgrp><abbr bid="B51">51</abbr></abbrgrp> to be inserted into a TE sequence with no detectable similarity to reference elements in RU (absent form the input R<smcaps>EPEAT</smcaps>M<smcaps>ASKER</smcaps> annotation); the last wheat element was annotated by RE<smcaps>ANNOTATE</smcaps> to be interrupting a fragment of an element homologous to <smcaps>CLAUDIA1_TM</smcaps>, which is not present in the manual annotation.</p>
                  <p><sup><it>d </it></sup>Estimated number of nucleotide substitutions between intra-element LTRs. (*) RE<smcaps>ANNOTATE</smcaps> did not date <it>Milt </it>because the 3' LTR is in inverse orientation relative to the rest of the element: an element model was built including the <it>Milt </it>5' LTR and internal region, and another model for the 3' LTR. (<sup>&#8225;</sup>) <it>Eway G1-1 </it>was originally annotated as having identical LTRs, but they are in fact quite divergent. (...) Elements <it>Ji-6</it>, <it>Tekay </it>and <it>Kake-1 </it>were dated in the original annotation, but these elements are truncated at the ends of the available 160 Kb of contiguous sequence re-annotated here.</p>
                  <p><sup><it>e </it></sup>Estimated time of insertion (million years ago), obtained with the substitution rate for the <it>adh </it>loci of grasses <abbrgrp><abbr bid="B66">66</abbr></abbrgrp>. The standard deviations computed by RE<smcaps>ANNOTATE</smcaps> are larger than in the manual annotation: in the latter the variance in time was propagated from the variance in <it>K</it>, whilst RE<smcaps>ANNOTATE</smcaps> additionally accounts for the Poisson variance (stochasticity) in the accumulation of nucleotide substitutions.</p>
               </tblfn>
            </tbl>
            <p>Although the analyses above focused on the annotation of LTR-elements, they also validate the automated re-annotation of other kinds of repetitive elements and were chosen for two main reasons: <it>i) </it>Models of LTR-elements in the first (Defragmentation) layer of re-annotation are more complex than models of other repeats. LTR-element models involve defragmentation of LTRs (which are themselves repetitive elements) and defragmentation of the internal regions (which are also repetitive) using the same algorithm as for other repeats, and an additional algorithm to defragment LTRs and internal regions modeled as parts of the structure of the same element (see Implementation). Therefore validation of the automated defragmentation of LTR-elements can be extended to other repetitive elements. The second layer (<it>Nesting Structure</it>) of re-annotation builds on the first layer and uses the same algorithm for all elements. Finally, the third layer (<it>Time</it>) of re-annotation is directly applicable only to (structurally) complete LTR-elements (though secondary inferences can be made for other kinds of elements if they overlap with a complete LTR-element). Note that results for the Time layer are obtained independently from any information on <it>Nesting Structure </it>(except for the secondary inferences of bounds on the ages of overlapping elements). The <it>Time </it>layer annotation of the LTR-elements in the maize and wheat sequences analysed is completely consistent with the <it>Nesting Structure </it>annotation (Figures <figr fid="F3">3</figr> and <figr fid="F4">4</figr>, and Table <tblr tid="T1">1</tblr>), adding support to the method for dating LTR-elements. Consistency between nesting structure and age estimates (obtained from a method different from the one used here) has also been shown for mammalian TEs using TCF <abbrgrp><abbr bid="B24">24</abbr></abbrgrp>, suggesting that this result is a general feature of the molecular paleontology of TEs. <it>ii) </it>For the maize and wheat sequences analysed, manually curated annotation of transposable elements was available that contained detailed information on the nesting order of repeats and dating of LTR-elements, providing a standard for comparing automated inferences against.</p>
            <fig id="F4">
               <title>
                  <p>Figure 4</p>
               </title>
               <caption>
                  <p>Re-annotation of highly nested TEs in the diploid wheat genome</p>
               </caption>
               <text>
                  <p><b>Re-annotation of highly nested TEs in the diploid wheat genome</b>. Re-annotation of repeats in a 215 Kb region of <it>Triticum monococcum </it>chromosome 5A<sup><it>m</it></sup>. Numbers label elements listed in Table <tblr tid="T1">1</tblr>. Colour scheme follows Figure <figr fid="F3">3</figr> (except that thin green boxes are specifically MITEs, and on the bottom tier unique genes are shown as light blue boxes). Label "18" appears twice and corresponds to element <it>Sabrina_G1-1 </it>annotated in <abbrgrp><abbr bid="B51">51</abbr></abbrgrp> (Fig. <figr fid="F1">1</figr> in this reference may be compared with this figure); RE<smcaps>ANNOTATE</smcaps> constructed two separate models because when run with default parameters the maximum span of a model is 40 Kb, which is exceeded by the chromosomal span of <it>Sabrina_G1-1</it>. Element models marked with a "*" above a horizontal bar were annotated as part of element <it>Sabrina_F2-2 </it>in <abbrgrp><abbr bid="B51">51</abbr></abbrgrp>, which corresponds to label "3" in this figure (see section 'Limitations and scope for development'). This figure was rendered in A<smcaps>POLLO</smcaps> from a GFF annotation file generated by RE<smcaps>ANNOTATE</smcaps>.</p>
               </text>
               <graphic file="1471-2164-9-614-4"/>
            </fig>
            <p>Here I compared (Table <tblr tid="T1">1</tblr>) only TE sequences that are present both in the manual and in the R<smcaps>EPEAT</smcaps>M<smcaps>ASKER</smcaps> annotation input to RE<smcaps>ANNOTATE</smcaps>, because only those are relevant for evaluating RE<smcaps>ANNOTATE</smcaps>'s inferences. RE<smcaps>ANNOTATE</smcaps>'s predictions in the three layers of re-annotation achieved excellent accuracy relative to manual annotation, with results shown in Table <tblr tid="T2">2</tblr> (see Implementation for calculation). The actual accuracy of predictions may be higher, as for some of the discrepancies the correct prediction may be attributed, upon inspection, to the automated re-annotation, despite the generally high quality of the human-curated annotation. For instance, the given accuracy of the <it>Time </it>layer predictions for the wheat sequence was obtained from 10 correct predictions out of 11 complete LTR-elements dated (90.9%). The one disagreement refers to the element <it>Eway_G1-1 </it>(Table <tblr tid="T1">1</tblr>) that was originally misannotated as having identical intra-element LTRs; the LTR sequences are actually quite divergent and RE<smcaps>ANNOTATE</smcaps> predicts the element to be the oldest of all complete LTR-elements found in the analysed query sequence.</p>
            <tbl id="T2">
               <title>
                  <p>Table 2</p>
               </title>
               <caption>
                  <p>Accuracy of RE<smcaps>ANNOTATE</smcaps>'s inferences relative to manual annotation</p>
               </caption>
               <tblbdy cols="5">
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c cspan="2" ca="center">
                        <p>Defragmentation</p>
                     </c>
                     <c ca="center">
                        <p>Nesting</p>
                     </c>
                     <c ca="center">
                        <p>Time</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>Sensitivity</p>
                     </c>
                     <c ca="center">
                        <p>Specificity</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c cspan="5">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>maize (AF123535.1)</p>
                     </c>
                     <c ca="center">
                        <p>97.8%</p>
                     </c>
                     <c ca="center">
                        <p>100.0%</p>
                     </c>
                     <c ca="center">
                        <p>100.0%</p>
                     </c>
                     <c ca="center">
                        <p>100.0%</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>wheat (AF459639.1)</p>
                     </c>
                     <c ca="center">
                        <p>96.0%</p>
                     </c>
                     <c ca="center">
                        <p>100.0%</p>
                     </c>
                     <c ca="center">
                        <p>93.3%</p>
                     </c>
                     <c ca="center">
                        <p>90.9%</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>Accuracy of predictions in the Defragmentation layer of re-annotation is given by their sensitivity and specificity according to the formulas <inline-formula><m:math name="1471-2164-9-614-i4" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:mfrac><m:mrow><m:mi>T</m:mi><m:mi>P</m:mi></m:mrow><m:mrow><m:mi>T</m:mi><m:mi>P</m:mi><m:mo>+</m:mo><m:mi>F</m:mi><m:mi>N</m:mi></m:mrow></m:mfrac></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaqcfa4aaSaaaeaacqWGubavcqWGqbauaeaacqWGubavcqWGqbaucqGHRaWkcqWGgbGrcqWGobGtaaaaaa@3443@</m:annotation></m:semantics></m:math></inline-formula> and <inline-formula><m:math name="1471-2164-9-614-i5" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:mfrac><m:mrow><m:mi>T</m:mi><m:mi>N</m:mi></m:mrow><m:mrow><m:mi>T</m:mi><m:mi>N</m:mi><m:mo>+</m:mo><m:mi>F</m:mi><m:mi>P</m:mi></m:mrow></m:mfrac></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaqcfa4aaSaaaeaacqWGubavcqWGobGtaeaacqWGubavcqWGobGtcqGHRaWkcqWGgbGrcqWGqbauaaaaaa@343F@</m:annotation></m:semantics></m:math></inline-formula> respectively, where <it>TP </it>is the count of true positives, <it>TN </it>true negatives, <it>FP </it>false positives, and <it>FN </it>the count of false negatives (see Implementation) relative to the manual annotations. Here a 'prediction' refers to a sequence similarity hit reported by R<smcaps>EPEAT</smcaps>M<smcaps>ASKER</smcaps> that has been defragmented into a TE model by RE<smcaps>ANNOTATE</smcaps>. For the Nesting Structure and Time layers accuracy is given as the proportion of predictions in agreement with the original manual annotation <abbrgrp><abbr bid="B45">45</abbr><abbr bid="B51">51</abbr></abbrgrp>.</p>
               </tblfn>
            </tbl>
         </sec>
         <sec>
            <st>
               <p>Repetitive DNA rearrangements other than transposition</p>
            </st>
            <p>In order to demonstrate that RE<smcaps>ANNOTATE</smcaps> can correctly detect and annotate DNA re-arrangement events (involving TE sequences) that have occurred after integration, I have re-annotated a region of the <it>D. melanogaster </it>genome that has previously been shown to contain a large number of TE fragments that have arisen by the joint effects of integration and duplication <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>. This region lies between the <it>Hsp70 </it>genes on chromosome arm 3R and is denoted in ref. <abbrgrp><abbr bid="B52">52</abbr></abbrgrp> as <it>NEST_FBti0020655 </it>(Release 3), and in ref. <abbrgrp><abbr bid="B32">32</abbr></abbrgrp> as "high density region 16 (HDR16)" (Release 4). The re-annotation of <it>NEST_FBti0020655 </it>is shown in Figure <figr fid="F5">5</figr>, which can be compared to Figure <figr fid="F2">2</figr> in ref. <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>. This serves to illustrate an additional type of inference that can be generated by RE<smcaps>ANNOTATE</smcaps>: the identification of segmental duplications contained in dispersed repeats (currently for LTR-elements only). Such a rearrangement could result from tandem duplication of a segment of an element, or from inter-element recombination events (e.g. between LTRs). Note that here REannotate attempts to construct an explicit model of sequence duplication to distinguish the origin of the repetitive sequences from independent transposition events.</p>
            <fig id="F5">
               <title>
                  <p>Figure 5</p>
               </title>
               <caption>
                  <p>Segmental multiplication within a TE cluster in the fly genome</p>
               </caption>
               <text>
                  <p><b>Segmental multiplication within a TE cluster in the fly genome</b>. Re-annotation of a cluster of repeats in the <it>Drosophila melanogaster </it>chromosome arm 3R. The scale shows chromosomal coordinates (Release 3.1 genome sequence). Visualisation scheme as in Figure <figr fid="F3">3</figr>, except that "element" models &#8211; displayed as boxes united by horizontal lines &#8211; no longer indicate sequences sharing an insertion (transposition) event; here a model indicates sequences that resulted from segmental multiplication subsequent to an original insertion. Note the high periodicity of the arrangement. The LTR-elements displayed immediately above the bottom tier all belong to the <smcaps>COPIA2</smcaps> family, the sequences marked with a '*' are all <smcaps>INVADER1</smcaps> LTRs, and the ones marked with a black bar are <smcaps>MICROPIA</smcaps> elements. RE<smcaps>ANNOTATE</smcaps> infers that the <smcaps>COPIA2</smcaps> sequences have been involved in DNA rearrangements other than transposition of an entire element. It is likely that the <smcaps>INVADER1</smcaps> LTR was inserted in a <smcaps>COPIA2</smcaps> LTR prior to the multiplication of the latter. The green box to the left indicates (subsequent) insertion of a <smcaps>PROTOP_A</smcaps> element, and the ones on the right S-elements. (All family names given as in the RU database). This figure was rendered in A<smcaps>POLLO</smcaps> from a GFF annotation file generated by RE<smcaps>ANNOTATE</smcaps>, and it may be compared with Figure <figr fid="F2">2</figr> in ref. <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>.</p>
               </text>
               <graphic file="1471-2164-9-614-5"/>
            </fig>
            <p>The periodic structure of the LTR-element sequences in Figure <figr fid="F5">5</figr> strongly suggests tandem multiplication. The multiple similarity hits (annotated by R<smcaps>EPEAT</smcaps>M<smcaps>ASKER</smcaps>) to LTR-elements (in which other repeats are nested) in the structure (inferred by RE<smcaps>ANNOTATE</smcaps>) all belong to the same family (<smcaps>COPIA2</smcaps>), periodically map to the same regions of the reference sequence, and were re-annotated as elements that have been involved in a DNA rearrangement other than transposition. The 'solo' LTRs re-annotated as <it>nested in </it>the <smcaps>COPIA2</smcaps> sequences all belong to the same family (<smcaps>INVADER1</smcaps>), they are also periodically arrayed (with the same spacing as the nesting sequences), and are inserted at the same position within the reference <smcaps>COPIA2</smcaps> LTR.</p>
            <p>Hence it is evident that the <smcaps>INVADER1</smcaps> LTR was inserted in the <smcaps>COPIA2</smcaps> LTR prior to tandem multiplication. Arrangements of the kind found in the <it>D. melanogaster </it>cluster <it>NEST_FBti0020655</it>/HDR16 (Figure <figr fid="F5">5</figr>) have also been reported for LTR-elements in yeast <abbrgrp><abbr bid="B53">53</abbr></abbrgrp>, for human ERVs <abbrgrp><abbr bid="B54">54</abbr></abbrgrp>, and have been detected with RE<smcaps>ANNOTATE</smcaps> in the <it>Arabidopsis thaliana </it>genome (data not shown). Thus it is possible that expansion of clusters of TE sequences via mechanisms in addition to transposition is a common phenomenon in eukaryotic genomes <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>.</p>
            <p>Although prediction of DNA rearrangements other than transposition is under-annotated by RE<smcaps>ANNOTATE</smcaps> (only certain kinds of rearrangement are currently detected &#8211; see Implementation), these annotations have value in <it>i) </it>cautioning the user against the validity of dating LTR-elements that may have been involved in post-integration recombination events; and <it>ii) </it>marking structures with unusual features for human inspection of their annotation.</p>
         </sec>
         <sec>
            <st>
               <p>Genome Paleontology</p>
            </st>
            <p>The advent of automated defragmentation and sequence analysis of fossil fragments of dispersed repeats makes possible the study of the evolutionary dynamics of these elements (and their host molecules) at the scale of entire chromosomes or genomes <abbrgrp><abbr bid="B44">44</abbr><abbr bid="B54">54</abbr></abbrgrp>.</p>
            <p>In order to further illustrate evolutionary analyses that become possible with RE<smcaps>ANNOTATE</smcaps>, here I highlight the re-annotation of dispersed repeats in a human sex chromosome (Y) and in two autosomes (chromosomes 2 and 1).</p>
            <sec>
               <st>
                  <p>Nesting of repeats in the human genome</p>
               </st>
               <p>Nesting patterns and counts of insertion events of repeats re-annotated on human chromosomes Y and 2 are summarised in Table <tblr tid="T3">3</tblr>. One striking result is the scarcity of TE insertions nested in satellite repeats, especially on chromosome Y. Even though, from these data, the possibility of functional constraint on satellite arrays cannot be ruled out, it is plausible that this result reflects a recent expansion of satellite repeats on human chromosome Y. Taking chromosome 2 as an example, not only the density of TE insertions into satellite sequence is almost four times that on the Y chromosome, but also there are ten times as many satellite repeats on the Y as in chromosome 2 (Table <tblr tid="T3">3</tblr>). The density of TE insertions into satellite arrays has been recently used to infer an age gradient for domains of such arrays around the primate X chromosome centromere <abbrgrp><abbr bid="B55">55</abbr></abbrgrp>.</p>
               <tbl id="T3">
                  <title>
                     <p>Table 3</p>
                  </title>
                  <caption>
                     <p>Nesting of repeats in human chromosomes Y and 2</p>
                  </caption>
                  <tblbdy cols="9">
                     <r>
                        <c>
                           <p/>
                        </c>
                        <c cspan="2" ca="center">
                           <p>no. of elements<sup><it>a</it></sup></p>
                        </c>
                        <c cspan="2" ca="center">
                           <p>% chromosome<sup><it>b</it></sup></p>
                        </c>
                        <c cspan="2" ca="center">
                           <p>% nested<sup><it>c</it></sup></p>
                        </c>
                        <c cspan="2" ca="center">
                           <p>insertions/Kb<sup><it>d</it></sup></p>
                        </c>
                     </r>
                     <r>
                        <c>
                           <p/>
                        </c>
                        <c ca="right">
                           <p>Y</p>
                        </c>
                        <c ca="left">
                           <p>2</p>
                        </c>
                        <c ca="right">
                           <p>Y</p>
                        </c>
                        <c ca="left">
                           <p>2</p>
                        </c>
                        <c ca="right">
                           <p>Y</p>
                        </c>
                        <c ca="left">
                           <p>2</p>
                        </c>
                        <c ca="right">
                           <p>Y</p>
                        </c>
                        <c ca="left">
                           <p>2</p>
                        </c>
                     </r>
                     <r>
                        <c cspan="9">
                           <hr/>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>SINE</p>
                        </c>
                        <c ca="right">
                           <p>10068</p>
                        </c>
                        <c ca="left">
                           <p>119383</p>
                        </c>
                        <c ca="right">
                           <p>10%</p>
                        </c>
                        <c ca="left">
                           <p>12%</p>
                        </c>
                        <c ca="right">
                           <p>46%</p>
                        </c>
                        <c ca="left">
                           <p>51%</p>
                        </c>
                        <c ca="right">
                           <p>0.48</p>
                        </c>
                        <c ca="left">
                           <p>0.63</p>
                        </c>
                     </r>
                     <r>
                        <c cspan="9">
                           <hr/>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>LINE</p>
                        </c>
                        <c ca="right">
                           <p>6207</p>
                        </c>
                        <c ca="left">
                           <p>65661</p>
                        </c>
                        <c ca="right">
                           <p>25%</p>
                        </c>
                        <c ca="left">
                           <p>21%</p>
                        </c>
                        <c ca="right">
                           <p>32%</p>
                        </c>
                        <c ca="left">
                           <p>39%</p>
                        </c>
                        <c ca="right">
                           <p>0.81</p>
                        </c>
                        <c ca="left">
                           <p>1.66</p>
                        </c>
                     </r>
                     <r>
                        <c cspan="9">
                           <hr/>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>LTR</p>
                        </c>
                        <c ca="right">
                           <p>5200</p>
                        </c>
                        <c ca="left">
                           <p>38395</p>
                        </c>
                        <c ca="right">
                           <p>18%</p>
                        </c>
                        <c ca="left">
                           <p>8%</p>
                        </c>
                        <c ca="right">
                           <p>43%</p>
                        </c>
                        <c ca="left">
                           <p>52%</p>
                        </c>
                        <c ca="right">
                           <p>0.57</p>
                        </c>
                        <c ca="left">
                           <p>0.68</p>
                        </c>
                     </r>
                     <r>
                        <c cspan="9">
                           <hr/>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>satellite</p>
                        </c>
                        <c ca="right">
                           <p>2144</p>
                        </c>
                        <c ca="left">
                           <p>222</p>
                        </c>
                        <c ca="right">
                           <p>5%</p>
                        </c>
                        <c ca="left">
                           <p>0%</p>
                        </c>
                        <c ca="right">
                           <p>4%</p>
                        </c>
                        <c ca="left">
                           <p>37%</p>
                        </c>
                        <c ca="right">
                           <p>0.05</p>
                        </c>
                        <c ca="left">
                           <p>0.18</p>
                        </c>
                     </r>
                     <r>
                        <c cspan="9">
                           <hr/>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>DNA</p>
                        </c>
                        <c ca="right">
                           <p>1464</p>
                        </c>
                        <c ca="left">
                           <p>26886</p>
                        </c>
                        <c ca="right">
                           <p>2%</p>
                        </c>
                        <c ca="left">
                           <p>3%</p>
                        </c>
                        <c ca="right">
                           <p>35%</p>
                        </c>
                        <c ca="left">
                           <p>46%</p>
                        </c>
                        <c ca="right">
                           <p>1.12</p>
                        </c>
                        <c ca="left">
                           <p>0.88</p>
                        </c>
                     </r>
                     <r>
                        <c cspan="9">
                           <hr/>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>RNA</p>
                        </c>
                        <c ca="right">
                           <p>55</p>
                        </c>
                        <c ca="left">
                           <p>778</p>
                        </c>
                        <c ca="right">
                           <p>0%</p>
                        </c>
                        <c ca="left">
                           <p>0%</p>
                        </c>
                        <c ca="right">
                           <p>54%</p>
                        </c>
                        <c ca="left">
                           <p>56%</p>
                        </c>
                        <c ca="right">
                           <p>0.36</p>
                        </c>
                        <c ca="left">
                           <p>0.26</p>
                        </c>
                     </r>
                     <r>
                        <c cspan="9">
                           <hr/>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>repeats total</p>
                        </c>
                        <c ca="right">
                           <p>25157</p>
                        </c>
                        <c ca="left">
                           <p>251667</p>
                        </c>
                        <c ca="right">
                           <p>59%</p>
                        </c>
                        <c ca="left">
                           <p>45%</p>
                        </c>
                        <c ca="right">
                           <p>38%</p>
                        </c>
                        <c ca="left">
                           <p>48%</p>
                        </c>
                        <c ca="right">
                           <p>0.65</p>
                        </c>
                        <c ca="left">
                           <p>1.19</p>
                        </c>
                     </r>
                     <r>
                        <c cspan="9">
                           <hr/>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>non-repetitive<sup>&#8225;</sup></p>
                        </c>
                        <c ca="right">
                           <p>-</p>
                        </c>
                        <c ca="left">
                           <p>-</p>
                        </c>
                        <c ca="right">
                           <p>41%</p>
                        </c>
                        <c ca="left">
                           <p>55%</p>
                        </c>
                        <c ca="right">
                           <p>-</p>
                        </c>
                        <c ca="left">
                           <p>-</p>
                        </c>
                        <c ca="right">
                           <p>1.33<sup>&#8224;</sup></p>
                        </c>
                        <c ca="left">
                           <p>1.00</p>
                        </c>
                     </r>
                  </tblbdy>
                  <tblfn>
                     <p><sup><it>a</it></sup>Number of repetitive elements (models) in each group. <sup><it>b</it></sup>Percentage of available (cytologically euchromatic) chromosome sequence. <sup><it>c</it></sup>Percentage of elements in each group that are inserted into any other repetitive element. <sup><it>d</it></sup>Density of repetitive element insertions (of any kind) per kilo base pairs of sequence within each group. <sup>&#8225;</sup>Non-repetitive sequence here means that similarity to known families of high complexity repeats was not detected, but it includes low-complexity repeats. <sup>&#8224;</sup>Excludes satellite repeats arrayed in tandem. (Satellites considered are high-complexity repeats).</p>
                  </tblfn>
               </tbl>
               <p>Another noteworthy nesting pattern in the re-annotation is the difference between LINEs on the one hand, and SINEs and LTR-elements on the other. On chromosome 2, LINEs harbour on average over twice the density of TE insertions (of any kind) per unit of LINE sequence than do SINEs per unit of SINE sequence or LTR-elements per unit of LTR-element sequence. In addition, on chromosome 2 (and on the Y) the proportion of SINE and LTR-element insertions nested into other repetitive elements is higher than that of LINEs (Table <tblr tid="T3">3</tblr>). Multiple factors such as age, base composition, and insertional biases may contribute to these differences, although automated analysis of such factors is beyond what is currently implemented in RE<smcaps>ANNOTATE</smcaps>.</p>
            </sec>
            <sec>
               <st>
                  <p>Autosomal vs Y chromosome comparison of endogenous retroviral ages</p>
               </st>
               <p>The age distribution of endogenous retroviruses (ERVs) on human chromosomes 1, 2, and Y was obtained from their automated re-annotation, and shown in Figure <figr fid="F6">6</figr>. ERVs on both chromosomes 1 and 2 have less divergent intra-element LTRs than those on chromosome Y (Wilcoxon rank sum tests, <it>p </it>&lt; 10<sup>-6</sup>), whilst there is no significant age difference between chromosomes 1 and 2 (<it>p </it>= 0.6). The main period(s) of retroviral activity over evolutionary time must have generated most ERV insertions on all chromosomes, therefore the older estimated ages are consistent with a faster rate of evolution on the Y than on chromosomes 1 and 2. Given that the "old" tail of the age distributions on those three chromosomes are similarly shaped, the age difference is not purely an effect due to longer persistence of ERVs on the Y. Evidence (from different methods) for a faster rate of evolution of the human Y chromosome relative to autosomes has been previously reported (e.g. <abbrgrp><abbr bid="B56">56</abbr></abbrgrp>).</p>
               <fig id="F6">
                  <title>
                     <p>Figure 6</p>
                  </title>
                  <caption>
                     <p>Age distribution of endogenous retroviruses in the human genome</p>
                  </caption>
                  <text>
                     <p><b>Age distribution of endogenous retroviruses in the human genome</b>. Age distributions of endogenous retroviruses (ERVs) from the automated re-annotation of three human chromosomes. ERV intra-element LTRs on chromosome Y are significantly more divergent than those on chromosomes 2 and 1. Top axis shows the number of substitutions per kilo base pairs of intra-element LTR alignments. Age estimates obtained with a rate of 2.1 &#215; 10<sup>3 </sup>substitutions per site per million years <abbrgrp><abbr bid="B57">57</abbr></abbrgrp>.</p>
                  </text>
                  <graphic file="1471-2164-9-614-6"/>
               </fig>
               <p>In contrast to plant (cereal) genomes where LTR-element polymorphisms make the dating of these elements potentially useful for dating haplotypes from different lines within the same species, most primate ERV insertions are ancient. Using the rate of evolution for mammalian repeats estimated in <abbrgrp><abbr bid="B57">57</abbr></abbrgrp>, 2.1 &#215; 10<sup>-3 </sup>substitutions per site per million years, the age distributions of ERVs in the genome of the lineage leading to human peaks at around 40 Myr ago.</p>
            </sec>
         </sec>
         <sec>
            <st>
               <p>Discussion of potential applications</p>
            </st>
            <p>Re-annotation of repetitive elements via automated defragmentation, resolution of nesting structures, and (in some cases) dating of LTR-elements form the basis for the evolutionary analyses exemplified above. Re-annotation also makes possible other kinds of analyses that were not explored in this report. For example, <it>i) </it>analysis of the insertion sites of TEs, which would require proper defragmentation of fossil sequence fragments to identify the both termini of the ancestral TE sequence at the time of integration; <it>ii) </it>dating of LTR-elements could be useful for dating events on their host molecules, for instance when comparing haplotypes from cereal genomes where insertional polymorphism is common; and <it>iii) </it>analysis of global patterns of TE family nesting using network or interruption matrix analysis <abbrgrp><abbr bid="B24">24</abbr><abbr bid="B32">32</abbr></abbrgrp>.</p>
            <p>Defragmentation of repeats performed by RE<smcaps>ANNOTATE</smcaps> could also solve a problem that has plagued automated annotation of complex repeats: low-complexity regions <it>within </it>library sequences of reference high-complexity (dispersed) repeats, which results in low-complexity repeats in chromosomal sequence being annotated by R<smcaps>EPEAT</smcaps>M<smcaps>ASKER</smcaps> as high-complexity repeats <abbrgrp><abbr bid="B22">22</abbr></abbrgrp>. If these regions were masked on the reference sequences <it>prior </it>to their use in similarity searches, multiple hits might be reported even when the chromosomal sequence of a repetitive element is intact &#8211; this artefactual fragmentation is resolved by RE<smcaps>ANNOTATE</smcaps> into an element model, and masking of low-complexity regions in the reference library would be recommended in order to avoid low-complexity sequences being annotated as dispersed repeats. In addition to the re-annotation, the sequence output from RE<smcaps>ANNOTATE</smcaps> also has potential uses that were not explored in this report. Here this output was only utilised for dating LTR-elements, but alignment of all copies within a given family of repeats (which is non-trivial as copies typically have large indels relative to each other) is a powerful resource for evolutionary studies of repetitive elements. For instance, alignment of human ERV sequences (obtained with an early precursor to RE<smcaps>ANNOTATE</smcaps>) supported an analysis showing that members of the HERV-K family have been re-infecting the germline of the human lineage for 30 million years &#8211; from the inference of selective constraint on the HERV-K envelope gene <abbrgrp><abbr bid="B58">58</abbr></abbrgrp>. Further examples of potential applications aided by multiple alignments of TE sequences are analyses of <it>i) </it>transition/transversion ratios in tests for the detection of hypermutability associated with cytosine methylation <abbrgrp><abbr bid="B59">59</abbr></abbrgrp>, <it>ii) </it>insertion/deletion spectra, used to estimate rates of spontaneous DNA loss <abbrgrp><abbr bid="B60">60</abbr></abbrgrp>, and <it>iii) </it>evolutionary relationships among individual elements or families via phylogeny re-construction <abbrgrp><abbr bid="B58">58</abbr><abbr bid="B61">61</abbr></abbrgrp>.</p>
         </sec>
         <sec>
            <st>
               <p>Limitations and scope for development</p>
            </st>
            <sec>
               <st>
                  <p>Reference library issues</p>
               </st>
               <p>As RE<smcaps>ANNOTATE</smcaps> takes as input similarity-based (R<smcaps>EPEAT</smcaps>M<smcaps>ASKER</smcaps>) annotation, only elements homologous to known families of repeats can be re-annotated. For genomes with poorly characterised repeat families, any similarity-based detection methodology needs to be complemented with <it>de novo </it>repeat discovery [<abbrgrp><abbr bid="B62">62</abbr><abbr bid="B63">63</abbr><abbr bid="B64">64</abbr><abbr bid="B65">65</abbr></abbrgrp>, for example]. Even though <it>de novo </it>discovery is useful for identifying uncharacterised families of repeats, it is not appropriate for genomic sequence annotation as it will fail to predict many repetitive elements that are are nested or fragmented. New families should be added to form an augmented reference library, and query sequences annotated with R<smcaps>EPEAT</smcaps>M<smcaps>ASKER</smcaps> and RE<smcaps>ANNOTATE</smcaps>. Generally, sequence divergence between repetitive element lineages exposes the dependence of similarity-based annotation on the quality and comprehensiveness of the reference library. For defragmentation, the most difficult situation occurs when a chromosomal element is divergent from all available library elements. For example, if the chromosomal element is in parts homologous to two different library elements, then REannotate would construct two element models instead of one. Note that REannotate does provide a facility to correct defragmentation, provided that this situation is noticed through human inspection in the first place (see below).</p>
               <p>Chimeric elements also present a challenging form of sequence divergence. They arise, for instance, when <it>i) </it>a transposable element nesting an unrelated element is mobilised, transducing the nested element; or <it>ii) </it>recombination between TE or ERV sequences leads to a new replication-competent element. If the progenitors of a chimeric element, but not the element itself, are both represented in the reference library, then R<smcaps>EPEAT</smcaps>M<smcaps>ASKER</smcaps> will report separate hits to each progenitor reference, and RE<smcaps>ANNOTATE</smcaps> will construct separate element models for segments of the chimeric element. However, if upon human inspection of the automated annotation such chimerism is noticed, RE<smcaps>ANNOTATE</smcaps> does provide an option to combine hits to <it>prescribed </it>reference elements into element models, so that a new round of re-annotation will construct models capturing the full sequence of chimeric elements.</p>
               <p>The optional RE<smcaps>ANNOTATE</smcaps> facility to defragment elements matching multiple reference sequences is also useful for solving another problem &#8211; <it>overrepresentation </it>of a lineage in the reference library, when the library redundantly contains very closely related sequences. Again this may cause R<smcaps>EPEAT</smcaps>M<smcaps>ASKER</smcaps> to report matches to different reference elements that correspond to segments of the same chromosomal element. Finally, user-prescribed association of reference elements is essential for the construction of LTR-element models if the library entries for the LTR and IR representing the same family of elements have different names (as is the case for primate ERVs in R<smcaps>EP</smcaps>B<smcaps>ASE</smcaps> U<smcaps>PDATE</smcaps>).</p>
               <p>However, the ideal approach for optimum similarity-based annotation is the construction of a reference library that non-redundantly represents repeat lineages as comprehensively as possible. In some cases, even when it is noticed (on human inspection) that chromosomal elements match multiple library sequences, the REannotate facility to defragment such elements will not help. This is the case when the rules for element model construction and defragmentation are violated. For example, the element labeled "3" in Table <tblr tid="T1">1</tblr> (first column) and in Figure <figr fid="F4">4</figr> was present in the original human annotation <abbrgrp><abbr bid="B51">51</abbr></abbrgrp> with the name <it>Sabrina_F2-2</it>", an LTR-retrotransposon. The IR of this element was matched by three disntinct reference elements in the library used (Table <tblr tid="T1">1</tblr>: "<smcaps>SABRINA2_TM</smcaps>", "<smcaps>SABRINA3_TM</smcaps>", and "<smcaps>SABRINA_HV</smcaps>"). So in this case REannotate constructed three separate element models, which correspond to the element "<it>Sabrina_F2-2</it>" in the human annotation. The longest of these three models (labeled "3" in Figure <figr fid="F4">4</figr>) defragmented four hits to the reference element <smcaps>SABRINA2_TM</smcaps> (one IR and three LTR hits). The other two short element models are marked in Figure <figr fid="F4">4</figr> by a horizontal bar and a "*". They correspond to the hits to <smcaps>SABRINA3_TM</smcaps> and <smcaps>SABRINA_HV</smcaps>. In the current defragmentation algorithm, these shorter hits could never be defragmented into the longer model containing the four <smcaps>SABRINA2_TM</smcaps> hits. This is because the <smcaps>SABRINA2_TM</smcaps> hits have in fact the opposite orientation on the chromosome relative to the <smcaps>SABRINA3_TM</smcaps> and <smcaps>SABRINA_HV</smcaps> hits.</p>
            </sec>
            <sec>
               <st>
                  <p>DNA re-arrangements</p>
               </st>
               <p>RE<smcaps>ANNOTATE</smcaps> assembles fossil sequence fragments colinear with a given reference element into an element model, and the model assumes that such fragments are also colinear with the ancestral sequence of a repetitive element at the time of integration. DNA rearrangements involving a TE after its insertion into the genome may disrupt colinearity of its sequence with a reference element. The issue then arises as to whether to classify the re-arranged sequence as a single repetitive element. RE<smcaps>ANNOTATE</smcaps> will normally construct a separate model for any sequence segment violating colinearity with the reference. DNA rearrangements other than transposition (of an entire element) pose challenges for TE annotation; for example, if a segmental duplication has occurred within an element that remained replication-competent and that <it>subsequently </it>generated new copies, then for these new copies, colinearity with the reference element (which does not contain the duplication) is violated and two separate models are constructed. RE<smcaps>ANNOTATE</smcaps> does construct models of DNA rearrangements involving LTR-elements (which are based on re-arrangements of the common structure of these elements). One of the purposes of such annotation is cautioning against the validity of dating such an element, as the rearranged structure may be the result of inter-element recombination. The occurrence of post-integration, inter-element LTR recombination would invalidate the use of intra-element LTR sequence divergence as a molecular timer. It is also possible that inter-element LTR recombination may occur without altering the structure of the elements involved &#8211; in this case the re-arrangement will remain undetected by the algorithm currently implemented in RE<smcaps>ANNOTATE</smcaps>. Nevertheless, the aligned sequence data output by RE<smcaps>ANNOTATE</smcaps> can be used to reconstruct the phylogeny of all LTRs within a family: inter-element recombination would be detected if intra-element LTRs did not cluster on the phylogeny. There is scope for future implementation of algorithms for improving the detection and annotation of recombination, segmental duplication, and inversion events involving repetitive elements.</p>
            </sec>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Conclusion</p>
         </st>
         <p>RE<smcaps>ANNOTATE</smcaps> improves repetitive element annotation of genomic sequences by constructing models of evolutionary events involving dispersed repeats. Currently, automated repetitive element annotation is largely limited by default use of R<smcaps>EPEAT</smcaps>M<smcaps>ASKER</smcaps> output, which reports genomic regions that have sequence similarity to known repeats. RE<smcaps>ANNOTATE</smcaps> is ready to post-process existing annotation or to be incorporated into annotation pipelines that use R<smcaps>EPEAT</smcaps>M<smcaps>ASKER</smcaps>. RE<smcaps>ANNOTATE</smcaps> processes the similarity annotation to infer the common origin of dispersed repetitive sequences, resolve complex nesting patterns, and date insertion events LTR-elements with a detectable structure. These analyses become possible even in genomic regions with a high-density of repeats, such as heterochromatin. The annotation and repetitive element model sequences output by RE<smcaps>ANNOTATE</smcaps> therefore provide automated paleontology of complex repeats and, consequently, their host genomes, as the evolution (and possibly some function) of genomes is linked to their repetitive content.</p>
      </sec>
      <sec>
         <st>
            <p>Availability and requirements</p>
         </st>
         <p><b>Project name: </b>REannotate</p>
         <p>
            <b>Project home page: </b>
            <url>http://www.bioinformatics.org/reannotate</url>
         </p>
         <p><b>Operating system(s): </b>GNU/Linux or any other UNIX-like environment</p>
         <p><b>Programming language: </b>Perl</p>
         <p><b>License: </b>GNU GPL</p>
         <p><b>Any restrictions to use by non-academics: </b>None</p>
         <p>The current version of RE<smcaps>ANNOTATE</smcaps> is also available as Additional file <supplr sid="S5">5</supplr>.</p>
         <suppl id="S5">
            <title>
               <p>Additional file 5</p>
            </title>
            <text>
               <p><b>RE<smcaps>ANNOTATE</smcaps>.</b> Current version of RE<smcaps>ANNOTATE</smcaps>.</p>
            </text>
            <file name="1471-2164-9-614-S5.pl">
               <p>Click here for file</p>
            </file>
         </suppl>
         <p>Usage, input, and output are described in the user manual, which is available as Additional file <supplr sid="S6">6</supplr> and also at <url>http://www.bioinformatics.org/reannotate/manual/user_manual.pdf</url>.</p>
         <suppl id="S6">
            <title>
               <p>Additional file 6</p>
            </title>
            <text>
               <p><b>RE<smcaps>ANNOTATE</smcaps> user manual.</b> User manual.</p>
            </text>
            <file name="1471-2164-9-614-S6.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <p>The programme <smcaps>CLUSTAL</smcaps>W <abbrgrp><abbr bid="B39">39</abbr></abbrgrp> or <smcaps>CLUSTAL</smcaps>W2 <url>http://www.clustal.org</url> is currently required for dating 'complete' LTR-elements.</p>
      </sec>
      <sec>
         <st>
            <p>Abbreviations</p>
         </st>
         <p>
ERV: endogenous retrovirus; HERV: human ERV; IR: internal region (of an LTR-element); LINE: long interspersed nuclear element; LTR: long terminal repeat; SINE: short interspersed nuclear element; TE: transposable element.</p>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>I thank Casey Bergman and Adam Eyre-Walker for useful comments.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>Interspersed repeats and other mementos of transposable elements in mammalian genomes</p>
            </title>
            <aug>
               <au>
                  <snm>Smit</snm>
                  <fnm>AF</fnm>
               </au>
            </aug>
            <source>Curr Opin Genet Dev</source>
            <pubdate>1999</pubdate>
            <volume>9</volume>
            <issue>6</issue>
            <fpage>657</fpage>
            <lpage>63</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0959-437X(99)00031-3</pubid>
                  <pubid idtype="pmpid" link="fulltext">10607616</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>Genome size and the proportion of repeated nucleotide sequence DNA in plants</p>
            </title>
            <aug>
               <au>
                  <snm>Flavell</snm>
                  <fnm>RB</fnm>
               </au>
               <au>
                  <snm>Bennett</snm>
                  <fnm>MD</fnm>
               </au>
               <au>
                  <snm>Smith</snm>
                  <fnm>JB</fnm>
               </au>
               <au>
                  <snm>Smith</snm>
                  <fnm>DB</fnm>
               </au>
            </aug>
            <source>Biochem Genet</source>
            <pubdate>1974</pubdate>
            <volume>12</volume>
            <issue>4</issue>
            <fpage>257</fpage>
            <lpage>69</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1007/BF00485947</pubid>
                  <pubid idtype="pmpid">4441361</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>Evidence that a Recent Increase in Maize Genome Size was Caused by the Massive Amplification of Intergene Retrotransposons</p>
            </title>
            <aug>
               <au>
                  <snm>SanMiguel</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Bennetzen</snm>
                  <fnm>JL</fnm>
               </au>
            </aug>
            <source>Ann Bot (Lond)</source>
            <pubdate>1998</pubdate>
            <volume>82</volume>
            <issue>Suppl 1</issue>
            <fpage>37</fpage>
            <lpage>44</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1006/anbo.1998.0746</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>Structure, functionality, and evolution of the BARE-1 retrotransposon of barley</p>
            </title>
            <aug>
               <au>
                  <snm>Vicient</snm>
                  <fnm>CM</fnm>
               </au>
               <au>
                  <snm>Kalendar</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Anamthawat-J&#243;nsson</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Schulman</snm>
                  <fnm>AH</fnm>
               </au>
            </aug>
            <source>Genetica</source>
            <pubdate>1999</pubdate>
            <volume>107</volume>
            <issue>1&#8211;3</issue>
            <fpage>53</fpage>
            <lpage>63</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1023/A:1003929913398</pubid>
                  <pubid idtype="pmpid" link="fulltext">10952197</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>Unequal homologous recombination between LINE-1 elements as a mutational mechanism in human genetic disease</p>
            </title>
            <aug>
               <au>
                  <snm>Burwinkel</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Kilimann</snm>
                  <fnm>MW</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1998</pubdate>
            <volume>277</volume>
            <issue>3</issue>
            <fpage>513</fpage>
            <lpage>7</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1006/jmbi.1998.1641</pubid>
                  <pubid idtype="pmpid" link="fulltext">9533876</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>Formation of chromosome rearrangements by P factors in <it>Drosophila</it></p>
            </title>
            <aug>
               <au>
                  <snm>Engels</snm>
                  <fnm>WR</fnm>
               </au>
               <au>
                  <snm>Preston</snm>
                  <fnm>CR</fnm>
               </au>
            </aug>
            <source>Genetics</source>
            <pubdate>1984</pubdate>
            <volume>107</volume>
            <issue>4</issue>
            <fpage>657</fpage>
            <lpage>78</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1202383</pubid>
                  <pubid idtype="pmpid" link="fulltext">6086453</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>Reconstructing hominid Y evolution: X-homologous block, created by X-Y transposition, was disrupted by Yp inversion through LINE-LINE recombination</p>
            </title>
            <aug>
               <au>
                  <snm>Schwartz</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Chan</snm>
                  <fnm>DC</fnm>
               </au>
               <au>
                  <snm>Brown</snm>
                  <fnm>LG</fnm>
               </au>
               <au>
                  <snm>Alagappan</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Pettay</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Disteche</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>McGillivray</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>de la Chapelle</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Page</snm>
                  <fnm>DC</fnm>
               </au>
            </aug>
            <source>Hum Mol Genet</source>
            <pubdate>1998</pubdate>
            <volume>7</volume>
            <fpage>1</fpage>
            <lpage>11</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/hmg/7.1.1</pubid>
                  <pubid idtype="pmpid" link="fulltext">9384598</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>RNA interference, transposons, and the centromere</p>
            </title>
            <aug>
               <au>
                  <snm>Dawe</snm>
                  <fnm>RK</fnm>
               </au>
            </aug>
            <source>Plant Cell</source>
            <pubdate>2003</pubdate>
            <volume>15</volume>
            <issue>2</issue>
            <fpage>297</fpage>
            <lpage>301</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">526033</pubid>
                  <pubid idtype="pmpid" link="fulltext">12566573</pubid>
                  <pubid idtype="doi">10.1105/tpc.150230</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>TAHRE, a novel telomeric retrotransposon from Drosophila melanogaster, reveals the origin of Drosophila telomeres</p>
            </title>
            <aug>
               <au>
                  <snm>Abad</snm>
                  <fnm>JP</fnm>
               </au>
               <au>
                  <snm>Pablos</snm>
                  <fnm>BD</fnm>
               </au>
               <au>
                  <snm>Osoegawa</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Jong</snm>
                  <fnm>PJD</fnm>
               </au>
               <au>
                  <snm>Mart&#237;n-Gallardo</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Villasante</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Mol Biol Evol</source>
            <pubdate>2004</pubdate>
            <volume>21</volume>
            <issue>9</issue>
            <fpage>1620</fpage>
            <lpage>4</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/molbev/msh180</pubid>
                  <pubid idtype="pmpid" link="fulltext">15175413</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>Role of transposable elements in heterochromatin and epigenetic control</p>
            </title>
            <aug>
               <au>
                  <snm>Lippman</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Gendrel</snm>
                  <fnm>AV</fnm>
               </au>
               <au>
                  <snm>Black</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Vaughn</snm>
                  <fnm>MW</fnm>
               </au>
               <au>
                  <snm>Dedhia</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>McCombie</snm>
                  <fnm>WR</fnm>
               </au>
               <au>
                  <snm>Lavine</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Mittal</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>May</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Kasschau</snm>
                  <fnm>KD</fnm>
               </au>
               <au>
                  <snm>Carrington</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Doerge</snm>
                  <fnm>RW</fnm>
               </au>
               <au>
                  <snm>Colot</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Martienssen</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2004</pubdate>
            <volume>430</volume>
            <issue>6998</issue>
            <fpage>471</fpage>
            <lpage>6</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nature02651</pubid>
                  <pubid idtype="pmpid" link="fulltext">15269773</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>Small RNAs correspond to centromere heterochromatic repeats</p>
            </title>
            <aug>
               <au>
                  <snm>Reinhart</snm>
                  <fnm>BJ</fnm>
               </au>
               <au>
                  <snm>Bartel</snm>
                  <fnm>DP</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2002</pubdate>
            <volume>297</volume>
            <issue>5588</issue>
            <fpage>1831</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.1077183</pubid>
                  <pubid idtype="pmpid" link="fulltext">12193644</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>A Distinct Small RNA Pathway Silences Selfish Genetic Elements in the Germline</p>
            </title>
            <aug>
               <au>
                  <snm>Vagin</snm>
                  <fnm>VV</fnm>
               </au>
               <au>
                  <snm>Sigova</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Seitz</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Gvozdev</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Zamore</snm>
                  <fnm>PD</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2006</pubdate>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">16809489</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B13">
            <title>
               <p>A role for Piwi and piRNAs in germ cell maintenance and transposon silencing in Zebrafish</p>
            </title>
            <aug>
               <au>
                  <snm>Houwing</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Kamminga</snm>
                  <fnm>LM</fnm>
               </au>
               <au>
                  <snm>Berezikov</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Cronembold</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Girard</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Elst</snm>
                  <mnm>van den</mnm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Filippov</snm>
                  <fnm>DV</fnm>
               </au>
               <au>
                  <snm>Blaser</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Raz</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Moens</snm>
                  <fnm>CB</fnm>
               </au>
               <au>
                  <snm>Plasterk</snm>
                  <fnm>RHA</fnm>
               </au>
               <au>
                  <snm>Hannon</snm>
                  <fnm>GJ</fnm>
               </au>
               <au>
                  <snm>Draper</snm>
                  <fnm>BW</fnm>
               </au>
               <au>
                  <snm>Ketting</snm>
                  <fnm>RF</fnm>
               </au>
            </aug>
            <source>Cell</source>
            <pubdate>2007</pubdate>
            <volume>129</volume>
            <fpage>69</fpage>
            <lpage>82</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.cell.2007.03.026</pubid>
                  <pubid idtype="pmpid" link="fulltext">17418787</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>Discrete small RNA-generating loci as master regulators of transposon activity in Drosophila</p>
            </title>
            <aug>
               <au>
                  <snm>Brennecke</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Aravin</snm>
                  <fnm>AA</fnm>
               </au>
               <au>
                  <snm>Stark</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Dus</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Kellis</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Sachidanandam</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Hannon</snm>
                  <fnm>GJ</fnm>
               </au>
            </aug>
            <source>Cell</source>
            <pubdate>2007</pubdate>
            <volume>128</volume>
            <issue>6</issue>
            <fpage>1089</fpage>
            <lpage>103</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.cell.2007.01.043</pubid>
                  <pubid idtype="pmpid" link="fulltext">17346786</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B15">
            <title>
               <p>Developmentally regulated piRNA clusters implicate MILI in transposon control</p>
            </title>
            <aug>
               <au>
                  <snm>Aravin</snm>
                  <fnm>AA</fnm>
               </au>
               <au>
                  <snm>Sachidanandam</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Girard</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Fejes-Toth</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Hannon</snm>
                  <fnm>GJ</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2007</pubdate>
            <volume>316</volume>
            <issue>5825</issue>
            <fpage>744</fpage>
            <lpage>7</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.1142612</pubid>
                  <pubid idtype="pmpid" link="fulltext">17446352</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <title>
               <p>Origin of a substantial fraction of human regulatory sequences from transposable elements</p>
            </title>
            <aug>
               <au>
                  <snm>Jordan</snm>
                  <fnm>IK</fnm>
               </au>
               <au>
                  <snm>Rogozin</snm>
                  <fnm>IB</fnm>
               </au>
               <au>
                  <snm>Glazko</snm>
                  <fnm>GV</fnm>
               </au>
               <au>
                  <snm>Koonin</snm>
                  <fnm>EV</fnm>
               </au>
            </aug>
            <source>Trends Genet</source>
            <pubdate>2003</pubdate>
            <volume>19</volume>
            <issue>2</issue>
            <fpage>68</fpage>
            <lpage>72</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0168-9525(02)00006-9</pubid>
                  <pubid idtype="pmpid" link="fulltext">12547512</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>Transposable elements in mammals promote regulatory variation and diversification of genes with specialized functions</p>
            </title>
            <aug>
               <au>
                  <snm>Lagemaat</snm>
                  <mnm>van de</mnm>
                  <fnm>LN</fnm>
               </au>
               <au>
                  <snm>Landry</snm>
                  <fnm>JR</fnm>
               </au>
               <au>
                  <snm>Mager</snm>
                  <fnm>DL</fnm>
               </au>
               <au>
                  <snm>Medstrand</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Trends Genet</source>
            <pubdate>2003</pubdate>
            <volume>19</volume>
            <issue>10</issue>
            <fpage>530</fpage>
            <lpage>6</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.tig.2003.08.004</pubid>
                  <pubid idtype="pmpid" link="fulltext">14550626</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p>Developmentally Regulated Activation of a SINE B2 Repeat as a Domain Boundary in Organogenesis</p>
            </title>
            <aug>
               <au>
                  <snm>Lunyak</snm>
                  <fnm>VV</fnm>
               </au>
               <au>
                  <snm>Prefontaine</snm>
                  <fnm>GG</fnm>
               </au>
               <au>
                  <snm>Nunez</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Cramer</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Ju</snm>
                  <fnm>BG</fnm>
               </au>
               <au>
                  <snm>Ohgi</snm>
                  <fnm>KA</fnm>
               </au>
               <au>
                  <snm>Hutt</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Roy</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Garcia-Diaz</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Zhu</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Yung</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Montoliu</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Glass</snm>
                  <fnm>CK</fnm>
               </au>
               <au>
                  <snm>Rosenfeld</snm>
                  <fnm>MG</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2007</pubdate>
            <volume>317</volume>
            <issue>5835</issue>
            <fpage>248</fpage>
            <lpage>251</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.1140871</pubid>
                  <pubid idtype="pmpid" link="fulltext">17626886</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B19">
            <title>
               <p>A distal enhancer and an ultraconserved exon are derived from a novel retroposon</p>
            </title>
            <aug>
               <au>
                  <snm>Bejerano</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Lowe</snm>
                  <fnm>CB</fnm>
               </au>
               <au>
                  <snm>Ahituv</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>King</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Siepel</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Salama</snm>
                  <fnm>SR</fnm>
               </au>
               <au>
                  <snm>Rubin</snm>
                  <fnm>EM</fnm>
               </au>
               <au>
                  <snm>Kent</snm>
                  <fnm>WJ</fnm>
               </au>
               <au>
                  <snm>Haussler</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2006</pubdate>
            <volume>441</volume>
            <issue>7089</issue>
            <fpage>87</fpage>
            <lpage>90</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nature04696</pubid>
                  <pubid idtype="pmpid" link="fulltext">16625209</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B20">
            <title>
               <p>RepeatMasker Open-3.0</p>
            </title>
            <aug>
               <au>
                  <snm>Smit</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Hubley</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Green</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <pubdate>1996</pubdate>
            <url>http://www.repeatmasker.org</url>
         </bibl>
         <bibl id="B21">
            <title>
               <p>Transposable element annotation of the rice genome</p>
            </title>
            <aug>
               <au>
                  <snm>Juretic</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Bureau</snm>
                  <fnm>TE</fnm>
               </au>
               <au>
                  <snm>Bruskiewich</snm>
                  <fnm>RM</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2004</pubdate>
            <volume>20</volume>
            <issue>2</issue>
            <fpage>155</fpage>
            <lpage>60</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/bth019</pubid>
                  <pubid idtype="pmpid" link="fulltext">14734305</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B22">
            <title>
               <p>Combined evidence annotation of transposable elements in genome sequences</p>
            </title>
            <aug>
               <au>
                  <snm>Quesneville</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Bergman</snm>
                  <fnm>CM</fnm>
               </au>
               <au>
                  <snm>Andrieu</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Autard</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Nouaud</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Ashburner</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Anxolabehere</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>PLoS Comput Biol</source>
            <pubdate>2005</pubdate>
            <volume>1</volume>
            <issue>2</issue>
            <fpage>e22</fpage>
            <xrefbib>
               <pubid idtype="doi">10.1371/journal.pcbi.0010022</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B23">
            <title>
               <p>Molecular archeology of L1 insertions in the human genome</p>
            </title>
            <aug>
               <au>
                  <snm>Szak</snm>
                  <fnm>ST</fnm>
               </au>
               <au>
                  <snm>Pickeral</snm>
                  <fnm>OK</fnm>
               </au>
               <au>
                  <snm>Makalowski</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Boguski</snm>
                  <fnm>MS</fnm>
               </au>
               <au>
                  <snm>Landsman</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Boeke</snm>
                  <fnm>JD</fnm>
               </au>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2002</pubdate>
            <volume>3</volume>
            <issue>10</issue>
            <fpage>research0052</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">134481</pubid>
                  <pubid idtype="pmpid" link="fulltext">12372140</pubid>
                  <pubid idtype="doi">10.1186/gb-2002-3-10-research0052</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B24">
            <title>
               <p>Evolutionary History of Mammalian Transposons Determined by Genome-Wide Defragmentation</p>
            </title>
            <aug>
               <au>
                  <snm>Giordano</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Ge</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Gelfand</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Abrus&#225;n</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Benson</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Warburton</snm>
                  <fnm>PE</fnm>
               </au>
            </aug>
            <source>PLoS Comput Biol</source>
            <pubdate>2007</pubdate>
            <volume>3</volume>
            <issue>7</issue>
            <fpage>e137</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1914374</pubid>
                  <pubid idtype="pmpid" link="fulltext">17630829</pubid>
                  <pubid idtype="doi">10.1371/journal.pcbi.0030137</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B25">
            <title>
               <p>Nested retrotransposons on the W chromosome of the wild silkworm <it>Bombyx mandarina</it></p>
            </title>
            <aug>
               <au>
                  <snm>Abe</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Sugasaki</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Terada</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Kanehara</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Ohbayashi</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Shimada</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Kawai</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Mita</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Oshiki</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>Insect Mol Biol</source>
            <pubdate>2002</pubdate>
            <volume>11</volume>
            <issue>4</issue>
            <fpage>307</fpage>
            <lpage>14</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1046/j.1365-2583.2002.00339.x</pubid>
                  <pubid idtype="pmpid" link="fulltext">12144695</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B26">
            <title>
               <p>Diverse DNA transposons in rotifers of the class Bdelloidea</p>
            </title>
            <aug>
               <au>
                  <snm>Arkhipova</snm>
                  <fnm>IR</fnm>
               </au>
               <au>
                  <snm>Meselson</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2005</pubdate>
            <volume>102</volume>
            <issue>33</issue>
            <fpage>11781</fpage>
            <lpage>6</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1188004</pubid>
                  <pubid idtype="pmpid" link="fulltext">16081532</pubid>
                  <pubid idtype="doi">10.1073/pnas.0505333102</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B27">
            <title>
               <p>The genome sequence of the rice blast fungus <it>Magnaporthe grisea</it></p>
            </title>
            <aug>
               <au>
                  <snm>Dean</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Talbot</snm>
                  <fnm>NJ</fnm>
               </au>
               <au>
                  <snm>Ebbole</snm>
                  <fnm>DJ</fnm>
               </au>
               <au>
                  <snm>Farman</snm>
                  <fnm>ML</fnm>
               </au>
               <au>
                  <snm>Mitchell</snm>
                  <fnm>TK</fnm>
               </au>
               <au>
                  <snm>Orbach</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Thon</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Kulkarni</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Xu</snm>
                  <fnm>JR</fnm>
               </au>
               <au>
                  <snm>Pan</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Read</snm>
                  <fnm>ND</fnm>
               </au>
               <au>
                  <snm>Lee</snm>
                  <fnm>YH</fnm>
               </au>
               <au>
                  <snm>Carbone</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Brown</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Oh</snm>
                  <fnm>YY</fnm>
               </au>
               <au>
                  <snm>Donofrio</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Jeong</snm>
                  <fnm>JS</fnm>
               </au>
               <au>
                  <snm>Soanes</snm>
                  <fnm>DM</fnm>
               </au>
               <au>
                  <snm>Djonovic</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Kolomiets</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Rehmeyer</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Harding</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Kim</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Lebrun</snm>
                  <fnm>MH</fnm>
               </au>
               <au>
                  <snm>Bohnert</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Coughlan</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Butler</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Calvo</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Ma</snm>
                  <fnm>LJ</fnm>
               </au>
               <au>
                  <snm>Nicol</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Purcell</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Nusbaum</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Galagan</snm>
                  <fnm>JE</fnm>
               </au>
               <au>
                  <snm>Birren</snm>
                  <fnm>BW</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2005</pubdate>
            <volume>434</volume>
            <issue>7036</issue>
            <fpage>980</fpage>
            <lpage>6</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nature03449</pubid>
                  <pubid idtype="pmpid" link="fulltext">15846337</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B28">
            <title>
               <p>Insertion preference of maize and rice miniature inverted repeat transposable elements as revealed by the analysis of nested elements</p>
            </title>
            <aug>
               <au>
                  <snm>Jiang</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Wessler</snm>
                  <fnm>SR</fnm>
               </au>
            </aug>
            <source>Plant Cell</source>
            <pubdate>2001</pubdate>
            <volume>13</volume>
            <issue>11</issue>
            <fpage>2553</fpage>
            <lpage>64</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">139471</pubid>
                  <pubid idtype="pmpid" link="fulltext">11701888</pubid>
                  <pubid idtype="doi">10.1105/tpc.13.11.2553</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B29">
            <title>
               <p>Retrotransposon-like nature of Tp1 elements: implications for the organisation of highly repetitive, hypermethylated DNA in the genome of <it>Physarum polycephalum</it></p>
            </title>
            <aug>
               <au>
                  <snm>Rothnie</snm>
                  <fnm>HM</fnm>
               </au>
               <au>
                  <snm>McCurrach</snm>
                  <fnm>KJ</fnm>
               </au>
               <au>
                  <snm>Glover</snm>
                  <fnm>LA</fnm>
               </au>
               <au>
                  <snm>Hardman</snm>
                  <fnm>N</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>1991</pubdate>
            <volume>19</volume>
            <issue>2</issue>
            <fpage>279</fpage>
            <lpage>86</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">333591</pubid>
                  <pubid idtype="pmpid" link="fulltext">1707520</pubid>
                  <pubid idtype="doi">10.1093/nar/19.2.279</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B30">
            <title>
               <p>Nested retrotransposons in the intergenic regions of the maize genome</p>
            </title>
            <aug>
               <au>
                  <snm>SanMiguel</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Tikhonov</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Jin</snm>
                  <fnm>YK</fnm>
               </au>
               <au>
                  <snm>Motchoulskaia</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Zakharov</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Melake-Berhan</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Springer</snm>
                  <fnm>PS</fnm>
               </au>
               <au>
                  <snm>Edwards</snm>
                  <fnm>KJ</fnm>
               </au>
               <au>
                  <snm>Lee</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Avramova</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Bennetzen</snm>
                  <fnm>JL</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>1996</pubdate>
            <volume>274</volume>
            <issue>5288</issue>
            <fpage>765</fpage>
            <lpage>8</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.274.5288.765</pubid>
                  <pubid idtype="pmpid" link="fulltext">8864112</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B31">
            <title>
               <p>Analysis of a contiguous 211 kb sequence in diploid wheat (<it>Triticum monococcum </it>L.) reveals multiple mechanisms of genome evolution</p>
            </title>
            <aug>
               <au>
                  <snm>Wicker</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Stein</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Albar</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Feuillet</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Schlagenhauf</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Keller</snm>
                  <fnm>B</fnm>
               </au>
            </aug>
            <source>Plant J</source>
            <pubdate>2001</pubdate>
            <volume>26</volume>
            <issue>3</issue>
            <fpage>307</fpage>
            <lpage>16</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1046/j.1365-313X.2001.01028.x</pubid>
                  <pubid idtype="pmpid" link="fulltext">11439119</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B32">
            <title>
               <p>Recurrent insertion and duplication generate networks of transposable element sequences in the Drosophila melanogaster genome</p>
            </title>
            <aug>
               <au>
                  <snm>Bergman</snm>
                  <fnm>CM</fnm>
               </au>
               <au>
                  <snm>Quesneville</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Anxolab&#233;h&#234;re</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Ashburner</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2006</pubdate>
            <volume>7</volume>
            <issue>11</issue>
            <fpage>R112</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1794594</pubid>
                  <pubid idtype="pmpid" link="fulltext">17134480</pubid>
                  <pubid idtype="doi">10.1186/gb-2006-7-11-r112</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B33">
            <title>
               <p>Discovering and detecting transposable elements in genome sequences</p>
            </title>
            <aug>
               <au>
                  <snm>Bergman</snm>
                  <fnm>CM</fnm>
               </au>
               <au>
                  <snm>Quesneville</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>Brief Bioinform</source>
            <pubdate>2007</pubdate>
            <volume>8</volume>
            <issue>6</issue>
            <fpage>382</fpage>
            <lpage>92</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bib/bbm048</pubid>
                  <pubid idtype="pmpid" link="fulltext">17932080</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B34">
            <title>
               <p>Apollo: a sequence annotation editor</p>
            </title>
            <aug>
               <au>
                  <snm>Lewis</snm>
                  <fnm>SE</fnm>
               </au>
               <au>
                  <snm>Searle</snm>
                  <fnm>SMJ</fnm>
               </au>
               <au>
                  <snm>Harris</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Gibson</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Lyer</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Richter</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Wiel</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Bayraktaroglir</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Birney</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Crosby</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Kaminker</snm>
                  <fnm>JS</fnm>
               </au>
               <au>
                  <snm>Matthews</snm>
                  <fnm>BB</fnm>
               </au>
               <au>
                  <snm>Prochnik</snm>
                  <fnm>SE</fnm>
               </au>
               <au>
                  <snm>Smithy</snm>
                  <fnm>CD</fnm>
               </au>
               <au>
                  <snm>Tupy</snm>
                  <fnm>JL</fnm>
               </au>
               <au>
                  <snm>Rubin</snm>
                  <fnm>GM</fnm>
               </au>
               <au>
                  <snm>Misra</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Mungall</snm>
                  <fnm>CJ</fnm>
               </au>
               <au>
                  <snm>Clamp</snm>
                  <fnm>ME</fnm>
               </au>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2002</pubdate>
            <volume>3</volume>
            <issue>12</issue>
            <fpage>RESEARCH0082</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">151184</pubid>
                  <pubid idtype="pmpid" link="fulltext">12537571</pubid>
                  <pubid idtype="doi">10.1186/gb-2002-3-12-research0082</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B35">
            <title>
               <p>PLOTREP: a web tool for defragmentation and visual analysis of dispersed genomic repeats</p>
            </title>
            <aug>
               <au>
                  <snm>T&#243;th</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>De&#225;k</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Barta</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Kiss</snm>
                  <fnm>GB</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2006</pubdate>
            <volume>34</volume>
            <fpage>W708</fpage>
            <lpage>13</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1538846</pubid>
                  <pubid idtype="pmpid" link="fulltext">16845104</pubid>
                  <pubid idtype="doi">10.1093/nar/gkl263</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B36">
            <title>
               <p>TEnest: automated chronological annotation and visualization of nested plant transposable elements</p>
            </title>
            <aug>
               <au>
                  <snm>Kronmiller</snm>
                  <fnm>BA</fnm>
               </au>
               <au>
                  <snm>Wise</snm>
                  <fnm>RP</fnm>
               </au>
            </aug>
            <source>Plant Physiol</source>
            <pubdate>2008</pubdate>
            <volume>146</volume>
            <fpage>45</fpage>
            <lpage>59</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">2230558</pubid>
                  <pubid idtype="pmpid" link="fulltext">18032588</pubid>
                  <pubid idtype="doi">10.1104/pp.107.110353</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B37">
            <title>
               <p>BLAST</p>
            </title>
            <aug>
               <au>
                  <snm>Gish</snm>
                  <fnm>W</fnm>
               </au>
            </aug>
            <pubdate>1996</pubdate>
            <url>http://www.advbiocomp.com/</url>
         </bibl>
         <bibl id="B38">
            <title>
               <p>A time-efficient, linear-space local similarity algorithm</p>
            </title>
            <aug>
               <au>
                  <snm>Huang</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Miller</snm>
                  <fnm>W</fnm>
               </au>
            </aug>
            <source>Adv Appl Math</source>
            <pubdate>1991</pubdate>
            <volume>12</volume>
            <fpage>337</fpage>
            <lpage>357</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1016/0196-8858(91)90017-D</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B39">
            <title>
               <p>CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice</p>
            </title>
            <aug>
               <au>
                  <snm>Thompson</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Higgins</snm>
                  <fnm>DG</fnm>
               </au>
               <au>
                  <snm>Gibson</snm>
                  <fnm>TJ</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>1994</pubdate>
            <volume>22</volume>
            <issue>22</issue>
            <fpage>4673</fpage>
            <lpage>80</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">308517</pubid>
                  <pubid idtype="pmpid" link="fulltext">7984417</pubid>
                  <pubid idtype="doi">10.1093/nar/22.22.4673</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B40">
            <title>
               <p>A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences</p>
            </title>
            <aug>
               <au>
                  <snm>Kimura</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>J Mol Evol</source>
            <pubdate>1980</pubdate>
            <volume>16</volume>
            <issue>2</issue>
            <fpage>111</fpage>
            <lpage>20</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1007/BF01731581</pubid>
                  <pubid idtype="pmpid">7463489</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B41">
            <title>
               <p>Laboratory of Phil Green</p>
            </title>
            <aug>
               <au>
                  <snm>Green</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <url>http://www.phrap.org</url>
         </bibl>
         <bibl id="B42">
            <title>
               <p>Repbase Update: a database and an electronic journal of repetitive elements</p>
            </title>
            <aug>
               <au>
                  <snm>Jurka</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Trends Genet</source>
            <pubdate>2000</pubdate>
            <volume>16</volume>
            <issue>9</issue>
            <fpage>418</fpage>
            <lpage>20</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0168-9525(00)02093-X</pubid>
                  <pubid idtype="pmpid" link="fulltext">10973072</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B43">
            <title>
               <p>The Institute for Genomic Research (TIGR) Plant Repeat Databases</p>
            </title>
            <aug>
               <au>
                  <cnm>TIGR</cnm>
               </au>
            </aug>
            <pubdate>2005</pubdate>
            <url>http://www.tigr.org/tdb/e2k1/plant.repeats/index.shtml</url>
         </bibl>
         <bibl id="B44">
            <title>
               <p>Insertion bias and purifying selection of retrotransposons in the <it>Arabidopsis thaliana </it>genome</p>
            </title>
            <aug>
               <au>
                  <snm>Pereira</snm>
                  <fnm>V</fnm>
               </au>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2004</pubdate>
            <volume>5</volume>
            <issue>10</issue>
            <fpage>R79</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">545599</pubid>
                  <pubid idtype="pmpid" link="fulltext">15461797</pubid>
                  <pubid idtype="doi">10.1186/gb-2004-5-10-r79</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B45">
            <title>
               <p>The paleontology of intergene retrotransposons of maize</p>
            </title>
            <aug>
               <au>
                  <snm>SanMiguel</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Gaut</snm>
                  <fnm>BS</fnm>
               </au>
               <au>
                  <snm>Tikhonov</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Nakajima</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Bennetzen</snm>
                  <fnm>JL</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>1998</pubdate>
            <volume>20</volume>
            <fpage>43</fpage>
            <lpage>5</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/1695</pubid>
                  <pubid idtype="pmpid" link="fulltext">9731528</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B46">
            <title>
               <p>Natural genetic variation caused by transposable elements in humans</p>
            </title>
            <aug>
               <au>
                  <snm>Bennett</snm>
                  <fnm>EA</fnm>
               </au>
               <au>
                  <snm>Coleman</snm>
                  <fnm>LE</fnm>
               </au>
               <au>
                  <snm>Tsui</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Pittard</snm>
                  <fnm>WS</fnm>
               </au>
               <au>
                  <snm>Devine</snm>
                  <fnm>SE</fnm>
               </au>
            </aug>
            <source>Genetics</source>
            <pubdate>2004</pubdate>
            <volume>168</volume>
            <issue>2</issue>
            <fpage>933</fpage>
            <lpage>51</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1448813</pubid>
                  <pubid idtype="pmpid" link="fulltext">15514065</pubid>
                  <pubid idtype="doi">10.1534/genetics.104.031757</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B47">
            <title>
               <p>Evolution of DNA sequence nonhomologies among maize inbreds</p>
            </title>
            <aug>
               <au>
                  <snm>Brunner</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Fengler</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Morgante</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Tingey</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Rafalski</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Plant Cell</source>
            <pubdate>2005</pubdate>
            <volume>17</volume>
            <issue>2</issue>
            <fpage>343</fpage>
            <lpage>60</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">548811</pubid>
                  <pubid idtype="pmpid" link="fulltext">15659640</pubid>
                  <pubid idtype="doi">10.1105/tpc.104.025627</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B48">
            <title>
               <p>Intraspecific violation of genetic colinearity and its implications in maize</p>
            </title>
            <aug>
               <au>
                  <snm>Fu</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Dooner</snm>
                  <fnm>HK</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2002</pubdate>
            <volume>99</volume>
            <issue>14</issue>
            <fpage>9573</fpage>
            <lpage>8</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">123182</pubid>
                  <pubid idtype="pmpid" link="fulltext">12060715</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B49">
            <title>
               <p>Genome evolution of wild barley (<it>Hordeum spontaneum</it>) by BARE-1 retrotransposon dynamics in response to sharp microclimatic divergence</p>
            </title>
            <aug>
               <au>
                  <snm>Kalendar</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Tanskanen</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Immonen</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Nevo</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Schulman</snm>
                  <fnm>AH</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2000</pubdate>
            <volume>97</volume>
            <issue>12</issue>
            <fpage>6603</fpage>
            <lpage>7</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">18673</pubid>
                  <pubid idtype="pmpid" link="fulltext">10823912</pubid>
                  <pubid idtype="doi">10.1073/pnas.110587497</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B50">
            <title>
               <p>Gene movement by Helitron transposons contributes to the haplotype variability of maize</p>
            </title>
            <aug>
               <au>
                  <snm>Lai</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Messing</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Dooner</snm>
                  <fnm>HK</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2005</pubdate>
            <volume>102</volume>
            <issue>25</issue>
            <fpage>9068</fpage>
            <lpage>73</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1157042</pubid>
                  <pubid idtype="pmpid" link="fulltext">15951422</pubid>
                  <pubid idtype="doi">10.1073/pnas.0502923102</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B51">
            <title>
               <p>Transposable elements, genes and recombination in a 215-kb contig from wheat chromosome 5A<sup><it>m</it></sup></p>
            </title>
            <aug>
               <au>
                  <snm>SanMiguel</snm>
                  <fnm>PJ</fnm>
               </au>
               <au>
                  <snm>Ramakrishna</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Bennetzen</snm>
                  <fnm>JL</fnm>
               </au>
               <au>
                  <snm>Busso</snm>
                  <fnm>CS</fnm>
               </au>
               <au>
                  <snm>Dubcovsky</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Funct Integr Genomics</source>
            <pubdate>2002</pubdate>
            <volume>2</volume>
            <fpage>1</fpage>
            <lpage>2</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1007/s10142-002-0056-4</pubid>
                  <pubid idtype="pmpid" link="fulltext">12021845</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B52">
            <title>
               <p>The transposable elements of the <it>Drosophila melanogaster </it>euchromatin: a genomics perspective</p>
            </title>
            <aug>
               <au>
                  <snm>Kaminker</snm>
                  <fnm>JS</fnm>
               </au>
               <au>
                  <snm>Bergman</snm>
                  <fnm>CM</fnm>
               </au>
               <au>
                  <snm>Kronmiller</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Carlson</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Svirskas</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Patel</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Frise</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Wheeler</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>Lewis</snm>
                  <fnm>SE</fnm>
               </au>
               <au>
                  <snm>Rubin</snm>
                  <fnm>GM</fnm>
               </au>
               <au>
                  <snm>Ashburner</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Celniker</snm>
                  <fnm>SE</fnm>
               </au>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2002</pubdate>
            <volume>3</volume>
            <issue>12</issue>
            <fpage>RESEARCH0084</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">151186</pubid>
                  <pubid idtype="pmpid" link="fulltext">12537573</pubid>
                  <pubid idtype="doi">10.1186/gb-2002-3-12-research0084</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B53">
            <title>
               <p>Transposable elements and genome organization: a comprehensive survey of retrotransposons revealed by the complete Saccharomyces cerevisiae genome sequence</p>
            </title>
            <aug>
               <au>
                  <snm>Kim</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Vanguri</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Boeke</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Gabriel</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Voytas</snm>
                  <fnm>DF</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>1998</pubdate>
            <volume>8</volume>
            <issue>5</issue>
            <fpage>464</fpage>
            <lpage>78</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">9582191</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B54">
            <title>
               <p>Effects of recombination rate on human endogenous retrovirus fixation and persistence</p>
            </title>
            <aug>
               <au>
                  <snm>Katzourakis</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Pereira</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Tristem</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>J Virol</source>
            <pubdate>2007</pubdate>
            <volume>81</volume>
            <issue>19</issue>
            <fpage>10712</fpage>
            <lpage>7</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">2045447</pubid>
                  <pubid idtype="pmpid" link="fulltext">17634225</pubid>
                  <pubid idtype="doi">10.1128/JVI.00410-07</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B55">
            <title>
               <p>Progressive proximal expansion of the primate X chromosome centromere</p>
            </title>
            <aug>
               <au>
                  <snm>Schueler</snm>
                  <fnm>MG</fnm>
               </au>
               <au>
                  <snm>Dunn</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Bird</snm>
                  <fnm>CP</fnm>
               </au>
               <au>
                  <snm>Ross</snm>
                  <fnm>MT</fnm>
               </au>
               <au>
                  <snm>Viggiano</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Rocchi</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Willard</snm>
                  <fnm>HF</fnm>
               </au>
               <au>
                  <snm>Green</snm>
                  <fnm>ED</fnm>
               </au>
               <au>
                  <cnm>N I S C Comparative Sequencing Program</cnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2005</pubdate>
            <volume>102</volume>
            <issue>30</issue>
            <fpage>10563</fpage>
            <lpage>8</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1180780</pubid>
                  <pubid idtype="pmpid" link="fulltext">16030148</pubid>
                  <pubid idtype="doi">10.1073/pnas.0503346102</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B56">
            <title>
               <p>Genomewide comparison of DNA sequences between humans and chimpanzees</p>
            </title>
            <aug>
               <au>
                  <snm>Ebersberger</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Metzler</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Schwarz</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>P&#228;&#228;bo</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Am J Hum Genet</source>
            <pubdate>2002</pubdate>
            <volume>70</volume>
            <issue>6</issue>
            <fpage>1490</fpage>
            <lpage>7</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">379137</pubid>
                  <pubid idtype="pmpid" link="fulltext">11992255</pubid>
                  <pubid idtype="doi">10.1086/340787</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B57">
            <title>
               <p>Covariation in frequencies of substitution, deletion, transposition, and recombination during eutherian evolution</p>
            </title>
            <aug>
               <au>
                  <snm>Hardison</snm>
                  <fnm>RC</fnm>
               </au>
               <au>
                  <snm>Roskin</snm>
                  <fnm>KM</fnm>
               </au>
               <au>
                  <snm>Yang</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Diekhans</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Kent</snm>
                  <fnm>WJ</fnm>
               </au>
               <au>
                  <snm>Weber</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Elnitski</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>O'Connor</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Kolbe</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Schwartz</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Furey</snm>
                  <fnm>TS</fnm>
               </au>
               <au>
                  <snm>Whelan</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Goldman</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Smit</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Miller</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Chiaromonte</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Haussler</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2003</pubdate>
            <volume>13</volume>
            <fpage>13</fpage>
            <lpage>26</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">430971</pubid>
                  <pubid idtype="pmpid" link="fulltext">12529302</pubid>
                  <pubid idtype="doi">10.1101/gr.844103</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B58">
            <title>
               <p>Long-term reinfection of the human genome by endogenous retroviruses</p>
            </title>
            <aug>
               <au>
                  <snm>Belshaw</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Pereira</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Katzourakis</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Talbot</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Paces</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Burt</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Tristem</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2004</pubdate>
            <volume>101</volume>
            <issue>14</issue>
            <fpage>4894</fpage>
            <lpage>9</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">387345</pubid>
                  <pubid idtype="pmpid" link="fulltext">15044706</pubid>
                  <pubid idtype="doi">10.1073/pnas.0307800101</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B59">
            <title>
               <p>Cytosine methylation and the ecology of intragenomic parasites</p>
            </title>
            <aug>
               <au>
                  <snm>Yoder</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Walsh</snm>
                  <fnm>CP</fnm>
               </au>
               <au>
                  <snm>Bestor</snm>
                  <fnm>TH</fnm>
               </au>
            </aug>
            <source>Trends Genet</source>
            <pubdate>1997</pubdate>
            <volume>13</volume>
            <issue>8</issue>
            <fpage>335</fpage>
            <lpage>40</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0168-9525(97)01181-5</pubid>
                  <pubid idtype="pmpid" link="fulltext">9260521</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B60">
            <title>
               <p>High intrinsic rate of DNA loss in Drosophila</p>
            </title>
            <aug>
               <au>
                  <snm>Petrov</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>Lozovskaya</snm>
                  <fnm>ER</fnm>
               </au>
               <au>
                  <snm>Hartl</snm>
                  <fnm>DL</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>1996</pubdate>
            <volume>384</volume>
            <issue>6607</issue>
            <fpage>346</fpage>
            <lpage>9</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/384346a0</pubid>
                  <pubid idtype="pmpid" link="fulltext">8934517</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B61">
            <title>
               <p>Definition and variation of human endogenous retrovirus H</p>
            </title>
            <aug>
               <au>
                  <snm>Jern</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Sperber</snm>
                  <fnm>GO</fnm>
               </au>
               <au>
                  <snm>Blomberg</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Virology</source>
            <pubdate>2004</pubdate>
            <volume>327</volume>
            <fpage>93</fpage>
            <lpage>110</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.virol.2004.06.023</pubid>
                  <pubid idtype="pmpid" link="fulltext">15327901</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B62">
            <title>
               <p>Automated de novo identification of repeat sequence families in sequenced genomes</p>
            </title>
            <aug>
               <au>
                  <snm>Bao</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Eddy</snm>
                  <fnm>SR</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2002</pubdate>
            <volume>12</volume>
            <issue>8</issue>
            <fpage>1269</fpage>
            <lpage>76</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">186642</pubid>
                  <pubid idtype="pmpid" link="fulltext">12176934</pubid>
                  <pubid idtype="doi">10.1101/gr.88502</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B63">
            <title>
               <p>LTR STRUC: a novel search and identification program for LTR retrotransposons</p>
            </title>
            <aug>
               <au>
                  <snm>McCarthy</snm>
                  <fnm>EM</fnm>
               </au>
               <au>
                  <snm>McDonald</snm>
                  <fnm>JF</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2003</pubdate>
            <volume>19</volume>
            <issue>3</issue>
            <fpage>362</fpage>
            <lpage>7</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/btf878</pubid>
                  <pubid idtype="pmpid" link="fulltext">12584121</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B64">
            <title>
               <p>Detection of transposable elements by their compositional bias</p>
            </title>
            <aug>
               <au>
                  <snm>Andrieu</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Fiston</snm>
                  <fnm>AS</fnm>
               </au>
               <au>
                  <snm>Anxolab&#233;h&#234;re</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Quesneville</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>BMC Bioinformatics</source>
            <pubdate>2004</pubdate>
            <volume>5</volume>
            <fpage>94</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">497039</pubid>
                  <pubid idtype="pmpid" link="fulltext">15251040</pubid>
                  <pubid idtype="doi">10.1186/1471-2105-5-94</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B65">
            <title>
               <p>Automated recognition of retroviral sequences in genomic data &#8211; RetroTector</p>
            </title>
            <aug>
               <au>
                  <snm>Sperber</snm>
                  <fnm>GO</fnm>
               </au>
               <au>
                  <snm>Airola</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Jern</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Blomberg</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Research</source>
            <pubdate>2007</pubdate>
            <volume>35</volume>
            <issue>15</issue>
            <fpage>4964</fpage>
            <lpage>4976</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1976444</pubid>
                  <pubid idtype="pmpid" link="fulltext">17636050</pubid>
                  <pubid idtype="doi">10.1093/nar/gkm515</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B66">
            <title>
               <p>Substitution rate comparisons between grasses and palms: synonymous rate differences at the nuclear gene Adh parallel rate differences at the plastid gene rbcL</p>
            </title>
            <aug>
               <au>
                  <snm>Gaut</snm>
                  <fnm>BS</fnm>
               </au>
               <au>
                  <snm>Morton</snm>
                  <fnm>BR</fnm>
               </au>
               <au>
                  <snm>McCaig</snm>
                  <fnm>BC</fnm>
               </au>
               <au>
                  <snm>Clegg</snm>
                  <fnm>MT</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>1996</pubdate>
            <volume>93</volume>
            <issue>19</issue>
            <fpage>10274</fpage>
            <lpage>9</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">38374</pubid>
                  <pubid idtype="pmpid" link="fulltext">8816790</pubid>
                  <pubid idtype="doi">10.1073/pnas.93.19.10274</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
      </refgrp>
   </bm>
</art>
