<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
<ui>1471-2148-11-364</ui>
<ji>1471-2148</ji>
<fm>
<dochead>Research article</dochead>
<bibl>
<title><p>Mechanisms of intron gain and loss in Drosophila</p></title>
<aug>
<au id="A1"><snm>Yenerall</snm><fnm>Paul</fnm><insr iid="I1"/><email>pmy3@pitt.edu</email></au>
<au id="A2"><snm>Krupa</snm><fnm>Bradlee</fnm><insr iid="I2"/><email>bkrupa21@gmail.com</email></au>
<au id="A3" ca="yes"><snm>Zhou</snm><fnm>Leming</fnm><insr iid="I3"/><insr iid="I4"/><email>lmzhou@gmail.com</email></au>
</aug>
<insg>
<ins id="I1"><p>Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA 15260, USA</p></ins>
<ins id="I2"><p>Department of Computer Science, University of Pittsburgh, Pittsburgh, PA 15260, USA</p></ins>
<ins id="I3"><p>Department of Health Information Management, University of Pittsburgh, Pittsburgh, PA 15260, USA</p></ins>
<ins id="I4"><p>Department of Bioengineering, University of Pittsburgh, Pittsburgh, PA 15260, USA</p></ins>
</insg>
<source>BMC Evolutionary Biology</source>
<issn>1471-2148</issn>
<pubdate>2011</pubdate>
<volume>11</volume>
<issue>1</issue>
<fpage>364</fpage>
<url>http://www.biomedcentral.com/1471-2148/11/364</url>
<xrefbib><pubidlist><pubid idtype="doi">10.1186/1471-2148-11-364</pubid><pubid idtype="pmpid">22182367</pubid></pubidlist></xrefbib></bibl>
<history><rec><date><day>7</day><month>9</month><year>2011</year></date></rec><acc><date><day>19</day><month>12</month><year>2011</year></date></acc><pub><date><day>19</day><month>12</month><year>2011</year></date></pub></history><cpyrt><year>2011</year><collab>Yenerall et al; licensee BioMed Central Ltd.</collab><note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note></cpyrt>
<abs>
<sec><st><p>Abstract</p></st>
<sec><st><p>Background</p></st>
<p>It is widely accepted that orthologous genes have lost or gained introns throughout evolution. However, the specific mechanisms that generate these changes have proved elusive. Introns are known to affect nearly every level of gene expression. Therefore, understanding their mechanism of evolution after their initial fixation in eukaryotes is pertinent to understanding the means by which organisms develop greater regulation and complexity.</p>
</sec>
<sec><st><p>Results</p></st>
<p>To investigate possible mechanisms of intron gain and loss, we identified 189 intron gain and 297 intron loss events among 11 Drosophila species. We then investigated these events for signatures of previously proposed mechanisms of intron gain and loss. This work constitutes the first comprehensive study into the specific mechanisms that may generate intron gains and losses in Drosophila. We report evidence of intron gain via transposon insertion; the first intron loss that may have occurred via non-homologous end joining; intron gains via the repair of a double strand break; evidence of intron sliding; and evidence that internal or 5' introns may not frequently be deleted via the self-priming of reverse transcription during mRNA-mediated intron loss. Our data also suggest that the transcription process may promote or result in intron gain.</p>
</sec>
<sec><st><p>Conclusion</p></st>
<p>Our findings support the occurrence of intron gain via transposon insertion, repair of double strand breaks, as well as intron loss via non-homologous end joining. Furthermore, our data suggest that intron gain may be enabled by or due to transcription, and we shed further light on the exact mechanism of mRNA-mediated intron loss.</p>
</sec>
</sec>
</abs>
</fm>
<bdy>
<sec><st><p>Background</p></st>
<p>Spliceosomal introns, segments of RNA that are excised by the spliceosome during the processing of pre-mRNA in eukaryotes, are found in varying quantities and positions among orthologous genes. By identifying orthologs, aligning gene sequences, and coupling intron absences/presences with known species phylogenies, numerous studies have identified the number of intron gains and losses that have occurred among species throughout evolution <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B2">2</abbr><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr><abbr bid="B5">5</abbr><abbr bid="B6">6</abbr><abbr bid="B7">7</abbr><abbr bid="B8">8</abbr><abbr bid="B9">9</abbr><abbr bid="B10">10</abbr><abbr bid="B11">11</abbr></abbrgrp>. However, very little is known about the molecular mechanisms underlying these changes <abbrgrp><abbr bid="B12">12</abbr><abbr bid="B13">13</abbr></abbrgrp>.</p>
<p>As a deeper understanding of gene expression emerges, it is evident that introns not only increase proteome diversity through their well known role in alternative splicing <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>, but also influence every stage of pre-translational gene expression <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>. Important regulatory elements such as miRNAs and snoRNAs are commonly found within introns in animals <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>, and recently introns in the human genome have been shown to harbor thousands of non-coding RNAs, key regulators of gene expression <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>. The splicing process alone has been shown to increase transcriptional efficiency and the nuclear export of transcripts <abbrgrp><abbr bid="B15">15</abbr><abbr bid="B18">18</abbr><abbr bid="B19">19</abbr><abbr bid="B20">20</abbr><abbr bid="B21">21</abbr></abbrgrp>. Therefore, understanding the molecular mechanisms that create and remove introns provides insight into one of the mechanisms by which eukaryotic organisms develop greater regulation and complexity.</p>
<p>Two previously hypothesized mechanisms of intron loss are <it><ul>R</ul>everse <ul>T</ul>ranscriptase-<ul>M</ul>ediated <ul>I</ul>ntron <ul>L</ul>oss </it>(referred to as <it>RTMIL </it>in this work) <abbrgrp><abbr bid="B22">22</abbr></abbrgrp> and <it>Genomic Deletions</it>. RTMIL occurs when cDNA, either directly or after retroposition into the genome, recombines with an intron-present gene, resulting in the precise deletion of intron(s) <abbrgrp><abbr bid="B23">23</abbr></abbrgrp>. Genomic deletions are general genomic deletion events that, by chance, delete an intron <abbrgrp><abbr bid="B24">24</abbr></abbrgrp>. Therefore, the genomic deletion of introns may occur via various molecular mechanisms and may produce precise or imprecise intron losses. Recently, double strand break repair (DSBR) by non-homologous end joining (NHEJ) has been implicated as a common means for the genomic deletion of introns <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>. RTMIL has been demonstrated in yeast <abbrgrp><abbr bid="B26">26</abbr><abbr bid="B27">27</abbr></abbrgrp>, and general genomic deletions are known to occur. However, the prevalence of each proposed mechanism of intron loss is unknown.</p>
<p>Previously hypothesized mechanisms of intron gain include: <it>Intron Transposition </it><abbrgrp><abbr bid="B28">28</abbr></abbrgrp>, in which an intron transposes or "reverse splices" into a previously intronless position in a transcript, and this transcript is then reverse transcribed and recombined with the original gene; <it>Transposon Insertion </it><abbrgrp><abbr bid="B29">29</abbr></abbrgrp>, in which a transposon inserts into a gene and forms a spliceable intron; <it>Tandem Genomic Duplications </it><abbrgrp><abbr bid="B30">30</abbr></abbrgrp>, in which the tandem duplication of a gene segment creates a spliceable intron; <it>Intron Transfer </it><abbrgrp><abbr bid="B31">31</abbr></abbrgrp>, in which a paralog transfers an intron via gene conversion to an intron-absent position; <it>Insertion of a Group II Intron </it><abbrgrp><abbr bid="B28">28</abbr></abbrgrp>, in which a group II intron (a type of intron known to reverse splice or retrohome in some organelle genomes) inserts into a nuclear gene and creates a spliceosomal intron; <it>Intron Gain During Double Strand Break </it>
<it>Repair </it><abbrgrp><abbr bid="B4">4</abbr></abbrgrp>, in which a DNA segment that may function as a spliceable intron is inserted during DSBR; and <it>Intronization </it><abbrgrp><abbr bid="B32">32</abbr><abbr bid="B33">33</abbr></abbrgrp>, in which mutations in exonic sequence produce functional splice signals, forming a new intron with previously exonic sequence.</p>
<p>Unlike most mechanisms of intron gain and loss which involve the insertion or deletion of DNA segments, <it>Intron Sliding </it><abbrgrp><abbr bid="B34">34</abbr><abbr bid="B35">35</abbr><abbr bid="B36">36</abbr></abbrgrp> has been hypothesized to present the appearance of concurrent intron loss and gain without removing or inserting DNA. This may occur when orthologous introns "slide" through a gene, while leaving the coding sequence largely unaffected. If the intron slides far enough from its original position, it may appear as if a gene has both lost and gained an intron. Evidence of intron sliding in Drosophila exists <abbrgrp><abbr bid="B34">34</abbr></abbrgrp>; however, there is debate over the viability of this mechanism <abbrgrp><abbr bid="B35">35</abbr><abbr bid="B36">36</abbr><abbr bid="B37">37</abbr><abbr bid="B38">38</abbr></abbrgrp>.</p>
<p>Out of all the proposed mechanisms of intron gain and loss, only RTMIL has been shown to occur <it>in vivo </it><abbrgrp><abbr bid="B26">26</abbr><abbr bid="B27">27</abbr></abbrgrp>. Therefore, in order to find support for the occurrence of other proposed mechanisms of intron gain or loss, researchers have attempted to identify intron gains or losses that appear to have occurred via a specific mechanism. Evidence has been found to support the occurrence of: intron loss due to genomic deletions in Drosophila and Pufferfish <abbrgrp><abbr bid="B39">39</abbr><abbr bid="B40">40</abbr></abbrgrp>; intron gain by intron transposition in <it>Oikopleura </it><abbrgrp><abbr bid="B5">5</abbr></abbrgrp>; intron gain by transposon insertion in maize, rice and <it>Oikopleura </it><abbrgrp><abbr bid="B5">5</abbr><abbr bid="B41">41</abbr><abbr bid="B42">42</abbr></abbrgrp>; intron gain by intron transfer in <it>Chironomus thummi </it>and <it>Aspergillus </it>fungi <abbrgrp><abbr bid="B11">11</abbr><abbr bid="B31">31</abbr></abbrgrp>; intron gain by tandem genomic duplications in a multitude of eukaryotes <abbrgrp><abbr bid="B34">34</abbr><abbr bid="B43">43</abbr><abbr bid="B44">44</abbr></abbrgrp>; intron gain during DSBR in <it>Daphnia pulex </it>and <it>Aspergillus </it>fungi <abbrgrp><abbr bid="B4">4</abbr><abbr bid="B11">11</abbr></abbrgrp>; intron gain by intronization in <it>Cryptococcus </it>and <it>Caenorhabditis </it><abbrgrp><abbr bid="B33">33</abbr><abbr bid="B37">37</abbr></abbrgrp>; and intron sliding in Drosophila <abbrgrp><abbr bid="B34">34</abbr></abbrgrp>. However, these findings are insufficient to prove the existence of any proposed mechanism. In order to determine if these proposed mechanisms of intron gain or loss are universal mechanisms operating in all eukaryotes, as opposed to either singular events or mechanisms that only occur in a few species, multiple unambiguous instances of each mechanism must be located in all eukaryotic kingdoms.</p>
<p>Only a few of the proposed mechanisms of intron gain or loss have been shown to occur in Drosophila <abbrgrp><abbr bid="B6">6</abbr><abbr bid="B34">34</abbr><abbr bid="B39">39</abbr></abbrgrp>. Therefore, we chose to investigate the ability of all proposed mechanisms to operate in Drosophila. To this end, we first identified high confidence cases of intron gains and losses among 11 Drosophila species (<it>D. melanogaster</it>, <it>D. pseudoobscura</it>, <it>D</it>. <it>virilis</it>, <it>D. sechellia</it>, <it>D. yakuba</it>, <it>D. erecta</it>, <it>D. ananassae</it>, <it>D. persimilis</it>, <it>D. willistoni</it>, <it>D. mojavensis</it>, and <it>D. grimshawi</it>). We then analyzed these events extensively for signatures of previously proposed mechanisms of intron gain and loss. These 11 well-sequenced and well-annotated Drosophila species enabled us to identify intron gains and losses that have occurred relatively recently (2-40 million years ago) <abbrgrp><abbr bid="B45">45</abbr></abbrgrp>. This fine time scale allowed us to analyze these events before extensive sequence divergence may have occurred, which has the potential to disguise the mechanism(s) underlying these events.</p>
</sec>
<sec><st><p>Results</p></st>
<p>Within the final dataset of 353 orthologs, we identified 189 intron gains and 287 intron losses with 112 gains and 94 losses located at ancestral nodes (Figure <figr fid="F1">1</figr>) and 77 gains and 193 losses located within a single species (Table <tblr tid="T1">1</tblr>). Using a different dataset, we support previous findings of widespread heterogeneity in the rates of intron gain and loss among Drosophila species <abbrgrp><abbr bid="B6">6</abbr></abbrgrp>. Overall, in comparison to introns from all 11 Drosophila species (59% AT content, average size 1015 bp), gained introns were of similar composition but shorter in length (64% AT content, average size 398 bp). Additionally, in accordance with previous research in Drosophila <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B6">6</abbr></abbrgrp>, these gained introns were biased towards the 5' end of genes (Kolmogorov-Smirnov test, p = 0.0421, Figure <figr fid="F2">2</figr>).</p>
<fig id="F1"><title><p>Figure 1</p></title><caption><p>Drosophila phylogenetic tree illustrating the numbers of intron gains and losses</p></caption><text>
   <p><b>Drosophila phylogenetic tree illustrating the numbers of intron gains and losses</b>. Pluses indicate the number of gained introns; minuses indicate the number of lost introns. Numbers at far right of the tree represent events identified in one species. Numbers at nodes represent events assumed to have occurred in ancestors. Branch lengths are drawn roughly to scale and do not indicate precise evolutionary distances. A larger phylogenetic tree drawn to scale (with the number of intron gains and losses mapped onto the tree) can be found in Additional file <supplr sid="S1">1</supplr>, Figure S1.</p>
</text><graphic file="1471-2148-11-364-1" hint_layout="double"/></fig>
<tbl id="T1"><title><p>Table 1</p></title><caption><p>Information about each species and the number of intron gains and losses found within each species</p></caption><tblbdy cols="8">
      <r>
         <c ca="left">
            <p>
               <b>Species Name</b>
            </p>
         </c>
         <c ca="left">
            <p>
               <b>Assembled Genome Size</b>
            </p>
         </c>
         <c ca="left">
            <p>
               <b>Protein Coding Genes</b>
            </p>
         </c>
         <c ca="left">
            <p>
               <b>Number of Introns</b>
            </p>
         </c>
         <c ca="left">
            <p>
               <b>Average Intron Size(bp)</b>
            </p>
         </c>
         <c ca="left">
            <p>
               <b>Introns Analyzed</b>
            </p>
         </c>
         <c ca="left">
            <p>
               <b>Gained Introns</b>
            </p>
         </c>
         <c ca="left">
            <p>
               <b>Lost Introns</b>
            </p>
         </c>
      </r>
      <r>
         <c cspan="8">
            <hr/>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>
               <it>D. melanogaster</it>
            </p>
         </c>
         <c ca="left">
            <p>118 Mb</p>
         </c>
         <c ca="left">
            <p>13, 919</p>
         </c>
         <c ca="left">
            <p>53, 459</p>
         </c>
         <c ca="left">
            <p>1, 482</p>
         </c>
         <c ca="left">
            <p>1, 401</p>
         </c>
         <c ca="left">
            <p>0</p>
         </c>
         <c ca="left">
            <p>4</p>
         </c>
      </r>
      <r>
         <c cspan="8">
            <hr/>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>
               <it>D. sechellia</it>
            </p>
         </c>
         <c ca="left">
            <p>115 Mb</p>
         </c>
         <c ca="left">
            <p>16, 467</p>
         </c>
         <c ca="left">
            <p>41, 655</p>
         </c>
         <c ca="left">
            <p>799</p>
         </c>
         <c ca="left">
            <p>1, 391</p>
         </c>
         <c ca="left">
            <p>6</p>
         </c>
         <c ca="left">
            <p>13</p>
         </c>
      </r>
      <r>
         <c cspan="8">
            <hr/>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>
               <it>D. yakuba</it>
            </p>
         </c>
         <c ca="left">
            <p>127 Mb</p>
         </c>
         <c ca="left">
            <p>16, 077</p>
         </c>
         <c ca="left">
            <p>42, 642</p>
         </c>
         <c ca="left">
            <p>824</p>
         </c>
         <c ca="left">
            <p>1, 392</p>
         </c>
         <c ca="left">
            <p>0</p>
         </c>
         <c ca="left">
            <p>3</p>
         </c>
      </r>
      <r>
         <c cspan="8">
            <hr/>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>
               <it>D. erecta</it>
            </p>
         </c>
         <c ca="left">
            <p>134 Mb</p>
         </c>
         <c ca="left">
            <p>15, 044</p>
         </c>
         <c ca="left">
            <p>40, 986</p>
         </c>
         <c ca="left">
            <p>835</p>
         </c>
         <c ca="left">
            <p>1, 397</p>
         </c>
         <c ca="left">
            <p>0</p>
         </c>
         <c ca="left">
            <p>2</p>
         </c>
      </r>
      <r>
         <c cspan="8">
            <hr/>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>
               <it>D. ananassae</it>
            </p>
         </c>
         <c ca="left">
            <p>176 Mb</p>
         </c>
         <c ca="left">
            <p>15, 069</p>
         </c>
         <c ca="left">
            <p>41, 345</p>
         </c>
         <c ca="left">
            <p>1, 026</p>
         </c>
         <c ca="left">
            <p>1, 391</p>
         </c>
         <c ca="left">
            <p>11</p>
         </c>
         <c ca="left">
            <p>21</p>
         </c>
      </r>
      <r>
         <c cspan="8">
            <hr/>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>
               <it>D. pseudoobscura</it>
            </p>
         </c>
         <c ca="left">
            <p>127 Mb</p>
         </c>
         <c ca="left">
            <p>16, 062</p>
         </c>
         <c ca="left">
            <p>41, 804</p>
         </c>
         <c ca="left">
            <p>823</p>
         </c>
         <c ca="left">
            <p>1, 372</p>
         </c>
         <c ca="left">
            <p>2</p>
         </c>
         <c ca="left">
            <p>2</p>
         </c>
      </r>
      <r>
         <c cspan="8">
            <hr/>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>
               <it>D. persimilis</it>
            </p>
         </c>
         <c ca="left">
            <p>138 Mb</p>
         </c>
         <c ca="left">
            <p>16, 874</p>
         </c>
         <c ca="left">
            <p>41, 743</p>
         </c>
         <c ca="left">
            <p>949</p>
         </c>
         <c ca="left">
            <p>1, 370</p>
         </c>
         <c ca="left">
            <p>5</p>
         </c>
         <c ca="left">
            <p>3</p>
         </c>
      </r>
      <r>
         <c cspan="8">
            <hr/>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>
               <it>D. willistoni</it>
            </p>
         </c>
         <c ca="left">
            <p>187 Mb</p>
         </c>
         <c ca="left">
            <p>15, 512</p>
         </c>
         <c ca="left">
            <p>40, 896</p>
         </c>
         <c ca="left">
            <p>1, 203</p>
         </c>
         <c ca="left">
            <p>1, 338</p>
         </c>
         <c ca="left">
            <p>50</p>
         </c>
         <c ca="left">
            <p>114</p>
         </c>
      </r>
      <r>
         <c cspan="8">
            <hr/>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>
               <it>D. mojavensis</it>
            </p>
         </c>
         <c ca="left">
            <p>161 Mb</p>
         </c>
         <c ca="left">
            <p>14, 594</p>
         </c>
         <c ca="left">
            <p>40, 199</p>
         </c>
         <c ca="left">
            <p>1, 075</p>
         </c>
         <c ca="left">
            <p>1, 391</p>
         </c>
         <c ca="left">
            <p>1</p>
         </c>
         <c ca="left">
            <p>13</p>
         </c>
      </r>
      <r>
         <c cspan="8">
            <hr/>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>
               <it>D. virilis</it>
            </p>
         </c>
         <c ca="left">
            <p>172 Mb</p>
         </c>
         <c ca="left">
            <p>14, 491</p>
         </c>
         <c ca="left">
            <p>40, 386</p>
         </c>
         <c ca="left">
            <p>1, 071</p>
         </c>
         <c ca="left">
            <p>1, 421</p>
         </c>
         <c ca="left">
            <p>0</p>
         </c>
         <c ca="left">
            <p>3</p>
         </c>
      </r>
      <r>
         <c cspan="8">
            <hr/>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>
               <it>D. grimshawi</it>
            </p>
         </c>
         <c ca="left">
            <p>138 Mb</p>
         </c>
         <c ca="left">
            <p>15, 585</p>
         </c>
         <c ca="left">
            <p>41, 370</p>
         </c>
         <c ca="left">
            <p>965</p>
         </c>
         <c ca="left">
            <p>1, 396</p>
         </c>
         <c ca="left">
            <p>2</p>
         </c>
         <c ca="left">
            <p>15</p>
         </c>
      </r>
   </tblbdy></tbl>
<fig id="F2"><title><p>Figure 2</p></title><caption><p>Histogram of positions of gained introns</p></caption><text>
   <p><b>Histogram of positions of gained introns</b>. Histogram displaying the relative position (i.e. on a scale of 1) of intron gains in the gene.</p>
</text><graphic file="1471-2148-11-364-2" hint_layout="single"/></fig>
<sec><st><p>Mechanisms of Intron Loss</p></st>
<sec><st><p>Reverse Transcriptase-Mediated Intron Loss</p></st>
<p>Because RTMIL leaves behind no distinct mechanistic signatures, it is only possible to determine its prevalence by analyzing intron deletion biases. These biases arise due to the involvement of reverse transcriptase during RTMIL. Reverse transcriptase has been proposed to be primed on the poly(A) tail of mRNA <abbrgrp><abbr bid="B22">22</abbr></abbrgrp> and transcribe from the 3' end to the 5' end of mRNA. However, reverse transcriptase may not always reach the 5' end of mRNA <abbrgrp><abbr bid="B46">46</abbr></abbrgrp>. Therefore, if intron deletions have commonly occurred via RTMIL, intron deletions are expected to be biased towards the 3' end of genes. Some researchers have identified this bias <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B2">2</abbr><abbr bid="B24">24</abbr><abbr bid="B44">44</abbr></abbrgrp>, but others have not <abbrgrp><abbr bid="B5">5</abbr><abbr bid="B6">6</abbr><abbr bid="B9">9</abbr><abbr bid="B40">40</abbr><abbr bid="B42">42</abbr></abbrgrp>. Previous reports on the distribution of intron loss positions in Drosophila have been conflicting <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B6">6</abbr></abbrgrp>. We found lost intron positions to be uniformly distributed throughout the length of genes that experienced intron loss(es) (Kolmogorov-Smirnov test, p = 0.2112, Figure <figr fid="F3">3</figr>). Other hypothesized mechanistic pathways of RTMIL, whereby RTMIL may delete internal or 5' introns without deleting 3' introns <abbrgrp><abbr bid="B44">44</abbr><abbr bid="B47">47</abbr></abbrgrp>, may explain this distribution. Alternatively, RTMIL may have not deleted the majority of lost introns in our dataset.</p>
<fig id="F3"><title><p>Figure 3</p></title><caption><p>Histogram of positions of lost introns</p></caption><text>
   <p><b>Histogram of positions of lost introns</b>. Histogram displaying the relative position (i.e. on a scale of 1) of intron losses in the gene.</p>
</text><graphic file="1471-2148-11-364-3" hint_layout="single"/></fig>
<p>Because RTMIL is transcript-mediated, if RTMIL was a frequent mechanism of intron loss, genes that have lost introns should commonly be germline expressed <abbrgrp><abbr bid="B23">23</abbr></abbrgrp>. To test this assumption, we extracted the <it>D. mel </it>ortholog of each gene that experienced an intron loss from our dataset. We then checked these orthologs for moderate germline expression using data downloaded from Flybase <abbrgrp><abbr bid="B45">45</abbr></abbrgrp>, the modENCODE project <abbrgrp><abbr bid="B48">48</abbr></abbrgrp>, and FlyAtlas <abbrgrp><abbr bid="B49">49</abbr></abbrgrp>. Using this dataset, 187 out of the 287 genes that experienced intron loss were shown to have moderate germline expression. In comparison to the frequency in which we found genes to be germline expressed in <it>D. melanogaster </it>(7, 212 out of 13, 752), we found a significant bias for genes that experienced intron loss to be germline expressed (Pearson chi-square test, p &lt; 0.05).</p>
<p>Another deletion bias expected if RTMIL has commonly deleted introns is the frequent loss of adjacent introns. Previous investigations have found adjacent introns to be lost more commonly than would be expected purely by chance <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B11">11</abbr><abbr bid="B24">24</abbr><abbr bid="B50">50</abbr></abbrgrp>. Our dataset contained a total of 9 adjacent intron losses that appear to have occurred simultaneously in the genes <it>Dwil\GK21739</it>, <it>Dsec\GM16466</it>, and <it>Dwil\GK24430</it>. We would have expected 2.7 adjacent intron losses to have occurred purely by chance <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. Therefore, our dataset show a significant bias for adjacent introns to be lost (Pearson chi-square test, p &lt; 0.05).</p>
<p>In one gene that experienced adjacent intron losses, <it>Dwil\GK24430</it>, the first and last introns were conserved while two internal introns were lost. Because these losses were adjacent and appear to have occurred simultaneously, we assume these introns were deleted by RTMIL. The exact mechanism by which RTMIL may remove internal or 5' intron(s) but conserve 3' intron(s) has proved elusive but received considerable attention <abbrgrp><abbr bid="B9">9</abbr><abbr bid="B11">11</abbr><abbr bid="B13">13</abbr><abbr bid="B44">44</abbr><abbr bid="B51">51</abbr><abbr bid="B52">52</abbr></abbrgrp>. The most commonly proposed mechanism to account for internal or 5' intron loss(es) by RTMIL is the formation of a double stranded mRNA secondary structure upstream from the 3' conserved intron position(s). This secondary structure then "self-primes" reverse transcription during RTMIL, excluding the conserved intron position(s) from reverse transcription and subsequent recombination (i.e. intron loss) <abbrgrp><abbr bid="B9">9</abbr><abbr bid="B44">44</abbr><abbr bid="B51">51</abbr><abbr bid="B52">52</abbr></abbrgrp>. Because the ortholog of <it>Dwil\GK24430 </it>in <it>D</it>. <it>melanogaster</it>, <it>elgi</it>, was shown to have high expression levels in the ovaries of adult flies <abbrgrp><abbr bid="B48">48</abbr></abbrgrp> and orthologs of <it>Dwil\GK24430 </it>have highly similar sequences (which suggests that the coding sequence has been conserved), <it>Dwil\GK24430 </it>was investigated for the ability to have self-primed reverse transcription during RTMIL. We determined the 5' and 3' untranslated regions (UTRs) of <it>Dwil\GK24430 </it>using the Augustus program <abbrgrp><abbr bid="B53">53</abbr></abbrgrp>, determined the polyadenylation site using PolyAPred <abbrgrp><abbr bid="B54">54</abbr></abbrgrp>, appended poly(A) tails of various lengths, and ran these predicted mRNA sequences through the RNA folding program mfold <abbrgrp><abbr bid="B55">55</abbr></abbrgrp>. All predicted secondary structures could not account for the pattern of intron losses that occurred in <it>Dwil\GK24430</it>. Therefore, it is not likely that the self-priming of reverse transcription during RTMIL accounted for these internal intron losses.</p>
</sec>
<sec><st><p>Genomic Deletions</p></st>
<p>Similar to intron loss via RTMIL, the precise genomic deletion of an intron is difficult to confidently detect after its occurrence. Therefore, we identified imprecise intron losses. To locate imprecise losses, we examined the former intron-exon junctions of all lost introns. If the intron deletion event appeared to have inserted nucleotides into the coding sequence of the gene, these inserted nucleotides were extracted and compared to conserved orthologous introns using the FASTA program <abbrgrp><abbr bid="B56">56</abbr></abbrgrp>. Using this method we identified an imprecise intron loss that may have occurred via NHEJ (Figure <figr fid="F4">4</figr>), a recently hypothesized mechanism of intron loss <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>. Direct repeats that likely flanked this intron prior to deletion may have mediated deletion by providing a recessed microhomology for efficient ligation during NHEJ <abbrgrp><abbr bid="B57">57</abbr></abbrgrp>.</p>
<fig id="F4"><title><p>Figure 4</p></title><caption><p>Genomic deletion of an intron by NHEJ</p></caption><text>
   <p><b>Genomic deletion of an intron by NHEJ</b>. Alignment of intron 1 in <it>Dvir\GJ12838 </it>with unaligned nucleotides from the coding sequence of <it>Dgri\GH15541</it>, which experienced an intron loss at this position. Direct repeats (bolded and underlined) may have been used for microhomology directed ligation during NHEJ. The second cyostine in the downstream repeat may have undergone a C&#8594;T transition.</p>
</text><graphic file="1471-2148-11-364-4" hint_layout="double"/></fig>
<p>Because introns flanked by direct repeats have been hypothesized to be preferentially deleted via genomic deletions <abbrgrp><abbr bid="B47">47</abbr></abbrgrp>, it is expected that throughout evolution, introns flanked by direct repeats will be preferentially lost. Therefore, in an attempt to determine the prevalence of intron loss via genomic deletions in our dataset, for each intron loss identified within a single species we searched the intron-exon junctions of the closest (in evolutionary distance) conserved orthologous intron for the presence of direct repeats &#8805; 5 bp in length. In our dataset, 27% of these introns were flanked by direct repeats, nearly identical to the percent of direct repeats found flanking 100 randomly selected conserved introns (26%). This suggests that RTMIL may have deleted the majority of introns in our dataset. However, it is possible that sequence divergence throughout evolution may have eliminated many direct repeats that originally flanked these conserved orthologous introns.</p>
</sec>
</sec>
<sec><st><p>Mechanisms of Intron Gain</p></st>
<sec><st><p>Transposon Insertion</p></st>
<p>To identify intron gains that occurred via transposon insertion, all gained intronic sequences were compared to the canonical transposon sequences from Flybase <abbrgrp><abbr bid="B45">45</abbr></abbrgrp> using the FASTA program <abbrgrp><abbr bid="B56">56</abbr></abbrgrp>. A hit between the third intron in <it>Dsec\GM26034 </it>and the retrotransposon <it>Doc1053 </it>occurred with 98.4% similarity and 99% coverage. Target site duplications (TSDs) are located at the 5' end and 15 nucleotides downstream from the end of this intron, indicating that the insertion of <it>Doc1053 </it>alone resulted in intron gain (Figure <figr fid="F5">5</figr>).</p>
<fig id="F5"><title><p>Figure 5</p></title><caption><p>Intron gain via transposon insertion</p></caption><text>
   <p><b>Intron gain via transposon insertion</b>. The solid black bar indicates intron-exon junctions. Red nucleotides indicate matched nucleotides between <it>Doc1053 </it>and intron 3 in <it>Dsec\GM26034</it>. Bolded and underlined nucleotides represent TSDs caused by insertion of the transposon. The first nucleotide in the downstream TSD likely underwent a G&#8594;C transversion. The insertion of <it>Doc1053 </it>did not change the reading frame of <it>Dsec\GM26034 </it>but did insert five amino acids (Thr, Met, Ser, Thr, and Glu).</p>
</text><graphic file="1471-2148-11-364-5" hint_layout="double"/></fig>
</sec>
<sec><st><p>Double Strand Break Repair</p></st>
<p>DSBR has recently been proposed to result in intron gain if NHEJ inserts filler DNA that may function as a spliceable intron <abbrgrp><abbr bid="B4">4</abbr><abbr bid="B58">58</abbr></abbrgrp>. It has been shown that this filler DNA may be preferentially of mitochondrial origin <abbrgrp><abbr bid="B59">59</abbr><abbr bid="B60">60</abbr></abbrgrp>. To identify cases of intron gain that occurred via the repair of double strand breaks, we compared all gained intronic sequences to their respective nuclear and mitochondrial genomes. Eight gained introns (<it>Dwil\GK13533 </it>intron 2, <it>Dana\GF12884 </it>intron 3, <it>Dper\GL20060 </it>intron 2, <it>Dmel\CG9297 </it>intron 4, <it>Dmoj\GI21017 </it>intron 5, <it>Dmoj\GI21017 </it>intron 6, <it>Dvir\GJ24248 </it>intron 8, and <it>Dwil\GK24841 </it>intron 1, average length = 113 bp) matched to their respective mitochondrial genomes with &#8805; 90% query sequence coverage and e-value &#8804; 0.1, suggesting that these introns may have been inserted via NHEJ. One of these introns, <it>Dwil\GK24465 </it>intron 5, displayed significant similarity to a mitochondrial sequence (Figure <figr fid="F6">6</figr>). Many other hits were found with lower coverage levels (60-70%) but better e-values (e &#8804; 10<sup>-4</sup>).</p>
<fig id="F6"><title><p>Figure 6</p></title><caption><p>Intron gain by DSBR</p></caption><text>
   <p><b>Intron gain by DSBR</b>. Alignment between the reverse complement of a gained intron in <it>Dwil\GK24465 </it>and a segment of <it>D. wil</it>'s mitochondria (e-value = 0.032, coverage = 99%). The best score produced by randomly shuffling and realigning these sequences 1, 000 times was significantly lower (Pearson chi-square test, p &lt; 0.05) than the score between the original sequences.</p>
</text><graphic file="1471-2148-11-364-6" hint_layout="double"/></fig>
<p>Because direct repeats frequently flank filler DNA inserted via NHEJ <abbrgrp><abbr bid="B61">61</abbr></abbrgrp>, to determine the prevalence of intron gain via NHEJ in our dataset we searched the intron-exon junctions of all gained introns for direct repeats of length &#8805; 5 bp. We identified direct repeats flanking 19 out of 77 gained introns; however, in comparison to a random set of 100 conserved introns (26 of which were flanked by direct repeats), this level did not reach statistical significance. This suggests two possibilities. One is that direct repeats may not commonly flank DNA inserted by NHEJ in Drosophila, as the frequency and size of direct repeats inserted by NHEJ when using filler DNA has been shown to vary in different organisms and cell types <abbrgrp><abbr bid="B61">61</abbr><abbr bid="B62">62</abbr><abbr bid="B63">63</abbr></abbrgrp>. Alternatively, NHEJ may not be a common mechanism of intron gain in Drosophila.</p>
</sec>
<sec><st><p>Transcription-Mediated Intron Gain?</p></st>
<p>We did not identify any intron gains that occurred via intron transposition in our dataset, the only proposed transcript-mediated mechanism of intron gain. However, genes that have experienced intron gains are highly overrepresented in our germline expression dataset (135 out of 189, Pearson chi-square test, p &lt; 0.01), similar to findings in <it>Caenorhabditis </it><abbrgrp><abbr bid="B10">10</abbr></abbrgrp>. This overrepresentation of germline expression in genes that have experienced intron gain suggests that intron gain may be enabled by or due to transcription.</p>
</sec>
</sec>
<sec><st><p>INTRON SLIDING</p></st>
<p>Intron sliding, the sliding or relocation of orthologous introns, has been proposed to be a rare event that may move introns very small distances <abbrgrp><abbr bid="B35">35</abbr><abbr bid="B37">37</abbr><abbr bid="B38">38</abbr></abbrgrp>. We identified 4 introns that appear to have slid more than 10 bp while leaving the coding sequence largely unaffected. To ensure that these were <it>bona fide </it>cases of intron sliding, as opposed to concurrent intron losses and gains, we compared the sequence of introns that appeared to have slid to the sequence of their closest (in evolutionary distance) suspected orthologous introns. Three cases of intron sliding displayed moderate similarity between these introns (e-value &#8804; 0.1), while one, the fourth intron in <it>Dwil\GK22863</it>, displayed significant similarity to its suspected ortholog intron, intron four in <it>Dper\GL17458 </it>(Additional file <supplr sid="S1">1</supplr>, Figure S2), indicating that this intron experienced intron sliding.</p>
<suppl id="S1">
<title><p>Additional file 1</p></title>
<text><p><b>Supplementary figures</b>. Figures used to provide further information about the alignments and various cases of intron gain/loss events.</p></text>
<file name="1471-2148-11-364-S1.PDF">
   <p>Click here for file</p>
</file>
</suppl>
</sec>
</sec>
<sec><st><p>Discussion</p></st>
<p>Prior investigations into intron gain and loss in Drosophila <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B6">6</abbr></abbrgrp> have yielded different results from the ones presented here. Our results differ greatly from those of Coulombe-Huntington and Majewski <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>, who reported intron loss to be much more prominent than intron gain in Drosophila. This difference can be attributed to different methodology and datasets. Coulombe-Huntington and Majewski mapped splice site junctions from <it>D. melanogaster </it>onto the other 10 Drosophila species used in this study, whereas we used high quality, full genome annotations produced by the Drosophila research community <abbrgrp><abbr bid="B45">45</abbr></abbrgrp> for the 11 species. As Coulombe-Huntington and Majewski noted, their methodology did not detect events that had occurred in the other 10 Drosophila species, and was therefore unable to detect intron gain events that had occurred in other species. Our results are also slightly different from those of Farlow et al. <abbrgrp><abbr bid="B6">6</abbr></abbrgrp>. This is likely due to different methods of gene annotation in Drosophila species other than <it>D. melanogaster</it>. Farlow et al.'s annotations primarily relied upon GeneWise <abbrgrp><abbr bid="B64">64</abbr></abbrgrp>, whereas the annotations employed here were produced using a compilation of various <it>ab initio </it>and extrinsic methods <abbrgrp><abbr bid="B45">45</abbr></abbrgrp>. This produced markedly different ortholog datasets; only 734 of our initial 1, 611 orthologs overlap between these two studies. Other differences include our use of a distant outlier, <it>A. gambiae</it>, which greatly increased the power of Dollo parsimony at peripheral branches, and our inclusion of <it>D. sechellia </it>and <it>D. persimilis</it>. Finally, it should be noted that the stringent criteria employed here was designed specifically to eliminate the maximal amount of false-positive intron gain and loss events, rather than to identify the precise number of intron gain and loss events among the Drosophila species. Therefore, the number of intron gains and losses reported here may not necessarily reflect the rate of intron turnover in Drosophila.</p>
<p>Our analyses suggest that intron loss frequently occurs via RTMIL in Drosophila. Adjacent introns were lost more frequently than would be expected purely by chance, and genes experiencing intron loss were commonly germline expressed. However, intron deletions were not biased towards the 3' end of genes (Figure <figr fid="F3">3</figr>), as would be expected if RTMIL deleted the majority of introns. Nonetheless, we did not find evidence suggesting that introns were frequently lost via the precise genomic deletion of introns. There are a number of proposed mechanisms that may explain 5' or internal intron loss by RTMIL without the loss of 3' intron(s). Our data suggest that the most commonly proposed mechanism, the self-priming of reverse transcription during RTMIL <abbrgrp><abbr bid="B9">9</abbr><abbr bid="B44">44</abbr><abbr bid="B51">51</abbr><abbr bid="B52">52</abbr></abbrgrp>, may not frequently produce internal intron losses via RTMIL in Drosophila. An alternative explanation for 5' or internal intron loss by RTMIL without the loss of 3' intron(s) was proposed by Sharpton et al. in <it>C. elegans</it>. Researchers elegantly demonstrated that genes experiencing two or more 3' intron losses (presumably by RTMIL) are preferentially recombined during meiosis at their 3' ends with alleles that have not experienced intron loss <abbrgrp><abbr bid="B44">44</abbr></abbrgrp>. This may have accounted for the uniform distribution of intron losses found in this study in Drosophila (Figure <figr fid="F3">3</figr>).</p>
<p>A recent study suggested that NHEJ may play a prominent role in both intron gain and loss <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>, and our investigation in Drosophila supports this idea. Similar to previous research <abbrgrp><abbr bid="B4">4</abbr><abbr bid="B11">11</abbr></abbrgrp>, we identified intron gains that likely occurred via NHEJ using mitochondrial DNA (an example is shown in Figure <figr fid="F6">6</figr>). We also identified the first case of an intron loss that may have occurred via NHEJ (Figure <figr fid="F4">4</figr>). The ability of NHEJ to both create and remove introns suggests an interesting scenario in intron evolution: introns gained by NHEJ may commonly be flanked by direct repeats <abbrgrp><abbr bid="B61">61</abbr></abbrgrp>, and introns flanked by direct repeats may be preferentially deleted by NHEJ <abbrgrp><abbr bid="B47">47</abbr><abbr bid="B57">57</abbr></abbrgrp>. This may be a mechanism by which new introns are "screened" for selective advantages. Under selection pressure, new introns that provide an advantage to the species may be conserved, whereas those that do not may be lost.</p>
<p>For mechanisms of intron gain, we identified an intron gain that unambiguously occurred via the insertion of a transposable element (Figure <figr fid="F5">5</figr>). In combination with previous findings of intron gain via transposon insertion in maize, rice, and <it>Oikopleura </it><abbrgrp><abbr bid="B5">5</abbr><abbr bid="B41">41</abbr><abbr bid="B42">42</abbr></abbrgrp>, this strongly suggests that transposons may create novel introns in all eukaryotes that harbor active transposons.</p>
<p>In our dataset, 187 gained introns do not appear to have been definitively created by any of the proposed mechanisms of intron gain. It is possible that sequence divergence has obscured the source of some of these introns. However, this finding is perplexing, especially for the 7 gained introns found between <it>D. per </it>and <it>D. pse</it>, which likely radiated only 2 million years ago <abbrgrp><abbr bid="B65">65</abbr></abbrgrp>. We identified a significant bias for genes that have experienced intron gain to be germline expressed, which suggests that transcription may play a prominent role in intron gain. Nonetheless, we find no evidence of intron gain via intron transposition, the only proposed transcript-mediated mechanism of intron gain. Furthermore, intron gains in Drosophila are biased towards the 5' end of genes (Figure <figr fid="F2">2</figr>) <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B6">6</abbr></abbrgrp>, indicating that reverse transcription may not play a significant role in intron gain. This is further supported by a recent investigation into the role of reverse transcriptase in intron gain and loss <abbrgrp><abbr bid="B66">66</abbr></abbrgrp>. Together, these findings suggest that the act of transcription itself may promote or cause intron gain. We speculate that this may be due to transcription-associated recombination (TAR). TAR generally uses homologous recombination <abbrgrp><abbr bid="B67">67</abbr></abbrgrp>; however, TAR has been shown to occasionally use non-homologous recombination <abbrgrp><abbr bid="B68">68</abbr><abbr bid="B69">69</abbr></abbrgrp> and is functionally different from homology-directed DSBR <abbrgrp><abbr bid="B70">70</abbr></abbrgrp>. It is therefore possible that TAR may occasionally insert DNA segments that function as introns. However, a deeper understanding of TAR, which is still poorly characterized, is necessary to fully explore this possibility. Alternatively, uncharacterized errors by or interactions with the transcriptional machinery may facilitate or result in intron gain.</p>
<p>Finally, we identified one unambiguous case of intron sliding in Drosophila. A previous investigation that located near intron pairs also found evidence of intron sliding in Drosophila <abbrgrp><abbr bid="B34">34</abbr></abbrgrp>. This report, in combination with our findings, strongly suggests that intron sliding occurs in Drosophila. However, we do note that intron sliding does not appear to occur in all organisms <abbrgrp><abbr bid="B35">35</abbr><abbr bid="B37">37</abbr></abbrgrp>. Therefore, further research into the possibility of this mechanism to operate in other species is necessary.</p>
</sec>
<sec><st><p>Conclusion</p></st>
<p>The use of 11 well-annotated Drosophila species and an annotated outlier, <it>A. gam</it>, as well as the strict criteria used to identify intron gains and losses, likely produced a low false-positive rate. Publicly available data for Drosophila - such as mitochondrial genome sequences, extensive expression data, and a well-characterized transposon set - provided us with excellent tools to determine if intron gains or losses occurred via any previously proposed mechanisms. Combined, this data enabled us to identify intron gains that occurred via transposon insertion and double strand break repair. Furthermore, our data suggest that transcription may promote or occasionally cause intron gain. We speculate that this may occur via TAR or uncharacterized errors by or interactions with the transcriptional machinery. However, the definitive mechanism by which this may occur eludes us and awaits further investigation.</p>
<p>As research progresses, the exact molecular mechanisms of intron loss are becoming more clear. Our data suggest that RTMIL was responsible for the majority of intron losses identified in this study. However, we also found evidence suggesting that the self-priming of reverse transcription during RTMIL may not occur. It is likely that a different hypothesis may account for internal or 5' intron losses via RTMIL <abbrgrp><abbr bid="B44">44</abbr></abbrgrp>. We also identified the first case of intron loss that may have occurred via NHEJ (Figure <figr fid="F4">4</figr>) and speculate that the ability of NHEJ to both generate and delete introns may act as a "screening" mechanism for new introns. Finally, we identified one unambiguous case of the controversial mechanism of intron sliding.</p>
<p>In order to identify and fully understand the molecular mechanisms of intron gain and loss, further research into the ability of proposed mechanisms to operate in other species is necessary. It is likely that different mechanisms operate with varying intensities in different species. Consequently, the use of various species increases the chances of detecting these events. Also, demonstration of these mechanisms in multiple eukaryotic kingdoms is necessary to determine whether these are common mechanisms of intron gain or loss, singular events, or mechanisms that occur in only one species. Investigations at the population level may prove particularly fruitful as they will likely identify events before sequence divergence may obscure their mechanistic origin. Furthermore, it would be even better if <it>in vitro </it>or <it>in vivo </it>experiments can be designed and conducted to verify these mechanisms. For example, a recent <it>in vivo </it>study found that the insertion of a group II intron into a nuclear gene abolishes gene expression <abbrgrp><abbr bid="B71">71</abbr></abbrgrp>, strongly suggesting that group II introns no longer create spliceosomal introns. An interesting assay for future research would be to test the ability of NHEJ to delete or insert introns by continuously inducing a double strand break under certain conditions.</p>
</sec>
<sec><st><p>Methods</p></st>
<sec><st><p>Obtaining Orthologs</p></st>
<p>Most data files (transposons, chromosomes, gene regions, coding regions, intron sequences and annotation files) for the 11 Drosophila species investigated were downloaded from Flybase (release FB2011_01) <abbrgrp><abbr bid="B45">45</abbr></abbrgrp>. Mitochondrial genomes were obtained from GenBank [GenBank: <ext-link ext-link-id="NC_005780" ext-link-type="gen">NC_005780</ext-link>, <ext-link ext-link-id="NC_001322" ext-link-type="gen">NC_001322</ext-link>, <ext-link ext-link-id="NC_001709" ext-link-type="gen">NC_001709</ext-link>, <ext-link ext-link-id="BK006335" ext-link-type="gen">BK006335</ext-link>-<ext-link ext-link-id="BK006341" ext-link-type="gen">BK006341</ext-link>]. To ascertain orthologous genes, an all-against-all comparison among coding sequences of all 11 species was performed using the FASTA program <abbrgrp><abbr bid="B56">56</abbr></abbrgrp>. Only reciprocal best hits with e-value &#8804; 10<sup>-30</sup>, similarity &#8805; 70% and query sequence coverage &#8805; 80% were selected and used to construct an orthologous gene matrix. Considerable debate exists as to the best method of ortholog detection; however, we chose to identify orthologs using reciprocal best hits as this has been shown to produce very low false-positive rates <abbrgrp><abbr bid="B72">72</abbr></abbrgrp>. This process yielded 1, 611 orthologs. Orthologs lacking introns in all 11 species were discarded, yielding a matrix of 1, 405 orthologs. The orthologs in this matrix are 97% identical to Flybase's ortholog dataset. The 9 genes that did not match to Flybase's ortholog dataset did not experience any intron gain or loss events and therefore did not affect our final results.</p>
</sec>
<sec><st><p>Generating Alignments</p></st>
<p>Artificial introns composed of 30 X's were insertd into intronic positions in each coding sequence and each group of orthologs was globally aligned using the <monospace>ClustalW </monospace>program <abbrgrp><abbr bid="B73">73</abbr></abbrgrp> with gap open penalty 80, gap extension penalty 0, gap separation penalty 10 and transition weight 1. An example of a global alignment using artificial introns is shown in Additional file <supplr sid="S1">1</supplr>, Figure S3. Homogenous artificial introns of length 30 were used for two reasons: they assign consistent weight to each intron position during alignment and produce alignments that are easily readable for further analyses. An <it>ad hoc </it>program was then created to locate orthologous introns and convert each alignment into an intron absence/presence (0/1) matrix. All alignments were manually inspected for sequence identity flanking intron positions. If the alignment flanking an intron had a low similarity level, the corresponding 0/1 column in the matrix was deleted, removing these intron(s) from further analyses (an example of an excluded intron is shown in Additional file <supplr sid="S1">1</supplr>, Figure S4). This criterion eliminated 1006 multiple sequence alignments, leaving 399 alignments for further analyses.</p>
</sec>
<sec><st><p>Identifying Intron Gains and Losses</p></st>
<p>All multiple sequence alignments were then categorized into 2 groups: those that had discordant intron presences/absences nested within the 11 Drosophila species (Group A, 252 alignments) and those that did not (Group B, 147 alignments). For Group B, if possible an ortholog in <it>Anopheles gambiae </it>(<it>A. gam</it>) was located to be used as an outlier. <it>A. gam</it>'s genome was downloaded from the UCSC genome browser <abbrgrp><abbr bid="B74">74</abbr></abbrgrp> and mRNA sequences were downloaded from the RefSeq database [GenBank:<ext-link ext-link-id="PRJNA163" ext-link-type="gen">PRJNA163</ext-link>] <abbrgrp><abbr bid="B75">75</abbr></abbrgrp>. The annotation of <it>A. gam </it>was generated by mapping mRNA sequences back onto <it>A. gam</it>' was generated by mapping mRNA sequences back onto <it>A</it>. <it>gams</it>'s genome using the program <monospace>ESTMapper </monospace><abbrgrp><abbr bid="B76">76</abbr></abbrgrp>. Orthologs were identified using the FASTA program and extracting reciprocal best hits with e-value &#8804; 10<sup>-30</sup>, similarity &#8805; 60% and query sequence coverage &#8805; 60%. If an ortholog was found in <it>A. gam</it>, alignments in Group B were regenerated and reexamined. For alignments in Group B, if no ortholog could be located in <it>A. gam</it>, the alignment was excluded. This criterion removed 46 alignments, resulting in the final dataset of 353 multiple sequence alignments (see Additional file <supplr sid="S2">2</supplr> for all orthologs used in final analyses). Intron absence/presence matrices for both Group A and B were then processed separately through the program Malin <abbrgrp><abbr bid="B77">77</abbr></abbrgrp> to identify intron gains and losses using Dollo parsimony. Example alignments of intron gains, losses, and alignments that required the outlier <it>A. gam </it>can be found in Additional file <supplr sid="S1">1</supplr>, Figures S5-S9.</p>
<suppl id="S2">
<title><p>Additional file 2</p></title>
<text><p><b>Ortholog dataset</b>. A matrix of all orthologs used in final analyses. Each ortholog group is listed on one line.</p></text>
<file name="1471-2148-11-364-S2.PDF">
   <p>Click here for file</p>
</file>
</suppl>
</sec>
<sec><st><p>Intron Quality Controls</p></st>
<p>The ability to accurately identify intron gains and losses relies upon accurate gene annotation. The multitude of comparative and <it>ab initio </it>gene finding programs that were used to annotate genes in the 11 Drosophila genomes and the use of well annotated <it>D. melanogaster </it>genes during the annotation of the other 10 Drosophila genomes greatly increased the reliability of these annotations <abbrgrp><abbr bid="B78">78</abbr></abbrgrp>. However, since some annotations in the Drosophila species other than <it>D. melanogaster </it>may lack experimental validation, annotation errors may exist. Therefore, we applied quality controls to each intron identified as an intron gain in a single species. First, we excluded all intron gains located within a single species that were length 3 n (where "n" is an integer) and did not contain a premature termination codon (PTC) (i.e. DNA segments that, if included in the predicted transcript, would not be expected to elicit nonsense-mediated decay). This criterion was based on a recent study in Drosophila that also used computationally annotated introns to identify intron gains and losses. In that study, 86% of predicted intron gains that were located in a single species and were length 3 n without PTCs were annotation errors as opposed to novel introns <abbrgrp><abbr bid="B6">6</abbr></abbrgrp>. Secondly, we removed all intron gains located in a single species with noncanonical splice sites. Ancestral intron gains (intron gains found in more than one species) and intron losses were not subject to increased scrutiny as the detection of these events is relatively straightforward.</p>
</sec>
</sec>
<sec><st><p>Authors' contributions</p></st>
<p>PY participated in the design of the study, performed and conceived analyses, and drafted the manuscript. BK wrote computer programs for data analysis. LZ conceived of the study, participated in its design, wrote computer programs for data analysis and helped to draft the manuscript. All authors read and approved of this version of the manuscript.</p>
</sec>
</bdy>
<bm>
<ack><sec><st><p>Acknowledgements</p></st>
<p>This work was supported by a grant from the National Science Foundation (IIS-0938393). We are grateful to Liliana Florea, Helmet Karim, and two anonymous reviewers for helpful comments on the manuscript.</p>
</sec>
</ack>
<refgrp><bibl id="B1"><title><p>Intron loss and gain in Drosophila</p></title><aug><au><snm>Coulombe-Huntington</snm><fnm>J</fnm></au><au><snm>Majewski</snm><fnm>J</fnm></au></aug><source>Mol Biol Evol</source><pubdate>2007</pubdate><volume>24</volume><fpage>2842</fpage><lpage>2850</lpage><xrefbib><pubid idtype="pmpid" link="fulltext">17965454</pubid></xrefbib></bibl><bibl id="B2"><title><p>Characterization of intron loss events in mammals</p></title><aug><au><snm>Coulombe-Huntington</snm><fnm>J</fnm></au><au><snm>Majewski</snm><fnm>J</fnm></au></aug><source>Genome Res</source><pubdate>2007</pubdate><volume>17</volume><fpage>23</fpage><lpage>32</lpage><xrefbib><pubidlist><pubid idtype="pmcid">1716263</pubid><pubid idtype="pmpid" link="fulltext">17108319</pubid></pubidlist></xrefbib></bibl><bibl id="B3"><title><p>The ecoresponsive genome of Daphnia pulex</p></title><aug><au><snm>Colbourne</snm><fnm>JK</fnm></au><au><snm>Pfrender</snm><fnm>ME</fnm></au><au><snm>Gilbert</snm><fnm>D</fnm></au><au><snm>Thomas</snm><fnm>WK</fnm></au><au><snm>Tucker</snm><fnm>A</fnm></au><au><snm>Oakley</snm><fnm>TH</fnm></au><au><snm>Tokishita</snm><fnm>S</fnm></au><au><snm>Aerts</snm><fnm>A</fnm></au><au><snm>Arnold</snm><fnm>GJ</fnm></au><au><snm>Basu</snm><fnm>MK</fnm></au><au><snm>Bauer</snm><fnm>DJ</fnm></au><au><snm>Caceres</snm><fnm>CE</fnm></au><au><snm>Carmel</snm><fnm>L</fnm></au><au><snm>Casola</snm><fnm>C</fnm></au><au><snm>Choi</snm><fnm>JH</fnm></au><au><snm>Detter</snm><fnm>JC</fnm></au><au><snm>Dong</snm><fnm>Q</fnm></au><au><snm>Dusheyko</snm><fnm>S</fnm></au><au><snm>Eads</snm><fnm>BD</fnm></au><au><snm>Frohlich</snm><fnm>T</fnm></au><au><snm>Geiler-Samerotte</snm><fnm>KA</fnm></au><au><snm>Gerlach</snm><fnm>D</fnm></au><au><snm>Hatcher</snm><fnm>P</fnm></au><au><snm>Jogdeo</snm><fnm>S</fnm></au><au><snm>Krijgsveld</snm><fnm>J</fnm></au><au><snm>Kriventseva</snm><fnm>EV</fnm></au><au><snm>Kultz</snm><fnm>D</fnm></au><au><snm>Laforsch</snm><fnm>C</fnm></au><au><snm>Lindquist</snm><fnm>E</fnm></au><au><snm>Lopez</snm><fnm>J</fnm></au><au><snm>Manak</snm><fnm>JR</fnm></au><au><snm>Muller</snm><fnm>J</fnm></au><au><snm>Pangilinan</snm><fnm>J</fnm></au><au><snm>Patwardhan</snm><fnm>RP</fnm></au><au><snm>Pitluck</snm><fnm>S</fnm></au><au><snm>Pritham</snm><fnm>EJ</fnm></au><au><snm>Rechtsteiner</snm><fnm>A</fnm></au><au><snm>Rho</snm><fnm>M</fnm></au><au><snm>Rogozin</snm><fnm>IB</fnm></au><au><snm>Sakarya</snm><fnm>O</fnm></au><au><snm>Salamov</snm><fnm>A</fnm></au><au><snm>Schaack</snm><fnm>S</fnm></au><au><snm>Shapiro</snm><fnm>H</fnm></au><au><snm>Shiga</snm><fnm>Y</fnm></au><au><snm>Skalitzky</snm><fnm>C</fnm></au><au><snm>Smith</snm><fnm>Z</fnm></au><au><snm>Souvorov</snm><fnm>A</fnm></au><au><snm>Sung</snm><fnm>W</fnm></au><au><snm>Tang</snm><fnm>Z</fnm></au><au><snm>Tsuchiya</snm><fnm>D</fnm></au><au><snm>Tu</snm><fnm>H</fnm></au><au><snm>Vos</snm><fnm>H</fnm></au><au><snm>Wang</snm><fnm>M</fnm></au><au><snm>Wolf</snm><fnm>YI</fnm></au><au><snm>Yamagata</snm><fnm>H</fnm></au><au><snm>Yamada</snm><fnm>T</fnm></au><au><snm>Ye</snm><fnm>Y</fnm></au><au><snm>Shaw</snm><fnm>JR</fnm></au><au><snm>Andrews</snm><fnm>J</fnm></au><au><snm>Crease</snm><fnm>TJ</fnm></au><au><snm>Tang</snm><fnm>H</fnm></au><au><snm>Lucas</snm><fnm>SM</fnm></au><au><snm>Robertson</snm><fnm>HM</fnm></au><au><snm>Bork</snm><fnm>P</fnm></au><au><snm>Koonin</snm><fnm>EV</fnm></au><au><snm>Zdobnov</snm><fnm>EM</fnm></au><au><snm>Grigoriev</snm><fnm>IV</fnm></au><au><snm>Lynch</snm><fnm>M</fnm></au><au><snm>Boore</snm><fnm>JL</fnm></au></aug><source>Science</source><pubdate>2011</pubdate><volume>331</volume><fpage>555</fpage><lpage>561</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1126/science.1197761</pubid><pubid idtype="pmpid" link="fulltext">21292972</pubid></pubidlist></xrefbib></bibl><bibl id="B4"><title><p>Extensive, recent intron gains in Daphnia populations</p></title><aug><au><snm>Li</snm><fnm>W</fnm></au><au><snm>Tucker</snm><fnm>AE</fnm></au><au><snm>Sung</snm><fnm>W</fnm></au><au><snm>Thomas</snm><fnm>WK</fnm></au><au><snm>Lynch</snm><fnm>M</fnm></au></aug><source>Science</source><pubdate>2009</pubdate><volume>326</volume><fpage>1260</fpage><lpage>1262</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1126/science.1179302</pubid><pubid idtype="pmpid" link="fulltext">19965475</pubid></pubidlist></xrefbib></bibl><bibl id="B5"><title><p>Plasticity of animal genome architecture unmasked by rapid evolution of a pelagic tunicate</p></title><aug><au><snm>Denoeud</snm><fnm>F</fnm></au><au><snm>Henriet</snm><fnm>S</fnm></au><au><snm>Mungpakdee</snm><fnm>S</fnm></au><au><snm>Aury</snm><fnm>JM</fnm></au><au><snm>Da</snm><fnm>SC</fnm></au><au><snm>Brinkmann</snm><fnm>H</fnm></au><au><snm>Mikhaleva</snm><fnm>J</fnm></au><au><snm>Olsen</snm><fnm>LC</fnm></au><au><snm>Jubin</snm><fnm>C</fnm></au><au><snm>Canestro</snm><fnm>C</fnm></au><au><snm>Bouquet</snm><fnm>JM</fnm></au><au><snm>Danks</snm><fnm>G</fnm></au><au><snm>Poulain</snm><fnm>J</fnm></au><au><snm>Campsteijn</snm><fnm>C</fnm></au><au><snm>Adamski</snm><fnm>M</fnm></au><au><snm>Cross</snm><fnm>I</fnm></au><au><snm>Yadetie</snm><fnm>F</fnm></au><au><snm>Muffato</snm><fnm>M</fnm></au><au><snm>Louis</snm><fnm>A</fnm></au><au><snm>Butcher</snm><fnm>S</fnm></au><au><snm>Tsagkogeorga</snm><fnm>G</fnm></au><au><snm>Konrad</snm><fnm>A</fnm></au><au><snm>Singh</snm><fnm>S</fnm></au><au><snm>Jensen</snm><fnm>MF</fnm></au><au><snm>Cong</snm><fnm>EH</fnm></au><au><snm>Eikeseth-Otteraa</snm><fnm>H</fnm></au><au><snm>Noel</snm><fnm>B</fnm></au><au><snm>Anthouard</snm><fnm>V</fnm></au><au><snm>Porcel</snm><fnm>BM</fnm></au><au><snm>Kachouri-Lafond</snm><fnm>R</fnm></au><au><snm>Nishino</snm><fnm>A</fnm></au><au><snm>Ugolini</snm><fnm>M</fnm></au><au><snm>Chourrout</snm><fnm>P</fnm></au><au><snm>Nishida</snm><fnm>H</fnm></au><au><snm>Aasland</snm><fnm>R</fnm></au><au><snm>Huzurbazar</snm><fnm>S</fnm></au><au><snm>Westhof</snm><fnm>E</fnm></au><au><snm>Delsuc</snm><fnm>F</fnm></au><au><snm>Lehrach</snm><fnm>H</fnm></au><au><snm>Reinhardt</snm><fnm>R</fnm></au><au><snm>Weissenbach</snm><fnm>J</fnm></au><au><snm>Roy</snm><fnm>SW</fnm></au><au><snm>Artiguenave</snm><fnm>F</fnm></au><au><snm>Postlethwait</snm><fnm>JH</fnm></au><au><snm>Manak</snm><fnm>JR</fnm></au><au><snm>Thompson</snm><fnm>EM</fnm></au><au><snm>Jaillon</snm><fnm>O</fnm></au><au><snm>Du</snm><fnm>PL</fnm></au><au><snm>Boudinot</snm><fnm>P</fnm></au><au><snm>Liberles</snm><fnm>DA</fnm></au><au><snm>Volff</snm><fnm>JN</fnm></au><au><snm>Philippe</snm><fnm>H</fnm></au><au><snm>Lenhard</snm><fnm>B</fnm></au><au><snm>Roest</snm><fnm>CH</fnm></au><au><snm>Wincker</snm><fnm>P</fnm></au><au><snm>Chourrout</snm><fnm>D</fnm></au></aug><source>Science</source><pubdate>2010</pubdate><volume>330</volume><fpage>1381</fpage><lpage>1385</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1126/science.1194167</pubid><pubid idtype="pmpid" link="fulltext">21097902</pubid></pubidlist></xrefbib></bibl><bibl id="B6"><title><p>Nonsense-mediated decay enables intron gain in Drosophila</p></title><aug><au><snm>Farlow</snm><fnm>A</fnm></au><au><snm>Meduri</snm><fnm>E</fnm></au><au><snm>Dolezal</snm><fnm>M</fnm></au><au><snm>Hua</snm><fnm>L</fnm></au><au><snm>Schlotterer</snm><fnm>C</fnm></au></aug><source>PLoS Genet</source><pubdate>2010</pubdate><volume>6</volume><fpage>e1000819</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1371/journal.pgen.1000819</pubid><pubid idtype="pmcid">2809761</pubid><pubid idtype="pmpid" link="fulltext">20107520</pubid></pubidlist></xrefbib></bibl><bibl id="B7"><title><p>Very little intron loss/gain in Plasmodium: intron loss/gain mutation rates and intron number</p></title><aug><au><snm>Roy</snm><fnm>SW</fnm></au><au><snm>Hartl</snm><fnm>DL</fnm></au></aug><source>Genome Res</source><pubdate>2006</pubdate><volume>16</volume><fpage>750</fpage><lpage>756</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1101/gr.4845406</pubid><pubid idtype="pmcid">1473185</pubid><pubid idtype="pmpid" link="fulltext">16702411</pubid></pubidlist></xrefbib></bibl><bibl id="B8"><title><p>Remarkable interkingdom conservation of intron positions and massive, lineage-specific intron loss and gain in eukaryotic evolution</p></title><aug><au><snm>Rogozin</snm><fnm>IB</fnm></au><au><snm>Wolf</snm><fnm>YI</fnm></au><au><snm>Sorokin</snm><fnm>AV</fnm></au><au><snm>Mirkin</snm><fnm>BG</fnm></au><au><snm>Koonin</snm><fnm>EV</fnm></au></aug><source>Curr Biol</source><pubdate>2003</pubdate><volume>13</volume><fpage>1512</fpage><lpage>1517</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/S0960-9822(03)00558-X</pubid><pubid idtype="pmpid" link="fulltext">12956953</pubid></pubidlist></xrefbib></bibl><bibl id="B9"><title><p>Patterns of intron gain and loss in fungi</p></title><aug><au><snm>Nielsen</snm><fnm>CB</fnm></au><au><snm>Friedman</snm><fnm>B</fnm></au><au><snm>Birren</snm><fnm>B</fnm></au><au><snm>Burge</snm><fnm>CB</fnm></au><au><snm>Galagan</snm><fnm>JE</fnm></au></aug><source>PLoS Biol</source><pubdate>2004</pubdate><volume>2</volume><fpage>e422</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1371/journal.pbio.0020422</pubid><pubid idtype="pmcid">532390</pubid><pubid idtype="pmpid" link="fulltext">15562318</pubid></pubidlist></xrefbib></bibl><bibl id="B10"><title><p>Origins of recently gained introns in Caenorhabditis</p></title><aug><au><snm>Coghlan</snm><fnm>A</fnm></au><au><snm>Wolfe</snm><fnm>KH</fnm></au></aug><source>Proc Natl Acad Sci USA</source><pubdate>2004</pubdate><volume>101</volume><fpage>11362</fpage><lpage>11367</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1073/pnas.0308192101</pubid><pubid idtype="pmcid">509176</pubid><pubid idtype="pmpid" link="fulltext">15243155</pubid></pubidlist></xrefbib></bibl><bibl id="B11"><title><p>Evaluation of models of the mechanisms underlying intron loss and gain in Aspergillus fungi</p></title><aug><au><snm>Zhang</snm><fnm>LY</fnm></au><au><snm>Yang</snm><fnm>YF</fnm></au><au><snm>Niu</snm><fnm>DK</fnm></au></aug><source>J Mol Evol</source><pubdate>2010</pubdate><volume>71</volume><fpage>364</fpage><lpage>373</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1007/s00239-010-9391-6</pubid><pubid idtype="pmpid" link="fulltext">20862581</pubid></pubidlist></xrefbib></bibl><bibl id="B12"><title><p>Mystery of intron gain</p></title><aug><au><snm>Fedorov</snm><fnm>A</fnm></au><au><snm>Roy</snm><fnm>S</fnm></au><au><snm>Fedorova</snm><fnm>L</fnm></au><au><snm>Gilbert</snm><fnm>W</fnm></au></aug><source>Genome Res</source><pubdate>2003</pubdate><volume>13</volume><fpage>2236</fpage><lpage>2241</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1101/gr.1029803</pubid><pubid idtype="pmcid">403686</pubid><pubid idtype="pmpid" link="fulltext">12975308</pubid></pubidlist></xrefbib></bibl><bibl id="B13"><title><p>Intron exclusion and the mystery of intron loss</p></title><aug><au><snm>Hu</snm><fnm>K</fnm></au></aug><source>FEBS Lett</source><pubdate>2006</pubdate><volume>580</volume><fpage>6361</fpage><lpage>6365</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/j.febslet.2006.10.048</pubid><pubid idtype="pmpid" link="fulltext">17092501</pubid></pubidlist></xrefbib></bibl><bibl id="B14"><title><p>Alternative splicing: increasing diversity in the proteomic world</p></title><aug><au><snm>Graveley</snm><fnm>BR</fnm></au></aug><source>Trends Genet</source><pubdate>2001</pubdate><volume>17</volume><fpage>100</fpage><lpage>107</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/S0168-9525(00)02176-4</pubid><pubid idtype="pmpid" link="fulltext">11173120</pubid></pubidlist></xrefbib></bibl><bibl id="B15"><title><p>How introns influence and enhance eukaryotic gene expression</p></title><aug><au><snm>Le</snm><fnm>HH</fnm></au><au><snm>Nott</snm><fnm>A</fnm></au><au><snm>Moore</snm><fnm>MJ</fnm></au></aug><source>Trends Biochem Sci</source><pubdate>2003</pubdate><volume>28</volume><fpage>215</fpage><lpage>220</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/S0968-0004(03)00052-5</pubid><pubid idtype="pmpid" link="fulltext">12713906</pubid></pubidlist></xrefbib></bibl><bibl id="B16"><title><p>Non-coding RNA</p></title><aug><au><snm>Mattick</snm><fnm>JS</fnm></au><au><snm>Makunin</snm><fnm>IV</fnm></au></aug><source>Hum Mol Genet</source><pubdate>2006</pubdate><volume>15</volume><issue>Spec No 1</issue><fpage>R17</fpage><lpage>R29</lpage><xrefbib><pubid idtype="pmpid" link="fulltext">16651366</pubid></xrefbib></bibl><bibl id="B17"><title><p>Critical association of ncRNA with introns</p></title><aug><au><snm>Rearick</snm><fnm>D</fnm></au><au><snm>Prakash</snm><fnm>A</fnm></au><au><snm>McSweeny</snm><fnm>A</fnm></au><au><snm>Shepard</snm><fnm>SS</fnm></au><au><snm>Fedorova</snm><fnm>L</fnm></au><au><snm>Fedorov</snm><fnm>A</fnm></au></aug><source>Nucleic Acids Res</source><pubdate>2011</pubdate><volume>39</volume><fpage>2357</fpage><lpage>2366</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/nar/gkq1080</pubid><pubid idtype="pmcid">3064772</pubid><pubid idtype="pmpid" link="fulltext">21071396</pubid></pubidlist></xrefbib></bibl><bibl id="B18"><title><p>Splicing promotes rapid and efficient mRNA export in mammalian cells</p></title><aug><au><snm>Valencia</snm><fnm>P</fnm></au><au><snm>Dias</snm><fnm>AP</fnm></au><au><snm>Reed</snm><fnm>R</fnm></au></aug><source>Proc Natl Acad Sci USA</source><pubdate>2008</pubdate><volume>105</volume><fpage>3386</fpage><lpage>3391</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1073/pnas.0800250105</pubid><pubid idtype="pmcid">2265164</pubid><pubid idtype="pmpid" link="fulltext">18287003</pubid></pubidlist></xrefbib></bibl><bibl id="B19"><title><p>A generic intron increases gene expression in transgenic mice</p></title><aug><au><snm>Choi</snm><fnm>T</fnm></au><au><snm>Huang</snm><fnm>M</fnm></au><au><snm>Gorman</snm><fnm>C</fnm></au><au><snm>Jaenisch</snm><fnm>R</fnm></au></aug><source>Mol Cell Biol</source><pubdate>1991</pubdate><volume>11</volume><fpage>3070</fpage><lpage>3074</lpage><xrefbib><pubidlist><pubid idtype="pmcid">360146</pubid><pubid idtype="pmpid" link="fulltext">2038318</pubid></pubidlist></xrefbib></bibl><bibl id="B20"><title><p>Introns boost transgene expression in Drosophila melanogaster</p></title><aug><au><snm>Duncker</snm><fnm>BP</fnm></au><au><snm>Davies</snm><fnm>PL</fnm></au><au><snm>Walker</snm><fnm>VK</fnm></au></aug><source>Mol Gen Genet</source><pubdate>1997</pubdate><volume>254</volume><fpage>291</fpage><lpage>296</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1007/s004380050418</pubid><pubid idtype="pmpid">9150263</pubid></pubidlist></xrefbib></bibl><bibl id="B21"><title><p>Heterologous introns can enhance expression of transgenes in mice</p></title><aug><au><snm>Palmiter</snm><fnm>RD</fnm></au><au><snm>Sandgren</snm><fnm>EP</fnm></au><au><snm>Avarbock</snm><fnm>MR</fnm></au><au><snm>Allen</snm><fnm>DD</fnm></au><au><snm>Brinster</snm><fnm>RL</fnm></au></aug><source>Proc Natl Acad Sci USA</source><pubdate>1991</pubdate><volume>88</volume><fpage>478</fpage><lpage>482</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1073/pnas.88.2.478</pubid><pubid idtype="pmcid">50834</pubid><pubid idtype="pmpid" link="fulltext">1988947</pubid></pubidlist></xrefbib></bibl><bibl id="B22"><title><p>Pseudogenes in yeast?</p></title><aug><au><snm>Fink</snm><fnm>GR</fnm></au></aug><source>Cell</source><pubdate>1987</pubdate><volume>49</volume><fpage>5</fpage><lpage>6</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/0092-8674(87)90746-X</pubid><pubid idtype="pmpid" link="fulltext">3549000</pubid></pubidlist></xrefbib></bibl><bibl id="B23"><title><p>The evolution of spliceosomal introns: patterns, puzzles and progress</p></title><aug><au><snm>Roy</snm><fnm>SW</fnm></au><au><snm>Gilbert</snm><fnm>W</fnm></au></aug><source>Nat Rev Genet</source><pubdate>2006</pubdate><volume>7</volume><fpage>211</fpage><lpage>221</lpage><xrefbib><pubid idtype="pmpid" link="fulltext">16485020</pubid></xrefbib></bibl><bibl id="B24"><title><p>The pattern of intron loss</p></title><aug><au><snm>Roy</snm><fnm>SW</fnm></au><au><snm>Gilbert</snm><fnm>W</fnm></au></aug><source>Proc Natl Acad Sci USA</source><pubdate>2005</pubdate><volume>102</volume><fpage>713</fpage><lpage>718</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1073/pnas.0408274102</pubid><pubid idtype="pmcid">545554</pubid><pubid idtype="pmpid" link="fulltext">15642949</pubid></pubidlist></xrefbib></bibl><bibl id="B25"><title><p>DNA double-strand break repair and the evolution of intron density</p></title><aug><au><snm>Farlow</snm><fnm>A</fnm></au><au><snm>Meduri</snm><fnm>E</fnm></au><au><snm>Schlotterer</snm><fnm>C</fnm></au></aug><source>Trends Genet</source><pubdate>2010</pubdate><volume>27</volume><fpage>1</fpage><lpage>6</lpage><xrefbib><pubidlist><pubid idtype="pmcid">3020277</pubid><pubid idtype="pmpid" link="fulltext">21106271</pubid></pubidlist></xrefbib></bibl><bibl id="B26"><title><p>A role for reverse transcripts in gene conversion</p></title><aug><au><snm>Derr</snm><fnm>LK</fnm></au><au><snm>Strathern</snm><fnm>JN</fnm></au></aug><source>Nature</source><pubdate>1993</pubdate><volume>361</volume><fpage>170</fpage><lpage>173</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/361170a0</pubid><pubid idtype="pmpid" link="fulltext">8380627</pubid></pubidlist></xrefbib></bibl><bibl id="B27"><title><p>Involvement of cDNA in homologous recombination between Ty elements in Saccharomyces cerevisiae</p></title><aug><au><snm>Melamed</snm><fnm>C</fnm></au><au><snm>Nevo</snm><fnm>Y</fnm></au><au><snm>Kupiec</snm><fnm>M</fnm></au></aug><source>Mol Cell Biol</source><pubdate>1992</pubdate><volume>12</volume><fpage>1613</fpage><lpage>1620</lpage><xrefbib><pubidlist><pubid idtype="pmcid">369604</pubid><pubid idtype="pmpid" link="fulltext">1372387</pubid></pubidlist></xrefbib></bibl><bibl id="B28"><title><p>On the origin of RNA splicing and introns</p></title><aug><au><snm>Sharp</snm><fnm>PA</fnm></au></aug><source>Cell</source><pubdate>1985</pubdate><volume>42</volume><fpage>397</fpage><lpage>400</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/0092-8674(85)90092-3</pubid><pubid idtype="pmpid" link="fulltext">2411416</pubid></pubidlist></xrefbib></bibl><bibl id="B29"><title><p>Split genes and RNA splicing</p></title><aug><au><snm>Crick</snm><fnm>F</fnm></au></aug><source>Science</source><pubdate>1979</pubdate><volume>204</volume><fpage>264</fpage><lpage>271</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1126/science.373120</pubid><pubid idtype="pmpid" link="fulltext">373120</pubid></pubidlist></xrefbib></bibl><bibl id="B30"><title><p>How were introns inserted into nuclear genes?</p></title><aug><au><snm>Rogers</snm><fnm>JH</fnm></au></aug><source>Trends Genet</source><pubdate>1989</pubdate><volume>5</volume><fpage>213</fpage><lpage>216</lpage><xrefbib><pubid idtype="pmpid">2551082</pubid></xrefbib></bibl><bibl id="B31"><title><p>A variable intron distribution in globin genes of Chironomus: evidence for recent intron gain</p></title><aug><au><snm>Hankeln</snm><fnm>T</fnm></au><au><snm>Friedl</snm><fnm>H</fnm></au><au><snm>Ebersberger</snm><fnm>I</fnm></au><au><snm>Martin</snm><fnm>J</fnm></au><au><snm>Schmidt</snm><fnm>ER</fnm></au></aug><source>Gene</source><pubdate>1997</pubdate><volume>205</volume><fpage>151</fpage><lpage>160</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/S0378-1119(97)00518-0</pubid><pubid idtype="pmpid">9461389</pubid></pubidlist></xrefbib></bibl><bibl id="B32"><title><p>Where do introns come from?</p></title><aug><au><snm>Catania</snm><fnm>F</fnm></au><au><snm>Lynch</snm><fnm>M</fnm></au></aug><source>PLoS Biol</source><pubdate>2008</pubdate><volume>6</volume><fpage>e283</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1371/journal.pbio.0060283</pubid><pubid idtype="pmcid">2586383</pubid><pubid idtype="pmpid" link="fulltext">19067485</pubid></pubidlist></xrefbib></bibl><bibl id="B33"><title><p>Origin of introns by 'intronization' of exonic sequences</p></title><aug><au><snm>Irimia</snm><fnm>M</fnm></au><au><snm>Rukov</snm><fnm>JL</fnm></au><au><snm>Penny</snm><fnm>D</fnm></au><au><snm>Vinther</snm><fnm>J</fnm></au><au><snm>Garcia-Fernandez</snm><fnm>J</fnm></au><au><snm>Roy</snm><fnm>SW</fnm></au></aug><source>Trends Genet</source><pubdate>2008</pubdate><volume>24</volume><fpage>378</fpage><lpage>381</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/j.tig.2008.05.007</pubid><pubid idtype="pmpid" link="fulltext">18597887</pubid></pubidlist></xrefbib></bibl><bibl id="B34"><title><p>Some novel intron positions in conserved Drosophila genes are caused by intron sliding or tandem duplication</p></title><aug><au><snm>Lehmann</snm><fnm>J</fnm></au><au><snm>Eisenhardt</snm><fnm>C</fnm></au><au><snm>Stadler</snm><fnm>PF</fnm></au><au><snm>Krauss</snm><fnm>V</fnm></au></aug><source>BMC Evol Biol</source><pubdate>2010</pubdate><volume>10</volume><fpage>156</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1186/1471-2148-10-156</pubid><pubid idtype="pmcid">2891723</pubid><pubid idtype="pmpid" link="fulltext">20500887</pubid></pubidlist></xrefbib></bibl><bibl id="B35"><title><p>Intron "sliding" and the diversity of intron positions</p></title><aug><au><snm>Stoltzfus</snm><fnm>A</fnm></au><au><snm>Logsdon</snm><fnm>JM</fnm><suf>Jr</suf></au><au><snm>Palmer</snm><fnm>JD</fnm></au><au><snm>Doolittle</snm><fnm>WF</fnm></au></aug><source>Proc Natl Acad Sci USA</source><pubdate>1997</pubdate><volume>94</volume><fpage>10739</fpage><lpage>10744</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1073/pnas.94.20.10739</pubid><pubid idtype="pmcid">23469</pubid><pubid idtype="pmpid" link="fulltext">9380704</pubid></pubidlist></xrefbib></bibl><bibl id="B36"><title><p>Alternative splicing: a missing piece in the puzzle of intron gain</p></title><aug><au><snm>Tarrio</snm><fnm>R</fnm></au><au><snm>Ayala</snm><fnm>FJ</fnm></au><au><snm>Rodriguez-Trelles</snm><fnm>F</fnm></au></aug><source>Proc Natl Acad Sci USA</source><pubdate>2008</pubdate><volume>105</volume><fpage>7223</fpage><lpage>7228</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1073/pnas.0802941105</pubid><pubid idtype="pmcid">2438231</pubid><pubid idtype="pmpid" link="fulltext">18463286</pubid></pubidlist></xrefbib></bibl><bibl id="B37"><title><p>Intronization, de-intronization and intron sliding are rare in Cryptococcus</p></title><aug><au><snm>Roy</snm><fnm>SW</fnm></au></aug><source>BMC Evol Biol</source><pubdate>2009</pubdate><volume>9</volume><fpage>192</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1186/1471-2148-9-192</pubid><pubid idtype="pmcid">2740785</pubid><pubid idtype="pmpid" link="fulltext">19664208</pubid></pubidlist></xrefbib></bibl><bibl id="B38"><title><p>Intron sliding in conserved gene families</p></title><aug><au><snm>Rogozin</snm><fnm>IB</fnm></au><au><snm>Lyons-Weiler</snm><fnm>J</fnm></au><au><snm>Koonin</snm><fnm>EV</fnm></au></aug><source>Trends Genet</source><pubdate>2000</pubdate><volume>16</volume><fpage>430</fpage><lpage>432</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/S0168-9525(00)02096-5</pubid><pubid idtype="pmpid" link="fulltext">11050324</pubid></pubidlist></xrefbib></bibl><bibl id="B39"><title><p>Intron presence-absence polymorphism in Drosophila driven by positive Darwinian selection</p></title><aug><au><snm>Llopart</snm><fnm>A</fnm></au><au><snm>Comeron</snm><fnm>JM</fnm></au><au><snm>Brunet</snm><fnm>FG</fnm></au><au><snm>Lachaise</snm><fnm>D</fnm></au><au><snm>Long</snm><fnm>M</fnm></au></aug><source>Proc Natl Acad Sci USA</source><pubdate>2002</pubdate><volume>99</volume><fpage>8121</fpage><lpage>8126</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1073/pnas.122570299</pubid><pubid idtype="pmcid">123031</pubid><pubid idtype="pmpid" link="fulltext">12060758</pubid></pubidlist></xrefbib></bibl><bibl id="B40"><title><p>Investigation of loss and gain of introns in the compact genomes of pufferfishes (Fugu and Tetraodon)</p></title><aug><au><snm>Loh</snm><fnm>YH</fnm></au><au><snm>Brenner</snm><fnm>S</fnm></au><au><snm>Venkatesh</snm><fnm>B</fnm></au></aug><source>Mol Biol Evol</source><pubdate>2008</pubdate><volume>25</volume><fpage>526</fpage><lpage>535</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/molbev/msm278</pubid><pubid idtype="pmpid" link="fulltext">18089580</pubid></pubidlist></xrefbib></bibl><bibl id="B41"><title><p>Evolutionary relationship of plant catalase genes inferred from exon-intron structures: isozyme divergence after the separation of monocots and dicots</p></title><aug><au><snm>Iwamoto</snm><fnm>M</fnm></au><au><snm>Maekawa</snm><fnm>M</fnm></au><au><snm>Saito</snm><fnm>A</fnm></au><au><snm>Higo</snm><fnm>H</fnm></au><au><snm>Higo</snm><fnm>K</fnm></au></aug><source>TAG Theoretical and Applied Genetics</source><pubdate>1998</pubdate><volume>97</volume><fpage>9</fpage><lpage>19</lpage><xrefbib><pubid idtype="doi">10.1007/s001220050861</pubid></xrefbib></bibl><bibl id="B42"><title><p>Intron gain and loss in segmentally duplicated genes in rice</p></title><aug><au><snm>Lin</snm><fnm>H</fnm></au><au><snm>Zhu</snm><fnm>W</fnm></au><au><snm>Silva</snm><fnm>JC</fnm></au><au><snm>Gu</snm><fnm>X</fnm></au><au><snm>Buell</snm><fnm>CR</fnm></au></aug><source>Genome Biol</source><pubdate>2006</pubdate><volume>7</volume><fpage>R41</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1186/gb-2006-7-5-r41</pubid><pubid idtype="pmcid">1779517</pubid><pubid idtype="pmpid" link="fulltext">16719932</pubid></pubidlist></xrefbib></bibl><bibl id="B43"><title><p>Ubiquitous internal gene duplication and intron creation in eukaryotes</p></title><aug><au><snm>Gao</snm><fnm>X</fnm></au><au><snm>Lynch</snm><fnm>M</fnm></au></aug><source>Proc Natl Acad Sci USA</source><pubdate>2009</pubdate><volume>106</volume><fpage>20818</fpage><lpage>20823</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1073/pnas.0911093106</pubid><pubid idtype="pmcid">2791625</pubid><pubid idtype="pmpid" link="fulltext">19926850</pubid></pubidlist></xrefbib></bibl><bibl id="B44"><title><p>Mechanisms of intron gain and loss in Cryptococcus</p></title><aug><au><snm>Sharpton</snm><fnm>TJ</fnm></au><au><snm>Neafsey</snm><fnm>DE</fnm></au><au><snm>Galagan</snm><fnm>JE</fnm></au><au><snm>Taylor</snm><fnm>JW</fnm></au></aug><source>Genome Biol</source><pubdate>2008</pubdate><volume>9</volume><fpage>R24</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1186/gb-2008-9-1-r24</pubid><pubid idtype="pmcid">2395259</pubid><pubid idtype="pmpid" link="fulltext">18234113</pubid></pubidlist></xrefbib></bibl><bibl id="B45"><title><p>FlyBase: enhancing Drosophila Gene Ontology annotations</p></title><aug><au><snm>Tweedie</snm><fnm>S</fnm></au><au><snm>Ashburner</snm><fnm>M</fnm></au><au><snm>Falls</snm><fnm>K</fnm></au><au><snm>Leyland</snm><fnm>P</fnm></au><au><snm>McQuilton</snm><fnm>P</fnm></au><au><snm>Marygold</snm><fnm>S</fnm></au><au><snm>Millburn</snm><fnm>G</fnm></au><au><snm>Osumi-Sutherland</snm><fnm>D</fnm></au><au><snm>Schroeder</snm><fnm>A</fnm></au><au><snm>Seal</snm><fnm>R</fnm></au><au><snm>Zhang</snm><fnm>H</fnm></au></aug><source>Nucleic Acids Res</source><pubdate>2009</pubdate><volume>37</volume><fpage>D555</fpage><lpage>D559</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/nar/gkn788</pubid><pubid idtype="pmcid">2686450</pubid><pubid idtype="pmpid" link="fulltext">18948289</pubid></pubidlist></xrefbib></bibl><bibl id="B46"><title><p>Molecular archeology of L1 insertions in the human genome</p></title><aug><au><snm>Szak</snm><fnm>ST</fnm></au><au><snm>Pickeral</snm><fnm>OK</fnm></au><au><snm>Makalowski</snm><fnm>W</fnm></au><au><snm>Boguski</snm><fnm>MS</fnm></au><au><snm>Landsman</snm><fnm>D</fnm></au><au><snm>Boeke</snm><fnm>JD</fnm></au></aug><source>Genome Biol</source><pubdate>2002</pubdate><volume>3</volume><fpage>1</fpage><lpage>18</lpage></bibl><bibl id="B47"><title><p>Two large families of chemoreceptor genes in the nematodes Caenorhabditis elegans and Caenorhabditis briggsae reveal extensive gene duplication, diversification, movement, and intron loss</p></title><aug><au><snm>Robertson</snm><fnm>HM</fnm></au></aug><source>Genome Res</source><pubdate>1998</pubdate><volume>8</volume><fpage>449</fpage><lpage>463</lpage><xrefbib><pubid idtype="pmpid" link="fulltext">9582190</pubid></xrefbib></bibl><bibl id="B48"><title><p>FlyMine: an integrated database for Drosophila and Anopheles genomics</p></title><aug><au><snm>Lyne</snm><fnm>R</fnm></au><au><snm>Smith</snm><fnm>R</fnm></au><au><snm>Rutherford</snm><fnm>K</fnm></au><au><snm>Wakeling</snm><fnm>M</fnm></au><au><snm>Varley</snm><fnm>A</fnm></au><au><snm>Guillier</snm><fnm>F</fnm></au><au><snm>Janssens</snm><fnm>H</fnm></au><au><snm>Ji</snm><fnm>W</fnm></au><au><snm>Mclaren</snm><fnm>P</fnm></au><au><snm>North</snm><fnm>P</fnm></au><au><snm>Rana</snm><fnm>D</fnm></au><au><snm>Riley</snm><fnm>T</fnm></au><au><snm>Sullivan</snm><fnm>J</fnm></au><au><snm>Watkins</snm><fnm>X</fnm></au><au><snm>Woodbridge</snm><fnm>M</fnm></au><au><snm>Lilley</snm><fnm>K</fnm></au><au><snm>Russell</snm><fnm>S</fnm></au><au><snm>Ashburner</snm><fnm>M</fnm></au><au><snm>Mizuguchi</snm><fnm>K</fnm></au><au><snm>Micklem</snm><fnm>G</fnm></au></aug><source>Genome Biol</source><pubdate>2007</pubdate><volume>8</volume><fpage>R129</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1186/gb-2007-8-7-r129</pubid><pubid idtype="pmcid">2323218</pubid><pubid idtype="pmpid" link="fulltext">17615057</pubid></pubidlist></xrefbib></bibl><bibl id="B49"><title><p>Using FlyAtlas to identify better Drosophila melanogaster models of human disease</p></title><aug><au><snm>Chintapalli</snm><fnm>VR</fnm></au><au><snm>Wang</snm><fnm>J</fnm></au><au><snm>Dow</snm><fnm>JA</fnm></au></aug><source>Nat Genet</source><pubdate>2007</pubdate><volume>39</volume><fpage>715</fpage><lpage>720</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/ng2049</pubid><pubid idtype="pmpid" link="fulltext">17534367</pubid></pubidlist></xrefbib></bibl><bibl id="B50"><title><p>Widespread intron loss suggests retrotransposon activity in ancient apicomplexans</p></title><aug><au><snm>Roy</snm><fnm>SW</fnm></au><au><snm>Penny</snm><fnm>D</fnm></au></aug><source>Mol Biol Evol</source><pubdate>2007</pubdate><volume>24</volume><fpage>1926</fpage><lpage>1933</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/molbev/msm102</pubid><pubid idtype="pmpid" link="fulltext">17522085</pubid></pubidlist></xrefbib></bibl><bibl id="B51"><title><p>mRNA-mediated intron losses: evidence from extraordinarily large exons</p></title><aug><au><snm>Niu</snm><fnm>DK</fnm></au><au><snm>Hou</snm><fnm>WR</fnm></au><au><snm>Li</snm><fnm>SW</fnm></au></aug><source>Mol Biol Evol</source><pubdate>2005</pubdate><volume>22</volume><fpage>1475</fpage><lpage>1481</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/molbev/msi138</pubid><pubid idtype="pmpid" link="fulltext">15788745</pubid></pubidlist></xrefbib></bibl><bibl id="B52"><title><p>The evolution of single-copy Drosophila nuclear 4f-rnp genes: spliceosomal intron losses create polymorphic alleles</p></title><aug><au><snm>Feiber</snm><fnm>AL</fnm></au><au><snm>Rangarajan</snm><fnm>J</fnm></au><au><snm>Vaughn</snm><fnm>JC</fnm></au></aug><source>J Mol Evol</source><pubdate>2002</pubdate><volume>55</volume><fpage>401</fpage><lpage>413</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1007/s00239-002-2336-y</pubid><pubid idtype="pmpid" link="fulltext">12355261</pubid></pubidlist></xrefbib></bibl><bibl id="B53"><title><p>AUGUSTUS: a web server for gene prediction in eukaryotes that allows user-defined constraints</p></title><aug><au><snm>Stanke</snm><fnm>M</fnm></au><au><snm>Morgenstern</snm><fnm>B</fnm></au></aug><source>Nucleic Acids Res</source><pubdate>2005</pubdate><volume>33</volume><fpage>W465</fpage><lpage>W467</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/nar/gki458</pubid><pubid idtype="pmcid">1160219</pubid><pubid idtype="pmpid" link="fulltext">15980513</pubid></pubidlist></xrefbib></bibl><bibl id="B54"><title><p>Prediction of polyadenylation signals in human DNA sequences using nucleotide frequencies</p></title><aug><au><snm>Ahmed</snm><fnm>F</fnm></au><au><snm>Kumar</snm><fnm>M</fnm></au><au><snm>Raghava</snm><fnm>GP</fnm></au></aug><source>In Silico Biol</source><pubdate>2009</pubdate><volume>9</volume><fpage>135</fpage><lpage>148</lpage><xrefbib><pubid idtype="pmpid">19795571</pubid></xrefbib></bibl><bibl id="B55"><title><p>Mfold web server for nucleic acid folding and hybridization prediction</p></title><aug><au><snm>Zuker</snm><fnm>M</fnm></au></aug><source>Nucleic Acids Res</source><pubdate>2003</pubdate><volume>31</volume><fpage>3406</fpage><lpage>3415</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/nar/gkg595</pubid><pubid idtype="pmcid">169194</pubid><pubid idtype="pmpid" link="fulltext">12824337</pubid></pubidlist></xrefbib></bibl><bibl id="B56"><title><p>Improved tools for biological sequence comparison</p></title><aug><au><snm>Pearson</snm><fnm>WR</fnm></au><au><snm>Lipman</snm><fnm>DJ</fnm></au></aug><source>Proc Natl Acad Sci USA</source><pubdate>1988</pubdate><volume>85</volume><fpage>2444</fpage><lpage>2448</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1073/pnas.85.8.2444</pubid><pubid idtype="pmcid">280013</pubid><pubid idtype="pmpid" link="fulltext">3162770</pubid></pubidlist></xrefbib></bibl><bibl id="B57"><title><p>The mechanism of double-strand DNA break repair by the nonhomologous DNA end-joining pathway</p></title><aug><au><snm>Lieber</snm><fnm>MR</fnm></au></aug><source>Annu Rev Biochem</source><pubdate>2010</pubdate><volume>79</volume><fpage>181</fpage><lpage>211</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1146/annurev.biochem.052308.093131</pubid><pubid idtype="pmcid">3079308</pubid><pubid idtype="pmpid" link="fulltext">20192759</pubid></pubidlist></xrefbib></bibl><bibl id="B58"><title><p>Intron creation and DNA repair</p></title><aug><au><snm>Ragg</snm><fnm>H</fnm></au></aug><source>Cell Mol Life Sci</source><pubdate>2010</pubdate><volume>68</volume><fpage>235</fpage><lpage>242</lpage><xrefbib><pubid idtype="pmpid" link="fulltext">20853128</pubid></xrefbib></bibl><bibl id="B59"><title><p>Multiple pathways of recombination induced by double-strand breaks in Saccharomyces cerevisiae</p></title><aug><au><snm>Paques</snm><fnm>F</fnm></au><au><snm>Haber</snm><fnm>JE</fnm></au></aug><source>Microbiol Mol Biol Rev</source><pubdate>1999</pubdate><volume>63</volume><fpage>349</fpage><lpage>404</lpage><xrefbib><pubidlist><pubid idtype="pmcid">98970</pubid><pubid idtype="pmpid" link="fulltext">10357855</pubid></pubidlist></xrefbib></bibl><bibl id="B60"><title><p>Mitochondrial DNA repairs double-strand breaks in yeast chromosomes</p></title><aug><au><snm>Ricchetti</snm><fnm>M</fnm></au><au><snm>Fairhead</snm><fnm>C</fnm></au><au><snm>Dujon</snm><fnm>B</fnm></au></aug><source>Nature</source><pubdate>1999</pubdate><volume>402</volume><fpage>96</fpage><lpage>100</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/47076</pubid><pubid idtype="pmpid" link="fulltext">10573425</pubid></pubidlist></xrefbib></bibl><bibl id="B61"><title><p>Chromosomal aberrations induced by double strand DNA breaks</p></title><aug><au><snm>Varga</snm><fnm>T</fnm></au><au><snm>Aplan</snm><fnm>PD</fnm></au></aug><source>DNA Repair</source><pubdate>2005</pubdate><volume>4</volume><fpage>1038</fpage><lpage>1046</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/j.dnarep.2005.05.004</pubid><pubid idtype="pmcid">1237002</pubid><pubid idtype="pmpid" link="fulltext">15935739</pubid></pubidlist></xrefbib></bibl><bibl id="B62"><title><p>Comparison of filler DNA at immune, nonimmune, and oncogenic rearrangements suggests multiple mechanisms of formation</p></title><aug><au><snm>Roth</snm><fnm>DB</fnm></au><au><snm>Chang</snm><fnm>XB</fnm></au><au><snm>Wilson</snm><fnm>JH</fnm></au></aug><source>Mol Cell Biol</source><pubdate>1989</pubdate><volume>9</volume><fpage>3049</fpage><lpage>3057</lpage><xrefbib><pubidlist><pubid idtype="pmcid">362774</pubid><pubid idtype="pmpid" link="fulltext">2550794</pubid></pubidlist></xrefbib></bibl><bibl id="B63"><title><p>Non-homologous DNA end joining in plant cells is associated with deletions and filler DNA insertions</p></title><aug><au><snm>Gorbunova</snm><fnm>V</fnm></au><au><snm>Levy</snm><fnm>AA</fnm></au></aug><source>Nucleic Acids Res</source><pubdate>1997</pubdate><volume>25</volume><fpage>4650</fpage><lpage>4657</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/nar/25.22.4650</pubid><pubid idtype="pmcid">147090</pubid><pubid idtype="pmpid" link="fulltext">9358178</pubid></pubidlist></xrefbib></bibl><bibl id="B64"><title><p>GeneWise and Genomewise</p></title><aug><au><snm>Birney</snm><fnm>E</fnm></au><au><snm>Clamp</snm><fnm>M</fnm></au><au><snm>Durbin</snm><fnm>R</fnm></au></aug><source>Genome Res</source><pubdate>2004</pubdate><volume>14</volume><fpage>988</fpage><lpage>995</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1101/gr.1865504</pubid><pubid idtype="pmcid">479130</pubid><pubid idtype="pmpid" link="fulltext">15123596</pubid></pubidlist></xrefbib></bibl><bibl id="B65"><title><p>TimeTree: a public knowledge-base of divergence times among organisms</p></title><aug><au><snm>Hedges</snm><fnm>SB</fnm></au><au><snm>Dudley</snm><fnm>J</fnm></au><au><snm>Kumar</snm><fnm>S</fnm></au></aug><source>Bioinformatics</source><pubdate>2006</pubdate><volume>22</volume><fpage>2971</fpage><lpage>2972</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/bioinformatics/btl505</pubid><pubid idtype="pmpid" link="fulltext">17021158</pubid></pubidlist></xrefbib></bibl><bibl id="B66"><title><p>The role of reverse-transcriptase in intron gain and loss mechanisms</p></title><aug><au><snm>Cohen</snm><fnm>NE</fnm></au><au><snm>Shen</snm><fnm>R</fnm></au><au><snm>Carmel</snm><fnm>L</fnm></au></aug><source>Mol Biol Evol</source><pubdate>2011</pubdate><inpress/></bibl><bibl id="B67"><title><p>Transcription-associated recombination in eukaryotes: link between transcription, replication and recombination</p></title><aug><au><snm>Gottipati</snm><fnm>P</fnm></au><au><snm>Helleday</snm><fnm>T</fnm></au></aug><source>Mutagenesis</source><pubdate>2009</pubdate><volume>24</volume><fpage>203</fpage><lpage>210</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/mutage/gen072</pubid><pubid idtype="pmpid" link="fulltext">19139058</pubid></pubidlist></xrefbib></bibl><bibl id="B68"><title><p>Transcription-induced deletions in Escherichia coli plasmids</p></title><aug><au><snm>Vilette</snm><fnm>D</fnm></au><au><snm>Ehrlich</snm><fnm>SD</fnm></au><au><snm>Michel</snm><fnm>B</fnm></au></aug><source>Mol Microbiol</source><pubdate>1995</pubdate><volume>17</volume><fpage>493</fpage><lpage>504</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1111/j.1365-2958.1995.mmi_17030493.x</pubid><pubid idtype="pmpid" link="fulltext">8559068</pubid></pubidlist></xrefbib></bibl><bibl id="B69"><title><p>The connection between transcription and genomic instability</p></title><aug><au><snm>Aguilera</snm><fnm>A</fnm></au></aug><source>EMBO J</source><pubdate>2002</pubdate><volume>21</volume><fpage>195</fpage><lpage>201</lpage><xrefbib><pubidlist><pubid idtype="pmcid">125829</pubid><pubid idtype="pmpid" link="fulltext">11823412</pubid></pubidlist></xrefbib></bibl><bibl id="B70"><title><p>Transcription-associated recombination is independent of XRCC2 and mechanistically separate from homology-directed DNA double-strand break repair</p></title><aug><au><snm>Savolainen</snm><fnm>L</fnm></au><au><snm>Helleday</snm><fnm>T</fnm></au></aug><source>Nucleic Acids Res</source><pubdate>2009</pubdate><volume>37</volume><fpage>405</fpage><lpage>412</lpage><xrefbib><pubidlist><pubid idtype="pmcid">2632912</pubid><pubid idtype="pmpid" link="fulltext">19043071</pubid></pubidlist></xrefbib></bibl><bibl id="B71"><title><p>Nuclear expression of a group II intron is consistent with spliceosomal intron ancestry</p></title><aug><au><snm>Chalamcharla</snm><fnm>VR</fnm></au><au><snm>Curcio</snm><fnm>MJ</fnm></au><au><snm>Belfort</snm><fnm>M</fnm></au></aug><source>Genes Dev</source><pubdate>2010</pubdate><volume>24</volume><fpage>827</fpage><lpage>836</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1101/gad.1905010</pubid><pubid idtype="pmcid">2854396</pubid><pubid idtype="pmpid" link="fulltext">20351053</pubid></pubidlist></xrefbib></bibl><bibl id="B72"><title><p>Assessing performance of orthology detection strategies applied to eukaryotic genomes</p></title><aug><au><snm>Chen</snm><fnm>F</fnm></au><au><snm>Mackey</snm><fnm>AJ</fnm></au><au><snm>Vermunt</snm><fnm>JK</fnm></au><au><snm>Roos</snm><fnm>DS</fnm></au></aug><source>PLoS One</source><pubdate>2007</pubdate><volume>2</volume><fpage>e383</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1371/journal.pone.0000383</pubid><pubid idtype="pmcid">1849888</pubid><pubid idtype="pmpid" link="fulltext">17440619</pubid></pubidlist></xrefbib></bibl><bibl id="B73"><title><p>Clustal W and Clustal X version 2.0</p></title><aug><au><snm>Larkin</snm><fnm>MA</fnm></au><au><snm>Blackshields</snm><fnm>G</fnm></au><au><snm>Brown</snm><fnm>NP</fnm></au><au><snm>Chenna</snm><fnm>R</fnm></au><au><snm>McGettigan</snm><fnm>PA</fnm></au><au><snm>McWilliam</snm><fnm>H</fnm></au><au><snm>Valentin</snm><fnm>F</fnm></au><au><snm>Wallace</snm><fnm>IM</fnm></au><au><snm>Wilm</snm><fnm>A</fnm></au><au><snm>Lopez</snm><fnm>R</fnm></au><au><snm>Thompson</snm><fnm>JD</fnm></au><au><snm>Gibson</snm><fnm>TJ</fnm></au><au><snm>Higgins</snm><fnm>DG</fnm></au></aug><source>Bioinformatics</source><pubdate>2007</pubdate><volume>23</volume><fpage>2947</fpage><lpage>2948</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/bioinformatics/btm404</pubid><pubid idtype="pmpid" link="fulltext">17846036</pubid></pubidlist></xrefbib></bibl><bibl id="B74"><title><p>The UCSC Genome Browser database: update 2011</p></title><aug><au><snm>Fujita</snm><fnm>PA</fnm></au><au><snm>Rhead</snm><fnm>B</fnm></au><au><snm>Zweig</snm><fnm>AS</fnm></au><au><snm>Hinrichs</snm><fnm>AS</fnm></au><au><snm>Karolchik</snm><fnm>D</fnm></au><au><snm>Cline</snm><fnm>MS</fnm></au><au><snm>Goldman</snm><fnm>M</fnm></au><au><snm>Barber</snm><fnm>GP</fnm></au><au><snm>Clawson</snm><fnm>H</fnm></au><au><snm>Coelho</snm><fnm>A</fnm></au><au><snm>Diekhans</snm><fnm>M</fnm></au><au><snm>Dreszer</snm><fnm>TR</fnm></au><au><snm>Giardine</snm><fnm>BM</fnm></au><au><snm>Harte</snm><fnm>RA</fnm></au><au><snm>Hillman-Jackson</snm><fnm>J</fnm></au><au><snm>Hsu</snm><fnm>F</fnm></au><au><snm>Kirkup</snm><fnm>V</fnm></au><au><snm>Kuhn</snm><fnm>RM</fnm></au><au><snm>Learned</snm><fnm>K</fnm></au><au><snm>Li</snm><fnm>CH</fnm></au><au><snm>Meyer</snm><fnm>LR</fnm></au><au><snm>Pohl</snm><fnm>A</fnm></au><au><snm>Raney</snm><fnm>BJ</fnm></au><au><snm>Rosenbloom</snm><fnm>KR</fnm></au><au><snm>Smith</snm><fnm>KE</fnm></au><au><snm>Haussler</snm><fnm>D</fnm></au><au><snm>Kent</snm><fnm>WJ</fnm></au></aug><source>Nucleic Acids Res</source><pubdate>2011</pubdate><volume>39</volume><fpage>D876</fpage><lpage>D882</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/nar/gkq963</pubid><pubid idtype="pmcid">3242726</pubid><pubid idtype="pmpid" link="fulltext">20959295</pubid></pubidlist></xrefbib></bibl><bibl id="B75"><title><p>NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins</p></title><aug><au><snm>Pruitt</snm><fnm>KD</fnm></au><au><snm>Tatusova</snm><fnm>T</fnm></au><au><snm>Maglott</snm><fnm>DR</fnm></au></aug><source>Nucleic Acids Res</source><pubdate>2007</pubdate><volume>35</volume><fpage>D61</fpage><lpage>D65</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/nar/gkl842</pubid><pubid idtype="pmcid">1716718</pubid><pubid idtype="pmpid" link="fulltext">17130148</pubid></pubidlist></xrefbib></bibl><bibl id="B76"><title><p>Gene and alternative splicing annotation with AIR</p></title><aug><au><snm>Florea</snm><fnm>L</fnm></au><au><snm>Di</snm><fnm>F</fnm></au><au><snm>Miller</snm><fnm>J</fnm></au><au><snm>Turner</snm><fnm>R</fnm></au><au><snm>Yao</snm><fnm>A</fnm></au><au><snm>Harris</snm><fnm>M</fnm></au><au><snm>Walenz</snm><fnm>B</fnm></au><au><snm>Mobarry</snm><fnm>C</fnm></au><au><snm>Merkulov</snm><fnm>GV</fnm></au><au><snm>Charlab</snm><fnm>R</fnm></au><au><snm>Dew</snm><fnm>I</fnm></au><au><snm>Deng</snm><fnm>Z</fnm></au><au><snm>Istrail</snm><fnm>S</fnm></au><au><snm>Li</snm><fnm>P</fnm></au><au><snm>Sutton</snm><fnm>G</fnm></au></aug><source>Genome Res</source><pubdate>2005</pubdate><volume>15</volume><fpage>54</fpage><lpage>66</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1101/gr.2889405</pubid><pubid idtype="pmcid">540277</pubid><pubid idtype="pmpid" link="fulltext">15632090</pubid></pubidlist></xrefbib></bibl><bibl id="B77"><title><p>Malin: maximum likelihood analysis of intron evolution in eukaryotes</p></title><aug><au><snm>Csuros</snm><fnm>M</fnm></au></aug><source>Bioinformatics</source><pubdate>2008</pubdate><volume>24</volume><fpage>1538</fpage><lpage>1539</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/bioinformatics/btn226</pubid><pubid idtype="pmcid">2718671</pubid><pubid idtype="pmpid" link="fulltext">18474506</pubid></pubidlist></xrefbib></bibl><bibl id="B78"><title><p>Evolution of genes and genomes on the Drosophila phylogeny</p></title><aug><au><snm>Clark</snm><fnm>AG</fnm></au><au><snm>Eisen</snm><fnm>MB</fnm></au><au><snm>Smith</snm><fnm>DR</fnm></au><au><snm>Bergman</snm><fnm>CM</fnm></au><au><snm>Oliver</snm><fnm>B</fnm></au><au><snm>Markow</snm><fnm>TA</fnm></au><au><snm>Kaufman</snm><fnm>TC</fnm></au><au><snm>Kellis</snm><fnm>M</fnm></au><au><snm>Gelbart</snm><fnm>W</fnm></au><au><snm>Iyer</snm><fnm>VN</fnm></au><au><snm>Pollard</snm><fnm>DA</fnm></au><au><snm>Sackton</snm><fnm>TB</fnm></au><au><snm>Larracuente</snm><fnm>AM</fnm></au><au><snm>Singh</snm><fnm>ND</fnm></au><au><snm>Abad</snm><fnm>JP</fnm></au><au><snm>Abt</snm><fnm>DN</fnm></au><au><snm>Adryan</snm><fnm>B</fnm></au><au><snm>Aguade</snm><fnm>M</fnm></au><au><snm>Akashi</snm><fnm>H</fnm></au><au><snm>Anderson</snm><fnm>WW</fnm></au><au><snm>Aquadro</snm><fnm>CF</fnm></au><au><snm>Ardell</snm><fnm>DH</fnm></au><au><snm>Arguello</snm><fnm>R</fnm></au><au><snm>Artieri</snm><fnm>CG</fnm></au><au><snm>Barbash</snm><fnm>DA</fnm></au><au><snm>Barker</snm><fnm>D</fnm></au><au><snm>Barsanti</snm><fnm>P</fnm></au><au><snm>Batterham</snm><fnm>P</fnm></au><au><snm>Batzoglou</snm><fnm>S</fnm></au><au><snm>Begun</snm><fnm>D</fnm></au><au><snm>Bhutkar</snm><fnm>A</fnm></au><au><snm>Blanco</snm><fnm>E</fnm></au><au><snm>Bosak</snm><fnm>SA</fnm></au><au><snm>Bradley</snm><fnm>RK</fnm></au><au><snm>Brand</snm><fnm>AD</fnm></au><au><snm>Brent</snm><fnm>MR</fnm></au><au><snm>Brooks</snm><fnm>AN</fnm></au><au><snm>Brown</snm><fnm>RH</fnm></au><au><snm>Butlin</snm><fnm>RK</fnm></au><au><snm>Caggese</snm><fnm>C</fnm></au><au><snm>Calvi</snm><fnm>BR</fnm></au><au><snm>Bernardo de</snm><fnm>CA</fnm></au><au><snm>Caspi</snm><fnm>A</fnm></au><au><snm>Castrezana</snm><fnm>S</fnm></au><au><snm>Celniker</snm><fnm>SE</fnm></au><au><snm>Chang</snm><fnm>JL</fnm></au><au><snm>Chapple</snm><fnm>C</fnm></au><au><snm>Chatterji</snm><fnm>S</fnm></au><au><snm>Chinwalla</snm><fnm>A</fnm></au><au><snm>Civetta</snm><fnm>A</fnm></au><au><snm>Clifton</snm><fnm>SW</fnm></au><au><snm>Comeron</snm><fnm>JM</fnm></au><au><snm>Costello</snm><fnm>JC</fnm></au><au><snm>Coyne</snm><fnm>JA</fnm></au><au><snm>Daub</snm><fnm>J</fnm></au><au><snm>David</snm><fnm>RG</fnm></au><au><snm>Delcher</snm><fnm>AL</fnm></au><au><snm>Delehaunty</snm><fnm>K</fnm></au><au><snm>Do</snm><fnm>CB</fnm></au><au><snm>Ebling</snm><fnm>H</fnm></au><au><snm>Edwards</snm><fnm>K</fnm></au><au><snm>Eickbush</snm><fnm>T</fnm></au><au><snm>Evans</snm><fnm>JD</fnm></au><au><snm>Filipski</snm><fnm>A</fnm></au><au><snm>Findeiss</snm><fnm>S</fnm></au><au><snm>Freyhult</snm><fnm>E</fnm></au><au><snm>Fulton</snm><fnm>L</fnm></au><au><snm>Fulton</snm><fnm>R</fnm></au><au><snm>Garcia</snm><fnm>AC</fnm></au><au><snm>Gardiner</snm><fnm>A</fnm></au><au><snm>Garfield</snm><fnm>DA</fnm></au><au><snm>Garvin</snm><fnm>BE</fnm></au><au><snm>Gibson</snm><fnm>G</fnm></au><au><snm>Gilbert</snm><fnm>D</fnm></au><au><snm>Gnerre</snm><fnm>S</fnm></au><au><snm>Godfrey</snm><fnm>J</fnm></au><au><snm>Good</snm><fnm>R</fnm></au><au><snm>Gotea</snm><fnm>V</fnm></au><au><snm>Gravely</snm><fnm>B</fnm></au><au><snm>Greenberg</snm><fnm>AJ</fnm></au><au><snm>Griffiths-Jones</snm><fnm>S</fnm></au><au><snm>Gross</snm><fnm>S</fnm></au><au><snm>Guigo</snm><fnm>R</fnm></au><au><snm>Gustafson</snm><fnm>EA</fnm></au><au><snm>Haerty</snm><fnm>W</fnm></au><au><snm>Hahn</snm><fnm>MW</fnm></au><au><snm>Halligan</snm><fnm>DL</fnm></au><au><snm>Halpern</snm><fnm>AL</fnm></au><au><snm>Halter</snm><fnm>GM</fnm></au><au><snm>Han</snm><fnm>MV</fnm></au><au><snm>Heger</snm><fnm>A</fnm></au><au><snm>Hillier</snm><fnm>L</fnm></au><au><snm>Hinrichs</snm><fnm>AS</fnm></au><au><snm>Holmes</snm><fnm>I</fnm></au><au><snm>Hoskins</snm><fnm>RA</fnm></au><au><snm>Hubisz</snm><fnm>MJ</fnm></au><au><snm>Hultmark</snm><fnm>D</fnm></au><au><snm>Huntley</snm><fnm>MA</fnm></au><au><snm>Jaffe</snm><fnm>DB</fnm></au><au><snm>Jagadeeshan</snm><fnm>S</fnm></au><au><snm>Jeck</snm><fnm>WR</fnm></au><au><snm>Johnson</snm><fnm>J</fnm></au><au><snm>Jones</snm><fnm>CD</fnm></au><au><snm>Jordan</snm><fnm>WC</fnm></au><au><snm>Karpen</snm><fnm>GH</fnm></au><au><snm>Kataoka</snm><fnm>E</fnm></au><au><snm>Keightley</snm><fnm>PD</fnm></au><au><snm>Kheradpour</snm><fnm>P</fnm></au><au><snm>Kirkness</snm><fnm>EF</fnm></au><au><snm>Koerich</snm><fnm>LB</fnm></au><au><snm>Kristiansen</snm><fnm>K</fnm></au><au><snm>Kudrna</snm><fnm>D</fnm></au><au><snm>Kulathinal</snm><fnm>RJ</fnm></au><au><snm>Kumar</snm><fnm>S</fnm></au><au><snm>Kwok</snm><fnm>R</fnm></au><au><snm>Lander</snm><fnm>E</fnm></au><au><snm>Langley</snm><fnm>CH</fnm></au><au><snm>Lapoint</snm><fnm>R</fnm></au><au><snm>Lazzaro</snm><fnm>BP</fnm></au><au><snm>Lee</snm><fnm>SJ</fnm></au><au><snm>Levesque</snm><fnm>L</fnm></au><au><snm>Li</snm><fnm>R</fnm></au><au><snm>Lin</snm><fnm>CF</fnm></au><au><snm>Lin</snm><fnm>MF</fnm></au><au><snm>Lindblad-Toh</snm><fnm>K</fnm></au><au><snm>Llopart</snm><fnm>A</fnm></au><au><snm>Long</snm><fnm>M</fnm></au><au><snm>Low</snm><fnm>L</fnm></au><au><snm>Lozovsky</snm><fnm>E</fnm></au><au><snm>Lu</snm><fnm>J</fnm></au><au><snm>Luo</snm><fnm>M</fnm></au><au><snm>Machado</snm><fnm>CA</fnm></au><au><snm>Makalowski</snm><fnm>W</fnm></au><au><snm>Marzo</snm><fnm>M</fnm></au><au><snm>Matsuda</snm><fnm>M</fnm></au><au><snm>Matzkin</snm><fnm>L</fnm></au><au><snm>McAllister</snm><fnm>B</fnm></au><au><snm>McBride</snm><fnm>CS</fnm></au><au><snm>McKernan</snm><fnm>B</fnm></au><au><snm>McKernan</snm><fnm>K</fnm></au><au><snm>Mendez-Lago</snm><fnm>M</fnm></au><au><snm>Minx</snm><fnm>P</fnm></au><au><snm>Mollenhauer</snm><fnm>MU</fnm></au><au><snm>Montooth</snm><fnm>K</fnm></au><au><snm>Mount</snm><fnm>SM</fnm></au><au><snm>Mu</snm><fnm>X</fnm></au><au><snm>Myers</snm><fnm>E</fnm></au><au><snm>Negre</snm><fnm>B</fnm></au><au><snm>Newfeld</snm><fnm>S</fnm></au><au><snm>Nielsen</snm><fnm>R</fnm></au><au><snm>Noor</snm><fnm>MA</fnm></au><au><snm>O&apos;Grady</snm><fnm>P</fnm></au><au><snm>Pachter</snm><fnm>L</fnm></au><au><snm>Papaceit</snm><fnm>M</fnm></au><au><snm>Parisi</snm><fnm>MJ</fnm></au><au><snm>Parisi</snm><fnm>M</fnm></au><au><snm>Parts</snm><fnm>L</fnm></au><au><snm>Pedersen</snm><fnm>JS</fnm></au><au><snm>Pesole</snm><fnm>G</fnm></au><au><snm>Phillippy</snm><fnm>AM</fnm></au><au><snm>Ponting</snm><fnm>CP</fnm></au><au><snm>Pop</snm><fnm>M</fnm></au><au><snm>Porcelli</snm><fnm>D</fnm></au><au><snm>Powell</snm><fnm>JR</fnm></au><au><snm>Prohaska</snm><fnm>S</fnm></au><au><snm>Pruitt</snm><fnm>K</fnm></au><au><snm>Puig</snm><fnm>M</fnm></au><au><snm>Quesneville</snm><fnm>H</fnm></au><au><snm>Ram</snm><fnm>KR</fnm></au><au><snm>Rand</snm><fnm>D</fnm></au><au><snm>Rasmussen</snm><fnm>MD</fnm></au><au><snm>Reed</snm><fnm>LK</fnm></au><au><snm>Reenan</snm><fnm>R</fnm></au><au><snm>Reily</snm><fnm>A</fnm></au><au><snm>Remington</snm><fnm>KA</fnm></au><au><snm>Rieger</snm><fnm>TT</fnm></au><au><snm>Ritchie</snm><fnm>MG</fnm></au><au><snm>Robin</snm><fnm>C</fnm></au><au><snm>Rogers</snm><fnm>YH</fnm></au><au><snm>Rohde</snm><fnm>C</fnm></au><au><snm>Rozas</snm><fnm>J</fnm></au><au><snm>Rubenfield</snm><fnm>MJ</fnm></au><au><snm>Ruiz</snm><fnm>A</fnm></au><au><snm>Russo</snm><fnm>S</fnm></au><au><snm>Salzberg</snm><fnm>SL</fnm></au><au><snm>Sanchez-Gracia</snm><fnm>A</fnm></au><au><snm>Saranga</snm><fnm>DJ</fnm></au><au><snm>Sato</snm><fnm>H</fnm></au><au><snm>Schaeffer</snm><fnm>SW</fnm></au><au><snm>Schatz</snm><fnm>MC</fnm></au><au><snm>Schlenke</snm><fnm>T</fnm></au><au><snm>Schwartz</snm><fnm>R</fnm></au><au><snm>Segarra</snm><fnm>C</fnm></au><au><snm>Singh</snm><fnm>RS</fnm></au><au><snm>Sirot</snm><fnm>L</fnm></au><au><snm>Sirota</snm><fnm>M</fnm></au><au><snm>Sisneros</snm><fnm>NB</fnm></au><au><snm>Smith</snm><fnm>CD</fnm></au><au><snm>Smith</snm><fnm>TF</fnm></au><au><snm>Spieth</snm><fnm>J</fnm></au><au><snm>Stage</snm><fnm>DE</fnm></au><au><snm>Stark</snm><fnm>A</fnm></au><au><snm>Stephan</snm><fnm>W</fnm></au><au><snm>Strausberg</snm><fnm>RL</fnm></au><au><snm>Strempel</snm><fnm>S</fnm></au><au><snm>Sturgill</snm><fnm>D</fnm></au><au><snm>Sutton</snm><fnm>G</fnm></au><au><snm>Sutton</snm><fnm>GG</fnm></au><au><snm>Tao</snm><fnm>W</fnm></au><au><snm>Teichmann</snm><fnm>S</fnm></au><au><snm>Tobari</snm><fnm>YN</fnm></au><au><snm>Tomimura</snm><fnm>Y</fnm></au><au><snm>Tsolas</snm><fnm>JM</fnm></au><au><snm>Valente</snm><fnm>VL</fnm></au><au><snm>Venter</snm><fnm>E</fnm></au><au><snm>Venter</snm><fnm>JC</fnm></au><au><snm>Vicario</snm><fnm>S</fnm></au><au><snm>Vieira</snm><fnm>FG</fnm></au><au><snm>Vilella</snm><fnm>AJ</fnm></au><au><snm>Villasante</snm><fnm>A</fnm></au><au><snm>Walenz</snm><fnm>B</fnm></au><au><snm>Wang</snm><fnm>J</fnm></au><au><snm>Wasserman</snm><fnm>M</fnm></au><au><snm>Watts</snm><fnm>T</fnm></au><au><snm>Wilson</snm><fnm>D</fnm></au><au><snm>Wilson</snm><fnm>RK</fnm></au><au><snm>Wing</snm><fnm>RA</fnm></au><au><snm>Wolfner</snm><fnm>MF</fnm></au><au><snm>Wong</snm><fnm>A</fnm></au><au><snm>Wong</snm><fnm>GK</fnm></au><au><snm>Wu</snm><fnm>CI</fnm></au><au><snm>Wu</snm><fnm>G</fnm></au><au><snm>Yamamoto</snm><fnm>D</fnm></au><au><snm>Yang</snm><fnm>HP</fnm></au><au><snm>Yang</snm><fnm>SP</fnm></au><au><snm>Yorke</snm><fnm>JA</fnm></au><au><snm>Yoshida</snm><fnm>K</fnm></au><au><snm>Zdobnov</snm><fnm>E</fnm></au><au><snm>Zhang</snm><fnm>P</fnm></au><au><snm>Zhang</snm><fnm>Y</fnm></au><au><snm>Zimin</snm><fnm>AV</fnm></au><au><snm>Baldwin</snm><fnm>J</fnm></au><au><snm>Abdouelleil</snm><fnm>A</fnm></au><au><snm>Abdulkadir</snm><fnm>J</fnm></au><au><snm>Abebe</snm><fnm>A</fnm></au><au><snm>Abera</snm><fnm>B</fnm></au><au><snm>Abreu</snm><fnm>J</fnm></au><au><snm>Acer</snm><fnm>SC</fnm></au><au><snm>Aftuck</snm><fnm>L</fnm></au><au><snm>Alexander</snm><fnm>A</fnm></au><au><snm>An</snm><fnm>P</fnm></au><au><snm>Anderson</snm><fnm>E</fnm></au><au><snm>Anderson</snm><fnm>S</fnm></au><au><snm>Arachi</snm><fnm>H</fnm></au><au><snm>Azer</snm><fnm>M</fnm></au></aug><source>Nature</source><pubdate>2007</pubdate><volume>450</volume><fpage>203</fpage><lpage>218</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nature06341</pubid><pubid idtype="pmpid" link="fulltext">17994087</pubid></pubidlist></xrefbib></bibl></refgrp>
</bm>
</art>