<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1471-2164-9-13</ui>
   <ji>1471-2164</ji>
   <fm>
      <dochead>Research article</dochead>
      <bibl>
         <title>
            <p>Computational analysis of splicing errors and mutations in human transcripts</p>
         </title>
         <aug>
            <au id="A1">
               <snm>Kurmangaliyev</snm>
               <mi>Z</mi>
               <fnm>Yerbol</fnm>
               <insr iid="I1"/>
               <email>kurmangali@mail.ru</email>
            </au>
            <au id="A2" ca="yes">
               <snm>Gelfand</snm>
               <mi>S</mi>
               <fnm>Mikhail</fnm>
               <insr iid="I1"/>
               <insr iid="I2"/>
               <email>gelfand@iitp.ru</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Institute for Information Transmission Problems (the Kharkevich Institute) RAS, Bolshoi Karetny pereulok 19, Moscow, 127994, Russia</p>
            </ins>
            <ins id="I2">
               <p>Faculty of Bioengineering and Bioinformatics, Moscow State University, Vorobievy Gory 1-73, Moscow 119992, Russia</p>
            </ins>
         </insg>
         <source>BMC Genomics</source>
         <issn>1471-2164</issn>
         <pubdate>2008</pubdate>
         <volume>9</volume>
         <issue>1</issue>
         <fpage>13</fpage>
         <url>http://www.biomedcentral.com/1471-2164/9/13</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">18194514</pubid>
               <pubid idtype="doi">10.1186/1471-2164-9-13</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>20</day>
               <month>4</month>
               <year>2007</year>
            </date>
         </rec>
         <acc>
            <date>
               <day>14</day>
               <month>1</month>
               <year>2008</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>14</day>
               <month>1</month>
               <year>2008</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2008</year>
         <collab>Kurmangaliyev and Gelfand; licensee BioMed Central Ltd.</collab>
         <note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>Most retained introns found in human cDNAs generated by high-throughput sequencing projects seem to result from underspliced transcripts, and thus they capture intermediate steps of pre-mRNA splicing. On the other hand, mutations in splice sites cause exon skipping of the respective exon or activation of pre-existing cryptic sites. Both types of events reflect properties of the splicing mechanism.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>The retained introns were significantly shorter than constitutive ones, and skipped exons are shorter than exons with cryptic sites. Both donor and acceptor splice sites of retained introns were weaker than splice sites of constitutive introns. The authentic acceptor sites affected by mutations were significantly weaker in exons with activated cryptic sites than in skipped exons. The distance from a mutated splice site to the nearest equivalent site is significantly shorter in cases of activated cryptic sites compared to exon skipping events. The prevalence of retained introns within genes monotonically increased in the 5'-to-3' direction (more retained introns close to the 3'-end), consistent with the model of co-transcriptional splicing. The density of exonic splicing enhancers was higher, and the density of exonic splicing silencers lower in retained introns compared to constitutive ones and in exons with cryptic sites compared to skipped exons.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusion</p>
               </st>
               <p>Thus the analysis of retained introns in human cDNA, exons skipped due to mutations in splice sites and exons with cryptic sites produced results consistent with the intron definition mechanism of splicing of short introns, co-transcriptional splicing, dependence of splicing efficiency on the splice site strength and the density of candidate exonic splicing enhancers and silencers. These results are consistent with other, recently published analyses.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>Vertebrate genes consist of relatively short exons separated by considerably larger introns. The introns of lower eukaryotes, invertebrates and plants are much shorter. This difference may be explained by the preference for two possible mechanisms for recognition of the exon-intron boundaries by the splicing machinery. In the case of long introns, the exon definition mechanism initially recognizes pairs of splicing sites corresponding to one exon. Vice versa, short introns are recognized by the intron definition that pairs splicing sites across introns <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. Historically, the intron definition mechanism seems to be the ancestral one, whereas exon definition likely is a relatively recent innovation that, in particular, created the possibility of regulated alternative splicing <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>.</p>
         <p>These models yield different consequences of mutations that destroy splicing sites. Errors in exon definition should lead to exon skipping or, if there are strong cryptic sites, the use of the latter, whereas errors in intron definition should cause intron retention. Indeed, exactly this behavior was observed in vivo and in vitro experiments (reviewed by <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>), and in early analyses of disease-causing mutations of human genes <abbrgrp><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr></abbrgrp>. These predictions also agree to the distribution of alternative splicing types in different organisms. In vertebrates, where long introns are frequent, the prevalent type of alternative splicing is exon skipping <abbrgrp><abbr bid="B5">5</abbr><abbr bid="B6">6</abbr></abbrgrp>, while in plants, where the majority of introns are short, the most frequent type is intron retention <abbrgrp><abbr bid="B5">5</abbr><abbr bid="B7">7</abbr></abbrgrp>.</p>
         <p>Intron retention is the least studied type of alternative and aberrant splicing. In contrast with other types of alternative splicing, which involve the choice between different splice sites, intron retention represents complete absence of splicing. Some specific features of retained introns have become clear in recent studies of human <abbrgrp><abbr bid="B8">8</abbr><abbr bid="B9">9</abbr></abbrgrp> and plant transcriptomes <abbrgrp><abbr bid="B10">10</abbr></abbrgrp>. Retained introns were found to differ from other introns in GC content, that was lower than in exons but higher than in constitutively spliced out introns. Retained introns were shown to be shorter on the average than constitutively spliced out ones and exhibited a tendency to occur in 5'- and 3'-untranslated regions <abbrgrp><abbr bid="B8">8</abbr><abbr bid="B9">9</abbr><abbr bid="B10">10</abbr></abbrgrp>; they also have weaker sites <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>.</p>
         <p>In several cases intron retention clearly has a function. A considerable fraction of retained introns encode identifiable protein domains or parts thereof <abbrgrp><abbr bid="B8">8</abbr><abbr bid="B11">11</abbr></abbrgrp>. In some cases intron retention produces different functional isoforms (EBNA-3 family anigens of the Epstein-Barr virus <abbrgrp><abbr bid="B12">12</abbr></abbrgrp>); isoforms with aberrant function (cancerspecific form of cholecystokinin 2 receptor <abbrgrp><abbr bid="B13">13</abbr></abbrgrp>); truncated proteins that may be involved in regulation (cold-dependent lipid metabolism in plants <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>, nuclear transport of retroviruses <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>, autoregulation of splicing <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>); non-functional proteins (P-element of Drosophila <abbrgrp><abbr bid="B17">17</abbr></abbrgrp> or rat cytochrome P450 <it>CYP2C11 </it>in stressed liver <abbrgrp><abbr bid="B18">18</abbr></abbrgrp>); proteins with unknown function (serine protease kallikrein <abbrgrp><abbr bid="B19">19</abbr><abbr bid="B20">20</abbr></abbrgrp>); or, finally, isoforms with no known functional differences between the variants (hormone urocortin 1 prepropeptide <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>, cyclooxygenase <abbrgrp><abbr bid="B22">22</abbr></abbrgrp>, D1 dopamine receptor (DR1) interacting protein calcyon <abbrgrp><abbr bid="B23">23</abbr></abbrgrp>, mouse homeodomain transcription factor <it>Tgif2 </it><abbrgrp><abbr bid="B24">24</abbr></abbrgrp>). At that, intron retention may be conserved in vertebrates, e.g. intron 3 of splicing regulator of the SR family <it>9G8 </it><abbrgrp><abbr bid="B16">16</abbr></abbrgrp> or species-specific, e.g. intron 2 of <it>Tgif2</it>, present in the mouse gene, but not its human ortholog <it>Tgif2 </it><abbrgrp><abbr bid="B24">24</abbr></abbrgrp>.</p>
         <p>However, it is likely that many cases of observed intron retention were caused by errors of the splicing machinery. Retained introns are the least conserved type of elementary alternatives <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>. Moreover, large scale projects that aim at sequencing of full-length cDNA use normalization procedures to enrich low copy transcripts, and these procedures seem to increase the fraction of underspliced transcripts that retain one or several introns <abbrgrp><abbr bid="B26">26</abbr><abbr bid="B27">27</abbr></abbrgrp>. Traditionally such artifacts in cDNA databases were treated as a nuisance and filtered out in attempts to create "clean" sets of alternative isoforms. We tried to look at introns retained in human cDNA data from another angle, assuming that they capture intermediate states of the splicing process and thus provide a glimpse on the splicing mechanisms.</p>
         <p>Another way to look at this mechanism is to analyze consequences of mutations in splice sites. This also has been the subject of several very recent studies. Such mutations have two major possible outcomes: exon skipping and activation of cryptic sites, whereas intron retention is relatively rare <abbrgrp><abbr bid="B3">3</abbr><abbr bid="B28">28</abbr><abbr bid="B29">29</abbr><abbr bid="B30">30</abbr></abbrgrp>. One of important determinants of the cryptic donor splice site phenotype is the presence of a strong candidate donor splice site in the vicinity of mutated sites <abbrgrp><abbr bid="B3">3</abbr><abbr bid="B31">31</abbr></abbrgrp>. Cryptic acceptor splice sites are more frequent in exons than in introns, likely due to depletion of AG dinucleotides upstream of the original acceptor sites <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>. There are differences in the distribution of candidate exonic enhancers and silencers between skipped exons and exons with activated cryptic sites <abbrgrp><abbr bid="B33">33</abbr></abbrgrp>.</p>
         <p>Here we systematically studied aberrant and mutated splicing. Specifically, we compared lengths of affected and adjacent introns and exons, as well strengths of splice sites and distribution of predicted splicing enhancers and silencers in these and adjacent exons and introns. While confirming many earlier predictions, our study also provides a number of new observations that are largely consistent with existing models of the splicing mechanisms.</p>
      </sec>
      <sec>
         <st>
            <p>Results</p>
         </st>
         <sec>
            <st>
               <p>Comparison of retained and constitutive introns</p>
            </st>
            <p>Sets of retained (Fig. <figr fid="F1">1</figr>) and constitutive (constitutively spliced out) introns were constructed as described in Data and Methods and compared with the aim to identify possible determinants of intron retention. We considered the distribution of intron lengths and of lengths of the flanking exons, scores of intron splice sites and the distal sites in the flanking exons (the acceptor site of the upstream exon and the donor site of the downstream exon), densities of exonic cis-acting elements, intron positions within the gene. The results are summarized in Table <tblr tid="T1">1</tblr>.</p>
            <fig id="F1">
               <title>
                  <p>Figure 1</p>
               </title>
               <caption>
                  <p>Definition of scored intron retention events</p>
               </caption>
               <text>
                  <p><b>Definition of scored intron retention events</b>. Gray rectangles represent exons of the RefSeq gene and mRNA. Exon/intron boundaries are marked by dotted lines.</p>
               </text>
               <graphic file="1471-2164-9-13-1"/>
            </fig>
            <tbl id="T1">
               <title>
                  <p>Table 1</p>
               </title>
               <caption>
                  <p>Properties of retained and constitutive introns. For all intron parameters the medians are reported. The last two columns report the statistical significance of the differences of the distributions by the Kolmogorov-Smirnov test (KS) and Student's t-test (ST); n/s &#8211; non significant.</p>
               </caption>
               <tblbdy cols="5">
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c cspan="2" ca="center">
                        <p>
                           <b>introns</b>
                        </p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c cspan="2">
                        <hr/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="right">
                        <p>
                           <b>Retained</b>
                        </p>
                     </c>
                     <c ca="right">
                        <p>
                           <b>Constitutively spliced</b>
                        </p>
                     </c>
                     <c ca="right">
                        <p>
                           <b>KS</b>
                        </p>
                     </c>
                     <c ca="right">
                        <p>
                           <b>ST</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c cspan="5">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>Set size</b>
                        </p>
                     </c>
                     <c ca="right">
                        <p>1197</p>
                     </c>
                     <c ca="right">
                        <p>137580</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>Intron length (nucleotides)</b>
                        </p>
                     </c>
                     <c ca="right">
                        <p>337</p>
                     </c>
                     <c ca="right">
                        <p>1481</p>
                     </c>
                     <c ca="right">
                        <p>&lt;10<sup>-15</sup></p>
                     </c>
                     <c ca="right">
                        <p>&lt;10<sup>-15</sup></p>
                     </c>
                  </r>
                  <r>
                     <c cspan="5" ca="left">
                        <p>
                           <b>Splice site scores</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Acceptor site of the of 5'-exon</p>
                     </c>
                     <c ca="right">
                        <p>18,60</p>
                     </c>
                     <c ca="right">
                        <p>19,09</p>
                     </c>
                     <c ca="right">
                        <p>&lt;10<sup>-15</sup></p>
                     </c>
                     <c ca="right">
                        <p>&lt;10<sup>-11</sup></p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Donor site</p>
                     </c>
                     <c ca="right">
                        <p>18,17</p>
                     </c>
                     <c ca="right">
                        <p>18,80</p>
                     </c>
                     <c ca="right">
                        <p>&lt;10<sup>-15</sup></p>
                     </c>
                     <c ca="right">
                        <p>&lt;10<sup>-15</sup></p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Acceptor</p>
                     </c>
                     <c ca="right">
                        <p>18,03</p>
                     </c>
                     <c ca="right">
                        <p>19,06</p>
                     </c>
                     <c ca="right">
                        <p>&lt;10<sup>-15</sup></p>
                     </c>
                     <c ca="right">
                        <p>&lt;10<sup>-15</sup></p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Donor site of 3'-exon</p>
                     </c>
                     <c ca="right">
                        <p>18,74</p>
                     </c>
                     <c ca="right">
                        <p>18,79</p>
                     </c>
                     <c ca="right">
                        <p>n/s</p>
                     </c>
                     <c ca="right">
                        <p>n/s</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="5" ca="left">
                        <p>
                           <b>Cis-acting elements (candidate sites per nucleotie)</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>ESEfinder: SC35</p>
                     </c>
                     <c ca="right">
                        <p>0,046</p>
                     </c>
                     <c ca="right">
                        <p>0,034</p>
                     </c>
                     <c ca="right">
                        <p>&lt;10<sup>-15</sup></p>
                     </c>
                     <c ca="right">
                        <p>&lt;10<sup>-15</sup></p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>ESEfinder: SF2/ASF</p>
                     </c>
                     <c ca="right">
                        <p>0,040</p>
                     </c>
                     <c ca="right">
                        <p>0,028</p>
                     </c>
                     <c ca="right">
                        <p>&lt;10<sup>-15</sup></p>
                     </c>
                     <c ca="right">
                        <p>&lt;10<sup>-15</sup></p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>ESEfinder: SRp40</p>
                     </c>
                     <c ca="right">
                        <p>0,041</p>
                     </c>
                     <c ca="right">
                        <p>0,038</p>
                     </c>
                     <c ca="right">
                        <p>&lt;10<sup>-15</sup></p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>ESEfinder: SRp55</p>
                     </c>
                     <c ca="right">
                        <p>0,022</p>
                     </c>
                     <c ca="right">
                        <p>0,022</p>
                     </c>
                     <c ca="right">
                        <p>&lt;10<sup>-15</sup></p>
                     </c>
                     <c ca="right">
                        <p>n/s</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>RESCUE-ESE</p>
                     </c>
                     <c ca="right">
                        <p>0,050</p>
                     </c>
                     <c ca="right">
                        <p>0,068</p>
                     </c>
                     <c ca="right">
                        <p>&lt;10<sup>-15</sup></p>
                     </c>
                     <c ca="right">
                        <p>&lt;10<sup>-15</sup></p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>PESE</p>
                     </c>
                     <c ca="right">
                        <p>0,043</p>
                     </c>
                     <c ca="right">
                        <p>0,035</p>
                     </c>
                     <c ca="right">
                        <p>&lt;10<sup>-15</sup></p>
                     </c>
                     <c ca="right">
                        <p>&lt;10<sup>-15</sup></p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>PESS</p>
                     </c>
                     <c ca="right">
                        <p>0,013</p>
                     </c>
                     <c ca="right">
                        <p>0,048</p>
                     </c>
                     <c ca="right">
                        <p>&lt;10<sup>-15</sup></p>
                     </c>
                     <c ca="right">
                        <p>&lt;10<sup>-15</sup></p>
                     </c>
                  </r>
                  <r>
                     <c cspan="5" ca="left">
                        <p>
                           <b>Relative position</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>by ordinal number</p>
                     </c>
                     <c ca="right">
                        <p>0,6</p>
                     </c>
                     <c ca="right">
                        <p>0,5</p>
                     </c>
                     <c ca="right">
                        <p>&lt;10<sup>-15</sup>*</p>
                     </c>
                     <c ca="right">
                        <p>&lt;10<sup>-15</sup></p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>by gene</p>
                     </c>
                     <c ca="right">
                        <p>0,671</p>
                     </c>
                     <c ca="right">
                        <p>0,597</p>
                     </c>
                     <c ca="right">
                        <p>&lt;10<sup>-9</sup></p>
                     </c>
                     <c ca="right">
                        <p>&lt;10<sup>-15</sup></p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>by mRNA</p>
                     </c>
                     <c ca="right">
                        <p>0,446</p>
                     </c>
                     <c ca="right">
                        <p>0,354</p>
                     </c>
                     <c ca="right">
                        <p>&lt;10<sup>-15</sup></p>
                     </c>
                     <c ca="right">
                        <p>&lt;10<sup>-15</sup></p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>by mRNA w/o last exon</p>
                     </c>
                     <c ca="right">
                        <p>0,688</p>
                     </c>
                     <c ca="right">
                        <p>0,575</p>
                     </c>
                     <c ca="right">
                        <p>&lt;10<sup>-15</sup></p>
                     </c>
                     <c ca="right">
                        <p>&lt;10<sup>-15</sup></p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>* Chi-square test</p>
               </tblfn>
            </tbl>
            <p>The distributions of the intron lengths of retained and constitutive introns were significantly different (Fig. <figr fid="F2">2</figr>, Two-sample Kolmogorov-Smirnov test P &lt; 10<sup>-15</sup>). The retained introns tend to be shorter than constitutively spliced out ones: 84% of the retained introns were shorter than 1000 nucleotides, compared to only 40% of the constitutive introns. The median size of the retained introns was 337, whereas the median size of the constitutive introns was 1481 nucleotides. No significant differences between distributions of flanking exons lengths were observed (data not shown).</p>
            <fig id="F2">
               <title>
                  <p>Figure 2</p>
               </title>
               <caption>
                  <p>Histograms of intron lengths</p>
               </caption>
               <text>
                  <p><b>Histograms of intron lengths</b>. Red: retained introns; blue: constitutive introns.</p>
               </text>
               <graphic file="1471-2164-9-13-2"/>
            </fig>
            <p>Scores of the intron splice sites and splice sites of the flanking exons for retained and constitutively spliced introns were calculated using a positional weight matrix as described in Data and Methods. Splice sites of retained introns were weaker: the distributions of the splice sites scores for the retained and constitutive introns were significantly different for both acceptor and donor sites (Two-sample Kolmogorov-Smirnov test P &lt; 10<sup>-15</sup>). The median scores for the donor sites of the retained and constitutive introns were 18.2 and 18.8 respectively, whereas for the acceptor sites they were 18.03 and 19.06 respectively.</p>
            <p>The donor site scores of the 3'-flanking (downstream) exons were similar for the retained and constitutive introns, whereas the acceptor sites of the 5'-flanking (upstream) exons were considerable weaker for the retained introns compared to the constitutive ones, with medians 18.6 and 19.1, respectively (Two-sample Kolmogorov-Smirnov test P &lt; 10<sup>-10</sup>).</p>
            <p>Densities of cis-acting elements of both types of introns were calculated using three available programs, ESEfinder <abbrgrp><abbr bid="B34">34</abbr></abbrgrp>, RESCUE-ESE <abbrgrp><abbr bid="B35">35</abbr></abbrgrp>, and PESX <abbrgrp><abbr bid="B36">36</abbr><abbr bid="B37">37</abbr></abbrgrp>, as described in Data and Methods. The results are described in Table <tblr tid="T1">1</tblr>. The densities of most types of predicted exonic splicing enhancers (ESEs) were higher in the retained introns, whereas the density of exonic splicing silencers (ESSs) was higher in the constitutive introns (Fig. <figr fid="F3">3</figr>, <figr fid="F4">4</figr>).</p>
            <fig id="F3">
               <title>
                  <p>Figure 3</p>
               </title>
               <caption>
                  <p>Histograms of ESE densities predicted by ESEfinder</p>
               </caption>
               <text>
                  <p><b>Histograms of ESE densities predicted by ESEfinder</b>. Red: retained introns; blue: constitutive introns.</p>
               </text>
               <graphic file="1471-2164-9-13-3"/>
            </fig>
            <fig id="F4">
               <title>
                  <p>Figure 4</p>
               </title>
               <caption>
                  <p>Histograms of ESE densities predicted by RESCUE-ESE and PESX/PESE and ESS densities predicted by PESX/PESS</p>
               </caption>
               <text>
                  <p><b>Histograms of ESE densities predicted by RESCUE-ESE and PESX/PESE and ESS densities predicted by PESX/PESS</b>. Red: retained introns; blue: constitutive introns.</p>
               </text>
               <graphic file="1471-2164-9-13-4"/>
            </fig>
            <p>At that, the average densities of all four ESEfinder motifs were higher in the retained introns (Fig. <figr fid="F3">3</figr>). The maximal difference between the median densities were observed for the SF2/ASF sites (median densities 0.040 and 0.028 for the retained and constitutive introns, respectively), whereas the lowest difference was observed for the SRp55 sites (median densities 0.0217 and 0.0215, non-significant). The density of PESE octamers (enhancers) was also higher in the retained introns (Fig. <figr fid="F4">4</figr>), whereas the density of PESS octamers (silencers) was higher in the constitutive introns (Fig. <figr fid="F4">4</figr>). In contrast, the density of ESE hexamers predicted by RESCUE-ESE was significantly higher in the constitutively splice introns than in the retained ones (Fig. <figr fid="F4">4</figr>). All these differences were statistically significant (Two-sample Kolmogorov-Smirnov test P &lt; 10<sup>-15</sup>).</p>
            <p>The relative position of an intron in a gene was defined as the ratio RP = D/L, where D was the distance from the gene 5'-end to the intron 5'-end (the donor site), and L was the gene length (the distance between 5'- and 3'-ends, as listed in RefSeq). Since terminal exons and introns may have considerably different lengths (<abbrgrp><abbr bid="B38">38</abbr></abbrgrp>, and data not shown), the distances were calculated in several different settings. Firstly, we used unspliced genes, as annotated in RefSeq, and in this cases the distances were calculated using the genomic sequence. Secondly, we considered spliced genes: all introns were removed and the studied intron was reduced to a single point, "intron shadow", and the distances were calculated using the mRNA sequence. Thirdly, we considered spliced genes with the last exon removed as well. Finally, we defined relative position of an intron as its ordinal number divided by the total number of introns in a gene.</p>
            <p>The constitutive introns (blue bars in Fig. <figr fid="F5">5</figr>) are shifted towards the 3'-end in the unspliced gene calculations (Fig. <figr fid="F5">5b</figr>), and towards 5'-ends in spliced gene calculations (Fig. <figr fid="F5">5c</figr>). This is consistent with decreasing intron density and increasing exon length in the 5'-to-3' direction <abbrgrp><abbr bid="B38">38</abbr></abbrgrp>. Indeed, when the last 3'terminal intron is removed, the distribution becomes almost uniform (Fig. <figr fid="F5">5d</figr>).</p>
            <fig id="F5">
               <title>
                  <p>Figure 5</p>
               </title>
               <caption>
                  <p>Histograms of the relative intron positions</p>
               </caption>
               <text>
                  <p><b>Histograms of the relative intron positions</b>. A: the relative (ordinal) intron number; B: unspliced genes; C: spliced genes; D: spliced genes with the last exon removed (see the text for the detailed explanation). Left axis: the fraction of introns in each position bin is given for retained (red) and constitutive (blue) introns separately. Points 0 and 1 on the horizontal axis correspond to the 5'- and 3'-ends of the gene, respectively. Right vertical axis and the orange triangle curve: the fraction of retained introns among all introns in the bin.</p>
               </text>
               <graphic file="1471-2164-9-13-5"/>
            </fig>
            <p>The situation with retained introns is dramatically different (Two-sample Kolmogorov-Smirnov test P &lt; 10<sup>-15 </sup>for relative intron positions in case with spliced genes and spliced genes with the last exon removed, and P &lt; 10<sup>-9 </sup>for unspliced genes; the &#967;<sup>2</sup>-test P &lt; 10<sup>-15 </sup>for the ordinal intron number). The distribution of the retained introns (red bars in Fig. <figr fid="F5">5</figr>) is considerably shifted towards the 3' in all settings, as compared to the constitutive introns. Accordingly, the fraction of retained introns increases in the 5'-to-3' direction, leveling off at about middle of the gene (the orange curve in Fig. <figr fid="F5">5</figr>).</p>
         </sec>
         <sec>
            <st>
               <p>Comparison of skipped and cryptic-site exons</p>
            </st>
            <p>The sets of splice-site inactivating mutations were collected as described in Data and Methods. Only mutations directly in the donor and acceptor sites were considered. The exons affected by the mutations were divided into skipped exons (S-exons) and exons utilizing cryptic sites (C-exons). The donor and acceptor site mutations were considered both separately and jointly, to increase the statistical power of the observations. The results are summarized in Table <tblr tid="T2">2</tblr>.</p>
            <tbl id="T2">
               <title>
                  <p>Table 2</p>
               </title>
               <caption>
                  <p>Properties of skipped exons (S-exons) and exons with cryptic sites (C-exons). For all exon parameters the medians are reported. The last column reports parameters of all internal exons in our dataset of RefSeq genes. MW: the statistical significance of the differences between the S- and C-exons by the Mann-Witney test; n/s &#8211; non significant.</p>
               </caption>
               <tblbdy cols="5">
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="right">
                        <p>
                           <b>S-exons</b>
                        </p>
                     </c>
                     <c ca="right">
                        <p>
                           <b>C-exons</b>
                        </p>
                     </c>
                     <c ca="right">
                        <p>
                           <b>MW</b>
                        </p>
                     </c>
                     <c ca="right">
                        <p>
                           <b>Internal exons</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c cspan="5">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c cspan="5" ca="left">
                        <p>
                           <b>Set size</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Mutated donor sites</p>
                     </c>
                     <c ca="right">
                        <p>67</p>
                     </c>
                     <c ca="right">
                        <p>42</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Mutated acceptor sites</p>
                     </c>
                     <c ca="right">
                        <p>42</p>
                     </c>
                     <c ca="right">
                        <p>72</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>All</p>
                     </c>
                     <c ca="right">
                        <p>109</p>
                     </c>
                     <c ca="right">
                        <p>114</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="right">
                        <p>154846</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="5" ca="left">
                        <p>
                           <b>Exon length (nucleotides)</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Mutated donor sites</p>
                     </c>
                     <c ca="right">
                        <p>114</p>
                     </c>
                     <c ca="right">
                        <p>147</p>
                     </c>
                     <c ca="right">
                        <p>0,024</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Mutated acceptor sites</p>
                     </c>
                     <c ca="right">
                        <p>112,5</p>
                     </c>
                     <c ca="right">
                        <p>130</p>
                     </c>
                     <c ca="right">
                        <p>n/s</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>All</p>
                     </c>
                     <c ca="right">
                        <p>114</p>
                     </c>
                     <c ca="right">
                        <p>136</p>
                     </c>
                     <c ca="right">
                        <p>0,020</p>
                     </c>
                     <c ca="right">
                        <p>123</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="5" ca="left">
                        <p>
                           <b>Densities of cis-acting elements(candidate sites per nucleotide)</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>
                              <it>ESEfinder: SC35</it>
                           </b>
                        </p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Mutated donor sites</p>
                     </c>
                     <c ca="right">
                        <p>0,043</p>
                     </c>
                     <c ca="right">
                        <p>0,042</p>
                     </c>
                     <c ca="right">
                        <p>n/s</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Mutated acceptor sites</p>
                     </c>
                     <c ca="right">
                        <p>0,038</p>
                     </c>
                     <c ca="right">
                        <p>0,045</p>
                     </c>
                     <c ca="right">
                        <p>n/s</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>All</p>
                     </c>
                     <c ca="right">
                        <p>0,042</p>
                     </c>
                     <c ca="right">
                        <p>0,043</p>
                     </c>
                     <c ca="right">
                        <p>n/s</p>
                     </c>
                     <c ca="right">
                        <p>0,038</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="5" ca="left">
                        <p>
                           <b>
                              <it>ESEfinder: SF2/ASF</it>
                           </b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Mutated donor sites</p>
                     </c>
                     <c ca="right">
                        <p>0,025</p>
                     </c>
                     <c ca="right">
                        <p>0,037</p>
                     </c>
                     <c ca="right">
                        <p>0,048</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Mutated acceptor sites</p>
                     </c>
                     <c ca="right">
                        <p>0,036</p>
                     </c>
                     <c ca="right">
                        <p>0,041</p>
                     </c>
                     <c ca="right">
                        <p>n/s</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>All</p>
                     </c>
                     <c ca="right">
                        <p>0,028</p>
                     </c>
                     <c ca="right">
                        <p>0,040</p>
                     </c>
                     <c ca="right">
                        <p>0,005</p>
                     </c>
                     <c ca="right">
                        <p>0,036</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="5" ca="left">
                        <p>
                           <b>
                              <it>ESEfinder: SRp40</it>
                           </b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Mutated donor sites</p>
                     </c>
                     <c ca="right">
                        <p>0,034</p>
                     </c>
                     <c ca="right">
                        <p>0,043</p>
                     </c>
                     <c ca="right">
                        <p>0,006</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Mutated acceptor sites</p>
                     </c>
                     <c ca="right">
                        <p>0,040</p>
                     </c>
                     <c ca="right">
                        <p>0,043</p>
                     </c>
                     <c ca="right">
                        <p>n/s</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>All</p>
                     </c>
                     <c ca="right">
                        <p>0,035</p>
                     </c>
                     <c ca="right">
                        <p>0,043</p>
                     </c>
                     <c ca="right">
                        <p>0,004</p>
                     </c>
                     <c ca="right">
                        <p>0,040</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="5" ca="left">
                        <p>
                           <b>
                              <it>ESEfinder: SRp55</it>
                           </b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Mutated donor sites</p>
                     </c>
                     <c ca="right">
                        <p>0,028</p>
                     </c>
                     <c ca="right">
                        <p>0,024</p>
                     </c>
                     <c ca="right">
                        <p>n/s</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Mutated acceptor sites</p>
                     </c>
                     <c ca="right">
                        <p>0,022</p>
                     </c>
                     <c ca="right">
                        <p>0,023</p>
                     </c>
                     <c ca="right">
                        <p>n/s</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>All</p>
                     </c>
                     <c ca="right">
                        <p>0,025</p>
                     </c>
                     <c ca="right">
                        <p>0,023</p>
                     </c>
                     <c ca="right">
                        <p>n/s</p>
                     </c>
                     <c ca="right">
                        <p>0,023</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="5" ca="left">
                        <p>
                           <b>
                              <it>RESCUE-ESE</it>
                           </b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Mutated donor sites</p>
                     </c>
                     <c ca="right">
                        <p>0,090</p>
                     </c>
                     <c ca="right">
                        <p>0,108</p>
                     </c>
                     <c ca="right">
                        <p>n/s</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Mutated acceptor sites</p>
                     </c>
                     <c ca="right">
                        <p>0,100</p>
                     </c>
                     <c ca="right">
                        <p>0,080</p>
                     </c>
                     <c ca="right">
                        <p>n/s</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>All</p>
                     </c>
                     <c ca="right">
                        <p>0,091</p>
                     </c>
                     <c ca="right">
                        <p>0,094</p>
                     </c>
                     <c ca="right">
                        <p>n/s</p>
                     </c>
                     <c ca="right">
                        <p>0,099</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="5" ca="left">
                        <p>
                           <b>
                              <it>PESE</it>
                           </b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Mutated donor sites</p>
                     </c>
                     <c ca="right">
                        <p>0,048</p>
                     </c>
                     <c ca="right">
                        <p>0,082</p>
                     </c>
                     <c ca="right">
                        <p>0,007</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Mutated acceptor sites</p>
                     </c>
                     <c ca="right">
                        <p>0,057</p>
                     </c>
                     <c ca="right">
                        <p>0,055</p>
                     </c>
                     <c ca="right">
                        <p>n/s</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>All</p>
                     </c>
                     <c ca="right">
                        <p>0,055</p>
                     </c>
                     <c ca="right">
                        <p>0,064</p>
                     </c>
                     <c ca="right">
                        <p>0,023</p>
                     </c>
                     <c ca="right">
                        <p>0,064</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="5" ca="left">
                        <p>
                           <b>
                              <it>PESS</it>
                           </b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Mutated donor sites</p>
                     </c>
                     <c ca="right">
                        <p>0,012</p>
                     </c>
                     <c ca="right">
                        <p>0,008</p>
                     </c>
                     <c ca="right">
                        <p>n/s</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Mutated acceptor sites</p>
                     </c>
                     <c ca="right">
                        <p>0,009</p>
                     </c>
                     <c ca="right">
                        <p>0,007</p>
                     </c>
                     <c ca="right">
                        <p>n/s</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>All</p>
                     </c>
                     <c ca="right">
                        <p>0,011</p>
                     </c>
                     <c ca="right">
                        <p>0,007</p>
                     </c>
                     <c ca="right">
                        <p>n/s</p>
                     </c>
                     <c ca="right">
                        <p>0,007</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="5" ca="left">
                        <p>
                           <b>Splice site scores</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>
                              <it>Mutated donor sites</it>
                           </b>
                        </p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Authentic donor sites</p>
                     </c>
                     <c ca="right">
                        <p>18,52</p>
                     </c>
                     <c ca="right">
                        <p>18,49</p>
                     </c>
                     <c ca="right">
                        <p>n/s</p>
                     </c>
                     <c ca="right">
                        <p>18,82</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Acceptor sites of the (upstream) exon</p>
                     </c>
                     <c ca="right">
                        <p>18,70</p>
                     </c>
                     <c ca="right">
                        <p>19,67</p>
                     </c>
                     <c ca="right">
                        <p>n/s</p>
                     </c>
                     <c ca="right">
                        <p>19,08</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Acceptor sites of the (downstream) intron</p>
                     </c>
                     <c ca="right">
                        <p>19,37</p>
                     </c>
                     <c ca="right">
                        <p>18,98</p>
                     </c>
                     <c ca="right">
                        <p>n/s</p>
                     </c>
                     <c ca="right">
                        <p>19,09</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="5" ca="left">
                        <p>
                           <b>
                              <it>Mutated acceptor sites</it>
                           </b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Authentic acceptor sites</p>
                     </c>
                     <c ca="right">
                        <p>19,59</p>
                     </c>
                     <c ca="right">
                        <p>18,72</p>
                     </c>
                     <c ca="right">
                        <p>0,05</p>
                     </c>
                     <c ca="right">
                        <p>19,08</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Donor sites of the (downstream) exon</p>
                     </c>
                     <c ca="right">
                        <p>18,44</p>
                     </c>
                     <c ca="right">
                        <p>18,56</p>
                     </c>
                     <c ca="right">
                        <p>n/s</p>
                     </c>
                     <c ca="right">
                        <p>18,82</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Donor sites of the (upstream) intron</p>
                     </c>
                     <c ca="right">
                        <p>18,48</p>
                     </c>
                     <c ca="right">
                        <p>18,51</p>
                     </c>
                     <c ca="right">
                        <p>n/s</p>
                     </c>
                     <c ca="right">
                        <p>18,79</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="5" ca="left">
                        <p>
                           <b>Distance to the closest candidate site(nucleotides)</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Mutated donor sites</p>
                     </c>
                     <c ca="right">
                        <p>220,5</p>
                     </c>
                     <c ca="right">
                        <p>75</p>
                     </c>
                     <c ca="right">
                        <p>0,067</p>
                     </c>
                     <c ca="right">
                        <p>289</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Mutated acceptor sites</p>
                     </c>
                     <c ca="right">
                        <p>185</p>
                     </c>
                     <c ca="right">
                        <p>66</p>
                     </c>
                     <c ca="right">
                        <p>0,024</p>
                     </c>
                     <c ca="right">
                        <p>81</p>
                     </c>
                  </r>
               </tblbdy>
            </tbl>
            <p>The S-exons were found to be significantly shorter than the C-exons (median sizes 114 and 136). No significant differences were observed in the lengths of flanking introns (data not shown).</p>
            <p>Scores of authentic splice sites and all splice sites in the adjacent exons and introns for the S- and C-exons were calculated as described in Data and Methods. Unexpectedly, the authentic acceptor sites affected by mutations were significantly weaker in the C-exons than in the S-exons, with the median scores 18.72 and 19.59, respectively (the Mann-Witney test P = 0.05). No significant differences were observed in the distribution of authentic site scores in the S- and C-exons with mutated donor sites, neither in the distribution of scores of all other considered sites.</p>
            <p>The relative enrichment by potential cryptic sites near the mutated sites was estimated by calculating the distance to the closest equivalent splice site; the latter were defined as candidate splice sites of the same type as the authentic site and having the same or higher splice site score. The search for equivalent splice sites was limited to the adjacent intron and exon, and the cases when such sites were absent were not taken into account in calculations. Both for the donor and acceptor site mutations, the S- and C-exons differed dramatically: the equivalent sites were located much closer to the authentic splice sites of the C-exons than for the S-exons.</p>
            <p>The densities of ESEfinder SF2/ASF and SRp40 motifs, as well as PESE octamers were significantly higher in the C-exons than in the S-exons with mutated donor sites, although the tendency was the same for most other types of ESEs and also in exons with mutated acceptor sites. The densities of PESS in exons with mutated splice sites of both types were higher in the S-exons, but the difference was not significant even for combined sets (The Kolmogorov-Smirnov test P = 0.09).</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Discussion</p>
         </st>
         <p>The overall results of this study seem to agree with the existing biological models. The fact that retained introns are relatively short is consistent with the possibility that such introns are spliced out by the intron definition mechanism, as in this case splicing aberrations should lead to intron retention. When this study was completed, similar observations were made also in <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>.</p>
         <p>The relative weakness of splicing sites in retained introns and the fact that exons skipped due to mutations of splice sites do not have strong cryptic sites in the immediate vicinity shows that the site scores are a reasonable approximation to site strength and may determine their functionality <abbrgrp><abbr bid="B3">3</abbr><abbr bid="B31">31</abbr><abbr bid="B32">32</abbr><abbr bid="B33">33</abbr><abbr bid="B39">39</abbr><abbr bid="B40">40</abbr></abbrgrp> At that, unlike <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>, the relative dearth of cryptic candidate sites in the vicinity of the C-exons was not confined to exclusively to the exons with mutated donor splice sites. On the other hand, we could not confirm the observation that strong acceptor sites are a characteristic of the C-exons with mutated donor sites <abbrgrp><abbr bid="B31">31</abbr></abbrgrp>.</p>
         <p>In contrast to previous studies that were primarily interested in functional (e.g. conserved) alternative splicing of retained introns <abbrgrp><abbr bid="B8">8</abbr><abbr bid="B10">10</abbr></abbrgrp>, we did not enforce possible functionality. One of consequences of that is that the majority of retained introns studied here are unlikely to encode functional proteins, as only 3.3% of them are frame-preserving (this number is close to 4.6% in-frame retained introns observed in Arabidopsis <abbrgrp><abbr bid="B10">10</abbr></abbrgrp>). This does not preclude the possible role of such introns in regulation, either on the protein level (e.g. leading to the synthesis of shortened proteins with regulatory function) or on the mRNA level (leading to NMD-inducing isoforms in some specific conditions); some examples of such regulatory mechanisms have been mentioned in the Introduction. However, both the procedure and the obtained results seem to indicate that the majority of retained introns in our study come from underspliced transcripts.</p>
         <p>In line with this reasoning, the weakness of sites in retained introns may have two explanations. The retained introns might come from underspliced transcripts (weaker sites imply lower splicing efficiency) or be instances of regulated alternative splicing. Indeed, functional alternative splice sites are weaker than constitutive splice sites <abbrgrp><abbr bid="B41">41</abbr><abbr bid="B42">42</abbr></abbrgrp>. Further, longer introns in general tend to have stronger splice sites; however, the latter trend becomes observable only for bona fide introns longer than 1500 nt <abbrgrp><abbr bid="B43">43</abbr></abbrgrp>, and thus should not influence the majority of retained introns studies here.</p>
         <p>It has been demonstrated that both human and plant retained introns are more prevalent in the 5'- and especially 3'-untranslated regions, compared to the protein-coding regions of the mRNAs mechanism <abbrgrp><abbr bid="B8">8</abbr><abbr bid="B10">10</abbr></abbrgrp>. This has been ascribed to elimination of abnormally spliced mRNAs by the NMD mechanism <abbrgrp><abbr bid="B44">44</abbr></abbrgrp>. However, this would not explain the observed prevalence of NMD-inducing retained introns in the 5'-regions. Our results demonstrate monotonic increase in the fraction of mostly retained introns in the 5'-to-3' direction. This is consistent with some degree of co-transciptional splicing (as opposed to simple commitment to splicing with the actual process starting simultaneously for all intron) observed in experiment <abbrgrp><abbr bid="B45">45</abbr></abbrgrp>. However, this correlation is not straightforward. Indeed, since we considered only introns bounded on both sides by internal exons, and required that the boundaries of the exon containing the unspliced intron coincided exactly with the boundaries of the corresponding exon-intron-exon chain in the RefSeq mRNA isoform (see Methods), all retained introns considered here are followed by spliced out introns. This means that the observed tendency may not be a simple consequence of completely unspliced 3'-termini.</p>
         <p>The observed differences in the density of exonic splicing enhancers in the retained and constitutive introns as well as in the C-exons and S-exons also seem to have a natural biological interpretation. Indeed, a high density of ESE-like sites in an (relatively short) intron may lead to misrecognition of this intron as a part of an exon together with the flanking exons. Similarly, a high density of ESEs in an exon with a mutated site may force the splicing machinery to retain this exon and use a cryptic site, whereas ESSs might provoke skipping the exon. A puzzling observation that candidate enhancers predicted by RESCUE-ESE were more abundant in the constitutively splice introns than in retained ones may be explained by the fact that this method, unlike PESX, is based on the comparison of oligonucleotide frequencies in constitutive and alternative exons and does not control for the distribution of these oligonucleotides in introns <abbrgrp><abbr bid="B35">35</abbr><abbr bid="B36">36</abbr><abbr bid="B37">37</abbr></abbrgrp>. A similar observation was recently made in <abbrgrp><abbr bid="B33">33</abbr></abbrgrp>. Another coincidence between our study and <abbrgrp><abbr bid="B33">33</abbr></abbrgrp> is that not all SELEX-based ESEFinder candidate exonic splicing enhancers have different densities in the S-exons and C-exons: in <abbrgrp><abbr bid="B33">33</abbr></abbrgrp>, the most pronounced effect was observed for SF2/ASF, whereas in our study a more statistically significant difference was seen for SRp40. In retained introns, the most prevalent candidate splicing enhancers were those for SF2/ASF and SC35, trailed by those for SRp40 and, marginally significant, for SRp55.</p>
         <p>Unfortunately, at present it seems impossible to repeat these analyses with intronic splicing enhancers and silencers, since no programs for their recognition are available. A more convoluted, but still plausible explanation may be found for the observed significant difference in the strength of authentic acceptor sites of the C-exons and S-exons: an exon with a weak splice site already contains more splicing enhancers than an exon with strong sites <abbrgrp><abbr bid="B35">35</abbr><abbr bid="B46">46</abbr><abbr bid="B47">47</abbr></abbrgrp>, and thus it is more likely to become a C-exon if the site is disrupted by a mutation.</p>
      </sec>
      <sec>
         <st>
            <p>Conclusion</p>
         </st>
         <p>Thus the analysis of retained introns in human cDNA, exons skipped due to mutations in splice sites and exons with cryptic sites produced results consistent with the intron definition mechanism of splicing of short introns and the model of co-transcriptional splicing. Retained introns tend to be short and contain a higher density of splicing enhancers. Skipped exons contain more candidate splicing enhancers and less silencers, compared to exons with activated cryptic sites. Skipped exons also do not have strong candidate splice sites in the vicinity of mutated ones.</p>
      </sec>
      <sec>
         <st>
            <p>Methods</p>
         </st>
         <sec>
            <st>
               <p>Set of RefSeq scaffolds</p>
            </st>
            <p>Human genome (version 18, March 2006) and alignments of RefSeq genes (21.02.07) and high-throughput cDNAs (16.06.07) were downloaded from the UCSC genome browser <abbrgrp><abbr bid="B48">48</abbr></abbrgrp>; the EST data were not used. Initially the dataset contained 25388 RefSeq mRNAs. Isoforms of alternatively spliced genes were clustered by the RefSeq gene name. To avoid redundancy in the structures of alternatively spliced genes, only the longest isoform for each such gene was retained and used as the scaffold in all further calculations. Isoform lengths were calculated for spliced mRNAs. The final set of RefSeq genes consisted of 18458 genes containing 154846 internal exons and 138777 introns between such exons. All measurements and comparisons of internal exons and introns were made according to the accepted scaffold gene structures and, in the case of mutated exons, for authentic sequences.</p>
         </sec>
         <sec>
            <st>
               <p>Sets of mutated exons</p>
            </st>
            <p>Sets of mutated exons included only internal exons affected by single-nucleotide substitutions in splice sites (from -3 to +6 for donor sites and from -15 to +2 for acceptor sites) leading to the exon-skipping (S-exons) or cryptic site activation (C-exons). The set of C-exon was also restricted to cryptic sites located in exons and introns adjacent to the mutated site. The set of C-exons with mutations in donor splice sites was obtained from <abbrgrp><abbr bid="B40">40</abbr></abbrgrp>, and contained 42 exons. The set of C-exons with mutations in acceptor sites was obtained from the DBASS3 database <abbrgrp><abbr bid="B39">39</abbr></abbrgrp> and contained 72 exons. The set of S-exons was collected by search of published examples of exon skipping in OMIM <abbrgrp><abbr bid="B49">49</abbr></abbrgrp> and PubMed. The collected S exons were identified in the set of RefSeq scaffolds. The final set contained, respectively, 67 and 42 S-exons with mutations in donor and acceptor sites. The sets of donor and acceptor S-exons are available as Additional files <supplr sid="S1">1</supplr> and <supplr sid="S2">2</supplr> respectively.</p>
            <suppl id="S1">
               <title>
                  <p>Additional file 1</p>
               </title>
               <text>
                  <p><b>List of skipped exons (S-exons) with mutations in donor sites</b>. List of skipped exons (S-exons) with mutated donor sites: gene name, ordinal number of the skipped exon in the gene, exon sequence.</p>
               </text>
               <file name="1471-2164-9-13-S1.txt">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <suppl id="S2">
               <title>
                  <p>Additional file 2</p>
               </title>
               <text>
                  <p><b>List of skipped exons (S-exons) with mutations in acceptor sites</b>. List of skipped exons (S-exons) with mutated acceptor sites: gene name, ordinal number of the skipped exon in the gene, exon sequence.</p>
               </text>
               <file name="1471-2164-9-13-S2.txt">
                  <p>Click here for file</p>
               </file>
            </suppl>
         </sec>
         <sec>
            <st>
               <p>Sets of retained and constitutive (constitutively spliced out) introns</p>
            </st>
            <p>An intron retention event was scored if the high-throughput cDNA sequencing data contained an exon that exactly covered an exon-intron-exon chain in a RefSeq gene (Fig. <figr fid="F1">1</figr>). Such intron was called a retained intron. All other introns were considered to be constitutive introns. Since parameters of flanking exons were analyzed, only introns between internal exons from the RefSeq scaffolds were considered. The final set consisted of 1197 retained and 137580 constitutive introns.</p>
         </sec>
         <sec>
            <st>
               <p>Splice site scores</p>
            </st>
            <p>Scores of the donor and acceptor splicing sites were calculated using positional weight matrices covering positions from -3 to +6 (for donor sites) and from -15 to +2 (for acceptor sites). The positional nucleotide weights were calculated as in <abbrgrp><abbr bid="B50">50</abbr></abbrgrp>: W(b,m) = log [N(b,m)+0.5]-0.25&#183;&#931;<sub>i=A,C,G,T </sub>log [N(i,m)+0.5] where N(b,m) is the count of nucleotide b in position m in the training sample. The training sample was obtained from the EDAS database <abbrgrp><abbr bid="B6">6</abbr></abbrgrp>, and contained 4179 constitutive internal exons confirmed by at least 50 EST. The score of a donor site (b<sub>-3</sub>,...,b<sub>6</sub>), where b<sub>j </sub>are nucleotides, was then calculated as a sum of positional weights: S(b<sub>-3</sub>,...,b<sub>6</sub>) = W(b<sub>-3</sub>,-3)+...+W(b<sub>6</sub>,6), and similarly for scores of acceptor sites.</p>
         </sec>
         <sec>
            <st>
               <p>Densities of cis-acting elements</p>
            </st>
            <p>Putative cis-regulatory elements were identified in all internal exons and introns by several published methods. In particular, we searched for ESE motifs initially identified by SELEX (SF2/ASF, SC35, SRp40, SRp55) using ESEfinder <abbrgrp><abbr bid="B34">34</abbr></abbrgrp>; 238 ESE hexamers predicted by RESCUE-ESE <abbrgrp><abbr bid="B35">35</abbr></abbrgrp>; and 2060 ESE and 1018 ESS octamers predicted by PESX <abbrgrp><abbr bid="B36">36</abbr><abbr bid="B37">37</abbr></abbrgrp>. The densities of predicted regulatory elements were defined as the number of candidate of ESE sand ESS per base pair.</p>
         </sec>
         <sec>
            <st>
               <p>Statistical analysis</p>
            </st>
            <p>The statistical significance of differences between distributions of all intron parameters was measured by the Two sample Kolmogorov-Smirnov test and Student's t-test. The only exception was the distributions of the intron ordinal number, where we used the &#967;<sup>2 </sup>test instead of the Kolmogorov-Smirnov test. The significance of differences between mutated exon parameters, due to small data set size was measured by the Mann-Whitney test. All these tests were implemented in the R-Package <abbrgrp><abbr bid="B51">51</abbr></abbrgrp>.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Authors' contributions</p>
         </st>
         <p>MSG conceived the project. EZK collected and analyzed the data. MSG and EZK wrote the manuscript.</p>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>We are grateful to Ramil Nurdtinov and Andrei Mironov for useful discussions. This study was partially supported by grants from the Howard Hughes Medical Institute (55001056), INTAS (05-8028), Russian Academy of Sciences (program "Cellular and Molecular Biology"), and the Russian Foundation of Basic Research (07-04-00343).</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>Exon recognition in vertebrate splicing</p>
            </title>
            <aug>
               <au>
                  <snm>Berget</snm>
                  <fnm>SM</fnm>
               </au>
            </aug>
            <source>J Biol Chem</source>
            <pubdate>1995</pubdate>
            <volume>270</volume>
            <fpage>2411</fpage>
            <lpage>2414</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">7852296</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>How did alternative splicing evolve?</p>
            </title>
            <aug>
               <au>
                  <snm>Ast</snm>
                  <fnm>G</fnm>
               </au>
            </aug>
            <source>Nat Rev Genet</source>
            <pubdate>2004</pubdate>
            <volume>5</volume>
            <fpage>773</fpage>
            <lpage>782</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nrg1451</pubid>
                  <pubid idtype="pmpid" link="fulltext">15510168</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>Single base-pair substitutions in exon-intron junctions of human genes: nature, distribution, and consequences for mRNA splicing</p>
            </title>
            <aug>
               <au>
                  <snm>Krawczak</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Thomas</snm>
                  <fnm>NS</fnm>
               </au>
               <au>
                  <snm>Hundrieser</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Mort</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Wittig</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Hampe</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Cooper</snm>
                  <fnm>DN</fnm>
               </au>
            </aug>
            <source>Hum Mutat</source>
            <pubdate>2007</pubdate>
            <volume>28</volume>
            <fpage>150</fpage>
            <lpage>158</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/humu.20400</pubid>
                  <pubid idtype="pmpid" link="fulltext">17001642</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>Construction of a novel database containing aberrant splicing mutations of mammalian genes</p>
            </title>
            <aug>
               <au>
                  <snm>Nakai</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Sakamoto</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>Gene</source>
            <pubdate>1994</pubdate>
            <volume>141</volume>
            <fpage>171</fpage>
            <lpage>177</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/0378-1119(94)90567-3</pubid>
                  <pubid idtype="pmpid">8163185</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>Different levels of alternative splicing among eukaryotes</p>
            </title>
            <aug>
               <au>
                  <snm>Kim</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Magen</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Ast</snm>
                  <fnm>G</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2007</pubdate>
            <volume>35</volume>
            <fpage>125</fpage>
            <lpage>131</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1802581</pubid>
                  <pubid idtype="pmpid" link="fulltext">17158149</pubid>
                  <pubid idtype="doi">10.1093/nar/gkl924</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>EDAS, databases of alternatively spliced human genes</p>
            </title>
            <aug>
               <au>
                  <snm>Nurtdinov</snm>
                  <fnm>RN</fnm>
               </au>
               <au>
                  <snm>Neverov</snm>
                  <fnm>AD</fnm>
               </au>
               <au>
                  <snm>Mal'ko</snm>
                  <fnm>DB</fnm>
               </au>
               <au>
                  <snm>Kosmodem'ianskii</snm>
                  <fnm>IA</fnm>
               </au>
               <au>
                  <snm>Ermakova</snm>
                  <fnm>EO</fnm>
               </au>
               <au>
                  <snm>Ramenskii</snm>
                  <fnm>VE</fnm>
               </au>
               <au>
                  <snm>Mironov</snm>
                  <fnm>AA</fnm>
               </au>
               <au>
                  <snm>Gel'fand</snm>
                  <fnm>MS</fnm>
               </au>
            </aug>
            <source>Biofizika</source>
            <pubdate>2006</pubdate>
            <volume>51</volume>
            <fpage>589</fpage>
            <lpage>592</lpage>
            <xrefbib>
               <pubid idtype="pmpid">16909834</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>Genome-wide comparative analysis of alternative splicing in plants</p>
            </title>
            <aug>
               <au>
                  <snm>Wang</snm>
                  <fnm>BB</fnm>
               </au>
               <au>
                  <snm>Brendel</snm>
                  <fnm>V</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2006</pubdate>
            <volume>103</volume>
            <fpage>7175</fpage>
            <lpage>7180</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1459036</pubid>
                  <pubid idtype="pmpid" link="fulltext">16632598</pubid>
                  <pubid idtype="doi">10.1073/pnas.0602039103</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>Detection and evaluation of intron retention events in the human transcriptome</p>
            </title>
            <aug>
               <au>
                  <snm>Galante</snm>
                  <fnm>PA</fnm>
               </au>
               <au>
                  <snm>Sakabe</snm>
                  <fnm>NJ</fnm>
               </au>
               <au>
                  <snm>Kirschbaum-Slager</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>de Souza</snm>
                  <fnm>SJ</fnm>
               </au>
            </aug>
            <source>RNA</source>
            <pubdate>2004</pubdate>
            <volume>10</volume>
            <fpage>757</fpage>
            <lpage>765</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1370565</pubid>
                  <pubid idtype="pmpid" link="fulltext">15100430</pubid>
                  <pubid idtype="doi">10.1261/rna.5123504</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>Sequence features responsible for intron retention in human</p>
            </title>
            <aug>
               <au>
                  <snm>Sakabe</snm>
                  <fnm>NJ</fnm>
               </au>
               <au>
                  <snm>de Souza</snm>
                  <fnm>SJ</fnm>
               </au>
            </aug>
            <source>BMC Genomics</source>
            <pubdate>2007</pubdate>
            <volume>8</volume>
            <fpage>59</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1831480</pubid>
                  <pubid idtype="pmpid" link="fulltext">17324281</pubid>
                  <pubid idtype="doi">10.1186/1471-2164-8-59</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>Whole-genome microarray in Arabidopsis facilitates global analysis of retained introns</p>
            </title>
            <aug>
               <au>
                  <snm>Ner-Gaon</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Fluhr</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>DNA Res</source>
            <pubdate>2006</pubdate>
            <volume>13</volume>
            <fpage>111</fpage>
            <lpage>21</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/dnares/dsl003</pubid>
                  <pubid idtype="pmpid" link="fulltext">16980712</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>Non-EST based prediction of exon skipping and intron retention events using Pfam information</p>
            </title>
            <aug>
               <au>
                  <snm>Hiller</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Huse</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Platzer</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Backofen</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2005</pubdate>
            <volume>33</volume>
            <fpage>5611</fpage>
            <lpage>5621</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1243800</pubid>
                  <pubid idtype="pmpid" link="fulltext">16204458</pubid>
                  <pubid idtype="doi">10.1093/nar/gki870</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>Intron retention may regulate expression of Epstein-Barr virus nuclear antigen 3 family genes</p>
            </title>
            <aug>
               <au>
                  <snm>Kienzle</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Young</snm>
                  <fnm>DB</fnm>
               </au>
               <au>
                  <snm>Liaskou</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Buck</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Greco</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Sculley</snm>
                  <fnm>TB</fnm>
               </au>
            </aug>
            <source>J Virol</source>
            <pubdate>1999</pubdate>
            <volume>73</volume>
            <fpage>1195</fpage>
            <lpage>1204</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">103940</pubid>
                  <pubid idtype="pmpid" link="fulltext">9882321</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B13">
            <title>
               <p>SRC regulates constitutive internalization and rapid resensitization of a cholecystokinin 2 receptor splice variant</p>
            </title>
            <aug>
               <au>
                  <snm>Chao</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Ives</snm>
                  <fnm>KL</fnm>
               </au>
               <au>
                  <snm>Goluszko</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Kolokoltsov</snm>
                  <fnm>AA</fnm>
               </au>
               <au>
                  <snm>Davey</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Townsend</snm>
                  <fnm>CM</fnm>
                  <suf>Jr</suf>
               </au>
               <au>
                  <snm>Hellmich</snm>
                  <fnm>MR</fnm>
               </au>
            </aug>
            <source>J Biol Chem</source>
            <pubdate>2005</pubdate>
            <volume>280</volume>
            <fpage>33368</fpage>
            <lpage>33373</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1074/jbc.M506337200</pubid>
                  <pubid idtype="pmpid" link="fulltext">16079138</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>Regulation of the beta-hydroxyacyl ACP dehydratase gene of Picea mariana by alternative splicing</p>
            </title>
            <aug>
               <au>
                  <snm>Tai</snm>
                  <fnm>HH</fnm>
               </au>
               <au>
                  <snm>Williams</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Iyengar</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Yeates</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Beardmore</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>Plant Cell Rep</source>
            <pubdate>2007</pubdate>
            <volume>26</volume>
            <fpage>105</fpage>
            <lpage>113</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1007/s00299-006-0213-7</pubid>
                  <pubid idtype="pmpid" link="fulltext">17021849</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B15">
            <title>
               <p>An intron with a constitutive transport element is retained in a Tap messenger</p>
            </title>
            <aug>
               <au>
                  <snm>Li</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Bor</snm>
                  <fnm>YC</fnm>
               </au>
               <au>
                  <snm>Misawa</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Xue</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Rekosh</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Hammarskjold</snm>
                  <fnm>ML</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2006</pubdate>
            <volume>443</volume>
            <fpage>234</fpage>
            <lpage>237</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nature05107</pubid>
                  <pubid idtype="pmpid" link="fulltext">16971948</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <title>
               <p>Alternative splicing of intron 3 of the serine/arginine-rich protein 9G8 gene. Identification of flanking exonic splicing enhancers and involvement of 9G8 as a trans-acting factor</p>
            </title>
            <aug>
               <au>
                  <snm>Lejeune</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Cavaloc</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Stevenin</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>J Biol Chem</source>
            <pubdate>2001</pubdate>
            <volume>276</volume>
            <fpage>7850</fpage>
            <lpage>7858</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1074/jbc.M009510200</pubid>
                  <pubid idtype="pmpid" link="fulltext">11096110</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>Regulation of Drosophila P-element transposition</p>
            </title>
            <aug>
               <au>
                  <snm>Rio</snm>
                  <fnm>DC</fnm>
               </au>
            </aug>
            <source>Trends Genet</source>
            <pubdate>1991</pubdate>
            <volume>7</volume>
            <fpage>282</fpage>
            <lpage>287</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">1662417</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p>Nominal growth hormone pulses in otherwise normal masculine plasma profiles induce intron retention of overexpressed hepatic CYP2C11 with associated nuclear splicing deficiency</p>
            </title>
            <aug>
               <au>
                  <snm>Pampori</snm>
                  <fnm>NA</fnm>
               </au>
               <au>
                  <snm>Shapiro</snm>
                  <fnm>BH</fnm>
               </au>
            </aug>
            <source>Endocrinology</source>
            <pubdate>2000</pubdate>
            <volume>141</volume>
            <fpage>4100</fpage>
            <lpage>4106</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1210/en.141.11.4100</pubid>
                  <pubid idtype="pmpid" link="fulltext">11089541</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B19">
            <title>
               <p>Kallikrein-related peptidase (KLK) family mRNA variants and protein isoforms in hormone-related cancers: do they have a function?</p>
            </title>
            <aug>
               <au>
                  <snm>Tan</snm>
                  <fnm>OL</fnm>
               </au>
               <au>
                  <snm>Whitbread</snm>
                  <fnm>AK</fnm>
               </au>
               <au>
                  <snm>Clements</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Dong</snm>
                  <fnm>Y</fnm>
               </au>
            </aug>
            <source>Biol Chem</source>
            <pubdate>2006</pubdate>
            <volume>387</volume>
            <fpage>697</fpage>
            <lpage>705</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1515/BC.2006.088</pubid>
                  <pubid idtype="pmpid" link="fulltext">16800730</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B20">
            <title>
               <p>Intron retention: a common splicing event within the human kallikrein gene family</p>
            </title>
            <aug>
               <au>
                  <snm>Michael</snm>
                  <fnm>IP</fnm>
               </au>
               <au>
                  <snm>Kurlender</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Memari</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Yousef</snm>
                  <fnm>GM</fnm>
               </au>
               <au>
                  <snm>Du</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Grass</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Stephan</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Jung</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Diamandis</snm>
                  <fnm>EP</fnm>
               </au>
            </aug>
            <source>Clin Chem</source>
            <pubdate>2005</pubdate>
            <volume>51</volume>
            <fpage>506</fpage>
            <lpage>15</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1373/clinchem.2004.042341</pubid>
                  <pubid idtype="pmpid" link="fulltext">15650036</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B21">
            <title>
               <p>Intron retention as an alternative splice variant of the rat urocortin 1 gene</p>
            </title>
            <aug>
               <au>
                  <snm>Blanco</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Rojas</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Haeger</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Cuevas</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Perez</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Munita</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Quiroz</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Andres</snm>
                  <fnm>ME</fnm>
               </au>
               <au>
                  <snm>Forray</snm>
                  <fnm>MI</fnm>
               </au>
               <au>
                  <snm>Gysling</snm>
                  <fnm>K</fnm>
               </au>
            </aug>
            <source>Neuroscience</source>
            <pubdate>2006</pubdate>
            <volume>140</volume>
            <fpage>1245</fpage>
            <lpage>1252</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.neuroscience.2006.03.031</pubid>
                  <pubid idtype="pmpid" link="fulltext">16650605</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B22">
            <title>
               <p>COX-3: a splice variant of cyclooxygenase-1 in mouse neural tissue and cells</p>
            </title>
            <aug>
               <au>
                  <snm>Shaftel</snm>
                  <fnm>SS</fnm>
               </au>
               <au>
                  <snm>Olschowka</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Hurley</snm>
                  <fnm>SD</fnm>
               </au>
               <au>
                  <snm>Moore</snm>
                  <fnm>AH</fnm>
               </au>
               <au>
                  <snm>O'Banion</snm>
                  <fnm>MK</fnm>
               </au>
            </aug>
            <source>Brain Res Mol Brain Res</source>
            <pubdate>2003</pubdate>
            <volume>119</volume>
            <fpage>213</fpage>
            <lpage>215</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.molbrainres.2003.09.006</pubid>
                  <pubid idtype="pmpid" link="fulltext">14625089</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B23">
            <title>
               <p>Structure and expression of the murine calcyon gene</p>
            </title>
            <aug>
               <au>
                  <snm>Dai</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Bergson</snm>
                  <fnm>C</fnm>
               </au>
            </aug>
            <source>Gene</source>
            <pubdate>2003</pubdate>
            <volume>311</volume>
            <fpage>111</fpage>
            <lpage>117</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0378-1119(03)00564-X</pubid>
                  <pubid idtype="pmpid" link="fulltext">12853145</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B24">
            <title>
               <p>The Tgif2 gene contains a retained intron within the coding sequence</p>
            </title>
            <aug>
               <au>
                  <snm>Melhuish</snm>
                  <fnm>TA</fnm>
               </au>
               <au>
                  <snm>Wotton</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>BMC Mol Biol</source>
            <pubdate>2006</pubdate>
            <volume>7</volume>
            <fpage>2</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1402312</pubid>
                  <pubid idtype="pmpid" link="fulltext">16436215</pubid>
                  <pubid idtype="doi">10.1186/1471-2199-7-2</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B25">
            <title>
               <p>Evolution of exon-intron structure and alternative splicing in fruit flies and malarial mosquito genomes</p>
            </title>
            <aug>
               <au>
                  <snm>Malko</snm>
                  <fnm>DB</fnm>
               </au>
               <au>
                  <snm>Makeev</snm>
                  <fnm>VJ</fnm>
               </au>
               <au>
                  <snm>Mironov</snm>
                  <fnm>AA</fnm>
               </au>
               <au>
                  <snm>Gelfand</snm>
                  <fnm>MS</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2006</pubdate>
            <volume>16</volume>
            <fpage>505</fpage>
            <lpage>509</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1457027</pubid>
                  <pubid idtype="pmpid" link="fulltext">16520458</pubid>
                  <pubid idtype="doi">10.1101/gr.4236606</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B26">
            <title>
               <p>Analysis of the mouse transcriptome based on functional annotation of 60,770 full-length cDNAs</p>
            </title>
            <aug>
               <au>
                  <snm>Okazaki</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Furuno</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Kasukawa</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Adachi</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Bono</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Kondo</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Nikaido</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Osato</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Saito</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Suzuki</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Yamanaka</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Kiyosawa</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Yagi</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Tomaru</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Hasegawa</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Nogami</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Sch&#246;nbach</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Gojobori</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Baldarelli</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Hill</snm>
                  <fnm>DP</fnm>
               </au>
               <au>
                  <snm>Bult</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Hume</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>Quackenbush</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Schriml</snm>
                  <fnm>LM</fnm>
               </au>
               <au>
                  <snm>Kanapin</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Matsuda</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Batalov</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Beisel</snm>
                  <fnm>KW</fnm>
               </au>
               <au>
                  <snm>Blake</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Bradt</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <cnm>FANTOM consortium; RIKEN Genome Exploration Research Group Phase I &amp; II Team</cnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2002</pubdate>
            <volume>420</volume>
            <fpage>563</fpage>
            <lpage>573</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nature01266</pubid>
                  <pubid idtype="pmpid" link="fulltext">12466851</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B27">
            <title>
               <p>ORESTES are enriched in rare exon usage variants affecting the encoded proteins</p>
            </title>
            <aug>
               <au>
                  <snm>Sakabe</snm>
                  <fnm>NJ</fnm>
               </au>
               <au>
                  <snm>de Souza</snm>
                  <fnm>JE</fnm>
               </au>
               <au>
                  <snm>Galante</snm>
                  <fnm>PA</fnm>
               </au>
               <au>
                  <snm>de Oliveira</snm>
                  <fnm>PS</fnm>
               </au>
               <au>
                  <snm>Passetti</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Brentani</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Osorio</snm>
                  <fnm>EC</fnm>
               </au>
               <au>
                  <snm>Zaiats</snm>
                  <fnm>AC</fnm>
               </au>
               <au>
                  <snm>Leerkes</snm>
                  <fnm>MR</fnm>
               </au>
               <au>
                  <snm>Kitajima</snm>
                  <fnm>JP</fnm>
               </au>
               <au>
                  <snm>Brentani</snm>
                  <fnm>RR</fnm>
               </au>
               <au>
                  <snm>Strausberg</snm>
                  <fnm>RL</fnm>
               </au>
               <au>
                  <snm>Simpson</snm>
                  <fnm>AJ</fnm>
               </au>
               <au>
                  <snm>de Souza</snm>
                  <fnm>SJ</fnm>
               </au>
            </aug>
            <source>C R Biol</source>
            <pubdate>2003</pubdate>
            <volume>326</volume>
            <fpage>979</fpage>
            <lpage>985</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.crvi.2003.09.027</pubid>
                  <pubid idtype="pmpid">14744104</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B28">
            <title>
               <p>Splicing mutants and their second-site suppressors at the dihydrofolate reductase locus in Chinese hamster ovary cells</p>
            </title>
            <aug>
               <au>
                  <snm>Carothers</snm>
                  <fnm>AM</fnm>
               </au>
               <au>
                  <snm>Urlaub</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Grunberger</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Chasin</snm>
                  <fnm>LA</fnm>
               </au>
            </aug>
            <source>Mol Cell Biol</source>
            <pubdate>1993</pubdate>
            <volume>13</volume>
            <fpage>5085</fpage>
            <lpage>5098</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">360161</pubid>
                  <pubid idtype="pmpid" link="fulltext">8336736</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B29">
            <title>
               <p>Mutations that alter RNA splicing of the human HPRT gene: a review of the spectrum</p>
            </title>
            <aug>
               <au>
                  <snm>O'Neill</snm>
                  <fnm>JP</fnm>
               </au>
               <au>
                  <snm>Rogan</snm>
                  <fnm>PK</fnm>
               </au>
               <au>
                  <snm>Cariello</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Nicklas</snm>
                  <fnm>JA</fnm>
               </au>
            </aug>
            <source>Mutat Res</source>
            <pubdate>1998</pubdate>
            <volume>411</volume>
            <fpage>179</fpage>
            <lpage>214</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S1383-5742(98)00013-1</pubid>
                  <pubid idtype="pmpid">9804951</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B30">
            <title>
               <p>A mechanism for unsplicing and exon skipping in human alpha- and beta-globin mutant pre-mRNA splicing</p>
            </title>
            <aug>
               <au>
                  <snm>Iida</snm>
                  <fnm>Y</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Symp Ser</source>
            <pubdate>1997</pubdate>
            <volume>37</volume>
            <fpage>183</fpage>
            <lpage>184</lpage>
            <xrefbib>
               <pubid idtype="pmpid">9586060</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B31">
            <title>
               <p>Extensive in silico analysis of NF1 splicing defects uncovers determinants for splicing outcome upon 5' splice-site disruption</p>
            </title>
            <aug>
               <au>
                  <snm>Wimmer</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Roca</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Beiglbock</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Callens</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Etzler</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Rao</snm>
                  <fnm>AR</fnm>
               </au>
               <au>
                  <snm>Krainer</snm>
                  <fnm>AR</fnm>
               </au>
               <au>
                  <snm>Fonatsch</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Messiaen</snm>
                  <fnm>L</fnm>
               </au>
            </aug>
            <source>Hum Mutat</source>
            <pubdate>2007</pubdate>
            <volume>28</volume>
            <fpage>599</fpage>
            <lpage>612</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/humu.20493</pubid>
                  <pubid idtype="pmpid" link="fulltext">17311297</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B32">
            <title>
               <p>Biased exon/intron distribution of cryptic and de novo 3' splice sites</p>
            </title>
            <aug>
               <au>
                  <snm>Kralovicov&#225;</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Christensen</snm>
                  <fnm>MB</fnm>
               </au>
               <au>
                  <snm>Vo&#248;echovsk&#253;</snm>
                  <fnm>I</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2005</pubdate>
            <volume>33</volume>
            <fpage>4882</fpage>
            <lpage>4898</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1197134</pubid>
                  <pubid idtype="pmpid" link="fulltext">16141195</pubid>
                  <pubid idtype="doi">10.1093/nar/gki811</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B33">
            <title>
               <p>Global control of aberrant splice-site activation by auxiliary splicing sequences: evidence for a gradient in exon and intron definition</p>
            </title>
            <aug>
               <au>
                  <snm>Kralovicov&#225;</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Vo&#248;echovsk&#253;</snm>
                  <fnm>I</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <note>Advance access publication September 18, 2007</note>
         </bibl>
         <bibl id="B34">
            <title>
               <p>ESEfinder: A web resource to identify exonic splicing enhancers</p>
            </title>
            <aug>
               <au>
                  <snm>Cartegni</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Zhu</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>MQ</fnm>
               </au>
               <au>
                  <snm>Krainer</snm>
                  <fnm>AR</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2003</pubdate>
            <volume>31</volume>
            <fpage>3568</fpage>
            <lpage>3571</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">169022</pubid>
                  <pubid idtype="pmpid" link="fulltext">12824367</pubid>
                  <pubid idtype="doi">10.1093/nar/gkg616</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B35">
            <title>
               <p>Predictive identification of exonic splicing enhancers in human genes</p>
            </title>
            <aug>
               <au>
                  <snm>Fairbrother</snm>
                  <fnm>WG</fnm>
               </au>
               <au>
                  <snm>Yeh</snm>
                  <fnm>RF</fnm>
               </au>
               <au>
                  <snm>Sharp</snm>
                  <fnm>PA</fnm>
               </au>
               <au>
                  <snm>Burge</snm>
                  <fnm>CB</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2002</pubdate>
            <volume>297</volume>
            <fpage>1007</fpage>
            <lpage>1013</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.1073774</pubid>
                  <pubid idtype="pmpid" link="fulltext">12114529</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B36">
            <title>
               <p>Computational definition of sequence motifs governing constitutive exon splicing</p>
            </title>
            <aug>
               <au>
                  <snm>Zhang</snm>
                  <fnm>XH</fnm>
               </au>
               <au>
                  <snm>Chasin</snm>
                  <fnm>LA</fnm>
               </au>
            </aug>
            <source>Genes Dev</source>
            <pubdate>2004</pubdate>
            <volume>18</volume>
            <fpage>1241</fpage>
            <lpage>1250</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">420350</pubid>
                  <pubid idtype="pmpid" link="fulltext">15145827</pubid>
                  <pubid idtype="doi">10.1101/gad.1195304</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B37">
            <title>
               <p>Exon inclusion is dependent on predictable exonic splicing enhancers</p>
            </title>
            <aug>
               <au>
                  <snm>Zhang</snm>
                  <fnm>XH</fnm>
               </au>
               <au>
                  <snm>Kangsamaksin</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Chao</snm>
                  <fnm>MS</fnm>
               </au>
               <au>
                  <snm>Banerjee</snm>
                  <fnm>JK</fnm>
               </au>
               <au>
                  <snm>Chasin</snm>
                  <fnm>LA</fnm>
               </au>
            </aug>
            <source>Mol Cell Biol</source>
            <pubdate>2005</pubdate>
            <volume>25</volume>
            <fpage>7323</fpage>
            <lpage>7332</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1190244</pubid>
                  <pubid idtype="pmpid" link="fulltext">16055740</pubid>
                  <pubid idtype="doi">10.1128/MCB.25.16.7323-7332.2005</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B38">
            <title>
               <p>A survey on intron and exon lengths</p>
            </title>
            <aug>
               <au>
                  <snm>Hawkins</snm>
                  <fnm>JD</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>1988</pubdate>
            <volume>16</volume>
            <fpage>9893</fpage>
            <lpage>9908</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">338825</pubid>
                  <pubid idtype="pmpid" link="fulltext">3057449</pubid>
                  <pubid idtype="doi">10.1093/nar/16.21.9893</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B39">
            <title>
               <p>Aberrant 3' splice sites in human disease genes: mutation pattern, nucleotide structure and comparison of computational tools that predict their utilization</p>
            </title>
            <aug>
               <au>
                  <snm>Vo&#248;echovsk&#253;</snm>
                  <fnm>I</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2006</pubdate>
            <volume>34</volume>
            <fpage>4630</fpage>
            <lpage>4641</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1636351</pubid>
                  <pubid idtype="pmpid" link="fulltext">16963498</pubid>
                  <pubid idtype="doi">10.1093/nar/gkl535</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B40">
            <title>
               <p>Intrinsic differences between authentic and cryptic 5' splice sites</p>
            </title>
            <aug>
               <au>
                  <snm>Roca</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Sachidanandam</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Krainer</snm>
                  <fnm>AR</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2003</pubdate>
            <volume>31</volume>
            <fpage>6321</fpage>
            <lpage>6333</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">275472</pubid>
                  <pubid idtype="pmpid" link="fulltext">14576320</pubid>
                  <pubid idtype="doi">10.1093/nar/gkg830</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B41">
            <title>
               <p>Identification and analysis of alternative splicing events conserved in human and mouse</p>
            </title>
            <aug>
               <au>
                  <snm>Yeo</snm>
                  <fnm>GW</fnm>
               </au>
               <au>
                  <snm>Van Nostrand</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Holste</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Poggio</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Burge</snm>
                  <fnm>CB</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2005</pubdate>
            <volume>102</volume>
            <fpage>2850</fpage>
            <lpage>2855</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">548664</pubid>
                  <pubid idtype="pmpid" link="fulltext">15708978</pubid>
                  <pubid idtype="doi">10.1073/pnas.0409742102</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B42">
            <title>
               <p>Categorization and characterization of transcript confirmed constitutively and alternatively spliced introns and exons from human</p>
            </title>
            <aug>
               <au>
                  <snm>Clark</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Thanaraj</snm>
                  <fnm>TA</fnm>
               </au>
            </aug>
            <source>Hum Mol Genet</source>
            <pubdate>2002</pubdate>
            <volume>11</volume>
            <fpage>451</fpage>
            <lpage>464</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/hmg/11.4.451</pubid>
                  <pubid idtype="pmpid" link="fulltext">11854178</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B43">
            <title>
               <p>Compensatory relationship between splice sites and exonic splicing signals depending on the length of vertebrate introns</p>
            </title>
            <aug>
               <au>
                  <snm>Dewey</snm>
                  <fnm>CN</fnm>
               </au>
               <au>
                  <snm>Rogozin</snm>
                  <fnm>IB</fnm>
               </au>
               <au>
                  <snm>Koonin</snm>
                  <fnm>EV</fnm>
               </au>
            </aug>
            <source>BMC Genomics</source>
            <pubdate>2006</pubdate>
            <volume>7</volume>
            <fpage>311</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1713244</pubid>
                  <pubid idtype="pmpid" link="fulltext">17156453</pubid>
                  <pubid idtype="doi">10.1186/1471-2164-7-311</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B44">
            <title>
               <p>Evidence for the widespread coupling of alternative splicing and nonsense-mediated mRNA decay in humans</p>
            </title>
            <aug>
               <au>
                  <snm>Lewis</snm>
                  <fnm>BP</fnm>
               </au>
               <au>
                  <snm>Green</snm>
                  <fnm>RE</fnm>
               </au>
               <au>
                  <snm>Brenner</snm>
                  <fnm>SE</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2003</pubdate>
            <volume>100</volume>
            <fpage>189</fpage>
            <lpage>192</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">140922</pubid>
                  <pubid idtype="pmpid" link="fulltext">12502788</pubid>
                  <pubid idtype="doi">10.1073/pnas.0136770100</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B45">
            <title>
               <p>Multiple links between transcription and splicing</p>
            </title>
            <aug>
               <au>
                  <snm>Kornblihtt</snm>
                  <fnm>AR</fnm>
               </au>
               <au>
                  <snm>de la Mata</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Fededa</snm>
                  <fnm>JP</fnm>
               </au>
               <au>
                  <snm>Munoz</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Nogues</snm>
                  <fnm>G</fnm>
               </au>
            </aug>
            <source>RNA</source>
            <pubdate>2004</pubdate>
            <volume>10</volume>
            <fpage>1489</fpage>
            <lpage>1498</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1370635</pubid>
                  <pubid idtype="pmpid" link="fulltext">15383674</pubid>
                  <pubid idtype="doi">10.1261/rna.7100104</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B46">
            <title>
               <p>Systematic identification and analysis of exonic splicing silencers</p>
            </title>
            <aug>
               <au>
                  <snm>Wang</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Rolish</snm>
                  <fnm>ME</fnm>
               </au>
               <au>
                  <snm>Yeo</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Tung</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Mawson</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Burge</snm>
                  <fnm>CB</fnm>
               </au>
            </aug>
            <source>Cell</source>
            <pubdate>2004</pubdate>
            <volume>119</volume>
            <fpage>831</fpage>
            <lpage>845</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.cell.2004.11.010</pubid>
                  <pubid idtype="pmpid" link="fulltext">15607979</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B47">
            <title>
               <p>General and specific functions of exonic splicing silencers in splicing control</p>
            </title>
            <aug>
               <au>
                  <snm>Wang</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Xiao</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Van Nostrand</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Burge</snm>
                  <fnm>CB</fnm>
               </au>
            </aug>
            <source>Mol Cell</source>
            <pubdate>2006</pubdate>
            <volume>23</volume>
            <fpage>61</fpage>
            <lpage>70</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1839040</pubid>
                  <pubid idtype="pmpid" link="fulltext">16797197</pubid>
                  <pubid idtype="doi">10.1016/j.molcel.2006.05.018</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B48">
            <title>
               <p>UCSC Genome Browser</p>
            </title>
            <url>http://genome.ucsc.edu</url>
         </bibl>
         <bibl id="B49">
            <title>
               <p>Online Mendelian Inheritance in Man, OMIM</p>
            </title>
            <url>http://www.ncbi.nlm.nih.gov/omim/</url>
         </bibl>
         <bibl id="B50">
            <title>
               <p>Prediction of transcription regulatory sites in Archaea by a comparative genomic approach</p>
            </title>
            <aug>
               <au>
                  <snm>Gelfand</snm>
                  <fnm>MS</fnm>
               </au>
               <au>
                  <snm>Koonin</snm>
                  <fnm>EV</fnm>
               </au>
               <au>
                  <snm>Mironov</snm>
                  <fnm>AA</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2000</pubdate>
            <volume>28</volume>
            <fpage>695</fpage>
            <lpage>705</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">102549</pubid>
                  <pubid idtype="pmpid" link="fulltext">10637320</pubid>
                  <pubid idtype="doi">10.1093/nar/28.3.695</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B51">
            <title>
               <p>The R Project for Statistical Computing</p>
            </title>
            <url>http://www.r-project.org/</url>
         </bibl>
      </refgrp>
   </bm>
</art>

