<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>gb-2007-8-8-r154</ui>
   <ji>GBJ</ji>
   <fm>
      <dochead>Research</dochead>
      <bibl>
         <title>
            <p>Violating the splicing rules: TG dinucleotides function as alternative 3' splice sites in U2-dependent introns</p>
         </title>
         <aug>
            <au id="A1" ce="yes" ca="yes">
               <snm>Szafranski</snm>
               <fnm>Karol</fnm>
               <insr iid="I1"/>
               <email>szafrans@fli-leibniz.de</email>
            </au>
            <au id="A2" ce="yes">
               <snm>Schindler</snm>
               <fnm>Stefanie</fnm>
               <insr iid="I1"/>
               <email>sschindl@fli-leibniz.de</email>
            </au>
            <au id="A3">
               <snm>Taudien</snm>
               <fnm>Stefan</fnm>
               <insr iid="I1"/>
               <email>stau@fli-leibniz.de</email>
            </au>
            <au id="A4">
               <snm>Hiller</snm>
               <fnm>Michael</fnm>
               <insr iid="I2"/>
               <email>hiller@informatik.uni-freiburg.de</email>
            </au>
            <au id="A5">
               <snm>Huse</snm>
               <fnm>Klaus</fnm>
               <insr iid="I1"/>
               <email>khuse@fli-leibniz.de</email>
            </au>
            <au id="A6">
               <snm>Jahn</snm>
               <fnm>Niels</fnm>
               <insr iid="I1"/>
               <email>nielsj@fli-leibniz.de</email>
            </au>
            <au id="A7">
               <snm>Schreiber</snm>
               <fnm>Stefan</fnm>
               <insr iid="I3"/>
               <email>s.schreiber@mucosa.de</email>
            </au>
            <au id="A8">
               <snm>Backofen</snm>
               <fnm>Rolf</fnm>
               <insr iid="I2"/>
               <email>backofen@informatik.uni-freiburg.de</email>
            </au>
            <au id="A9">
               <snm>Platzer</snm>
               <fnm>Matthias</fnm>
               <insr iid="I1"/>
               <email>mplatzer@fli-leibniz.de</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Genome Analysis, Leibniz Institute for Age Research - Fritz Lipmann Institute, Beutenbergstr., 07745 Jena, Germany</p>
            </ins>
            <ins id="I2">
               <p>Institute of Computer Science, Bioinformatics Group, Albert-Ludwigs-University Freiburg, Georges-Koehler-Allee, 79110 Freiburg, Germany</p>
            </ins>
            <ins id="I3">
               <p>Institute of Clinical Molecular Biology, Christian Albrechts University Kiel, Schittenhelmstr., 24105 Kiel, Germany</p>
            </ins>
         </insg>
         <source>Genome Biology</source>
         <issn>1465-6906</issn>
         <pubdate>2007</pubdate>
         <volume>8</volume>
         <issue>8</issue>
         <fpage>R154</fpage>
         <url>http://genomebiology.com/2007/8/8/R154</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">17672918</pubid>
               <pubid idtype="doi">10.1186/gb-2007-8-8-r154</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>8</day>
               <month>3</month>
               <year>2007</year>
            </date>
         </rec>
         <revrec>
            <date>
               <day>14</day>
               <month>6</month>
               <year>2007</year>
            </date>
         </revrec>
         <acc>
            <date>
               <day>1</day>
               <month>8</month>
               <year>2007</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>01</day>
               <month>08</month>
               <year>2007</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2007</year>
         <collab>Szafranski et al.; licensee BioMed Central Ltd.</collab>
         <note>This is an open access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cite.</note>
      </cpyrt>
      <shorttitle>
         <p>TG 3' alternative splice sites</p>
      </shorttitle>
      <shortabs>
         <p>TG dinucleotides functioning as alternative 3' splice sites were identified and experimentally verified in 36 human genes.</p>
      </shortabs>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>Despite some degeneracy of sequence signals that govern splicing of eukaryotic pre-mRNAs, it is an accepted rule that U2-dependent introns exhibit the 3' terminal dinucleotide AG. Intrigued by anecdotal evidence for functional non-AG 3' splice sites, we carried out a human genome-wide screen.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>We identified TG dinucleotides functioning as alternative 3' splice sites in 36 human genes. The TG-derived splice variants were experimentally validated with a success rate of 92%. Interestingly, ratios of alternative splice variants are tissue-specific for several introns. TG splice sites and their flanking intron sequences are substantially conserved between orthologous vertebrate genes, even between human and frog, indicating functional relevance. Remarkably, TG splice sites are exclusively found as alternative 3' splice sites, never as the sole 3' splice site for an intron, and we observed a distance constraint for TG-AG splice site tandems.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusion</p>
               </st>
               <p>Since TGs splice sites are exclusively found as alternative 3' splice sites, the U2 spliceosome apparently accomplishes perfect specificity for 3' AGs at an early splicing step, but may choose 3' TGs during later steps. Given the tiny fraction of TG 3' splice sites compared to the vast amount of non-viable TGs, <it>cis</it>-acting sequence signals must significantly contribute to splice site definition. Thus, we consider TG-AG 3' splice site tandems as promising subjects for studies on the mechanisms of 3' splice site selection.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <meta>
      <classifications>
         <classification type="BMC" subtype="man_spc_id" id="30010016">Molecular biology</classification>
         <classification type="BMC" subtype="man_spc_id" id="30010010">Genome studies</classification>
         <classification type="BMC" subtype="man_spc_id" id="30010009">Genetics</classification>
      </classifications>
   </meta>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>Intervening sequences (introns), primary transcript regions that are removed during mRNA maturation, are an outstanding feature of eukaryotic gene structure. Introns are excised through two transesterification reactions involving the collaboration of five different small nuclear ribonucleoprotein particles and additional proteins that associate to form the spliceosome. Rearrangements of the spliceosome and, consequently, splicing catalysis is driven by the sequential action of ATP-dependent helicases <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B2">2</abbr></abbrgrp>. The assembly of the early spliceosomal complex relies on sequence-specific contacts between the intron terminal regions and the spliceosome subunits U1, U2, and U2AF <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr></abbrgrp>. From accumulating intron sequence data, it was noted that invariant dinucleotides must represent important signals for the definition of intron termini, the so-called GT-AG rule (for simplicity, we use the nucleotide symbol T to denote thymidine in DNA as well as uridine in RNA sequences). With respect to the role of intron termini in the transesterification reactions, the 5' GT site and the 3' AG site were named donor and acceptor splice sites, respectively.</p>
         <p>Early work on unusual splice signals revealed introns with the terminal dinucleotides AT-AC <abbrgrp><abbr bid="B5">5</abbr></abbrgrp>, and these were later shown to be processed by an independent splicing pathway, the U12 spliceosome. The U12 spliceosome recognizes highly specific donor site and branchpoint motifs <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> while recognition of 3' splice sites is rather unspecific. As a result, there are several variants of intron termini besides the prominent combinations GT-AG and AT-AC <abbrgrp><abbr bid="B7">7</abbr><abbr bid="B8">8</abbr></abbrgrp>.</p>
         <p>Among U2-dependent introns, the most frequent exception to the GT-AG rule are GC-AG intron termini, which comprise 0.7-0.9% of vertebrate introns <abbrgrp><abbr bid="B5">5</abbr><abbr bid="B8">8</abbr><abbr bid="B9">9</abbr></abbrgrp>. Other rare exceptions are GA-AG intron termini in the FGFR gene family <abbrgrp><abbr bid="B10">10</abbr></abbrgrp> and AT-AC termini, mostly found in introns of the SCN gene family <abbrgrp><abbr bid="B8">8</abbr><abbr bid="B11">11</abbr><abbr bid="B12">12</abbr></abbrgrp>. While the latter cases are the only reported non-AG 3' splice sites, results from <it>in silico </it>studies have repeatedly suggested that other unusual 3' splice sites occur in U2-type introns <abbrgrp><abbr bid="B13">13</abbr><abbr bid="B14">14</abbr></abbrgrp>. An in-depth, systematic screening effort could not reveal significant evidence for additional unusual intron 3' termini above the noise level brought by annotation errors <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>. However, it was noted that a few exceptional U2-spliceosomal introns exist that involve unusual 3' splice sites in scenarios of alternative splice site choice. For example, intron 3 of the human guanine nucleotide binding protein gene <it>GNAS </it>is spliced at either TG or AG in the 3' intron sequence CTGCAG <abbrgrp><abbr bid="B15">15</abbr><abbr bid="B16">16</abbr></abbrgrp>. Remarkably, the homologous <it>Drosophila </it>gene shows the same unusual splicing pattern for another intron <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>. Moreover, unusual TG splice acceptors appear to be involved in alternative splicing of the human gene for presynaptic density protein 95, <it>DLG4 </it><abbrgrp><abbr bid="B18">18</abbr></abbrgrp>, and the human dopamine D2 receptor gene, <it>DRD2 </it><abbrgrp><abbr bid="B19">19</abbr></abbrgrp>.</p>
         <p>We have previously reported a widespread type of alternative splicing mediated by the tandem splice acceptor motif NAGNAG <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>. From the analysis of single-nucleotide polymorphisms (SNPs) we concluded that a NAGNAG motif is necessary and sufficient to explain three-nucleotide variant splicing at intron-exon boundaries <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>. In contrast, alternative splicing of an intron 3' terminus in the <it>GNAS </it>gene appears to occur independently of a NAGNAG motif. Furthermore, it has been suggested that unusual splice sites could be selectively involved in alternative splicing <abbrgrp><abbr bid="B5">5</abbr><abbr bid="B9">9</abbr></abbrgrp>, although this was never examined in detail. Here, we report a systematic screening of the human transcriptome that identified 36 introns with <it>bona fide </it>TG 3' splice sites. These TG splice sites are exclusively found as alternative 3' splice sites, each associated with a canonical AG 3' splice site. The evolutionary conservation of these introns and their alternative splicing patterns indicate physiological relevance and point to the requirement for <it>cis</it>-regulatory sequence elements to promote usage of TG 3' splice sites.</p>
      </sec>
      <sec>
         <st>
            <p>Results</p>
         </st>
         <sec>
            <st>
               <p>Prior considerations</p>
            </st>
            <p>We used an <it>in silico </it>approach based on expressed sequence tags (ESTs) to identify unusual 3' splice sites that are found in pairs of 3' splice variants. ESTs, as first-pass results from high-throughput cDNA sequencing projects, are clearly prone to errors. Therefore, we assumed that single ESTs are insufficient to indicate genuine subtle splice variants since technical artifacts contribute to false positives. We considered a variant as sufficiently evident if it is supported by at least two independent ESTs, and expect the EST variant ratio to serve as an approximation for the natural ratio of splice variants. An additional threshold was applied to the relative abundance of splice variants, since our experimental approach, that is, sequencing of 100 individual RT-PCR clones, had a detection limit. Using a random binomial distribution to model the occurrence of splice variants in the RT-PCR clones, we calculated a diagnostic power of 95% (&#946; error 5%) if a splice variant occurs with at least 3% frequency. It is important to note that we have not inferred anything for the cases that failed the threshold criteria. It is possible that they actually represent natural splice variants; however, the evidence for such cases is weak and the experimental approach did not provide sufficient sensitivity for validation.</p>
         </sec>
         <sec>
            <st>
               <p>TG dinucleotides function as non-canonical alternative 3' splice sites</p>
            </st>
            <p>We initially aimed to identify unusual 3' splice sites that are found in pairs of 3' splice variants that differ by 3 nucleotides (nt), such as in the <it>GNAS </it>intron 3 splice site tandem <abbrgrp><abbr bid="B15">15</abbr><abbr bid="B16">16</abbr></abbrgrp>. Identification of 3 nt splice variant pairs (&#916;3SVPs) was based on 3' splice sites as indicated by spliced alignments of human ESTs <abbrgrp><abbr bid="B22">22</abbr></abbrgrp>. After a reduction of false positives performed by a series of filtering steps (Figure <figr fid="F1">1</figr>), we identified 65 'unusual' &#916;3SVPs that were supported by high-quality local EST alignments. Of these, 20 meet the requirements that the minor splice variant is supported by at least two ESTs and 3% of the matching ESTs (see considerations below). However, after close inspection and re-sequencing, we identified 6 of the 20 unusual &#916;3SVPs as false positives (Additional data file 1), explained by: 3 nt deletion variants due to sequencing errors; mouse ESTs erroneously attributed to human; or alignment artifacts. Another six &#916;3SVPs can be explained by SNPs, where the SNP allele corresponding to a NAGNAG splice site motif is not displayed by the human reference genome sequence <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>. Strikingly, all the remaining eight &#916;3SVPs suggest that TG dinucleotides function as alternative 3' splice sites (Table <tblr tid="T1">1</tblr>).</p>
            <tbl id="T1">
               <title>
                  <p>Table 1</p>
               </title>
               <caption>
                  <p>Unusual TG splice acceptors identified in the human transcriptome</p>
               </caption>
               <tblbdy cols="7">
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c cspan="2" ca="center">
                        <p>Intron</p>
                     </c>
                     <c cspan="2" ca="center">
                        <p>3' Splice site pair</p>
                     </c>
                     <c cspan="2" ca="center">
                        <p>ESTs for unusual 3' splice sites</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c cspan="2">
                        <hr/>
                     </c>
                     <c cspan="2">
                        <hr/>
                     </c>
                     <c cspan="2">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Gene</p>
                     </c>
                     <c ca="center">
                        <p>No.</p>
                     </c>
                     <c ca="center">
                        <p>Length</p>
                     </c>
                     <c ca="center">
                        <p>Distance</p>
                     </c>
                     <c ca="left">
                        <p>Motif</p>
                     </c>
                     <c ca="center">
                        <p>Fraction</p>
                     </c>
                     <c ca="center">
                        <p>No.</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>GNAS</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>7843</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="left">
                        <p>CTG,CAG|</p>
                     </c>
                     <c ca="center">
                        <p>0.15-0.62*<sup>&#8224;</sup></p>
                     </c>
                     <c ca="center">
                        <p>282</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>PCGF2</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>224</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="left">
                        <p>AAG|ATG,</p>
                     </c>
                     <c ca="center">
                        <p>0.50</p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>CNBP</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>168</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="left">
                        <p>TTGTTG,AAG|</p>
                     </c>
                     <c ca="center">
                        <p>0.25</p>
                     </c>
                     <c ca="center">
                        <p>257</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>168</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="left">
                        <p>TTG,TTGAAG|</p>
                     </c>
                     <c ca="center">
                        <p>0.01</p>
                     </c>
                     <c ca="center">
                        <p>10</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>FBXO17</it>
                           <sup>&#8225;</sup>
                        </p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>1999</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="left">
                        <p>CAG|ATG,</p>
                     </c>
                     <c ca="center">
                        <p>0.14</p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>C21orf63</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>9975</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="left">
                        <p>CAG|ATG,</p>
                     </c>
                     <c ca="center">
                        <p>0.09</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>BRUNOL4</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="center">
                        <p>1147</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="left">
                        <p>CAG|CTG,</p>
                     </c>
                     <c ca="center">
                        <p>0.07</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>PCID2</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>2162</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="left">
                        <p>CAG|ATG,</p>
                     </c>
                     <c ca="center">
                        <p>0.04</p>
                     </c>
                     <c ca="center">
                        <p>7</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>TNNT2</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>4354</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="left">
                        <p>TTG,GAG|</p>
                     </c>
                     <c ca="center">
                        <p>0.04</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>
                              <it>CACNA1A</it>
                           </b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>9</p>
                     </c>
                     <c ca="center">
                        <p>2532</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="left">
                        <p>TTGTTG,GAG|</p>
                     </c>
                     <c ca="center">
                        <p>?<sup>&#8224;</sup></p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>9</p>
                     </c>
                     <c ca="center">
                        <p>2532</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="left">
                        <p>TTG,TTGGAG|</p>
                     </c>
                     <c ca="center">
                        <p>0.17<sup>&#8224;</sup></p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>GPBP1</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>9</p>
                     </c>
                     <c ca="center">
                        <p>1377</p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                     <c ca="left">
                        <p>CAG|GATG,</p>
                     </c>
                     <c ca="center">
                        <p>0.03</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>KIAA0494</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>1459</p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                     <c ca="left">
                        <p>TTG,AGCAG|</p>
                     </c>
                     <c ca="center">
                        <p>0.09</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>OSBPL8</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>36530</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="left">
                        <p>CTG,TTGTAG|</p>
                     </c>
                     <c ca="center">
                        <p>0.11</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>SAP30</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>1892</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="left">
                        <p>CTG,TTTCAG|</p>
                     </c>
                     <c ca="center">
                        <p>0.04</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>
                              <it>DRD2</it>
                           </b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="center">
                        <p>1485</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="left">
                        <p>CTG,GTGCAG|</p>
                     </c>
                     <c ca="center">
                        <p>0.02<sup>&#8224;</sup></p>
                     </c>
                     <c ca="center">
                        <p>5<sup>&#8224;</sup></p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>SUV420H2</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                     <c ca="center">
                        <p>134</p>
                     </c>
                     <c ca="center">
                        <p>7</p>
                     </c>
                     <c ca="left">
                        <p>CTG,GCTCCAG|</p>
                     </c>
                     <c ca="center">
                        <p>0.20</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>SSRP1</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>489</p>
                     </c>
                     <c ca="center">
                        <p>7</p>
                     </c>
                     <c ca="left">
                        <p>TTG,AATTCAG|</p>
                     </c>
                     <c ca="center">
                        <p>0.20</p>
                     </c>
                     <c ca="center">
                        <p>16</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>FREQ</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>16849</p>
                     </c>
                     <c ca="center">
                        <p>7</p>
                     </c>
                     <c ca="left">
                        <p>CTG,CCTCCAG|</p>
                     </c>
                     <c ca="center">
                        <p>0.04</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>IL21</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>2753</p>
                     </c>
                     <c ca="center">
                        <p>8</p>
                     </c>
                     <c ca="left">
                        <p>TTG,ATTTCTAG|</p>
                     </c>
                     <c ca="center">
                        <p>0.13</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>
                              <it>RYK</it>
                           </b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>7</p>
                     </c>
                     <c ca="center">
                        <p>3107</p>
                     </c>
                     <c ca="center">
                        <p>9</p>
                     </c>
                     <c ca="left">
                        <p>TTG,GCTCCTTAG|</p>
                     </c>
                     <c ca="center">
                        <p>0.77</p>
                     </c>
                     <c ca="center">
                        <p>27</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>
                              <it>DLG4</it>
                           </b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                     <c ca="center">
                        <p>131</p>
                     </c>
                     <c ca="center">
                        <p>9</p>
                     </c>
                     <c ca="left">
                        <p>CTG,GAGTTGCAG|</p>
                     </c>
                     <c ca="center">
                        <p>0.62</p>
                     </c>
                     <c ca="center">
                        <p>8</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>SMARCA4</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>29</p>
                     </c>
                     <c ca="center">
                        <p>6174</p>
                     </c>
                     <c ca="center">
                        <p>9</p>
                     </c>
                     <c ca="left">
                        <p>TTG,ACCCTGAAG|</p>
                     </c>
                     <c ca="center">
                        <p>0.41</p>
                     </c>
                     <c ca="center">
                        <p>34</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>FBXL10</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>15</p>
                     </c>
                     <c ca="center">
                        <p>177</p>
                     </c>
                     <c ca="center">
                        <p>9</p>
                     </c>
                     <c ca="left">
                        <p>TTG,GCCTACAAG|</p>
                     </c>
                     <c ca="center">
                        <p>0.21</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>HNRPR</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>7</p>
                     </c>
                     <c ca="center">
                        <p>2839</p>
                     </c>
                     <c ca="center">
                        <p>9</p>
                     </c>
                     <c ca="left">
                        <p>TTG,GTTTAACAG|</p>
                     </c>
                     <c ca="center">
                        <p>0.13</p>
                     </c>
                     <c ca="center">
                        <p>15</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>RRAD</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>214</p>
                     </c>
                     <c ca="center">
                        <p>9</p>
                     </c>
                     <c ca="left">
                        <p>CTG,ATCCCCTAG|</p>
                     </c>
                     <c ca="center">
                        <p>0.06</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>TGM1</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="center">
                        <p>454</p>
                     </c>
                     <c ca="center">
                        <p>10</p>
                     </c>
                     <c ca="left">
                        <p>CTG,TCCTGGGCAG|</p>
                     </c>
                     <c ca="center">
                        <p>0.13</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>ALAS1</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>11</p>
                     </c>
                     <c ca="center">
                        <p>1599</p>
                     </c>
                     <c ca="center">
                        <p>11</p>
                     </c>
                     <c ca="left">
                        <p>CTG,TTTCTCCTCAG|</p>
                     </c>
                     <c ca="center">
                        <p>0.04</p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>
                              <it>ARS2</it>
                           </b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>18</p>
                     </c>
                     <c ca="center">
                        <p>182</p>
                     </c>
                     <c ca="center">
                        <p>12</p>
                     </c>
                     <c ca="left">
                        <p>TTG,TACTCCCCCCAG|</p>
                     </c>
                     <c ca="center">
                        <p>0.74</p>
                     </c>
                     <c ca="center">
                        <p>75</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>
                              <it>PCBP2</it>
                           </b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>7</p>
                     </c>
                     <c ca="center">
                        <p>1337</p>
                     </c>
                     <c ca="center">
                        <p>12</p>
                     </c>
                     <c ca="left">
                        <p>CTG,ACTCTCTCCCAG|</p>
                     </c>
                     <c ca="center">
                        <p>0.43</p>
                     </c>
                     <c ca="center">
                        <p>169</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>PTPN11</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>10</p>
                     </c>
                     <c ca="center">
                        <p>4269</p>
                     </c>
                     <c ca="center">
                        <p>12</p>
                     </c>
                     <c ca="left">
                        <p>TTG,GCTCTACTCCAG|</p>
                     </c>
                     <c ca="center">
                        <p>0.33</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>MSH5</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="center">
                        <p>164</p>
                     </c>
                     <c ca="center">
                        <p>12</p>
                     </c>
                     <c ca="left">
                        <p>CTG,ATCCCCTCCCAG|</p>
                     </c>
                     <c ca="center">
                        <p>0.25</p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>SYTL2</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>9</p>
                     </c>
                     <c ca="center">
                        <p>1259</p>
                     </c>
                     <c ca="center">
                        <p>13</p>
                     </c>
                     <c ca="left">
                        <p>TTG,CCCTCCTGAGTAG|</p>
                     </c>
                     <c ca="center">
                        <p>0.09</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>TOMM40</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>95</p>
                     </c>
                     <c ca="center">
                        <p>16</p>
                     </c>
                     <c ca="left">
                        <p>CTG,ACCTCTCCCCTAGCAG|</p>
                     </c>
                     <c ca="center">
                        <p>0.07</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>MARK3</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>20478</p>
                     </c>
                     <c ca="center">
                        <p>17</p>
                     </c>
                     <c ca="left">
                        <p>TTG,TTTGTTTTTTTTTTTAG|</p>
                     </c>
                     <c ca="center">
                        <p>0.07</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>
                              <it>BAT3</it>
                           </b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="center">
                        <p>832</p>
                     </c>
                     <c ca="center">
                        <p>18</p>
                     </c>
                     <c ca="left">
                        <p>CTG,ACTCTCCCCTACCTTCAG|</p>
                     </c>
                     <c ca="center">
                        <p>0.01</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>
                              <it>SH3D19</it>
                           </b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="center">
                        <p>838</p>
                     </c>
                     <c ca="center">
                        <p>21</p>
                     </c>
                     <c ca="left">
                        <p>TTG,GTTTTGTTTTGGTCTCGTCAG|</p>
                     </c>
                     <c ca="center">
                        <p>0.07</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <b>
                              <it>LOC346653</it>
                           </b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>3097</p>
                     </c>
                     <c ca="center">
                        <p>27</p>
                     </c>
                     <c ca="left">
                        <p>CTG,ACCCATGTACCTGAGGCTGATTTCCAG|</p>
                     </c>
                     <c ca="center">
                        <p>0.60</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>ACAD9</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>10</p>
                     </c>
                     <c ca="center">
                        <p>253</p>
                     </c>
                     <c ca="center">
                        <p>28</p>
                     </c>
                     <c ca="left">
                        <p>TTG,TTTCTTGTGTTTTTTCTGAACACTCCAG|</p>
                     </c>
                     <c ca="center">
                        <p>0.09</p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>Entries in bold have RefSeq transcripts supporting the unusual TG acceptor site. Each TG splice variant is supported by at least two ESTs and at least 3% of all covering ESTs, except for some RefSeq-supported cases, <it>CACNA1A </it>[24,35]<it>, DRD2 </it>[19] and <it>BAT3</it>. In the 'Motif' column, a vertical line (|) indicates a canonical splice site, and a comma (,) marks the TG splice site. Splice ratios are given as absolute EST counts (No.) as well as the fraction of TG splice variants. A question mark indicates that an explicit fraction is not given in the referenced article, although the authors performed quantitative experiments. *EST ratio depends on the exon junction; the upstream exon 3 may be skipped. <sup>&#8224;</sup>Splice variants were previously quantified by others: <it>GNAS </it>[16,26], <it>CACNA1A </it>(splice ratio cited from [24,35]), <it>DRD2 </it>(splice ratio cited from [19]). <sup>&#8225;</sup>Alternative splicing at <it>FBXO17 </it>intron 3 was not experimentally reproducible in this study.</p>
               </tblfn>
            </tbl>
            <fig id="F1">
               <title>
                  <p>Figure 1</p>
               </title>
               <caption>
                  <p>Screening procedure for unusual 3' splice sites found in pairs of 3' splice variants that differ by 3 nt (&#916;3SVPs)</p>
               </caption>
               <text>
                  <p>Screening procedure for unusual 3' splice sites found in pairs of 3' splice variants that differ by 3 nt (&#916;3SVPs). Processing of AG-AG tandem cases ('NAGNAG', parallel branch on the right) was performed as a comparison to unusual 3' splice site tandems.</p>
               </text>
               <graphic file="gb-2007-8-8-r154-1"/>
            </fig>
            <p>Since all the unusual alternative &#916;3-nt splice acceptors identified display TG dinucleotides, we investigated their occurrence in a wider scope. Analogously to the screen for &#916;3SVPs (Figure <figr fid="F1">1</figr>), we performed a search for alternative TG splice acceptors at larger distances, up to 36 nt from from the canonical splice site. The same filter procedures were applied, and close inspection did not reveal obvious artifacts or explanatory SNP alleles. We identified 26 additional EST-supported splice variant pairs that suggest alternative TG splice acceptors functioning in U2-dependent introns (Table <tblr tid="T1">1</tblr>).</p>
            <p>We sought to screen for unusual splice acceptors using an independent approach in order to cross-validate our initial findings and to make a link to previous studies that were primarily based on curated transcript data <abbrgrp><abbr bid="B5">5</abbr><abbr bid="B9">9</abbr></abbrgrp>. An analysis of RefSeq-to-genome alignments identified 122 putative introns with terminal TG dinucleotides (120 unique genomic sites) out of 228,925 total introns (171,605 unique genomic sites). Of these, 39 introns have a canonical GT donor dinucleotide (Additional data file 1). A previous study, performing a similar screening approach for unusual splice sites using curated transcript data, showed high enrichment of annotation artifacts <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>. Therefore, we checked the identified TG acceptor cases thoroughly. In fact, cases failed this quality check for several reasons: known SNPs masking existing canonical AG splice sites; RefSeqs lacking transcript evidence; and misleading RefSeq-to-genome alignments. Since the overall false positive rate seemed very high, we additionally required independent transcript entries (mRNA or EST) to support the unusual splice site. In summary, 9 of the 39 RefSeqs showed robust support for unusual TG acceptor sites (Table <tblr tid="T1">1</tblr>, entries in bold), and 6 of these overlap with cases obtained from the EST screening approach while the others are exclusively identified by the RefSeq-based approach (<it>SH3D19</it>, <it>BAT3</it>, <it>CACNA1A</it>). Intriguingly, these three EST-independent RefSeq-supported cases all comprise 'alternative' splice sites, although this was not a screening criterion. Taking into account that about 1% of all introns have alternative 3' termini <abbrgrp><abbr bid="B23">23</abbr></abbrgrp>, this strongly indicates that TG splice acceptors are functionally linked to nearby AG splice sites and cannot function in a constitutive manner (<it>P </it>= 0.000001, binomial test). Altogether, the two screens identified 37 introns with 39 alternative TG splice acceptors (Table <tblr tid="T1">1</tblr>).</p>
         </sec>
         <sec>
            <st>
               <p>Negation of genome sequence errors and polymorphisms</p>
            </st>
            <p>Since six putatively unusual 3' splice sites can be explained by SNP-affected NAGNAG acceptors (which were filtered; Figure <figr fid="F1">1</figr>), we asked whether undiscovered SNPs, or even inaccuracies in the available human genome sequence, may explain some of the remaining candidates. The genomic sequence of the splice site regions of <it>GNAS </it>and <it>CACNA1A </it>had been experimentally verified by others <abbrgrp><abbr bid="B16">16</abbr><abbr bid="B24">24</abbr></abbrgrp>. For 10 other genes (listed in Table <tblr tid="T2">2</tblr>), we analyzed PCR products obtained from genomic DNA, pooled from 100 individuals, for sequence variations. The re-sequenced genomic regions were in perfect agreement with the available genome sequence, negating the possibility that unusual splice sites are trivial sequencing errors (data not shown). Moreover, we identified no SNP alleles that confer explanatory AG splice sites on any of the observed unusual splice variants, demonstrating that the TG splice sites are real and genetically invariant.</p>
            <tbl id="T2">
               <title>
                  <p>Table 2</p>
               </title>
               <caption>
                  <p>Validation and quantification of alternative splice variants</p>
               </caption>
               <tblbdy cols="5">
                  <r>
                     <c ca="left">
                        <p>Splice junction</p>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Tissue</p>
                     </c>
                     <c ca="center">
                        <p>Fraction of TG splice</p>
                     </c>
                     <c ca="center">
                        <p>Method</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="5">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>GNAS</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Intron 3, exon junction 3-4</p>
                     </c>
                     <c ca="left">
                        <p>Leukocytes</p>
                     </c>
                     <c ca="center">
                        <p>0.14</p>
                     </c>
                     <c ca="center">
                        <p>n = 115</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>PCGF2</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Intron 1</p>
                     </c>
                     <c ca="left">
                        <p>Placenta</p>
                     </c>
                     <c ca="center">
                        <p>0.99</p>
                     </c>
                     <c ca="center">
                        <p>n = 89</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>CNBP</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Intron 3, indel AAG</p>
                     </c>
                     <c ca="left">
                        <p>Leukocytes</p>
                     </c>
                     <c ca="center">
                        <p>0.52</p>
                     </c>
                     <c ca="center">
                        <p>n = 69</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Brain</p>
                     </c>
                     <c ca="center">
                        <p>0.38</p>
                     </c>
                     <c ca="center">
                        <p>n = 58</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Placenta</p>
                     </c>
                     <c ca="center">
                        <p>0.33</p>
                     </c>
                     <c ca="center">
                        <p>n = 70</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Intron 3, indel TTGAAG</p>
                     </c>
                     <c ca="left">
                        <p>Leukocytes</p>
                     </c>
                     <c ca="center">
                        <p>0.01</p>
                     </c>
                     <c ca="center">
                        <p>n = 69</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>FBXO17</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Intron 3</p>
                     </c>
                     <c ca="left">
                        <p>Leukocytes</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                     <c ca="center">
                        <p>n = 96</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Liver</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                     <c ca="center">
                        <p>n = 151</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>C21orf63</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Intron 3</p>
                     </c>
                     <c ca="left">
                        <p>Leukocytes</p>
                     </c>
                     <c ca="center">
                        <p>0.09</p>
                     </c>
                     <c ca="center">
                        <p>n = 110</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Brain</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                     <c ca="center">
                        <p>Direct sequencing</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>BRUNOL4</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Intron 6</p>
                     </c>
                     <c ca="left">
                        <p>Lung</p>
                     </c>
                     <c ca="center">
                        <p>-</p>
                     </c>
                     <c ca="center">
                        <p>n = 90</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Brain</p>
                     </c>
                     <c ca="center">
                        <p>0.19</p>
                     </c>
                     <c ca="center">
                        <p>n = 92</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>PCID2</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Intron 2</p>
                     </c>
                     <c ca="left">
                        <p>Leukocytes</p>
                     </c>
                     <c ca="center">
                        <p>0.02</p>
                     </c>
                     <c ca="center">
                        <p>n = 142</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>TNNT2</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Intron 1</p>
                     </c>
                     <c ca="left">
                        <p>Heart</p>
                     </c>
                     <c ca="center">
                        <p>0.03</p>
                     </c>
                     <c ca="center">
                        <p>n = 91</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>CACNA1A</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Intron 9, indel GAG</p>
                     </c>
                     <c ca="left">
                        <p>Brain</p>
                     </c>
                     <c ca="center">
                        <p>0.85</p>
                     </c>
                     <c ca="center">
                        <p>n = 90</p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>Intron 9, indel TTGGAG</p>
                     </c>
                     <c ca="left">
                        <p>Brain</p>
                     </c>
                     <c ca="center">
                        <p>0.03</p>
                     </c>
                     <c ca="center">
                        <p>n = 90</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>ARS2</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Intron 18</p>
                     </c>
                     <c ca="left">
                        <p>Leukocytes</p>
                     </c>
                     <c ca="center">
                        <p>0.55</p>
                     </c>
                     <c ca="center">
                        <p>n = 37</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>PCBP2</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Intron 7</p>
                     </c>
                     <c ca="left">
                        <p>Leukocytes</p>
                     </c>
                     <c ca="center">
                        <p>0.54</p>
                     </c>
                     <c ca="center">
                        <p>n = 47</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>LOC346653</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>Intron 1</p>
                     </c>
                     <c ca="left">
                        <p>Testis</p>
                     </c>
                     <c ca="center">
                        <p>0.50</p>
                     </c>
                     <c ca="center">
                        <p>Direct sequencing</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>In the 'Methods' column, n represents the number of subclones sequenced.</p>
               </tblfn>
            </tbl>
         </sec>
         <sec>
            <st>
               <p>Validation and quantification of splice variants</p>
            </st>
            <p>To verify the existence of TG-derived splice variants, we performed RT-PCR experiments designed to yield amplicons that cover the exon-exon junctions under consideration. Cloning of the PCR products and sequencing allowed us to detect splice variants. Subclassification and counting of clones gave measures of splice variant ratios. This way, the alternative splicing pattern was reproduced and quantified for 11 out of 12 analyzed cases (Table <tblr tid="T2">2</tblr>). Generally, the splice ratios obtained from clone counting agree well with the EST data. The observed deviations can be explained by significant fluctuations depending on the analyzed tissue (<it>C21orf63</it>, <it>BRUNOL4</it>, and <it>CNBP </it>in Table <tblr tid="T2">2</tblr>). The splice variant validation failed for <it>FBXO17</it>, a gene for which 4 out of 29 ESTs had suggested a TG-derived splice variant. All supporting ESTs originated from the same EST library, NIH_MGC_100, derived from a hepatocellular carcinoma. A peculiarity of the source material, either the NIH_MGC_100 cell line or the single-individual liver sample used for our RT-PCR experiments, may be the reason for the inconsistent results concerning this putative splice variant. This example illustrates that at least two ESTs from independent sources are required to indicate a natural splice variant with high reliability. Overall, the success rate of the validation experiments was high (92%), and extrapolated to the 25 non-tested cases, about 2 false positives are expected.</p>
         </sec>
         <sec>
            <st>
               <p>Tissue-specificity of splice ratios</p>
            </st>
            <p>According to the results of the PCR cloning approach, <it>BRUNOL4 </it>displayed remarkable tissue-specific splice ratios. The TG-derived splice variant was not detected in lung cDNA whereas the same variant constituted 20% of brain <it>BRUNOL4 </it>transcripts (Table <tblr tid="T2">2</tblr>, Figure <figr fid="F2">2a</figr>). So we asked if splice ratios of TG-derived and AG-derived variants generally show tissue-specific differences. We analyzed the splice variant ratios more extensively in other genes using pyrosequencing, a method that allows accurate and cost-effective quantification of mixtures of polymorphic DNA populations <abbrgrp><abbr bid="B25">25</abbr><abbr bid="B26">26</abbr></abbrgrp>. <it>ARS2 </it>and <it>CNBP</it>, both having a ubiquitous expression profile, show tissue-dependent fluctuations in splice ratios (55-65% and 20-40%, respectively; Figure <figr fid="F2">2</figr>). While these differences are numerically significant in each of these genes (&#945; = 0.01, ANOVA), their biological relevance is debatable. We conclude that splice ratios of TG-AG tandems are tissue-specific for particular introns but are rather stable for others.</p>
            <fig id="F2">
               <title>
                  <p>Figure 2</p>
               </title>
               <caption>
                  <p>Tissue-specific fractions of TG-derived splice variants</p>
               </caption>
               <text>
                  <p>Tissue-specific fractions of TG-derived splice variants. <b>(a) </b><it>BRUNOL4 </it>(values are as shown in Table 2); <b>(b) </b><it>CNBP</it>; and <b>(c) </b><it>ARS2</it>. Pyrosequencing assays (for (b,c)) were performed multiple times for each sample (two to four times). Error bars depict the standard deviation of individual measurements.</p>
               </text>
               <graphic file="gb-2007-8-8-r154-2"/>
            </fig>
         </sec>
         <sec>
            <st>
               <p>Evolutionary conservation of introns with TG 3' splice sites</p>
            </st>
            <p>Since splicing at TG sites occurs in a very small number of introns, one might argue that these represent 'accidental' events attributable to spliceosome dysfunction. To address this question, we first analyzed the conservation of splice sites in homologous introns as an indication of alternative splice variants being under purifying selection. Out of 36 introns with 3' TG splice sites (37 minus the false-positive <it>FBXO17 </it>intron), 26 (72%) are conserved between the human and mouse genomes. In 14 of these cases (39% of the total), mouse ESTs indicate homologous TG-derived splice variants. For comparison, this rate is three- to four-fold higher than that of alternative exons found in both human and mouse <abbrgrp><abbr bid="B27">27</abbr><abbr bid="B28">28</abbr><abbr bid="B29">29</abbr></abbrgrp>. In some cases, EST evidence for orthologous TG-derived splice variants even exists for distantly related species, such as chicken (<it>CNBP</it>, <it>BRUNOL4</it>, <it>RYK</it>), and frog or fish (<it>BRUNOL4</it>, <it>RYK</it>, <it>FBXL10</it>). An outstanding example of conserved intron sequence and homologous splice variants is intron7 of the <it>RYK </it>gene (Figure <figr fid="F3">3</figr>). The ratio of 3' splice variants is remarkably similar between human and chicken, as can be inferred from the available EST data (EST ratios of 24:6 and 5:1 for human and chicken, respectively). In general, it should be noted that homologous splice variants may remain undetected due to the limited depth of EST coverage <abbrgrp><abbr bid="B23">23</abbr><abbr bid="B27">27</abbr></abbrgrp>.</p>
            <fig id="F3">
               <title>
                  <p>Figure 3</p>
               </title>
               <caption>
                  <p>Conservation of the TG splice site found in intron7 of the <it>RYK </it>gene from human to chicken</p>
               </caption>
               <text>
                  <p>Conservation of the TG splice site found in intron7 of the <it>RYK </it>gene from human to chicken. <b>(a) </b>Human genomic sequence and derived splice variants. Canonical (filled triangle) and TG 3' splice site (open triangle) are marked. <b>(b) </b>Alignment of orthologous exon-intron boundary regions from several vertebrate genomes, splice sites highlighted as in (a). Numbers on the right display the ratios of species-specific ESTs for the TG and AG splice sites, respectively.</p>
               </text>
               <graphic file="gb-2007-8-8-r154-3"/>
            </fig>
            <p>Independently, we analyzed intron sequence conservation as an indication of the functional relevance of alternative splicing <abbrgrp><abbr bid="B27">27</abbr><abbr bid="B28">28</abbr><abbr bid="B29">29</abbr></abbrgrp>. A data set of human-mouse orthologous intron-exon boundaries was used to determine the degree of conservation within a 50 nt intron sequence upstream of the splice acceptor, or acceptor tandem. Intronic flanks of TG splice sites show an average sequence similarity of 74%, whereas flanks of AG splice sites within canonical (AG-only) introns are 65% similar on average (<it>P </it>&lt;&lt; 0.00001, permutation test). A plot of flanking sequence conservation against the abundance of the TG-derived splice variant (Figure <figr fid="F4">4</figr>) shows that these two measures are positively correlated. Introns that give rise to less than 10% TG-derived splice variants have an average human-mouse intron sequence identity of 64%, indistinguishable from canonical introns. In contrast, introns with a TG-derived splice variant making up more than 10% of the transcripts show an average sequence identity as high as 80%. This parallels a previous finding that the abundance of splice variants correlates with sequence conservation of alternative exons <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>. Consistently, high intron sequence conservation is strongly correlated with conservation of the TG splice site (13 of the 14 cases with gene labels in Figure <figr fid="F4">4</figr>). Our results indicate that splicing at TG acceptors may arise from neutral evolution, presumably showing low splicing efficiency. However, efficiently spliced TG 3' splice sites seem to evolve and to be maintained by evolutionary selection.</p>
            <fig id="F4">
               <title>
                  <p>Figure 4</p>
               </title>
               <caption>
                  <p>Intron flank conservation of TG-AG splice acceptor tandems</p>
               </caption>
               <text>
                  <p>Intron flank conservation of TG-AG splice acceptor tandems. Orthologous human/mouse intron-exon boundaries involving TG splice sites are displayed in a two-dimensional plot according to two properties: horizontal axis = sequence identity of 50 nt sequence upstream of both splice sites; vertical axis = relative abundance of the TG-derived splice variant, as reflected by the fraction of TG-spliced ESTs (except for <it>CACN1A1</it>, where the data are taken from Table 2). Data points are labeled with the gene symbol if the conservation score and/or the fraction of TG-derived splice variant are significantly high. Conservation properties of canonical introns are indicated by shaded intervals: black line = median; dark gray = 66% percentile; light gray = 90% percentile.</p>
               </text>
               <graphic file="gb-2007-8-8-r154-4"/>
            </fig>
         </sec>
         <sec>
            <st>
               <p>Structural and sequence characteristics of TG 3' splice sites</p>
            </st>
            <p>We analyzed the context properties of TG 3' splice sites in order to find an explanation for these rare exceptions to the GT-AG rule. With regard to the gene structure context, TG-splicing introns are indistinguishable from canonical introns in several respects: length of the affected as well as the downstream introns, and length of the upstream and downstream exons (data not shown). TG 3' splice sites are significantly often found in the first intron of the gene; 8 of 36 (22%) TG-AG introns compared to 11% of other introns (<it>P </it>= 0.02, Fisher's Exact). This bias is also found for AG-AG splice site tandems and is certainly due to neutral evolution of introns located in the 5' untranslated transcript region.</p>
            <p>TG 3' splice sites were exclusively found within a context of alternative splice site choice (Table <tblr tid="T1">1</tblr>). This is clearly significant for results from the RefSeq-based screening procedure, which are unbiased with respect to constitutive or alternative transcripts. Taking into account that about 1% of all introns have alternative 3' termini <abbrgrp><abbr bid="B23">23</abbr></abbrgrp>, it strongly indicates that TG 3' splice sites are functionally linked to nearby AG splice sites and cannot function in a constitutive manner. This conclusion is further supported by studies of human AG>TG 3' splice site mutants <abbrgrp><abbr bid="B30">30</abbr><abbr bid="B31">31</abbr><abbr bid="B32">32</abbr></abbrgrp>, which always resulted in activation of neighboring AG splice sites, but not splicing at TG. Furthermore, we observed that TG-AG tandems display a splice site distance restriction with a limit of 28 nt, which is not seen for AG-AG tandems (Table <tblr tid="T1">1</tblr>, Additional data file 1). Thus, splicing of a 3' TG does not depend only on an additional AG splice site, this dependency also seems to pose a constraint on TG-AG splice site distance. The observed distance limit corresponds well with the distance between the branch site and the 3' splice site, which is typically 20-40 nt <abbrgrp><abbr bid="B33">33</abbr></abbrgrp>.</p>
            <p>Splice site strength was scored using the maximum entropy method 'maxent' of Yeo and Burge <abbrgrp><abbr bid="B34">34</abbr></abbrgrp>. The 5' splice site scores are indistinguishable between TG-AG introns and canonical introns. On the other hand, the AG 3' splice sites in TG-AG tandems score significantly lower than canonical introns (6.1 &#177; 3.5 versus 8.5 &#177; 2.8, respectively; <it>P </it>= 0.00001, Student's <it>t</it>-test). The 3' splice site score remains significantly small if low-scoring outliers are excluded from the analysis (6.7). Interestingly, the sequence context of TG 3' splice sites is very similar to canonical AG splice sites in that it shows a preference for pyrimidines at position -3 and preference for purines at position +1 (Additional data file 1). However, TG 3' splice sites changed to AG yield an average score of 6.6, again significantly lower than that of canonical AG splice sites (<it>P </it>&lt; 0.00001, Student's <it>t</it>-test). This disfavors the simple explanation that TG and AG splice sites compete for the same recognizing factors and the neighboring nucleotide composition (that is, the feature scored by maxent) alone acts to direct splice site choice towards TGs. Finally, it remains questionable if maxent, trained on canonical AG 3' splice sites, has any predictive power for the functionality of TG splice sites.</p>
            <p>The fraction of TGs functioning as splice acceptors is extremely small, about 0.01% of candidate motifs. Thus, <it>cis</it>-regulatory elements must play a crucial role in the definition of TG splice acceptors. From the ratio of functional/non-functional TG-AG tandems, in comparison with AG-AG tandems, we estimate that at least 6 nt of <it>cis</it>-regulatory sequence information is required to promote splice site usage of 3' TGs (Additional data file 1). This is in agreement with 5-20 unchanged nucleotides in excess over the average intron mutation rate, found in about half of the TG-splicing introns, that is, those considered to be subject to purifying selection for splicing of 3' TGs. However, we failed to identify specific regulatory motifs (data not shown). This may be due to the dispersed arrangement of <it>cis</it>-regulatory elements, or the contextual cooperation of diverse elements. Due to the relatively small sample size for TG 3' splice sites, available methods for motif discovery have limited detection power.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Discussion</p>
         </st>
         <p>Previous studies provided incidental evidence for unusual 3' terminal dinucleotides in U2-dependent introns, particularly TG dinucleotides that are used as alternative 3' splice sites. Few directed efforts have been made so far to verify such instances and to elucidate underlying mechanisms and consequences <abbrgrp><abbr bid="B16">16</abbr><abbr bid="B35">35</abbr></abbrgrp>. Here, we report 36 human U2-spliceosomal introns with TG dinucleotides functioning as 3' splice sites, identified by thoroughly filtered EST-to-genome alignments. The high accuracy of the EST-based screening approach was validated by RT-PCR with a success rate of 92%. Though it might seem paradoxical, the analysis of EST data gave superior results compared to an analysis of curated data, that is, RefSeq transcripts. We found that the abundance of EST data allows the application of statistical methods for obtaining valid results whereas curated data sets, which are typically devoid of redundancy, may contain errors that are rarely captured by filtering criteria, consistent with the findings of others <abbrgrp><abbr bid="B36">36</abbr></abbrgrp>. In practice, we found that two independent ESTs are strong evidence for a natural splice variant. Given this rather permissive threshold <abbrgrp><abbr bid="B9">9</abbr><abbr bid="B37">37</abbr></abbrgrp>, we expect that the established screening protocol achieves high sensitivity.</p>
         <p>Since our screening procedure is EST-based, certainly more unusual 3' splice sites remain undiscovered in transcript regions that lack sufficient EST coverage. Moreover, there are indications that even other unusual dinucleotides, apart from TG, may function as alternative 3' splice sites. For example, others reported an AT 3' splice site in the mammalian <it>DGCR2 </it>gene <abbrgrp><abbr bid="B8">8</abbr></abbrgrp>, a CG 3' splice site in the <it>Drosophila per </it>gene <abbrgrp><abbr bid="B38">38</abbr></abbrgrp>, and we found that a TG splice acceptor in human <it>CNBP </it>intron 3 is replaced by a viable GG in the chicken ortholog (results not shown). The occurrence of a TG splice acceptor in the <it>Drosophila gnas </it>gene suggests that they occur throughout metazoan organisms.</p>
         <p>Other studies have questioned the extent to which alternative splicing is functionally relevant <abbrgrp><abbr bid="B27">27</abbr><abbr bid="B28">28</abbr><abbr bid="B29">29</abbr></abbrgrp>. Since TG splice acceptors are extremely rare compared to AG acceptors, one might think that these cases reflect a fuzziness of the splicing reaction. However, multiple findings support the idea that TG splice sites are activated by directed mechanisms and that the resulting splice variants fulfill functional roles: first, several TG splice acceptors are used with a high frequency or can even be the preferred splice site, which excludes splicing errors as a plausible explanation (Table <tblr tid="T1">1</tblr>, Figure <figr fid="F4">4</figr>); second, TG splice acceptors and their adjacent intron sequence are remarkably conserved between orthologous mammalian genes (Figure <figr fid="F4">4</figr>); third, tissue-specific splice patterns are observed for <it>GNAS </it><abbrgrp><abbr bid="B16">16</abbr><abbr bid="B26">26</abbr></abbrgrp> as well as <it>BRUNOL4 </it>(this study; Figure <figr fid="F2">2</figr>), suggestive of specific regulatory processes; and fourth, the TG splice site-mediated protein isoform of the mammalian calcium channel subunit &#945;<sub>1A </sub>(<it>CACNA1A</it>) has been shown to result in significant differences in neuronal excitability <abbrgrp><abbr bid="B35">35</abbr></abbrgrp>.</p>
         <p>Thinking of splice site evolution as a process of functional engineering, we might ask about the functional options that distinguish TG-AG splice acceptor tandems from AG-AG tandems. During analysis of orthologs of human TG splice acceptors, we did not identify any case of orthologous AG splice sites, suggesting that TG and AG splice site dinucleotides are functionally non-equivalent. The inserted/deleted nucleotide sequence differs only if TG is positioned downstream of the tandem splice site. Apart from the possible impact on the protein sequence, an NAGATG tandem acceptor allows insertion of a start codon. For example, this seems to be realized in intron 1 of human <it>PCGF2</it>, where the observed splice variants differ by the presence of an upstream open reading frame. Preliminary results indicate that this ATG insertion has an effect on the translation efficiency of the mRNA (results not shown). It is also worth noting that the <it>Drosophila gnas </it>gene has a TG splice acceptor, like the human gene, but it is located in a non-homologous intron <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>. Given the overall low frequency of TG 3' splice sites (0.02%), this example of convergent evolution indicates a functional benefit of the unusual splice site, independent of its impact on protein sequence. It is tempting to speculate that splicing of TG splice acceptors, rather than providing a pathway for alternative transcripts or protein isoforms, may play a role as a regulatory bottleneck for maturation of the transcript, as was suggested for U12-type introns <abbrgrp><abbr bid="B39">39</abbr></abbrgrp>.</p>
         <p>Considering functional classes, a significant fraction of TG-spliced genes represent regulators of chromatin structure (<it>PCGF2</it>, <it>GPBP1</it>, <it>SAP30</it>, <it>SUV420H2</it>, <it>SSRP1</it>, <it>SMARCA4</it>) as well as splicing factors and translational modulators (<it>CNBP</it>, <it>BRUNOL4</it>, <it>HNRPR</it>, <it>PCBP2</it>). Interestingly, two of the affected RNA-binding proteins are reported to bind DNA as well <abbrgrp><abbr bid="B40">40</abbr><abbr bid="B41">41</abbr></abbrgrp>. Together, these enrichments suggest a regulatory cross-talk between transcription on the one hand, and splicing, mRNA maintenance, and translation on the other. Together with another subgroup associated with receptor-mediated signal transduction (<it>GNAS</it>, <it>DRD2</it>, <it>FREQ</it>, <it>IL21</it>, <it>RYK</it>, <it>DLG4</it>, <it>RRAD</it>, <it>PTPN11</it>, <it>SYTL2</it>, <it>MARK3</it>, <it>SH3D19</it>), most of the genes' functions may be circumscribed with 'information processing', a term that was introduced to describe the functional characteristics of U12-dependent introns <abbrgrp><abbr bid="B6">6</abbr></abbrgrp>. However, as a statistical analysis of Gene Ontology functional classification terms does not reveal any significant over- or under-representation (results not shown), further work is required to determine the relevance of these findings.</p>
         <p>TG-AG splice acceptor tandems illustrate the flexibility as well as the specificity of splice site selection by the U2-type spliceosome. The spliceosome is flexible enough to choose TG dinucleotides as splice acceptors. Despite this flexibility, a TG splice site depends on a neighboring AG splice acceptor, since constitutive TG splice acceptors are not found, and TG-AG acceptor tandems show a distance constraint. We assume that an AG splice acceptor, within the typical context of a branch-point motif and polypyrimidine tract, is essentially required for intron definition to promote splicing stepI <it>in vivo</it>. Consistent with this, a recent report showed that the essential splicing factor U2AF<sup>35 </sup>in cooperation with other factors mediates the spliceosome's specificity for AG 3' intron termini during splicing stepI <abbrgrp><abbr bid="B42">42</abbr></abbrgrp>. Assuming that splicing stepI does not ultimately define the 3' splice site, we hypothesize that definite splice site choice takes place during reaction stepII, allowing TG dinucleotides to function as 3' splice sites. Since U2AF dissociates from the spliceosomal complex after stepI <abbrgrp><abbr bid="B43">43</abbr><abbr bid="B44">44</abbr></abbrgrp>, other factors may influence splice site choice at a later step. Two different modes of 3' splice site selection after splicing stepI have been suggested for AG-AG splice site tandems. First, a second 3' AG may be chosen as the site of exon ligation during splicing stepII if it is located a few nucleotides downstream of the first-step AG, defined by U2AF binding <abbrgrp><abbr bid="B45">45</abbr></abbrgrp>. This rather unspecific mechanism is the likely explanation for the high propensity of small-distance AG-AG tandems to result in alternative splicing, and may also be relevant for TG-AG acceptor tandems, which are found overrepresented at a 3-nt distance compared to larger distances (Figure S1 in Additional data file 1). Another mechanism is exemplified by intron 2 of the <it>Drosophila sxl </it>gene <abbrgrp><abbr bid="B46">46</abbr></abbrgrp> as well as intron 1 of the &#946;-globin mutant &#946;<sup>110 </sup><abbrgrp><abbr bid="B47">47</abbr><abbr bid="B48">48</abbr></abbrgrp>. Here, the downstream AG is essential for splicing while the dispensable upstream AG may be chosen in splicing stepII, even as the preferred splice site. The splicing factor SPF45 was shown to bind to the upstream AG dinucleotide during splicing stepII, promoting splice site choice <abbrgrp><abbr bid="B46">46</abbr></abbrgrp>. It remains to be tested if SPF45 or other factors contribute to TG splice site choice.</p>
         <p>Given the extremely low ratio of viable versus non-viable TG-AG tandems at intron-exon boundaries, contextual sequence signals must contribute to TG splice site definition and influence splice site choice. In agreement, half of the TG splice acceptors are associated with outstandingly high intron sequence conservation. Notably, the alternative TG splice acceptor of <it>GNAS </it>intron 3 has been shown to be flanked by three putative exonic splice enhancer motifs (specific for SF2/ASF, SC35, and SRp40), and TG splice site choice has been experimentally shown to be modulated by the ratios of SF2/ASF and hnRNPA1 <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>. We could not identify specific sequence motifs associated with TG splice sites (results not shown). Due to the relatively small sample size for TG 3' splice sites, available methods for motif discovery have limited detection power, especially if <it>cis</it>-regulatory elements are highly dispersed, or if diverse elements cooperate in a contextual manner. Presumably, each individual TG-AG tandem recruits a characteristic ensemble of splice regulators to facilitate unusual splice site choice. Thus, the compilation of TG splice sites could serve as a rich source of splicing-relevant contextual sequence signals to be examined in future experimental studies.</p>
      </sec>
      <sec>
         <st>
            <p>Materials and methods</p>
         </st>
         <sec>
            <st>
               <p>Screening for non-canonical 3' splice sites</p>
            </st>
            <p>From the UCSC Genome Browser site <abbrgrp><abbr bid="B22">22</abbr></abbrgrp> we obtained spliced alignments of human ESTs (file all_est.txt, released 2005-07-14) and of human RefSeq transcripts (refGene.txt, 2005-07-23) <abbrgrp><abbr bid="B49">49</abbr></abbrgrp>, as well as a compilation of all human EST sequences (est.fa, 2005-11-26). First, we sampled EST-supported 3' splice sites to identify 3-nt splice variant pairs (&#916;3SVPs). In parallel, we identified ESTs that were mapped to multiple genome locations, indicative of paralogous gene loci including pseudogenes. We discarded those &#916;3SVPs whose EST support for the minor splice variant did not exceed the number of these ambiguously mapped ESTs. Furthermore, we retained only those &#916;3SVPs that have at least one splice site corresponding to a RefSeq transcript, according to the RefSeq-to-genome alignment. Then, we separated cases that involve the dinucleotide AG at both 3' splice sites, that is, NAGNAG tandem splice acceptors, as well as U12-dependent introns, identified by their characteristic donor site and branch-point motifs <abbrgrp><abbr bid="B6">6</abbr></abbrgrp>. The remaining &#916;3SVPs were considered 'unusual' since they comprise at least one non-AG splice acceptor in a U2-spliceosomal intron. The splice variants of these &#916;3SVPs were validated and quantified by a WU-BLASTN search of 60-nt sequence windows around the resulting exon-exon junctions against all human ESTs, using parameters W = 13, N = -8, nogap S = 180, hspmax = 1. BLAST matches were considered valid if perfect sequence identity was found in a 12-nt window around the exon-exon junctions <abbrgrp><abbr bid="B37">37</abbr></abbrgrp>. Finally, &#916;3SVPs were considered highly reliable if the minor 3' splice site was found in at least two ESTs and was used in at least 3% of the covering ESTs. A screen for splice variant pairs for distances of 4-36 nt was performed analogously, restricting the search to tandems of TG-AG splice sites.</p>
         </sec>
         <sec>
            <st>
               <p>PCR and RT-PCR</p>
            </st>
            <p>For validation of splice variants, nested PCR was performed using 1 ng cDNA templates from the Human Multiple Tissue cDNA PanelsI and II (Clontech, Mountain View, CA, USA). For a given gene, suitable tissues were determined from expression data obtained from the Stanford SOURCE database <abbrgrp><abbr bid="B50">50</abbr></abbrgrp>. However, pooled leukocyte cDNA from 200 individuals was preferably chosen in order to obtain comparable results. Verification of the genomic sequence and an analysis of potential polymorphisms were done by nested PCR using 200 ng of pooled genomic DNA from 100 Caucasian individuals (Roche, Mannheim, Germany) as template. Primers were obtained from Metabion (Martinsried, Germany) (Additional data file 1). Reactions were set up with PuReTaq Ready-To-Go PCR beads (GE Healthcare, Munich, Germany) and 10 pmol primer in 25 &#956;l total volume, according to the manufacturer's instructions. A typical thermocycle protocol was 3 minutes initial denaturation at 94&#176;C, followed by 25 cycles of 1 minute denaturation at 94&#176;C, 1 minute annealing at 53-55&#176;C, 1 minute extension at 72&#176;C, and a final 10 minute extension step at 72&#176;C. In the second round of nested PCR, 1 &#956;l of the first-round product was amplified for 30 cycles. For cloning, PCR products were separated on agarose, DNA was extracted applying the Millipore (Billerica, MA, USA) Montage Gel Extraction kit, followed by ethanol precipitation. Isolated fragments were cloned in pCR2.1-TOPO (Invitrogen, Karlsruhe, Germany), and cloned DNA was Sanger sequenced using M13 standard reverse primer (17-mer).</p>
         </sec>
         <sec>
            <st>
               <p>Splice variant quantification by pyrosequencing</p>
            </st>
            <p>Templates for pyrosequencing were generated using universal biotinylated primers <abbrgrp><abbr bid="B51">51</abbr></abbrgrp>. RT-PCR amplicons of the exon-exon junctions were ligated into pCR2.1-TOPO (Invitrogen) according to the supplier's recommendations and subsequently re-amplified with all four possible combinations of 5'-biotinylated M13 standard primers (17-mers) and unlabeled insert-specific primers (Additional data file 1). The latter also served to prime the pyrosequencing reaction. Biotin-labeling of DNA, single strand preparation and sequencing were performed as described <abbrgrp><abbr bid="B51">51</abbr></abbrgrp>.</p>
         </sec>
         <sec>
            <st>
               <p>Orthologous intron-exon boundaries</p>
            </st>
            <p>A data set of orthologous intron-exon boundaries was constructed automatically to obtain sufficient data (especially reference data) to test for evolutionary constraints on intron flanking sequence. Sets of human (data as described for the splice site screen) and mouse transcript annotations (UCSC genome assembly mm7, RefSeq-to-genome alignment 2006-05-21) were processed as described earlier (supplementary methods in <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>). For 97,107 unambiguous orthologous pairs (57% of unique human intron-exon boundaries, including 23 of 36 TG-AG splice site tandems), 100 nt flanking intron sequences were aligned using CLUSTALW <abbrgrp><abbr bid="B52">52</abbr></abbrgrp>, using the optimized parameter -gapopen = 2.2. The degree of conservation was determined for 50 nt of the human intron sequence upstream of the splice site (tandem), giving a score of 1 for an identical aligned nucleotide in mouse, a score of 0 for a mismatch, and a penalty of -1 for inserted mouse sequence. Since a histogram of sequence conservation in canonical introns showed a non-normal distribution, statistical testing was performed using a permutation test. Intron samples of given size were simulated by random drawings from the intron data set, and the average sequence identity was calculated, repeating the sampling procedure 10,000 times.</p>
            <p>Where automated processing failed (13 of 36 TG-AG splice site tandems), orthologous intron-exon boundaries were retrieved using the UCSC genome browser. These cases were not used for the statistical analysis since these represent a likely biased subset with regard to sequence conservation, and an appropriate large data set for comparison is not available.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Additional data files</p>
         </st>
         <p>The following additional data are available with the online version of this paper. Additional data file <supplr sid="S1">1</supplr> is a Word file containing a description of the estimation of the amount of <it>cis</it>-regulatory sequence context, three supplementary tables and two supplementary figures. Supplementary Table 1 lists the putative unusual splice sites evident from EST-to-genome alignments that failed the quality checks. Supplementary Table 2 provides data about the comprehensive analysis of putative 3' TG splice sites suggested by spliced alignments of RefSeq transcripts. Supplementary Table 3 contains all primer sequences. Supplementary Figure S1 shows the distance-dependent occurrence of TG-AG and AG-AG splice acceptor tandems. Supplementary Figure S2 shows a LOGO representation of the TG 3' splice site sequence context.</p>
         <suppl id="S1">
            <title>
               <p>Additional data file 1</p>
            </title>
            <caption>
               <p>Description of the estimation of the amount of <it>cis</it>-regulatory sequence context, and supplementary tables and figures.</p>
            </caption>
            <text>
               <p>Supplementary Table 1 lists the putative unusual splice sites evident from EST-to-genome alignments that failed the quality checks. Supplementary Table 2 provides data about the comprehensive analysis of putative 3' TG splice sites suggested by spliced alignments of RefSeq transcripts. Supplementary Table 3 contains all primer sequences. Supplementary Figure S1 shows the distance-dependent occurrence of TG-AG and AG-AG splice acceptor tandems. Supplementary Figure S2 shows a LOGO representation of the TG 3' splice site sequence context.</p>
            </text>
            <file name="gb-2007-8-8-r154-S1.doc">
               <p>Click here for file</p>
            </file>
         </suppl>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>We thank M-L Schmidt and I G&#246;rlich for expert technical assistance, F Liu and Z-G Han for providing clone material, members of the RefSeq Division staff of the National Center for Biotechnology Information for helpful discussions, many members of the FLI and two anonymous referees for critical reading of the manuscript and helpful suggestions. This work was supported by grants from the German Ministry of Education and Research to SS (01GS0426) and MP (01GR0504, 0313652D) as well as from the Deutsche Forschungsgemeinschaft (SFB604-02) to MP.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>Splicing precursors to mRNAs by the spliceosomes.</p>
            </title>
            <aug>
               <au>
                  <snm>Burge</snm>
                  <fnm>CB</fnm>
               </au>
               <au>
                  <snm>Tuschl</snm>
                  <fnm>TH</fnm>
               </au>
               <au>
                  <snm>Sharp</snm>
                  <fnm>PA</fnm>
               </au>
            </aug>
            <source>The RNA World</source>
            <publisher>Plainview, NY: Cold Spring Harbor Laboratory Press</publisher>
            <editor>Gesteland RF, Cech T, Atkins JF</editor>
            <edition>2</edition>
            <pubdate>1999</pubdate>
            <fpage>525</fpage>
            <lpage>560</lpage>
         </bibl>
         <bibl id="B2">
            <title>
               <p>Insights into the mechanisms of splicing: more lessons from the ribosome.</p>
            </title>
            <aug>
               <au>
                  <snm>Konarska</snm>
                  <fnm>MM</fnm>
               </au>
               <au>
                  <snm>Query</snm>
                  <fnm>CC</fnm>
               </au>
            </aug>
            <source>Genes Dev</source>
            <pubdate>2005</pubdate>
            <volume>19</volume>
            <fpage>2255</fpage>
            <lpage>2260</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1101/gad.1363105</pubid>
                  <pubid idtype="pmpid" link="fulltext">16204176</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>Mechanisms of fidelity in pre-mRNA splicing.</p>
            </title>
            <aug>
               <au>
                  <snm>Reed</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Curr Opin Cell Biol</source>
            <pubdate>2000</pubdate>
            <volume>12</volume>
            <fpage>340</fpage>
            <lpage>345</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0955-0674(00)00097-1</pubid>
                  <pubid idtype="pmpid" link="fulltext">10801464</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>Intron recognition comes of AGe.</p>
            </title>
            <aug>
               <au>
                  <snm>Moore</snm>
                  <fnm>MJ</fnm>
               </au>
            </aug>
            <source>Nat Struct Biol</source>
            <pubdate>2000</pubdate>
            <volume>7</volume>
            <fpage>14</fpage>
            <lpage>16</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/71207</pubid>
                  <pubid idtype="pmpid" link="fulltext">10625417</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>A reappraisal of non-consensus mRNA splice sites.</p>
            </title>
            <aug>
               <au>
                  <snm>Jackson</snm>
                  <fnm>IJ</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>1991</pubdate>
            <volume>19</volume>
            <fpage>3795</fpage>
            <lpage>3798</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">328465</pubid>
                  <pubid idtype="pmpid" link="fulltext">1713664</pubid>
                  <pubid idtype="doi">10.1093/nar/19.14.3795</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>Evolutionary fates and origins of U12-type introns.</p>
            </title>
            <aug>
               <au>
                  <snm>Burge</snm>
                  <fnm>CB</fnm>
               </au>
               <au>
                  <snm>Padgett</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Sharp</snm>
                  <fnm>PA</fnm>
               </au>
            </aug>
            <source>Mol Cell</source>
            <pubdate>1998</pubdate>
            <volume>2</volume>
            <fpage>773</fpage>
            <lpage>785</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S1097-2765(00)80292-0</pubid>
                  <pubid idtype="pmpid" link="fulltext">9885565</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>A computational scan for U12-dependent introns in the human genome sequence.</p>
            </title>
            <aug>
               <au>
                  <snm>Levine</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Durbin</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2001</pubdate>
            <volume>29</volume>
            <fpage>4006</fpage>
            <lpage>4013</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">60238</pubid>
                  <pubid idtype="pmpid" link="fulltext">11574683</pubid>
                  <pubid idtype="doi">10.1093/nar/29.1.300</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>Comprehensive splice-site analysis using comparative genomics.</p>
            </title>
            <aug>
               <au>
                  <snm>Sheth</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Roca</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Hastings</snm>
                  <fnm>ML</fnm>
               </au>
               <au>
                  <snm>Roeder</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Krainer</snm>
                  <fnm>AR</fnm>
               </au>
               <au>
                  <snm>Sachidanandam</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2006</pubdate>
            <volume>34</volume>
            <fpage>3955</fpage>
            <lpage>3967</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1557818</pubid>
                  <pubid idtype="pmpid" link="fulltext">16914448</pubid>
                  <pubid idtype="doi">10.1093/nar/gkl556</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>Analysis of canonical and non-canonical splice sites in mammalian genomes.</p>
            </title>
            <aug>
               <au>
                  <snm>Burset</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Seledtsov</snm>
                  <fnm>IA</fnm>
               </au>
               <au>
                  <snm>Solovyev</snm>
                  <fnm>VV</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2000</pubdate>
            <volume>28</volume>
            <fpage>4364</fpage>
            <lpage>4375</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">113136</pubid>
                  <pubid idtype="pmpid" link="fulltext">11058137</pubid>
                  <pubid idtype="doi">10.1093/nar/28.21.4364</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>Efficient use of a 'dead-end' GA 5' splice site in the human fibroblast growth factor receptor genes.</p>
            </title>
            <aug>
               <au>
                  <snm>Brackenridge</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Wilkie</snm>
                  <fnm>AOM</fnm>
               </au>
               <au>
                  <snm>Screaton</snm>
                  <fnm>GR</fnm>
               </au>
            </aug>
            <source>EMBO J</source>
            <pubdate>2003</pubdate>
            <volume>22</volume>
            <fpage>1620</fpage>
            <lpage>1631</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">152907</pubid>
                  <pubid idtype="pmpid" link="fulltext">12660168</pubid>
                  <pubid idtype="doi">10.1093/emboj/cdg163</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>Splicing of a divergent subclass of AT-AC introns requires the major spliceosomal snRNAs.</p>
            </title>
            <aug>
               <au>
                  <snm>Wu</snm>
                  <fnm>Q</fnm>
               </au>
               <au>
                  <snm>Krainer</snm>
                  <fnm>AR</fnm>
               </au>
            </aug>
            <source>RNA</source>
            <pubdate>1997</pubdate>
            <volume>3</volume>
            <fpage>586</fpage>
            <lpage>601</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1369508</pubid>
                  <pubid idtype="pmpid" link="fulltext">9174094</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>Terminal intron dinucleotide sequences do not distinguish between U2- and U12-dependent introns.</p>
            </title>
            <aug>
               <au>
                  <snm>Dietrich</snm>
                  <fnm>RC</fnm>
               </au>
               <au>
                  <snm>Incorvaia</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Padgett</snm>
                  <fnm>RA</fnm>
               </au>
            </aug>
            <source>Mol Cell</source>
            <pubdate>1997</pubdate>
            <volume>1</volume>
            <fpage>151</fpage>
            <lpage>160</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S1097-2765(00)80016-7</pubid>
                  <pubid idtype="pmpid" link="fulltext">9659912</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B13">
            <title>
               <p>Information for the Coordinates of Exons (ICE): a human splice sites database.</p>
            </title>
            <aug>
               <au>
                  <snm>Chong</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Bajic</snm>
                  <fnm>VB</fnm>
               </au>
            </aug>
            <source>Genomics</source>
            <pubdate>2004</pubdate>
            <volume>84</volume>
            <fpage>762</fpage>
            <lpage>766</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.ygeno.2004.05.007</pubid>
                  <pubid idtype="pmpid" link="fulltext">15475254</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>SPA: a probabilistic algorithm for spliced alignment.</p>
            </title>
            <aug>
               <au>
                  <snm>van Nimwegen</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Paul</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Sheridan</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Zavolan</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>PLoS Genet</source>
            <pubdate>2006</pubdate>
            <volume>2</volume>
            <fpage>e24</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1449883</pubid>
                  <pubid idtype="pmpid" link="fulltext">16683023</pubid>
                  <pubid idtype="doi">10.1371/journal.pgen.0020024</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B15">
            <title>
               <p>Isolation and characterization of the human Gs alpha gene.</p>
            </title>
            <aug>
               <au>
                  <snm>Kozasa</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Itoh</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Tsukamoto</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Kaziro</snm>
                  <fnm>Y</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>1988</pubdate>
            <volume>85</volume>
            <fpage>2081</fpage>
            <lpage>2085</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">279932</pubid>
                  <pubid idtype="pmpid" link="fulltext">3127824</pubid>
                  <pubid idtype="doi">10.1073/pnas.85.7.2081</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <title>
               <p>Alternative splicing of the adenylyl cyclase stimulatory G-protein G alpha(s) is regulated by SF2/ASF and heterogeneous nuclear ribonucleoprotein A1 (hnRNPA1) and involves the use of an unusual TG 3'-splice site.</p>
            </title>
            <aug>
               <au>
                  <snm>Pollard</snm>
                  <fnm>AJ</fnm>
               </au>
               <au>
                  <snm>Krainer</snm>
                  <fnm>AR</fnm>
               </au>
               <au>
                  <snm>Robson</snm>
                  <fnm>SC</fnm>
               </au>
               <au>
                  <snm>Europe-Finner</snm>
                  <fnm>GN</fnm>
               </au>
            </aug>
            <source>J Biol Chem</source>
            <pubdate>2002</pubdate>
            <volume>277</volume>
            <fpage>15241</fpage>
            <lpage>15251</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1074/jbc.M109046200</pubid>
                  <pubid idtype="pmpid" link="fulltext">11825891</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>Two forms of <it>Drosophila melanogaster </it>Gs alpha are produced by alternate splicing involving an unusual splice site.</p>
            </title>
            <aug>
               <au>
                  <snm>Quan</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Forte</snm>
                  <fnm>MA</fnm>
               </au>
            </aug>
            <source>Mol Cell Biol</source>
            <pubdate>1990</pubdate>
            <volume>10</volume>
            <fpage>910</fpage>
            <lpage>917</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">360930</pubid>
                  <pubid idtype="pmpid" link="fulltext">2106072</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p>Genomic organization of human DLG4, the gene encoding postsynaptic density 95.</p>
            </title>
            <aug>
               <au>
                  <snm>Stathakis</snm>
                  <fnm>DG</fnm>
               </au>
               <au>
                  <snm>Udar</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Sandgren</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Andreasson</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Bryant</snm>
                  <fnm>PJ</fnm>
               </au>
               <au>
                  <snm>Small</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Forsman-Semb</snm>
                  <fnm>K</fnm>
               </au>
            </aug>
            <source>J Neurochem</source>
            <pubdate>1999</pubdate>
            <volume>73</volume>
            <fpage>2250</fpage>
            <lpage>2265</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1046/j.1471-4159.1999.0732250.x</pubid>
                  <pubid idtype="pmpid" link="fulltext">10582582</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B19">
            <title>
               <p>New dopamine receptor, D2(Longer), with unique TG splice site, in human brain.</p>
            </title>
            <aug>
               <au>
                  <snm>Seeman</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Nam</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Ulpian</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Liu</snm>
                  <fnm>IS</fnm>
               </au>
               <au>
                  <snm>Tallerico</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>Brain Res Mol Brain Res</source>
            <pubdate>2000</pubdate>
            <volume>76</volume>
            <fpage>132</fpage>
            <lpage>141</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0169-328X(99)00343-5</pubid>
                  <pubid idtype="pmpid" link="fulltext">10719223</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B20">
            <title>
               <p>Widespread occurrence of alternative splicing at NAGNAG acceptors contributes to proteome plasticity.</p>
            </title>
            <aug>
               <au>
                  <snm>Hiller</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Huse</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Szafranski</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Jahn</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Hampe</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Schreiber</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Backofen</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Platzer</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>2004</pubdate>
            <volume>36</volume>
            <fpage>1255</fpage>
            <lpage>1257</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/ng1469</pubid>
                  <pubid idtype="pmpid" link="fulltext">15516930</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B21">
            <title>
               <p>Single-nucleotide polymorphisms in NAGNAG acceptors are highly predictive for variations of alternative splicing.</p>
            </title>
            <aug>
               <au>
                  <snm>Hiller</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Huse</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Szafranski</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Jahn</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Hampe</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Schreiber</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Backofen</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Platzer</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Am J Hum Genet</source>
            <pubdate>2006</pubdate>
            <volume>78</volume>
            <fpage>291</fpage>
            <lpage>302</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1380236</pubid>
                  <pubid idtype="pmpid" link="fulltext">16400609</pubid>
                  <pubid idtype="doi">10.1086/500151</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B22">
            <title>
               <p>The UCSC Genome Browser Database: update 2006.</p>
            </title>
            <aug>
               <au>
                  <snm>Hinrichs</snm>
                  <fnm>AS</fnm>
               </au>
               <au>
                  <snm>Karolchik</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Baertsch</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Barber</snm>
                  <fnm>GP</fnm>
               </au>
               <au>
                  <snm>Bejerano</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Clawson</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Diekhans</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Furey</snm>
                  <fnm>TS</fnm>
               </au>
               <au>
                  <snm>Harte</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Hsu</snm>
                  <fnm>F</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2006</pubdate>
            <volume>34</volume>
            <fpage>D590</fpage>
            <lpage>D598</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1347506</pubid>
                  <pubid idtype="pmpid" link="fulltext">16381938</pubid>
                  <pubid idtype="doi">10.1093/nar/gkj144</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B23">
            <title>
               <p>Transcriptome and genome conservation of alternative splicing events in humans and mice.</p>
            </title>
            <aug>
               <au>
                  <snm>Sugnet</snm>
                  <fnm>CW</fnm>
               </au>
               <au>
                  <snm>Kent</snm>
                  <fnm>WJ</fnm>
               </au>
               <au>
                  <snm>Ares</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Haussler</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Pac Symp Biocomput</source>
            <pubdate>2004</pubdate>
            <fpage>66</fpage>
            <lpage>77</lpage>
            <xrefbib>
               <pubid idtype="pmpid">14992493</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B24">
            <title>
               <p>Systematic identification of splice variants in human P/Q-type channel alpha1(2.1) subunits: implications for current density and Ca2+dependent inactivation.</p>
            </title>
            <aug>
               <au>
                  <snm>Soong</snm>
                  <fnm>TW</fnm>
               </au>
               <au>
                  <snm>DeMaria</snm>
                  <fnm>CD</fnm>
               </au>
               <au>
                  <snm>Alvania</snm>
                  <fnm>RS</fnm>
               </au>
               <au>
                  <snm>Zweifel</snm>
                  <fnm>LS</fnm>
               </au>
               <au>
                  <snm>Liang</snm>
                  <fnm>MC</fnm>
               </au>
               <au>
                  <snm>Mittman</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Agnew</snm>
                  <fnm>WS</fnm>
               </au>
               <au>
                  <snm>Yue</snm>
                  <fnm>DT</fnm>
               </au>
            </aug>
            <source>J Neurosci</source>
            <pubdate>2002</pubdate>
            <volume>22</volume>
            <fpage>10142</fpage>
            <lpage>10152</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">12451115</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B25">
            <title>
               <p>Rapid SNP allele frequency determination in genomic DNA pools by pyrosequencing.</p>
            </title>
            <aug>
               <au>
                  <snm>Neve</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Froguel</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Corset</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Vaillant</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Vatin</snm>
  