<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art><ui>1471-2164-11-106</ui><ji>1471-2164</ji><fm>
<dochead>Research article</dochead>
<bibl>
<title>
<p>U12 type introns were lost at multiple occasions during evolution</p>
</title>
<aug>
<au id="A1"><snm>Bartschat</snm><fnm>Sebastian</fnm><insr iid="I1"/><insr iid="I2"/><email>sebastian@bioinf.uni-leipzig.de</email></au>
<au ca="yes" id="A2"><snm>Samuelsson</snm><fnm>Tore</fnm><insr iid="I1"/><email>tore.samuelsson@medkem.gu.se</email></au>
</aug>
<insg>
<ins id="I1"><p>Department of Medical Biochemistry and Cell Biology, Institute of Biomedicine, Sahlgrenska Academy at University of Gothenburg, Box 440, SE-405 30 G&#246;teborg, Sweden</p></ins>
<ins id="I2"><p>Bioinformatics Group, Department of Computer Science and Interdisciplinary Center for Bioinformatics, University of Leipzig, Haertelstrasse 16-18, D-04107 Leipzig, Germany</p></ins>
</insg>
<source>BMC Genomics</source>
<issn>1471-2164</issn>
<pubdate>2010</pubdate>
<volume>11</volume>
<issue>1</issue>
<fpage>106</fpage>
<url>http://www.biomedcentral.com/1471-2164/11/106</url>
<xrefbib><pubidlist><pubid idtype="doi">10.1186/1471-2164-11-106</pubid><pubid idtype="pmpid">20149226</pubid></pubidlist></xrefbib>
</bibl>
<history><rec><date><day>11</day><month>11</month><year>2009</year></date></rec><acc><date><day>11</day><month>2</month><year>2010</year></date></acc><pub><date><day>11</day><month>2</month><year>2010</year></date></pub></history>
<cpyrt><year>2010</year><collab>Bartschat and Samuelsson; licensee BioMed Central Ltd.</collab><note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note></cpyrt>
<abs>
<sec>
<st>
<p>Abstract</p>
</st>
<sec>
<st>
<p>Background</p>
</st>
<p>Two categories of introns are known, a common U2 type and a rare U12 type. These two types of introns are removed by distinct spliceosomes. The phylogenetic distribution of spliceosomal RNAs that are characteristic of the U12 spliceosome, i.e. the U11, U12, U4atac and U6atac RNAs, suggest that U12 spliceosomes were lost in many phylogenetic groups. We have now examined the distribution of U2 and U12 introns in many of these groups.</p>
</sec>
<sec>
<st>
<p>Results</p>
</st>
<p>U2 and U12 introns were predicted by making use of available EST and genomic sequences. The results show that in species or branches where U12 spliceosomal components are missing, also U12 type of introns are lacking. Examples are the choanoflagellate <it>Monosiga brevicollis</it>, <it>Entamoeba histolytica</it>, green algae, diatoms, and the fungal lineage Basidiomycota. Furthermore, whereas U12 splicing does not occur in <it>Caenorhabditis elegans</it>, U12 introns as well as U12 snRNAs are present in <it>Trichinella spiralis</it>, which is deeply branching in the nematode tree. A comparison of homologous genes in <it>T. spiralis </it>and <it>C. elegans </it>revealed different mechanisms whereby U12 introns were lost.</p>
</sec>
<sec>
<st>
<p>Conclusions</p>
</st>
<p>The phylogenetic distribution of U12 introns and spliceosomal RNAs give further support to an early origin of U12 dependent splicing. In addition, this distribution identifies a large number of instances during eukaryotic evolution where such splicing was lost.</p>
</sec>
</sec>
</abs>
</fm><meta>
<classifications>
<classification id="endnote" subtype="user_supplied_xml" type="bmc"/>
</classifications>
</meta><bdy>
<sec>
<st>
<p>Background</p>
</st>
<p>In eukaryotes mature RNA is formed by the removal of introns from a primary transcript. Splicing is catalyzed by a multicomponent complex, the spliceosome <abbrgrp>
<abbr bid="B1">1</abbr>
</abbrgrp>. Two intron classes have been identified, a common U2-type and a rare U12-type <abbrgrp>
<abbr bid="B2">2</abbr>
<abbr bid="B3">3</abbr>
<abbr bid="B4">4</abbr>
</abbrgrp>. Splicing of U2-type introns is catalyzed by the U2-dependent (major) spliceosome, which includes the U1, U2, U4, U5 and U6 spliceosomal RNAs as well as multiple protein factors. The U12-dependent (minor) spliceosome, responsible for the excision of the U12-type introns, is structurally similar to the U2-type spliceosome. It contains protein subunits and the U5 RNA as well as the U11, U12, U4atac, and U6atac spliceosomal RNAs that are functionally and structurally related to the U1, U2, U4 and U6 RNAs of the major spliceosome.</p>
<p>U2 introns have characteristic properties at the 5' splice site (AG/GURAGU), 3' splice site (YAG/G) and branch site (CURACU, where the A is the branch point adenosine). There is also a pyrimidine rich region between the branch and 3' splice sites. Much of the specificity in the splicing reaction is accomplished by pairing with snRNAs. Thus, the 5' splice site pairs with U1 RNA and the branch site pairs with U2 RNA.</p>
<p>The U12 introns have consensus sequences that are different from U2 introns. The 5' splice site (/RTATCCTTT) as well branch site (UCCUUA<ul>A</ul>CU, where the underlined A is the branch point adenosine) are more conserved than their counterparts in U2 introns, whereas the 3' splice site is more variable. In addition, U12 introns lack a pyrimidine rich region. Whereas the vast majority of U2 introns have the dinucleotides GT and AG at their 5' and 3' ends, respectively, some U12 introns have the dinucleotides AT and AC in these positions <abbrgrp>
<abbr bid="B5">5</abbr>
</abbrgrp>. During U12 splicing, the 5' splice site and branch site pair with the U11 and U12 snRNA, respectively.</p>
<p>U2-type introns are ubiquitous in eukaryotes while U12-type introns are lacking in some species, such as <it>Saccharomyces cerevisiae </it>
<abbrgrp>
<abbr bid="B6">6</abbr>
</abbrgrp> and in the nematode <it>Caenorhabditis elegans </it>
<abbrgrp>
<abbr bid="B5">5</abbr>
</abbrgrp>. U12 introns were first reported only in vertebrates, insects, cnidarians and plants <abbrgrp>
<abbr bid="B5">5</abbr>
</abbrgrp>. However, they were later discovered in <it>Rhizopus oryzae</it>, <it>Phytophthora </it>and <it>Acantamoeba castellanii</it>, demonstrating an early evolutionary origin for the U12 spliceosome <abbrgrp>
<abbr bid="B7">7</abbr>
</abbrgrp>.</p>
<p>We have recently presented an inventory of spliceosomal RNAs based on computational prediction from genomic sequences <abbrgrp>
<abbr bid="B8">8</abbr>
</abbrgrp>. We found additional support of U12 splicing in <it>Acanthamoeba castellanii </it>as we identified the U12-type spliceosomal U11 and U6atac RNAs, in addition to the previously identified U12 RNA <abbrgrp>
<abbr bid="B7">7</abbr>
</abbrgrp>. Furthermore, RNAs specific to the U12 spliceosome were identified in a number of phylogenetic groups where previously such RNAs were not observed, including the nematode <it>Trichinella spiralis</it>, the slime mold <it>Physarum polycephalum </it>and the fungal lineages Zygomycota and Chytridiomycota. The detailed map of the distribution of the U12-type RNA genes supports an early origin of the minor spliceosome and points to a number of occasions during evolution where it was lost.</p>
<p>We have now addressed the question of whether the distribution of U12-type RNAs is correlated with the distribution of U12 introns. If there is such a correlation we also wanted to examine mechanisms of U12 intron evolution. Possible events regarding the fate of U12 introns as discussed by Burge et al <abbrgrp>
<abbr bid="B5">5</abbr>
</abbrgrp> include U12 intron loss as well as conversion of introns from the U12 to the U2 category by mutational changes. The database of orthologous U12 introns, U12DB<abbrgrp>
<abbr bid="B9">9</abbr>
</abbrgrp>, lists examples of changes in the latter category.</p>
<p>A number of methods have been developed for the prediction of U12 type introns <abbrgrp>
<abbr bid="B5">5</abbr>
<abbr bid="B10">10</abbr>
<abbr bid="B11">11</abbr>
</abbrgrp>. Most of them make use of weight matrices based on known exon-intron boundary regions and branch sites <abbrgrp>
<abbr bid="B5">5</abbr>
<abbr bid="B11">11</abbr>
</abbrgrp>. In addition, AT-AC-type introns with classic consensus features may be identified with a simple pattern-based approach <abbrgrp>
<abbr bid="B7">7</abbr>
</abbrgrp>.</p>
<p>For this study we have used the methods of Burge et al <abbrgrp>
<abbr bid="B5">5</abbr>
</abbrgrp> as well as that of Sheth et al <abbrgrp>
<abbr bid="B11">11</abbr>
</abbrgrp> to predict U2 and U12 introns in a number of different species that represent a broad phylogenetic range. The results show that the distribution of U12 introns is consistent with the distribution of U12 spliceosomal components.</p>
</sec>
<sec>
<st>
<p>Methods</p>
</st>
<sec>
<st>
<p>Sequence data</p>
</st>
<p>EST sequences were retrieved using NCBI Entrez. Genomic sequences were downloaded from WUSTL <url>http://genome.wustl.edu</url>, <it>T. spiralis </it>version 1.0, and <it>Physarum polycephalum </it>version 3.1), Wormbase (<it>Caenorhabditis elegans</it>, <url>http://www.wormbase.org/</url>), JGI (<url>http://www.jgi.doe.gov/</url>, <it>Monosiga brevicollis </it>version 1.0, <it>Phycomyces blakesleeanus </it>version 1.0, <it>Chlamydomonas reinhardtii </it>v 3.0, <it>Phytophthora sojae </it>version 1.0, <it>Thalassiosira pseudonana </it>version 3.0, <it>Phaeodactylum tricornutum </it>assembly 1), from Broad Institute (<it>Rhizopus oryzae </it>assembly 3, <it>Phytophthora infestans </it>version 1.0), from TIGR (<it>Entamoeba histolytica </it>2004 version) and from TraceDB (<it>Phakopsora pachyrhizi </it>and <it>Acanthamoeba castellanii</it>).</p>
</sec>
<sec>
<st>
<p>Identification of U12 introns</p>
</st>
<p>Introns were identifying from BLAST searches <abbrgrp>
<abbr bid="B12">12</abbr>
</abbrgrp> where EST sequences were used to query a database of genomic sequences. For instance, in the case of <it>T. spiralis</it>, a total of 25,268 EST sequences were retrieved using NCBI Entrez and used to query a database of <it>T. spiralis </it>genome sequences. The genome sequences contained 15544 contigs with a total of 115,634,429 nt. Only hits with sequence identity at least 98% and HSP length at least 35 nt were considered for further analysis.</p>
<p>Whenever an EST matched to more than one genomic contig sequence we selected for further analysis the contig with the most extensive match to the EST sequence. As BLAST is often not able to unambiguously identify the exact location of the splice site, we considered all possible sites and the most probable one was identified by screening with position weight matrices (PWMs) as described below.</p>
<p>PWMs for 5', 3' and branch sites of the GT-AG U12, GT-AG U2, AT-AC U12 and GC-AG U2 type of introns from five different species were obtained from the Splicerack database <url>http://katahdin.cshl.edu:9331/SpliceRack/index.cgi?database=spliceNew</url>. For the 5' splice sites the PWM covers 13 positions where the first 3 positions are the 3' end of the exon, and for the 3' splice site the window has 17 positions when the last 3 positions are in the exon part. The branch site PWM has a length of 12 and corresponds to a location falling into the range of (-40,-5) upstream of the 3' splice site of the intron. PWMs were available for <it>C. elegans</it>, <it>D. melanogaster</it>, <it>A. thaliana</it>, <it>H. sapiens </it>and <it>M. musculus </it>but only the first three of these were used for the scoring of <it>T. spiralis </it>sequences as these PWMs are more appropriate to nematodes as well as to the other species examined here <abbrgrp>
<abbr bid="B11">11</abbr>
</abbrgrp>.</p>
<p>We used 5' and 3' matrices from <it>C. elegans</it>, <it>D. melanogaster </it>and <it>A. thaliana </it>to score all possible intron locations as inferred by BLAST. Each possible position of the intron therefore generated three different sets of 5' and 3' scores. We finally selected the intron position where both 5' and 3' scores were the greatest or, in cases where this was not applicable, where the sum of both scores was the greatest.</p>
<p>For identification of U12 introns using the method of Sheth et al <abbrgrp>
<abbr bid="B11">11</abbr>
</abbrgrp>, pseudocounts of 0.001 were added to the PWMs available from the Splicerack database <url>http://katahdin.cshl.edu:9331/SpliceRack/</url> and log-odd matrices were then obtained. In the case of the branch site scoring we used only one PWM, as the PWMs of the five different species were identical. For identifying the most likely branch site every segment of length 12 within the range (-40,-5) relative to the 3' splice site was scored with U12 GT-AG and AT-AC matrices.</p>
<p>For prediction of U12 introns using the Burge et al method <abbrgrp>
<abbr bid="B5">5</abbr>
</abbrgrp> we used for scoring the frequency matrices of U2 GT-AG, U12 GT-AG, and U12 AT-AC dependent introns from SpliceRack. As there is no matrix for branch sites of GT-AG U2 introns we created such a matrix in the following way. For every U2 intron of <it>T. spiralis </it>classified with the method from Sheth et al. we used the branch site which achieved the best score with a GT-AG U12 PWM. All the 14480 branch sites obtained in this way were used to construct a frequency matrix.</p>
<p>The scores for each splice or branch site were then computed as described in Burge et al. <abbrgrp>
<abbr bid="B5">5</abbr>
</abbrgrp>. The 5' splice site probability is defined as <inline-formula>
<graphic file="1471-2164-11-106-i1.gif"/>
</inline-formula> where the probability of base j in position i is <inline-formula>
<graphic file="1471-2164-11-106-i2.gif"/>
</inline-formula>, U is either U12 or U2 and X describes the sequence to be scored. To score the branch site, the values of <inline-formula>
<graphic file="1471-2164-11-106-i3.gif"/>
</inline-formula> and <inline-formula>
<graphic file="1471-2164-11-106-i4.gif"/>
</inline-formula> are calculated for each 13 nt segment in the range (-40,-5) relative to the 3' splice site and the maximum values of both calculations were retained. The complete 5' splice site scores and branch site scores are <inline-formula>
<graphic file="1471-2164-11-106-i5.gif"/>
</inline-formula> and <inline-formula>
<graphic file="1471-2164-11-106-i6.gif"/>
</inline-formula>, respectively. These two values were calculated for every intron found. The corresponding sample mean and standard deviation were determined and these scores were normalized to z scores <it>S</it>
<sub>5'<it>ss </it>
</sub>and <it>S</it>
<sub>
<it>bps </it>
</sub>by subtracting the sample mean and dividing by the standard deviation.</p>
<p>After scoring all introns we tried to separate the putative U12 dependent introns from the U2 dependent ones with respect to their normalized scores. The lower thresholds for U12 type introns were empirically defined with respect to the minimum values of a reference set of minor introns which were used by Burge et al <abbrgrp>
<abbr bid="B5">5</abbr>
</abbrgrp>. The test criterion we used was the same as in Zhu and Brendel <abbrgrp>
<abbr bid="B10">10</abbr>
</abbrgrp>, and as discussed by these authors it is likely to be different from the test statistic <inline-formula>
<graphic file="1471-2164-11-106-i7.gif"/>
</inline-formula> originally used by Burge et al.</p>
<p>We also analyzed all predicted introns with respect to previously known consensus features of U12 introns as referred to by Russell et al <abbrgrp>
<abbr bid="B7">7</abbr>
</abbrgrp>. In addition, we took into consideration that for effective splicing at the 5' splice site we require the sequence RTATCCTT where one of the Cs in positions +5 and +6 may be replaced by a T (Mikko Frilander, Helsinki, personal communication).</p>
</sec>
<sec>
<st>
<p>Analysis of relationship between introns of <it>C. elegans </it>and <it>T. spiralis</it>
</p>
</st>
<p>For identifying introns in <it>C. elegans </it>that are homologous to the U12 introns in <it>T. spiralis </it>the <it>C. elegans </it>genome sequence (sequence number 198, release of January 13, 2009) was retrieved from Wormbase <url>ftp://ftp.wormbase.org/pub/wormbase/genomes/c_elegans/sequences/dna/</url>.</p>
<p>A <it>C. elegans </it>protein database (number 198, release of January 12, 2009) with 23962 proteins was also downloaded from Wormbase <url>ftp://ftp.wormbase.org/pub/wormbase/genomes/c_elegans/sequences/protein/</url>.</p>
<p>BLASTX was first used to identify <it>C. elegans </it>proteins corresponding to the <it>T. spiralis </it>U12 intron genes. Genomic positions of the corresponding gene in <it>C. elegans </it>were then inferred using TBLASTN.</p>
</sec>
</sec>
<sec>
<st>
<p>Results and Discussion</p>
</st>
<p>We have previously examined the phylogenetic distribution of U2 and U12 spliceosomal components <abbrgrp>
<abbr bid="B8">8</abbr>
</abbrgrp>. The results showed that U12 components were lacking in a number of phyla such as Nematoda, Choanoflagellida (Monosiga), Fungi, Mycetozoa, Entamoeba, red and green algae and Heterokonta. We have now examined the occurrence of U12 type introns in these groups.</p>
<sec>
<st>
<p>Identification of introns</p>
</st>
<p>Introns were identified by matching of ESTs to genomic sequences using BLAST <abbrgrp>
<abbr bid="B12">12</abbr>
</abbrgrp>. The size distribution of introns for selected species is shown in additional file <supplr sid="S1">1</supplr>. Mode values vary between 51 and 95 nt for all species examined here. An exception is <it>C. reinhardtii</it>, which has a distribution of introns lengths with a mode value which is approximately 195 nt, consistent with previous observations <abbrgrp>
<abbr bid="B13">13</abbr>
</abbrgrp>.</p>
<suppl id="S1">
<title>
<p>Additional file 1</p>
</title>
<text>
<p>
<b>Intron length statistics</b>. Upper panel: Distribution of lengths in the size range 1-300 nt for all introns (U2 and U12) of <it>T. spiralis</it>, <it>E. histolytica</it>, <it>A. castellanii</it>, <it>P. tricornutum</it>, <it>M. brevicollis</it>, and <it>C. reinhardtii</it>. Lower panel: Mean intron lengths for all species with U12 introns considered in this work.</p>
</text>
<file name="1471-2164-11-106-S1.PDF">
   <p>Click here for file</p>
</file>
</suppl>
<p>In order to discriminate between U2 and U12 introns we used methods described by Burge et al <abbrgrp>
<abbr bid="B5">5</abbr>
</abbrgrp>, Zhu and Brendel <abbrgrp>
<abbr bid="B10">10</abbr>
</abbrgrp> and Sheth et al <abbrgrp>
<abbr bid="B11">11</abbr>
</abbrgrp> as described under Materials and Methods. In the Burge et al method weight matrices were used to score 5' splice sites and branch sites. Normalized z scores for these sites were then obtained and used to produce plots like that shown in Fig. <figr fid="F1">1</figr> for <it>P. sojae</it>. In order to discriminate between U12 and U2 type introns we used a cutoff based on a reference set of U12 introns as used in Burge et al <abbrgrp>
<abbr bid="B5">5</abbr>
</abbrgrp>. Thus, for an intron to qualify as a U12 type both 5' splice site and branch site scores need to be at least the minimum values present in the reference U12 set of sequences <abbrgrp>
<abbr bid="B10">10</abbr>
</abbrgrp>. The plot in Fig. <figr fid="F1">1</figr> shows that in the case of <it>P. sojae </it>three different introns, one of the type GT-AG and two of the type AT-AC, fulfilled these criteria.</p>
<fig id="F1"><title><p>Figure 1</p></title><caption><p><it>Plot of splice site scores of introns identified in </it>Phytophthora sojae</p></caption><text>
   <p><b><it>Plot of splice site scores of introns identified in </it>Phytophthora sojae</b>. The scores of 5' splice sites and branch sites are compared to those of a reference set of U12 introns as used by Burge et al <abbrgrp><abbr bid="B5">5</abbr></abbrgrp> (within blue rectangle). Introns in <it>P. sojae </it>predicted to be of the U12 type have both 5' splice site and branch site scores equal to or larger than the minimum values of the reference set. Arrows indicate one GT-AG and two AT-AC introns.</p>
</text><graphic file="1471-2164-11-106-1" hint_layout="single"/></fig>
</sec>
<sec>
<st>
<p>U12 introns in <it>Trichinella spiralis</it>
</p>
</st>
<p>We examined in more detail the prediction of U12-type introns in the nematode <it>T. spiralis</it>. Molecular phylogenetic analysis places Trichinella close to the root in the nematode tree, i.e. more deeply branching than species such as <it>C. elegans </it>of the Rhabditida branch <abbrgrp>
<abbr bid="B14">14</abbr>
<abbr bid="B15">15</abbr>
<abbr bid="B16">16</abbr>
</abbrgrp> which is believed to be lacking U12 introns.</p>
<p>In <it>T. spiralis </it>we identified a total of 15402 introns. Out of these, 14866 were of type GT-AG, 218 were of type GC-AG and 8 were AT-AC type introns. In addition, 315 introns were identified with non-canonical terminal dinucleotides.</p>
<p>Using the method of Sheth et al <abbrgrp>
<abbr bid="B11">11</abbr>
</abbrgrp>, U12 GT-AG type introns (13) and U12 AT-AC introns (3) were identified (Table <tblr tid="T1">1</tblr>). Minor introns are thought to have the 5' splice site sequence RTATCCTT where one of the Cs in positions +5 and +6 may be converted into a T (Mikko Frilander, Helsinki, personal communication). There are 5 introns that do not conform to this rule, leaving 11 introns that are stronger predictions.</p>
<tbl id="T1"><title><p>Table 1</p></title><caption><p>U12 type of introns identified in T. spiralis.</p></caption><tblbdy cols="8">
      <r>
         <c ca="left">
            <p>
               <b>EST</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>Genomic sequence</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>Intron length</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>B</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>R</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>Relationship to <it>C. elegans</it></b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b><it>C. elegans </it>orthologue</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>Gene function</b>
            </p>
         </c>
      </r>
      <r>
         <c>
            <p/>
         </c>
         <c ca="center">
            <p>
               <b>ATAC introns</b>
            </p>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
      </r>
      <r>
         <c cspan="8">
            <hr/>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>EX501652.1</p>
         </c>
         <c ca="left">
            <p>
               <monospace>AGA|ATATCCTTTC...TTGGGCATTGTTAT<ul>ATTTCCTTAACG</ul>GGTATGGTTTAC|GTT</monospace>
            </p>
         </c>
         <c ca="center">
            <p>
               <monospace>2095</monospace>
            </p>
         </c>
         <c ca="center">
            <p>
               <monospace>+</monospace>
            </p>
         </c>
         <c ca="center">
            <p>
               <monospace>+</monospace>
            </p>
         </c>
         <c ca="center">
            <p>U12->U2</p>
         </c>
         <c ca="center">
            <p>EEED8.7</p>
         </c>
         <c ca="center">
            <p>SR (splicing factor)</p>
         </c>
      </r>
      <r>
         <c>
            <p/>
         </c>
         <c ca="left">
            <p>
               <monospace>&#160;&#160;&#160;TyrIleGl&#160;&#160;&#160;&#160;&#160;&#160;&#160;nPheGlnGl&#160;&#160;&#160;&#160;&#160;&#160;&#160;nLeuLysAspAla</monospace>
            </p>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
      </r>
      <r>
         <c>
            <p/>
         </c>
         <c>
            <p>
               <monospace>Ts&#160;<ul>TACATAGA</ul>AT...AC<ul>GTTCGAAGA&#160;&#160;&#160;&#160;&#160;&#160;&#160;GCTGAAAGATGCT</ul></monospace>
            </p>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
      </r>
      <r>
         <c cspan="8">
            <p/>
         </c>
      </r>
      <r>
         <c>
            <p/>
         </c>
         <c ca="left">
            <p>
               <monospace>&#160;&#160;&#160;&#160;PheValAr&#160;&#160;&#160;&#160;&#160;&#160;gPheTyrGl&#160;&#160;&#160;&#160;&#160;&#160;&#160;nArgArgAspAla</monospace>
            </p>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
      </r>
      <r>
         <c>
            <p/>
         </c>
         <c>
            <p>
               <monospace>Ce&#160;&#160;<ul>TTCGTTAG&#160;&#160;&#160;&#160;&#160;&#160;ATTTTATGA</ul>GT...AG<ul>ACGTCGTGCTGCT</ul></monospace>
            </p>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>EX500683.1</p>
         </c>
         <c ca="left">
            <p>
               <monospace>ATT|ATATCCTTTC...AATTTC<ul>ATTTCCTTAACG</ul>TTAGATTTTTGTTGTTTTAC|TGA</monospace>
            </p>
         </c>
         <c ca="center">
            <p>
               <monospace>94</monospace>
            </p>
         </c>
         <c ca="center">
            <p>
               <monospace>+</monospace>
            </p>
         </c>
         <c ca="center">
            <p>
               <monospace>+</monospace>
            </p>
         </c>
         <c ca="center">
            <p>U12 lost</p>
         </c>
         <c ca="center">
            <p>E04D5.1a</p>
         </c>
         <c ca="center">
            <p>NA</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>ES570647.1</p>
         </c>
         <c ca="left">
            <p>
               <monospace>ATC|ATATCCTTTC...GTATTGTTTGTA<ul>TTTTCCTTAACT</ul>TCATATGTTTTTAC|GTA</monospace>
            </p>
         </c>
         <c ca="center">
            <p>
               <monospace>182</monospace>
            </p>
         </c>
         <c ca="center">
            <p>
               <monospace>+</monospace>
            </p>
         </c>
         <c ca="center">
            <p>
               <monospace>+</monospace>
            </p>
         </c>
         <c ca="center">
            <p>U12 lost</p>
         </c>
         <c ca="center">
            <p>Y37D8A.10</p>
         </c>
         <c ca="center">
            <p>Signal peptidase complex, subunit SPC25</p>
         </c>
      </r>
      <r>
         <c cspan="8">
            <hr/>
         </c>
      </r>
      <r>
         <c>
            <p/>
         </c>
         <c ca="center">
            <p>
               <b>GTAG introns</b>
            </p>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
      </r>
      <r>
         <c cspan="8">
            <hr/>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>EX499999.1</p>
         </c>
         <c ca="left">
            <p>
               <monospace>Ts GAG|GTATCCTTTG...TTTTGTTTT<ul>TCTCTCTTTTTACA</ul>ATTATTATACAG|GCC</monospace>
            </p>
         </c>
         <c ca="center">
            <p>
               <monospace>90</monospace>
            </p>
         </c>
         <c ca="center">
            <p>
               <monospace>-</monospace>
            </p>
         </c>
         <c ca="center">
            <p>
               <monospace>+</monospace>
            </p>
         </c>
         <c ca="center">
            <p>U12->U2</p>
         </c>
         <c ca="center">
            <p>F10F2.1</p>
         </c>
         <c ca="center">
            <p>PH, BEACH and WD40 domain-containing protein</p>
         </c>
      </r>
      <r>
         <c>
            <p/>
         </c>
         <c ca="left">
            <p>
               <monospace>Ce GAG|GTTTGAAACA...TTTTAATATTGAACTAAAATTTTTGAATTTTCCAG|GCG</monospace>
            </p>
         </c>
         <c ca="center">
            <p>
               <monospace>64</monospace>
            </p>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>ES570692.1</p>
         </c>
         <c ca="left">
            <p>
               <monospace>Ts TAG|GTATTGTTTT...TGCTACAAGGAATTTTTTT<ul>ATTGCTTTGATT</ul>TTAG|AGT</monospace>
            </p>
         </c>
         <c ca="center">
            <p>
               <monospace>617</monospace>
            </p>
         </c>
         <c ca="center">
            <p>
               <monospace>-</monospace>
            </p>
         </c>
         <c ca="center">
            <p>
               <monospace>-</monospace>
            </p>
         </c>
         <c ca="center">
            <p>U12->U2</p>
         </c>
         <c ca="center">
            <p>F40F8.10</p>
         </c>
         <c ca="center">
            <p>Small ribosomal subunit S9 protein</p>
         </c>
      </r>
      <r>
         <c>
            <p/>
         </c>
         <c ca="left">
            <p>
               <monospace>Ce CCG|GTTTAGTTTT...AAGATTAGTATCGACTTCAAATTCTTCTCTTTCAG|TGT</monospace>
            </p>
         </c>
         <c ca="center">
            <p>
               <monospace>291</monospace>
            </p>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>ES561213.1</p>
         </c>
         <c ca="left">
            <p>
               <monospace>Ts TCG|GTATTATTTT...CATATTAATCGTTT<ul>CATTTCTTAATG</ul>TATTTTTAG|TGG</monospace>
            </p>
         </c>
         <c ca="center">
            <p>
               <monospace>54</monospace>
            </p>
         </c>
         <c ca="center">
            <p>
               <monospace>-</monospace>
            </p>
         </c>
         <c ca="center">
            <p>
               <monospace>-</monospace>
            </p>
         </c>
         <c ca="center">
            <p>U12->U2</p>
         </c>
         <c ca="center">
            <p>ZC395.10</p>
         </c>
         <c ca="center">
            <p>NA</p>
         </c>
      </r>
      <r>
         <c>
            <p/>
         </c>
         <c ca="left">
            <p>
               <monospace>Ce CCA|GTACGTTTCG...ACATAGAATGAGTCGTAATTCGTAAATTTTCAGAG|GAA</monospace>
            </p>
         </c>
         <c ca="center">
            <p>
               <monospace>150</monospace>
            </p>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>EX500486.1</p>
         </c>
         <c ca="left">
            <p>
               <monospace>Ts TCG|GTATTCTTTC...TAATATGTTTTTCT<ul>TTTTTTTCAACT</ul>TATTTTAAG|ATT</monospace>
            </p>
         </c>
         <c ca="center">
            <p>
               <monospace>87</monospace>
            </p>
         </c>
         <c ca="center">
            <p>
               <monospace>-</monospace>
            </p>
         </c>
         <c ca="center">
            <p>
               <monospace>+</monospace>
            </p>
         </c>
         <c ca="center">
            <p>U12->U2</p>
         </c>
         <c ca="center">
            <p>ZC328.3a</p>
         </c>
         <c ca="center">
            <p>NA</p>
         </c>
      </r>
      <r>
         <c>
            <p/>
         </c>
         <c ca="left">
            <p>
               <monospace>Ce CAT|GTGAGTTTCA...TCCTGAATTTATTCAAGTTTCAACCACATTTCCAG|CAT</monospace>
            </p>
         </c>
         <c ca="center">
            <p>
               <monospace>758</monospace>
            </p>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>ES569928.1</p>
         </c>
         <c ca="left">
            <p>
               <monospace>ATG|GTATTCTTTT...<ul>ATTTCCATTACA</ul>AAATTACAACCGCGTTGTTCTTTCAG|TGC</monospace>
            </p>
         </c>
         <c ca="center">
            <p>
               <monospace>107</monospace>
            </p>
         </c>
         <c ca="center">
            <p>
               <monospace>+</monospace>
            </p>
         </c>
         <c ca="center">
            <p>
               <monospace>+</monospace>
            </p>
         </c>
         <c ca="center">
            <p>Not known</p>
         </c>
         <c ca="center">
            <p>Y82E9BR.15</p>
         </c>
         <c ca="center">
            <p>Transcription elongation factor B</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>ES565768.1</p>
         </c>
         <c ca="left">
            <p>
               <monospace>CAG|GTATTCTTTT...CAAATTTTGGAAAAATTCT<ul>TTTTTTTTAATC</ul>CGAACAG|GTA</monospace>
            </p>
         </c>
         <c ca="center">
            <p>
               <monospace>94</monospace>
            </p>
         </c>
         <c ca="center">
            <p>
               <monospace>-</monospace>
            </p>
         </c>
         <c ca="center">
            <p>
               <monospace>+</monospace>
            </p>
         </c>
         <c ca="center">
            <p>Not known</p>
         </c>
         <c ca="center">
            <p>C34D4.4a</p>
         </c>
         <c ca="center">
            <p>NA</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>ES562099.1</p>
         </c>
         <c ca="left">
            <p>
               <monospace>AAT|GTATCCTTAA...TGTATGAGGTTTGGTATTT<ul>CTGATTTTAATC</ul>ATTTTAG|TGT</monospace>
            </p>
         </c>
         <c ca="center">
            <p>
               <monospace>50</monospace>
            </p>
         </c>
         <c ca="center">
            <p>
               <monospace>-</monospace>
            </p>
         </c>
         <c ca="center">
            <p>
               <monospace>+</monospace>
            </p>
         </c>
         <c ca="center">
            <p>Not known</p>
         </c>
         <c ca="center">
            <p>R07E5.14</p>
         </c>
         <c ca="center">
            <p>RRM RNA binding domain) containing</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>ES563059.1</p>
         </c>
         <c ca="left">
            <p>
               <monospace>GCG|GTATCTTTTC...TATTTATAACTGAATCG<ul>TTTTTATTAATA</ul>ATTTTTTAG|AGT</monospace>
            </p>
         </c>
         <c ca="center">
            <p>
               <monospace>54</monospace>
            </p>
         </c>
         <c ca="center">
            <p>
               <monospace>-</monospace>
            </p>
         </c>
         <c ca="center">
            <p>
               <monospace>+</monospace>
            </p>
         </c>
         <c ca="center">
            <p>Not known</p>
         </c>
         <c ca="center">
            <p>M04F3.4</p>
         </c>
         <c ca="center">
            <p>NA</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>BQ738460.1</p>
         </c>
         <c ca="left">
            <p>
               <monospace>ACG|GTATCGTTCA...TCAATTTTTTTAAAAGTA<ul>ATTTTCTTCATA</ul>TATTTTAG|AAC</monospace>
            </p>
         </c>
         <c ca="center">
            <p>
               <monospace>72</monospace>
            </p>
         </c>
         <c ca="center">
            <p>
               <monospace>-</monospace>
            </p>
         </c>
         <c ca="center">
            <p>
               <monospace>-</monospace>
            </p>
         </c>
         <c ca="center">
            <p>U12 lost/Not known</p>
         </c>
         <c ca="center">
            <p>Y56A3A.36</p>
         </c>
         <c ca="center">
            <p>NA</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>ES566079.1</p>
         </c>
         <c ca="left">
            <p>
               <monospace>TGG|GTATCGTTCG...ATTAACTAACACT<ul>TTGAAGTTGACA</ul>AGTGAATGTTTAG|GAT</monospace>
            </p>
         </c>
         <c ca="center">
            <p>
               <monospace>140</monospace>
            </p>
         </c>
         <c ca="center">
            <p>
               <monospace>-</monospace>
            </p>
         </c>
         <c ca="center">
            <p>
               <monospace>-</monospace>
            </p>
         </c>
         <c ca="center">
            <p>Not known</p>
         </c>
         <c ca="center">
            <p>M02B7.4</p>
         </c>
         <c ca="center">
            <p>beta-1,4-N-acetylglucosaminyl transferase</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>EX500543.1</p>
         </c>
         <c ca="left">
            <p>
               <monospace>TCG|GTATTCTTTG...<ul>TTATTATTAATT</ul>TCTGTTTTTTTTGGTTTTCTAAACAG|AGA</monospace>
            </p>
         </c>
         <c ca="center">
            <p>
               <monospace>86</monospace>
            </p>
         </c>
         <c ca="center">
            <p>
               <monospace>-</monospace>
            </p>
         </c>
         <c ca="center">
            <p>
               <monospace>+</monospace>
            </p>
         </c>
         <c>
            <p/>
         </c>
         <c ca="center">
            <p>None</p>
         </c>
         <c>
            <p/>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>ES561535.1</p>
         </c>
         <c ca="left">
            <p>
               <monospace>GGG|GTATTATTTT...<ul>TTTTCTGTGATT</ul>TAATTGCATTTTAATGTTCTATCTAG|TGA</monospace>
            </p>
         </c>
         <c ca="center">
            <p>
               <monospace>71</monospace>
            </p>
         </c>
         <c ca="center">
            <p>
               <monospace>-</monospace>
            </p>
         </c>
         <c ca="center">
            <p>
               <monospace>-</monospace>
            </p>
         </c>
         <c>
            <p/>
         </c>
         <c ca="center">
            <p>None</p>
         </c>
         <c>
            <p/>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>BQ738918.1</p>
         </c>
         <c ca="left">
            <p>
               <monospace>GAA|GTATCTTTTA...TGAATTTTGCTAAA<ul>TTGTACTTAACA</ul>GGTTGTTTTTAG|AAA</monospace>
            </p>
         </c>
         <c ca="center">
            <p>
               <monospace>153</monospace>
            </p>
         </c>
         <c ca="center">
            <p>
               <monospace>-</monospace>
            </p>
         </c>
         <c ca="center">
            <p>
               <monospace>+</monospace>
            </p>
         </c>
         <c>
            <p/>
         </c>
         <c ca="center">
            <p>None</p>
         </c>
         <c>
            <p/>
         </c>
      </r>
   </tblbdy><tblfn>
      <p>Table shows all introns identified by the Sheth et al method <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>. Regions with best match to branch site PWM is underlined. For one of the ATAC introns (EST EX501652.1) is shown the shift in 5' splice site observed between <it>T. spiralis </it>and <it>C. elegans</it>. Rule of 5' splice site (R) is that 5' splice site sequence is RTATCCTT where one of the Cs in positions +5 and +6 may be converted into a T. Burge et al method (B) is described in <abbrgrp><abbr bid="B5">5</abbr></abbrgrp>. NA = not available. For sequences of U2 and U12 introns listed in table, see additional file <supplr sid="S2">2</supplr>.</p>
   </tblfn></tbl>
<p>In order to study the fate of <it>T. spiralis </it>U12 introns we examined the homologous genes in <it>C. elegans </it>where U12 introns are believed to be absent. Of the 16 EST sequences that we identified using the Sheth et al method as being associated with U12 type introns, 3 had no matches to entries in protein databases and could not be associated with a <it>C. elegans </it>gene. Another 6 ESTs matched only partially to the homologous <it>C. elegans </it>gene and for this reason we were not able to examine the fate of the homologous <it>T. spiralis </it>intron. For the remaining ESTs we were able to compare the <it>T. spiralis </it>U12 intron to the corresponding <it>C. elegans </it>intron. Four of these had identical splice sites in the two species as shown in Table <tblr tid="T1">1</tblr>, and in all these cases the <it>T. spiralis </it>introns were changed from U12 to U2 type in <it>C. elegans</it>.</p>
<p>For two of <it>T. spiralis </it>AT-AC-type U12 introns the intron was completely lost in <it>C. elegans </it>(Table <tblr tid="T1">1</tblr>). It is interesting to note that in the case of the third U12 AT-AC intron, a shift from U12 to U2 is accomplished by a shift of splice site (Table <tblr tid="T1">1</tblr>, EST EX501652.1), such that the intron is moved a distance corresponding to three amino acids in the coding sequence. Therefore, we here observe yet another mechanism whereby a U12 intron may be converted to a U2 type intron.</p>
<p>Finally, we also used the method of Burge et al <abbrgrp>
<abbr bid="B5">5</abbr>
</abbrgrp> to predict U12 type introns in <it>T. spiralis</it>. A smaller number of U12 type introns were found using this method; 3 AT-AC and one GT-AG U12 intron (Table <tblr tid="T1">1</tblr>). Also for other species examined, the Sheth et al method generated more U12 candidates as compared to the Burge et al method.</p>
</sec>
<sec>
<st>
<p>U12 introns as predicted by requiring effective branch sites as well as 5' splice sites</p>
</st>
<p>In addition to <it>Trichinella spiralis</it>, we examined the introns in the choanoflagellate <it>Monosiga brevicollis</it>, in the Zygomycota <it>Rhizopus oryzae </it>and <it>Phycomyces blakesleeanus</it>, in the Basidiomycota <it>Phakopsora pachyrhizi</it>, in <it>Acanthamoeba castellanii</it>, <it>Entamoeba histolytica, Physarum polycephalum</it>, in the green alga <it>Chlamydomonas reinhardtii </it>and in the heterokonts <it>Phytophthora infestans</it>, <it>Phytophthora sojae</it>, <it>Thalassiosira pseudonana </it>and <it>Phaeodactylum tricornutum </it>(Table <tblr tid="T2">2</tblr> and Fig. <figr fid="F2">2</figr>). Introns were predicted using the same methods as described above for <it>T. spiralis </it>introns.</p>
<tbl id="T2"><title><p>Table 2</p></title><caption><p>Summary of U12 introns identified in a range of eukaryotic species.</p></caption><tblbdy cols="13">
      <r>
         <c>
            <p/>
         </c>
         <c ca="center">
            <p>
               <b>Number of ESTs analyzed</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>U12 spliceosomal RNAs</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>U12 AT-AC introns</b>
            </p>
            <p>
               <b>(5' rule)</b>
            </p>
         </c>
         <c>
            <p/>
         </c>
         <c ca="center">
            <p>
               <b>U12 GT-AG introns</b>
            </p>
            <p>
               <b>(5' rule)</b>
            </p>
         </c>
         <c>
            <p/>
         </c>
         <c ca="center">
            <p>
               <b>U2 GC-AG introns</b>
            </p>
         </c>
         <c>
            <p/>
         </c>
         <c ca="center">
            <p>
               <b>U2 GT-AG introns</b>
            </p>
         </c>
         <c>
            <p/>
         </c>
         <c ca="center">
            <p>
               <b>Other AT-AC introns</b>
            </p>
         </c>
         <c>
            <p/>
         </c>
      </r>
      <r>
         <c cspan="13">
            <hr/>
         </c>
      </r>
      <r>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
         <c>
            <p/>
         </c>
         <c ca="center">
            <p>B</p>
         </c>
         <c ca="center">
            <p>S</p>
         </c>
         <c ca="center">
            <p>B</p>
         </c>
         <c ca="center">
            <p>S</p>
         </c>
         <c ca="center">
            <p>B</p>
         </c>
         <c ca="center">
            <p>S</p>
         </c>
         <c ca="center">
            <p>B</p>
         </c>
         <c ca="center">
            <p>S</p>
         </c>
         <c ca="center">
            <p>B</p>
         </c>
         <c ca="center">
            <p>S</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>
               <it>Trichinella spiralis</it>
            </p>
         </c>
         <c ca="center">
            <p>25,268</p>
         </c>
         <c ca="center">
            <p>+</p>
         </c>
         <c ca="center">
            <p>3 (3)</p>
         </c>
         <c ca="center">
            <p>3 (3)</p>
         </c>
         <c ca="center">
            <p>1 (1)</p>
         </c>
         <c ca="center">
            <p>13 (8)</p>
         </c>
         <c ca="center">
            <p>217</p>
         </c>
         <c ca="center">
            <p>217</p>
         </c>
         <c ca="center">
            <p>14697</p>
         </c>
         <c ca="center">
            <p>14685</p>
         </c>
         <c ca="center">
            <p>5</p>
         </c>
         <c ca="center">
            <p>5</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>
               <it>Monosiga brevicollis</it>
            </p>
         </c>
         <c ca="center">
            <p>29,495</p>
         </c>
         <c ca="center">
            <p>-</p>
         </c>
         <c ca="center">
            <p>0</p>
         </c>
         <c ca="center">
            <p>0</p>
         </c>
         <c ca="center">
            <p>0</p>
         </c>
         <c ca="center">
            <p>1 (0)</p>
         </c>
         <c ca="center">
            <p>134</p>
         </c>
         <c ca="center">
            <p>134</p>
         </c>
         <c ca="center">
            <p>13327</p>
         </c>
         <c ca="center">
            <p>13326</p>
         </c>
         <c ca="center">
            <p>10</p>
         </c>
         <c ca="center">
            <p>10</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>
               <it>Rhizopus oryzae</it>
            </p>
         </c>
         <c ca="center">
            <p>13,313</p>
         </c>
         <c ca="center">
            <p>+</p>
         </c>
         <c ca="center">
            <p>1 (1)</p>
         </c>
         <c ca="center">
            <p>1 (1)</p>
         </c>
         <c ca="center">
            <p>2 (2)</p>
         </c>
         <c ca="center">
            <p>3 (2)</p>
         </c>
         <c ca="center">
            <p>71</p>
         </c>
         <c ca="center">
            <p>71</p>
         </c>
         <c ca="center">
            <p>5520</p>
         </c>
         <c ca="center">
            <p>5519</p>
         </c>
         <c ca="center">
            <p>3</p>
         </c>
         <c ca="center">
            <p>3</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>
               <it>Phycomyces blakesleeanus</it>
            </p>
         </c>
         <c ca="center">
            <p>47,847</p>
         </c>
         <c ca="center">
            <p>+</p>
         </c>
         <c ca="center">
            <p>9 (8)</p>
         </c>
         <c ca="center">
            <p>8 (8)</p>
         </c>
         <c ca="center">
            <p>8 (8)</p>
         </c>
         <c ca="center">
            <p>12 (10)</p>
         </c>
         <c ca="center">
            <p>446</p>
         </c>
         <c ca="center">
            <p>446</p>
         </c>
         <c ca="center">
            <p>13504</p>
         </c>
         <c ca="center">
            <p>13500</p>
         </c>
         <c ca="center">
            <p>31</p>
         </c>
         <c ca="center">
            <p>32</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>
               <it>Phakopsora pachyrhizi</it>
            </p>
         </c>
         <c ca="center">
            <p>34,394</p>
         </c>
         <c ca="center">
            <p>-?</p>
         </c>
         <c ca="center">
            <p>0</p>
         </c>
         <c ca="center">
            <p>0</p>
         </c>
         <c ca="center">
            <p>0</p>
         </c>
         <c ca="center">
            <p>1 (0)</p>
         </c>
         <c ca="center">
            <p>10</p>
         </c>
         <c ca="center">
            <p>10</p>
         </c>
         <c ca="center">
            <p>561</p>
         </c>
         <c ca="center">
            <p>560</p>
         </c>
         <c ca="center">
            <p>2</p>
         </c>
         <c ca="center">
            <p>2</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>
               <it>Acanthamoeba castellanii</it>
            </p>
         </c>
         <c ca="center">
            <p>13,784</p>
         </c>
         <c ca="center">
            <p>+</p>
         </c>
         <c ca="center">
            <p>1 (0)</p>
         </c>
         <c ca="center">
            <p>0</p>
         </c>
         <c ca="center">
            <p>1 (1)</p>
         </c>
         <c ca="center">
            <p>3 (0)</p>
         </c>
         <c ca="center">
            <p>21</p>
         </c>
         <c ca="center">
            <p>21</p>
         </c>
         <c ca="center">
            <p>1232</p>
         </c>
         <c ca="center">
            <p>1230</p>
         </c>
         <c ca="center">
            <p>2</p>
         </c>
         <c ca="center">
            <p>2</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>
               <it>Entamoeba histolytica</it>
            </p>
         </c>
         <c ca="center">
            <p>14,388</p>
         </c>
         <c ca="center">
            <p>-</p>
         </c>
         <c ca="center">
            <p>0</p>
         </c>
         <c ca="center">
            <p>0</p>
         </c>
         <c ca="center">
            <p>0</p>
         </c>
         <c ca="center">
            <p>0</p>
         </c>
         <c ca="center">
            <p>0</p>
         </c>
         <c ca="center">
            <p>0</p>
         </c>
         <c ca="center">
            <p>160</p>
         </c>
         <c ca="center">
            <p>160</p>
         </c>
         <c ca="center">
            <p>4</p>
         </c>
         <c ca="center">
            <p>4</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>
               <it>Physarum polycephalum</it>
            </p>
         </c>
         <c ca="center">
            <p>25,393</p>
         </c>
         <c ca="center">
            <p>+</p>
         </c>
         <c ca="center">
            <p>83 (14)</p>
         </c>
         <c ca="center">
            <p>27 (15)</p>
         </c>
         <c ca="center">
            <p>88 (57)</p>
         </c>
         <c ca="center">
            <p>218 (109)</p>
         </c>
         <c ca="center">
            <p>34</p>
         </c>
         <c ca="center">
            <p>34</p>
         </c>
         <c ca="center">
            <p>6452</p>
         </c>
         <c ca="center">
            <p>6326</p>
         </c>
         <c ca="center">
            <p>251</p>
         </c>
         <c ca="center">
            <p>307</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>
               <it>Chlamydomonas reinhardtii</it>
            </p>
         </c>
         <c ca="center">
            <p>202,044</p>
         </c>
         <c ca="center">
            <p>-</p>
         </c>
         <c ca="center">
            <p>0</p>
         </c>
         <c ca="center">
            <p>0</p>
         </c>
         <c ca="center">
            <p>0</p>
         </c>
         <c ca="center">
            <p>0</p>
         </c>
         <c ca="center">
            <p>532</p>
         </c>
         <c ca="center">
            <p>532</p>
         </c>
         <c ca="center">
            <p>25053</p>
         </c>
         <c ca="center">
            <p>25053</p>
         </c>
         <c ca="center">
            <p>7</p>
         </c>
         <c ca="center">
            <p>7</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>
               <it>Phytophthora infestans</it>
            </p>
         </c>
         <c ca="center">
            <p>94,091</p>
         </c>
         <c ca="center">
            <p>+</p>
         </c>
         <c ca="center">
            <p>2 (2)</p>
         </c>
         <c ca="center">
            <p>2 (2)</p>
         </c>
         <c ca="center">
            <p>1 (1)</p>
         </c>
         <c ca="center">
            <p>1 (1)</p>
         </c>
         <c ca="center">
            <p>66</p>
         </c>
         <c ca="center">
            <p>66</p>
         </c>
         <c ca="center">
            <p>6601</p>
         </c>
         <c ca="center">
            <p>6601</p>
         </c>
         <c ca="center">
            <p>9</p>
         </c>
         <c ca="center">
            <p>9</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>
               <it>Phytophthora sojae</it>
            </p>
         </c>
         <c ca="center">
            <p>28,467</p>
         </c>
         <c ca="center">
            <p>+</p>
         </c>
         <c ca="center">
            <p>2 (2)</p>
         </c>
         <c ca="center">
            <p>2 (2)</p>
         </c>
         <c ca="center">
            <p>1 (1)</p>
         </c>
         <c ca="center">
            <p>1 (1)</p>
         </c>
         <c ca="center">
            <p>34</p>
         </c>
         <c ca="center">
            <p>34</p>
         </c>
         <c ca="center">
            <p>3351</p>
         </c>
         <c ca="center">
            <p>3351</p>
         </c>
         <c ca="center">
            <p>5</p>
         </c>
         <c ca="center">
            <p>5</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>
               <it>Thalassiosira pseudonana</it>
            </p>
         </c>
         <c ca="center">
            <p>61,913</p>
         </c>
         <c ca="center">
            <p>-</p>
         </c>
         <c ca="center">
            <p>0</p>
         </c>
         <c ca="center">
            <p>0</p>
         </c>
         <c ca="center">
            <p>0</p>
         </c>
         <c ca="center">
            <p>0</p>
         </c>
         <c ca="center">
            <p>37</p>
         </c>
         <c ca="center">
            <p>37</p>
         </c>
         <c ca="center">
            <p>5140</p>
         </c>
         <c ca="center">
            <p>5140</p>
         </c>
         <c ca="center">
            <p>13</p>
         </c>
         <c ca="center">
            <p>13</p>
         </c>
      </r>
      <r>
         <c ca="left">
            <p>
               <it>Phaeodactylum tricornutum</it>
            </p>
         </c>
         <c ca="center">
            <p>133,871</p>
         </c>
         <c ca="center">
            <p>-</p>
         </c>
         <c ca="center">
            <p>0</p>
         </c>
         <c ca="center">
            <p>0</p>
         </c>
         <c ca="center">
            <p>0</p>
         </c>
         <c ca="center">
            <p>0</p>
         </c>
         <c ca="center">
            <p>23</p>
         </c>
         <c ca="center">
            <p>23</p>
         </c>
         <c ca="center">
            <p>3815</p>
         </c>
         <c ca="center">
            <p>3815</p>
         </c>
         <c ca="center">
            <p>7</p>
         </c>
         <c ca="center">
            <p>7</p>
         </c>
      </r>
   </tblbdy><tblfn>
      <p>Table shows prediction according to methods of Burge et al (B) and Sheth et al (S) <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>. The "5' rule" is that 5' splice site sequence is RTATCCTT where one of the Cs in positions +5 and +6 may be converted into a T. Occurrence of U12 snRNAs is from Davila-Lopez et al <abbrgrp><abbr bid="B8">8</abbr></abbrgrp>. For sequences of U12 introns as predicted by the Burge et al method, see additional file <supplr sid="S2">2</supplr>.</p>
   </tblfn></tbl>
<fig id="F2"><title><p>Figure 2</p></title><caption><p>Schematic phylogenetic tree showing instances where U12 introns were lost</p></caption><text>
   <p><b>Schematic phylogenetic tree showing instances where U12 introns were lost</b>. Presence or absence of U12 snRNAs and U12 introns are shown as well as paths where U12 splicing seem to have been lost (dashed lines). Figure is based on previous information regarding the phylogenetic distribution of U12 snRNAs <abbrgrp><abbr bid="B7">7</abbr><abbr bid="B8">8</abbr></abbrgrp> as well as results regarding U12 intron distribution described here.</p>
</text><graphic file="1471-2164-11-106-2" hint_layout="single"/></fig>
<p>Previously, U12 introns have been reported in some of these species. Thus, Russell et al <abbrgrp>
<abbr bid="B17">17</abbr>
</abbrgrp> reported three AT-AC and two GT-AG type introns in <it>A. castellanii</it>, one AT-AC intron in <it>R. oryzae </it>and one AT-AC intron in the peptidyl-prolyl isomerase genes of <it>P.sojae </it>and <it>P. ramorum</it>. It should be noted that Russell et al used a pattern based approach to identify U12 type introns. This method is not expected to be as accurate as the prediction carried out here which is based on position weight matrices. Two U12 introns were identified by Gl&#246;ckner et al <abbrgrp>
<abbr bid="B18">18</abbr>
</abbrgrp> in <it>P. polycephalum</it>, although it is not clear what method in that case was used to classify the introns.</p>
<p>In <it>A. castellanii </it>we identified three U12 introns (Table <tblr tid="T2">2</tblr>). One of them is the AT-AC U12 intron in the gene COMM7 previously found by Russell et al <abbrgrp>
<abbr bid="B17">17</abbr>
</abbrgrp>. We also identified a GT-AG intron present in the gene for a mitochondrial carnitine acylcarnitine carrier protein. In addition, there is evidence of a GC-AG U12 intron, present in a gene encoding a lipid transfer protein.</p>
<p>A multiple alignment of all available <it>A. castellanii </it>U12 introns, i.e. those identified previously <abbrgrp>
<abbr bid="B17">17</abbr>
</abbrgrp>, together with the two additional introns identified here, shows that there is a C-rich region in all these sequences downstream of the consensus 5' splice site sequence (Fig. <figr fid="F3">3</figr>). We do not know if this sequence conservation is functionally significant, but it seems specific to <it>A. castellanii</it>, as it is not found in other species with U12 introns such as <it>Physarum</it>.</p>
<fig id="F3"><title><p>Figure 3</p></title><caption><p>Conserved sequence elements of U12 introns in A. castellanii</p></caption><text>
   <p><b>Conserved sequence elements of U12 introns in A. castellanii</b>. Position 6 in alignment corresponds to the 5' terminal position of the intron. The majority of these introns were identified by Russell et al <abbrgrp><abbr bid="B7">7</abbr></abbrgrp>, whereas the introns of lipid transfer protein and carnitine/acylcarnitine carrier protein was identified in this work.</p>
</text><graphic file="1471-2164-11-106-3" hint_layout="single"/></fig>
<p>In <it>P. sojae </it>we identified a previously reported AT-AC intron which is present in a gene encoding peptidyl-prolyl isomerase <abbrgrp>
<abbr bid="B17">17</abbr>
</abbrgrp>. In addition, one other AT-AC as well as a GT-AG U12 intron in the ribosomal protein L31 gene is present in this species. In <it>P. infestans </it>we identified the introns homologous to the peptidyl-prolyl isomerase and L31 introns in <it>P. sojae</it>, as well as an additional AT-AC intron.</p>
<p>In <it>R. oryzae </it>we found the same AT-AC intron previously reported <abbrgrp>
<abbr bid="B17">17</abbr>
</abbrgrp>, as well as two additional GT-AG introns.</p>
<p>A very large number of U12 introns were predicted in <it>P. polycephalum </it>(Table <tblr tid="T2">2</tblr>). At the same time, this collection does not include any of the introns reported by Glockner et al <abbrgrp>
<abbr bid="B18">18</abbr>
</abbrgrp>. Perhaps U12 introns are particularly predominant in this species. As an alternative, our prediction method may give rise to an unusually large number of false positives in <it>P. polycephalum</it>. On the other hand, many of the U12 introns may be regarded as strong predictions as they also conform to the 5' consensus rule. It is therefore highly likely that U12 introns exist in this species.</p>
<p>In summary, the phylogenetic distribution of U12 introns is entirely consistent with the distribution of U12 snRNAs <abbrgrp>
<abbr bid="B8">8</abbr>
</abbrgrp>. This is illustrated in the schematic phylogenetic tree in Fig. <figr fid="F2">2</figr>. There are at least nine different branches that are associated with a loss of U12 splicing. When more genomic and EST sequences become available even more instances of such loss may be observed.</p>
<p>When comparing the distribution of U12 snRNAs and U12 introns one potential discrepancy is <it>P. pachyrhizi </it>where U12 introns seem to be missing but we have identified a U4atac snRNA. However, no other U12-type snRNA was found, and it would therefore seem likely that this species is missing U12-dependent splicing. The U4atac snRNA observed could be a non-functional remnant of the U12 splicing machinery of an ancestral species.</p>
<p>There is also a weak candidate for a U12 AT-AG intron in <it>T. pseudonana </it>(data not shown) but as this is the only U12 intron predicted in this species and because we have failed to identify any U12 snRNAs in this species the evidence of U12 splicing is so far very poor.</p>
</sec>
<sec>
<st>
<p>U12 introns in ribosomal protein genes</p>
</st>
<p>Although U12 introns are very rare they occur in ribosomal protein genes in five of the species examined here, <it>R. oryzae </it>(S13), <it>P. blakesleeanus </it>(S13), <it>P. sojae </it>(L31), <it>P. infestans </it>(L13) and <it>P. polycephalum </it>(L4).</p>
<p>In the Zygomycota <it>R. oryzae </it>there are at least three non-identical versions of the S13 gene. These genes encode nearly identical proteins (see additional file <supplr sid="S2">2</supplr>). One of the genes has no intron at all and the other two have U12 GT-AG introns towards the 3' end of the coding sequence. In <it>P. blakesleeanus</it>, another Zygomycota, we have identified one S13 gene. This gene has a U12 intron in the same position as for the <it>R. oryzae </it>gene. More S13 genes might be found in this species once genome sequencing is complete.</p>
<suppl id="S2">
<title>
<p>Additional file 2</p>
</title>
<text>
<p>
<b>Sequences of introns</b>. Sequences of introns referred to in Tables <tblr tid="T1">1</tblr> and <tblr tid="T2">2</tblr>. Sequences of <it>R. oryzae </it>ribosomal protein S13 genes and introns.</p>
</text>
<file name="1471-2164-11-106-S2.PDF">
   <p>Click here for file</p>
</file>
</suppl>
<p>By comparison the S13 gene in the Basidiomycota <it>P. pachyrizi </it>has two different introns and neither of them are of the U12 type and in the same position as the <it>R. oryzae </it>S13 intron.</p>
<p>
<it>P. sojae</it>, <it>P. infestans </it>and <it>P. ramorum </it>of the Oomycetes group all have a L31 gene with a U12 GT-AG intron positioned towards the 5' end of the coding sequence. By comparison the L31 genes in the diatoms <it>T. pseudonana </it>and <it>P. tricornutum </it>seem to be missing introns.</p>
<p>The presence of U12 introns in ribosomal protein genes may be of significance from a regulatory point of view. Ribosomal proteins have previously been reported to be involved in U12 splicing. Thus, there is a U12 intron in the gene for ribosomal protein L1 in <it>X. laevis </it>
<abbrgrp>
<abbr bid="B19">19</abbr>
<abbr bid="B20">20</abbr>
<abbr bid="B21">21</abbr>
<abbr bid="B22">22</abbr>
<abbr bid="B23">23</abbr>
</abbrgrp>. This intron has a low efficiency in splicing, indicating that it might be involved in regulation of L1 expression. There is evidence that splicing of U12 introns is comparatively ineffective and is a rate-limiting step in gene expression <abbrgrp>
<abbr bid="B24">24</abbr>
</abbrgrp>.</p>
<p>The presence of U12 introns in ribosomal protein genes may also be relevant to the observation that in the yeast <it>S. cerevisiae</it>, where introns are rare, ribosomal protein genes is a predominating class of genes containing introns <abbrgrp>
<abbr bid="B25">25</abbr>
<abbr bid="B26">26</abbr>
</abbrgrp>. The splicing of introns in <it>S. cerevisiae </it>ribosomal proteins is presumably of great regulatory significance <abbrgrp>
<abbr bid="B27">27</abbr>
</abbrgrp>. For instance, there is an autoregulatory mechanism of L30 where the protein inhibits the splicing of its own pre-mRNA <abbrgrp>
<abbr bid="B28">28</abbr>
</abbrgrp>. There is also evidence that yeast ribosomal protein paralogues are different in terms of splicing regulation <abbrgrp>
<abbr bid="B29">29</abbr>
</abbrgrp>.</p>
</sec>
</sec>
<sec>
<st>
<p>Conclusions</p>
</st>
<p>The presence of U2 and U12 introns have been examined in a number species that we previously screened for U2 and U12 spliceosomal RNAs. In most species where U12 introns are found, such introns are very rare. The phylogenetic distribution of U12 introns is entirely consistent with the distribution of U12 spliceosomal RNAs. The currently available information on U12 introns and U12 spliceosomal components presents strong evidence that U12 splicing is missing in a number of phyla and species; in the Caenorhabditis branch, in Monosiga, in Microsporidia, in Basidiomycota, Ascomycota and Pezizomycotina, in Dictyostelium and Entamoeba, in the red and green algae and in the diatoms <it>T. pseudonana </it>and <it>P. tricornatum</it>. This would correspond to at least nine different occasions during evolution where U12 splicing seem to have been lost (Fig. <figr fid="F2">2</figr>).</p>
<p>We have examined in more detail the occurrence of introns in the nematodes <it>T. spiralis </it>and <it>C. elegans</it>. As these two species are relatively closely related by evolution they offer a unique possibility to monitor the process in which U12 splicing is lost. By comparing <it>T. spiralis </it>U12 introns to their homologues in <it>C. elegans </it>we noted that U12 introns were eliminated using different mechanisms. In some cases U12 introns were lost completely. Other U12 introns were subject to extensive sequence changes including changes in the 5', 3', and branch site regions of the introns. In one case a U12 to U2 conversion was achieved by shifting the splice position only a short distance.</p>
</sec>
<sec>
<st>
<p>Authors' contributions</p>
</st>
<p>TS conceived of the study and drafted the manuscript. SB carried out bioinformatics analyses and helped to draft the manuscript. All authors read and approved the final manuscript.</p>
</sec>
</bdy><bm>
<ack>
<sec>
<st>
<p>Acknowledgements</p>
</st>
<p>This work was supported by the Erik Philip-S&#246;rensen foundation.</p>
</sec>
</ack>
<refgrp><bibl id="B1"><title><p>The spliceosome: the most complex macromolecular machine in the cell?</p></title><aug><au><snm>Nilsen</snm><fnm>TW</fnm></au></aug><source>Bioessays</source><pubdate>2003</pubdate><volume>25</volume><issue>12</issue><fpage>1147</fpage><lpage>1149</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1002/bies.10394</pubid><pubid idtype="pmpid" link="fulltext">14635248</pubid></pubidlist></xrefbib></bibl><bibl id="B2"><title><p>Highly diverged U4 and U6 small nuclear RNAs required for splicing rare AT-AC introns</p></title><aug><au><snm>Tarn</snm><fnm>WY</fnm></au><au><snm>Steitz</snm><fnm>JA</fnm></au></aug><source>Science</source><pubdate>1996</pubdate><volume>273</volume><issue>5283</issue><fpage>1824</fpage><lpage>1832</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1126/science.273.5283.1824</pubid><pubid idtype="pmpid" link="fulltext">8791582</pubid></pubidlist></xrefbib></bibl><bibl id="B3"><title><p>A novel spliceosome containing U11, U12, and U5 snRNPs excises a minor class (AT-AC) intron in vitro</p></title><aug><au><snm>Tarn</snm><fnm>WY</fnm></au><au><snm>Steitz</snm><fnm>JA</fnm></au></aug><source>Cell</source><pubdate>1996</pubdate><volume>84</volume><issue>5</issue><fpage>801</fpage><lpage>811</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/S0092-8674(00)81057-0</pubid><pubid idtype="pmpid" link="fulltext">8625417</pubid></pubidlist></xrefbib></bibl><bibl id="B4"><title><p>Splicing of a rare class of introns by the U12-dependent spliceosome</p></title><aug><au><snm>Will</snm><fnm>CL</fnm></au><au><snm>Luhrmann</snm><fnm>R</fnm></au></aug><source>Biol Chem</source><pubdate>2005</pubdate><volume>386</volume><issue>8</issue><fpage>713</fpage><lpage>724</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1515/BC.2005.084</pubid><pubid idtype="pmpid" link="fulltext">16201866</pubid></pubidlist></xrefbib></bibl><bibl id="B5"><title><p>Evolutionary fates and origins of U12-type introns</p></title><aug><au><snm>Burge</snm><fnm>CB</fnm></au><au><snm>Padgett</snm><fnm>RA</fnm></au><au><snm>Sharp</snm><fnm>PA</fnm></au></aug><source>Mol Cell</source><pubdate>1998</pubdate><volume>2</volume><issue>6</issue><fpage>773</fpage><lpage>785</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/S1097-2765(00)80292-0</pubid><pubid idtype="pmpid" link="fulltext">9885565</pubid></pubidlist></xrefbib></bibl><bibl id="B6"><title><p>Overview of the yeast genome</p></title><aug><au><snm>Mewes</snm><fnm>HW</fnm></au><au><snm>Albermann</snm><fnm>K</fnm></au><au><snm>Bahr</snm><fnm>M</fnm></au><au><snm>Frishman</snm><fnm>D</fnm></au><au><snm>Gleissner</snm><fnm>A</fnm></au><au><snm>Hani</snm><fnm>J</fnm></au><au><snm>Heumann</snm><fnm>K</fnm></au><au><snm>Kleine</snm><fnm>K</fnm></au><au><snm>Maierl</snm><fnm>A</fnm></au><au><snm>Oliver</snm><fnm>SG</fnm></au><etal/></aug><source>Nature</source><pubdate>1997</pubdate><volume>387</volume><issue>6632 Suppl</issue><fpage>7</fpage><lpage>65</lpage><xrefbib><pubid idtype="pmpid" link="fulltext">9169865</pubid></xrefbib></bibl><bibl id="B7"><title><p>An early evolutionary origin for the minor spliceosome</p></title><aug><au><snm>Russell</snm><fnm>AG</fnm></au><au><snm>Charette</snm><fnm>JM</fnm></au><au><snm>Spencer</snm><fnm>DF</fnm></au><au><snm>Gray</snm><fnm>MW</fnm></au></aug><source>Nature</source><pubdate>2006</pubdate><volume>443</volume><issue>7113</issue><fpage>863</fpage><lpage>866</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nature05228</pubid><pubid idtype="pmpid" link="fulltext">17051219</pubid></pubidlist></xrefbib></bibl><bibl id="B8"><title><p>Computational screen for spliceosomal RNA genes aids in defining the phylogenetic distribution of major and minor spliceosomal components</p></title><aug><au><snm>Davila Lopez</snm><fnm>M</fnm></au><au><snm>Rosenblad</snm><fnm>MA</fnm></au><au><snm>Samuelsson</snm><fnm>T</fnm></au></aug><source>Nucleic Acids Res</source><pubdate>2008</pubdate><volume>36</volume><issue>9</issue><fpage>3001</fpage><lpage>3010</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/nar/gkn142</pubid><pubid idtype="pmcid">2396436</pubid><pubid idtype="pmpid" link="fulltext">18390578</pubid></pubidlist></xrefbib></bibl><bibl id="B9"><title><p>U12DB: a database of orthologous U12-type spliceosomal introns</p></title><aug><au><snm>Alioto</snm><fnm>TS</fnm></au></aug><source>Nucleic Acids Res</source><pubdate>2007</pubdate><issue>35 Database</issue><fpage>D110</fpage><lpage>115</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/nar/gkl796</pubid><pubid idtype="pmcid">1635337</pubid><pubid idtype="pmpid" link="fulltext">17082203</pubid></pubidlist></xrefbib></bibl><bibl id="B10"><title><p>Identification, characterization and molecular phylogeny of U12-dependent introns in the Arabidopsis thaliana genome</p></title><aug><au><snm>Zhu</snm><fnm>W</fnm></au><au><snm>Brendel</snm><fnm>V</fnm></au></aug><source>Nucleic Acids Res</source><pubdate>2003</pubdate><volume>31</volume><issue>15</issue><fpage>4561</fpage><lpage>4572</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/nar/gkg492</pubid><pubid idtype="pmcid">169882</pubid><pubid idtype="pmpid" link="fulltext">12888517</pubid></pubidlist></xrefbib></bibl><bibl id="B11"><title><p>Comprehensive splice-site analysis using comparative genomics</p></title><aug><au><snm>Sheth</snm><fnm>N</fnm></au><au><snm>Roca</snm><fnm>X</fnm></au><au><snm>Hastings</snm><fnm>ML</fnm></au><au><snm>Roeder</snm><fnm>T</fnm></au><au><snm>Krainer</snm><fnm>AR</fnm></au><au><snm>Sachidanandam</snm><fnm>R</fnm></au></aug><source>Nucleic Acids Res</source><pubdate>2006</pubdate><volume>34</volume><issue>14</issue><fpage>3955</fpage><lpage>3967</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/nar/gkl556</pubid><pubid idtype="pmcid">1557818</pubid><pubid idtype="pmpid" link="fulltext">16914448</pubid></pubidlist></xrefbib></bibl><bibl id="B12"><title><p>Basic local alignment search tool</p></title><aug><au><snm>Altschul</snm><fnm>SF</fnm></au><au><snm>Gish</snm><fnm>W</fnm></au><au><snm>Miller</snm><fnm>W</fnm></au><au><snm>Myers</snm><fnm>EW</fnm></au><au><snm>Lipman</snm><fnm>DJ</fnm></au></aug><source>J Mol Biol</source><pubdate>1990</pubdate><volume>215</volume><issue>3</issue><fpage>403</fpage><lpage>410</lpage><xrefbib><pubid idtype="pmpid">2231712</pubid></xrefbib></bibl><bibl id="B13"><title><p>The Chlamydomonas genome reveals the evolution of key animal and plant functions</p></title><aug><au><snm>Merchant</snm><fnm>SS</fnm></au><au><snm>Prochnik</snm><fnm>SE</fnm></au><au><snm>Vallon</snm><fnm>O</fnm></au><au><snm>Harris</snm><fnm>EH</fnm></au><au><snm>Karpowicz</snm><fnm>SJ</fnm></au><au><snm>Witman</snm><fnm>GB</fnm></au><au><snm>Terry</snm><fnm>A</fnm></au><au><snm>Salamov</snm><fnm>A</fnm></au><au><snm>Fritz-Laylin</snm><fnm>LK</fnm></au><au><snm>Marechal-Drouard</snm><fnm>L</fnm></au><etal/></aug><source>Science</source><pubdate>2007</pubdate><volume>318</volume><issue>5848</issue><fpage>245</fpage><lpage>250</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1126/science.1143609</pubid><pubid idtype="pmpid" link="fulltext">17932292</pubid></pubidlist></xrefbib></bibl><bibl id="B14"><title><p>A molecular evolutionary framework for the phylum Nematoda</p></title><aug><au><snm>Blaxter</snm><fnm>ML</fnm></au><au><snm>De Ley</snm><fnm>P</fnm></au><au><snm>Garey</snm><fnm>JR</fnm></au><au><snm>Liu</snm><fnm>LX</fnm></au><au><snm>Scheldeman</snm><fnm>P</fnm></au><au><snm>Vierstraete</snm><fnm>A</fnm></au><au><snm>Vanfleteren</snm><fnm>JR</fnm></au><au><snm>Mackey</snm><fnm>LY</fnm></au><au><snm>Dorris</snm><fnm>M</fnm></au><au><snm>Frisse</snm><fnm>LM</fnm></au><etal/></aug><source>Nature</source><pubdate>1998</pubdate><volume>392</volume><issue>6671</issue><fpage>71</fpage><lpage>75</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/32160</pubid><pubid idtype="pmpid" link="fulltext">9510248</pubid></pubidlist></xrefbib></bibl><bibl id="B15"><title><p>A transcriptomic analysis of the phylum Nematoda</p></title><aug><au><snm>Parkinson</snm><fnm>J</fnm></au><au><snm>Mitreva</snm><fnm>M</fnm></au><au><snm>Whitton</snm><fnm>C</fnm></au><au><snm>Thomson</snm><fnm>M</fnm></au><au><snm>Daub</snm><fnm>J</fnm></au><au><snm>Martin</snm><fnm>J</fnm></au><au><snm>Schmid</snm><fnm>R</fnm></au><au><snm>Hall</snm><fnm>N</fnm></au><au><snm>Barrell</snm><fnm>B</fnm></au><au><snm>Waterston</snm><fnm>RH</fnm></au><etal/></aug><source>Nat Genet</source><pubdate>2004</pubdate><volume>36</volume><issue>12</issue><fpage>1259</fpage><lpage>1267</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/ng1472</pubid><pubid idtype="pmpid" link="fulltext">15543149</pubid></pubidlist></xrefbib></bibl><bibl id="B16"><title><p>An improved molecular phylogeny of the Nematoda with special emphasis on marine taxa</p></title><aug><au><snm>Meldal</snm><fnm>BH</fnm></au><au><snm>Debenham</snm><fnm>NJ</fnm></au><au><snm>De Ley</snm><fnm>P</fnm></au><au><snm>De Ley</snm><fnm>IT</fnm></au><au><snm>Vanfleteren</snm><fnm>JR</fnm></au><au><snm>Vierstraete</snm><fnm>AR</fnm></au><au><snm>Bert</snm><fnm>W</fnm></au><au><snm>Borgonie</snm><fnm>G</fnm></au><au><snm>Moens</snm><fnm>T</fnm></au><au><snm>Tyler</snm><fnm>PA</fnm></au><etal/></aug><source>Mol Phylogenet Evol</source><pubdate>2007</pubdate><volume>42</volume><issue>3</issue><fpage>622</fpage><lpage>636</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/j.ympev.2006.08.025</pubid><pubid idtype="pmpid" link="fulltext">17084644</pubid></pubidlist></xrefbib></bibl><bibl id="B17"><title><p>An ancient spliceosomal intron in the ribosomal protein L7a gene (Rpl7a) of Giardia lamblia</p></title><aug><au><snm>Russell</snm><fnm>AG</fnm></au><au><snm>Shutt</snm><fnm>TE</fnm></au><au><snm>Watkins</snm><fnm>RF</fnm></au><au><snm>Gray</snm><fnm>MW</fnm></au></aug><source>BMC Evol Biol</source><pubdate>2005</pubdate><volume>5</volume><fpage>45</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1186/1471-2148-5-45</pubid><pubid idtype="pmcid">1201135</pubid><pubid idtype="pmpid" link="fulltext">16109161</pubid></pubidlist></xrefbib></bibl><bibl id="B18"><title><p>A first glimpse at the transcriptome of Physarum polycephalum</p></title><aug><au><snm>Glockner</snm><fnm>G</fnm></au><au><snm>Golderer</snm><fnm>G</fnm></au><au><snm>Werner-Felmayer</snm><fnm>G</fnm></au><au><snm>Meyer</snm><fnm>S</fnm></au><au><snm>Marwan</snm><fnm>W</fnm></au></aug><source>BMC Genomics</source><pubdate>2008</pubdate><volume>9</volume><fpage>6</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1186/1471-2164-9-6</pubid><pubid idtype="pmcid">2258281</pubid><pubid idtype="pmpid" link="fulltext">18179708</pubid></pubidlist></xrefbib></bibl><bibl id="B19"><title><p>Expression of two Xenopus laevis ribosomal protein genes in injected frog oocytes. A specific splicing block interferes with the L1 RNA maturation</p></title><aug><au><snm>Bozzoni</snm><fnm>I</fnm></au><au><snm>Fragapane</snm><fnm>P</fnm></au><au><snm>Annesi</snm><fnm>F</fnm></au><au><snm>Pierandrei-Amaldi</snm><fnm>P</fnm></au><au><snm>Amaldi</snm><fnm>F</fnm></au><au><snm>Beccari</snm><fnm>E</fnm></au></aug><source>J Mol Biol</source><pubdate>1984</pubdate><volume>180</volume><issue>4</issue><fpage>987</fpage><lpage>1005</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/0022-2836(84)90267-5</pubid><pubid idtype="pmpid" link="fulltext">6084725</pubid></pubidlist></xrefbib></bibl><bibl id="B20"><title><p>The accumulation of mature RNA for the Xenopus laevis ribosomal protein L1 is controlled at the level of splicing and turnover of the precursor RNA</p></title><aug><au><snm>Caffarelli</snm><fnm>E</fnm></au><au><snm>Fragapane</snm><fnm>P</fnm></au><au><snm>Gehring</snm><fnm>C</fnm></au><au><snm>Bozzoni</snm><fnm>I</fnm></au></aug><source>EMBO J</source><pubdate>1987</pubdate><volume>6</volume><issue>11</issue><fpage>3493</fpage><lpage>3498</lpage><xrefbib><pubidlist><pubid idtype="pmcid">553808</pubid><pubid idtype="pmpid">2448138</pubid></pubidlist></xrefbib></bibl><bibl id="B21"><title><p>Inefficient in vitro splicing of the regulatory intron of the L1 ribosomal protein gene of X.laevis depends on suboptimal splice site sequences</p></title><aug><au><snm>Caffarelli</snm><fnm>E</fnm></au><au><snm>Fragapane</snm><fnm>P</fnm></au><au><snm>Bozzoni</snm><fnm>I</fnm></au></aug><source>Biochem Biophys Res Commun</source><pubdate>1992</pubdate><volume>183</volume><issue>2</issue><fpage>680</fpage><lpage>687</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/0006-291X(92)90536-T</pubid><pubid idtype="pmpid" link="fulltext">1550574</pubid></pubidlist></xrefbib></bibl><bibl id="B22"><title><p>Expression of the gene for ribosomal protein L1 in Xenopus embryos: alteration of gene dosage by microinjection</p></title><aug><au><snm>Pierandrei-Amaldi</snm><fnm>P</fnm></au><au><snm>Bozzoni</snm><fnm>I</fnm></au><au><snm>Cardinali</snm><fnm>B</fnm></au></aug><source>Genes Dev</source><pubdate>1988</pubdate><volume>2</volume><issue>1</issue><fpage>23</fpage><lpage>31</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1101/gad.2.1.23</pubid><pubid idtype="pmpid" link="fulltext">3356338</pubid></pubidlist></xrefbib></bibl><bibl id="B23"><title><p>Identification of the sequences responsible for the splicing phenotype of the regulatory intron of the L1 ribosomal protein gene of Xenopus laevis</p></title><aug><au><snm>Fragapane</snm><fnm>P</fnm></au><au><snm>Caffarelli</snm><fnm>E</fnm></au><au><snm>Lener</snm><fnm>M</fnm></au><au><snm>Prislei</snm><fnm>S</fnm></au><au><snm>Santoro</snm><fnm>B</fnm></au><au><snm>Bozzoni</snm><fnm>I</fnm></au></aug><source>Mol Cell Biol</source><pubdate>1992</pubdate><volume>12</volume><issue>3</issue><fpage>1117</fpage><lpage>1125</lpage><xrefbib><pubidlist><pubid idtype="pmcid">369543</pubid><pubid idtype="pmpid" link="fulltext">1545793</pubid></pubidlist></xrefbib></bibl><bibl id="B24"><title><p>The splicing of U12-type introns can be a rate-limiting step in gene expression</p></title><aug><au><snm>Patel</snm><fnm>AA</fnm></au><au><snm>McCarthy</snm><fnm>M</fnm></au><au><snm>Steitz</snm><fnm>JA</fnm></au></aug><source>EMBO J</source><pubdate>2002</pubdate><volume>21</volume><issue>14</issue><fpage>3804</fpage><lpage>3815</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/emboj/cdf297</pubid><pubid idtype="pmcid">126102</pubid><pubid idtype="pmpid" link="fulltext">12110592</pubid></pubidlist></xrefbib></bibl><bibl id="B25"><title><p>Test of intron predictions reveals novel splice sites, alternatively spliced mRNAs and new introns in meiotically regulated genes of yeast</p></title><aug><au><snm>Davis</snm><fnm>CA</fnm></au><au><snm>Grate</snm><fnm>L</fnm></au><au><snm>Spingola</snm><fnm>M</fnm></au><au><snm>Ares</snm><fnm>M</fnm><suf>Jr</suf></au></aug><source>Nucleic Acids Res</source><pubdate>2000</pubdate><volume>28</volume><issue>8</issue><fpage>1700</fpage><lpage>1706</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/nar/28.8.1700</pubid><pubid idtype="pmcid">102823</pubid><pubid idtype="pmpid" link="fulltext">10734188</pubid></pubidlist></xrefbib></bibl><bibl id="B26"><title><p>A handful of intron-containing genes produces the lion's share of yeast mRNA</p></title><aug><au><snm>Ares</snm><fnm>M</fnm><suf>Jr</suf></au><au><snm>Grate</snm><fnm>L</fnm></au><au><snm>Pauling</snm><fnm>MH</fnm></au></aug><source>RNA</source><pubdate>1999</pubdate><volume>5</volume><issue>9</issue><fpage>1138</fpage><lpage>1139</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1017/S1355838299991379</pubid><pubid idtype="pmcid">1369836</pubid><pubid idtype="pmpid" link="fulltext">10496214</pubid></pubidlist></xrefbib></bibl><bibl id="B27"><title><p>The quest for a message: budding yeast, a model organism to study the control of pre-mRNA splicing</p></title><aug><au><snm>Meyer</snm><fnm>M</fnm></au><au><snm>Vilardell</snm><fnm>J</fnm></au></aug><source>Brief Funct Genomic Proteomic</source><pubdate>2009</pubdate><volume>8</volume><issue>1</issue><fpage>60</fpage><lpage>67</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/bfgp/elp002</pubid><pubid idtype="pmpid" link="fulltext">19279072</pubid></pubidlist></xrefbib></bibl><bibl id="B28"><title><p>An RNA structure involved in feedback regulation of splicing and of translation is critical for biological fitness</p></title><aug><au><snm>Li</snm><fnm>B</fnm></au><au><snm>Vilardell</snm><fnm>J</fnm></au><au><snm>Warner</snm><fnm>JR</fnm></au></aug><source>Proc Natl Acad Sci USA</source><pubdate>1996</pubdate><volume>93</volume><issue>4</issue><fpage>1596</fpage><lpage>1600</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1073/pnas.93.4.1596</pubid><pubid idtype="pmcid">39987</pubid><pubid idtype="pmpid">8643676</pubid></pubidlist></xrefbib></bibl><bibl id="B29"><title><p>Transcript specificity in yeast pre-mRNA splicing revealed by mutations in core spliceosomal components</p></title><aug><au><snm>Pleiss</snm><fnm>JA</fnm></au><au><snm>Whitworth</snm><fnm>GB</fnm></au><au><snm>Bergkessel</snm><fnm>M</fnm></au><au><snm>Guthrie</snm><fnm>C</fnm></au></aug><source>PLoS Biol</source><pubdate>2007</pubdate><volume>5</volume><fpage>e90</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1371/journal.pbio.0050090</pubid><pubid idtype="pmcid">1831718,1831718</pubid><pubid idtype="pmpid" link="fulltext">17388687</pubid></pubidlist></xrefbib></bibl></refgrp>
</bm></art>