<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1471-2229-8-17</ui>
   <ji>1471-2229</ji>
   <fm>
		<dochead>Research article</dochead>
		<bibl>
			<title>
				<p>Cross-species EST alignments reveal novel and conserved alternative splicing events in legumes</p>
			</title>
			<aug>
				<au id="A1">
					<snm>Wang</snm>
					<fnm>Bing-Bing</fnm>
					<insr iid="I1"/>
					<insr iid="I3"/>
					<email>wangx741@umn.edu</email>
				</au>
				<au id="A2">
					<snm>O'Toole</snm>
					<fnm>Mike</fnm>
					<insr iid="I1"/>
					<email>mike.otoole@gmail.com</email>
				</au>
				<au id="A3">
					<snm>Brendel</snm>
					<fnm>Volker</fnm>
					<insr iid="I2"/>
					<email>vbrendel@iastate.edu</email>
				</au>
				<au id="A4" ca="yes">
					<snm>Young</snm>
					<mi>D</mi>
					<fnm>Nevin</fnm>
					<insr iid="I1"/>
					<email>neviny@umn.edu</email>
				</au>
			</aug>
			<insg>
				<ins id="I1">
					<p>Department of Plant Pathology, University of Minnesota, St. Paul, MN 55108, USA</p>
				</ins>
				<ins id="I2">
					<p>Department of Genetics, Development and Cell Biology and Department of Statistics, Iowa State University, Ames, IA 50011, USA</p>
				</ins>
				<ins id="I3">
					<p>Pioneer Hi-Bred International, Inc., a DuPont company, 7200 N.W. 62nd Avenue, Johnston, IA 50131, USA</p>
				</ins>
			</insg>
			<source>BMC Plant Biology</source>
			<issn>1471-2229</issn>
			<pubdate>2008</pubdate>
			<volume>8</volume>
			<issue>1</issue>
			<fpage>17</fpage>
			<url>http://www.biomedcentral.com/1471-2229/8/17</url>
			<xrefbib>
				<pubidlist>
					<pubid idtype="pmpid">18282305</pubid>
					<pubid idtype="doi">10.1186/1471-2229-8-17</pubid>
				</pubidlist>
			</xrefbib>
		</bibl>
		<history>
			<rec>
				<date>
					<day>16</day>
					<month>10</month>
					<year>2007</year>
				</date>
			</rec>
			<acc>
				<date>
					<day>19</day>
					<month>2</month>
					<year>2008</year>
				</date>
			</acc>
			<pub>
				<date>
					<day>19</day>
					<month>2</month>
					<year>2008</year>
				</date>
			</pub>
		</history>
		<cpyrt>
			<year>2008</year>
			<collab>Wang et al; licensee BioMed Central Ltd.</collab>
			<note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
		</cpyrt>
		<abs>
			<sec>
				<st>
					<p>Abstract</p>
				</st>
				<sec>
					<st>
						<p>Background</p>
					</st>
					<p>Although originally thought to be less frequent in plants than in animals, alternative splicing (AS) is now known to be widespread in plants. Here we report the characteristics of AS in legumes, one of the largest and most important plant families, based on EST alignments to the genome sequences of <it>Medicago truncatula </it>(<it>Mt</it>) and <it>Lotus japonicus </it>(<it>Lj</it>).</p>
				</sec>
				<sec>
					<st>
						<p>Results</p>
					</st>
					<p>Based on cognate EST alignments alone, the observed frequency of alternatively spliced genes is lower in <it>Mt </it>(~10%, 1,107 genes) and <it>Lj </it>(~3%, 92 genes) than in <it>Arabidopsis </it>and rice (both around 20%). However, AS frequencies are comparable in all four species if EST levels are normalized. Intron retention is the most common form of AS in all four plant species (~50%), with slightly lower frequency in legumes compared to <it>Arabidopsis </it>and rice. This differs notably from vertebrates, where exon skipping is most common. To uncover additional AS events, we aligned ESTs from other legume species against the <it>Mt </it>genome sequence. In this way, 248 additional <it>Mt </it>genes were predicted to be alternatively spliced. We also identified 22 AS events completely conserved in two or more plant species.</p>
				</sec>
				<sec>
					<st>
						<p>Conclusion</p>
					</st>
					<p>This study extends the range of plant taxa shown to have high levels of AS, confirms the importance of intron retention in plants, and demonstrates the utility of using ESTs from related species in order to identify novel and conserved AS events. The results also indicate that the frequency of AS in plants is comparable to that observed in mammals. Finally, our results highlight the importance of normalizing EST levels when estimating the frequency of alternative splicing.</p>
				</sec>
			</sec>
		</abs>
	</fm>
   <bdy>
		<sec>
			<st>
				<p>Background</p>
			</st>
			<p>Alternative splicing (AS) is an important cellular process that leads to multiple mRNA isoforms from a single pre-mRNA in eukaryotic organisms. Plant AS events used to be regarded as rare. However, a growing number of computational studies have now demonstrated that the frequency of alternatively spliced genes in plants is higher than previously estimated <abbrgrp>
					<abbr bid="B1">1</abbr>
					<abbr bid="B2">2</abbr>
				</abbrgrp>. 20&#8211;30% of expressed genes are alternatively spliced in <it>Arabidopsis thaliana </it>(<it>At</it>) and rice (<it>Oryza sativa, Os</it>) as revealed by large scale EST-genome alignments <abbrgrp>
					<abbr bid="B1">1</abbr>
					<abbr bid="B2">2</abbr>
				</abbrgrp>. A recent study using EST pairs gapped alignments (EST-EST) surveyed 11 plant species and suggested that overall AS frequencies vary greatly in different plant species, with some rates comparable to those observed in animals <abbrgrp>
					<abbr bid="B3">3</abbr>
				</abbrgrp>. In mammals, exon skipping (ExonS) is the most common type of AS <abbrgrp>
					<abbr bid="B4">4</abbr>
					<abbr bid="B5">5</abbr>
				</abbrgrp>, but in <it>At </it>and <it>Os</it>, intron retention (IntronR) is most abundant <abbrgrp>
					<abbr bid="B1">1</abbr>
				</abbrgrp>. Alternative acceptor site (AltA) and alternative donor site (AltD) are also common in these two model plants <abbrgrp>
					<abbr bid="B1">1</abbr>
					<abbr bid="B2">2</abbr>
				</abbrgrp>. A rare type of AS event is alternative position (AltP), where an alternative intron differs from its constitutive form in both donor and acceptor sites <abbrgrp>
					<abbr bid="B1">1</abbr>
				</abbrgrp>. Examples of all five types of AS events are shown in Additional file <supplr sid="S1">1</supplr> (Supplementary Figure S1). Recently, a novel approach involving whole-genome microarray data revealed that IntronR can be detected in ~8% of <it>At </it>genes <abbrgrp>
					<abbr bid="B6">6</abbr>
				</abbrgrp>. The prevalent IntronR events suggest that an intron recognition mechanism is predominant in <it>At </it>and <it>Os </it>
				<abbrgrp>
					<abbr bid="B1">1</abbr>
				</abbrgrp>. A small fraction of conserved AS events have also been discovered and confirmed between <it>At </it>and <it>Os</it>, strongly indicating the functional importance of AS in plants <abbrgrp>
					<abbr bid="B1">1</abbr>
				</abbrgrp>.</p>
			<suppl id="S1">
				<title>
					<p>Additional file 1</p>
				</title>
				<text>
					<p>
						<b>Supplementary figures and tables</b>. This pdf document contains supplementary figures and tables for the main manuscript.</p>
				</text>
				<file name="1471-2229-8-17-S1.pdf">
					<p>Click here for file</p>
				</file>
			</suppl>
			<p>Most computational studies on AS in mammals and plants use transcript sequences from the same species as their genome sequences. For species with relatively small EST/cDNA collections, transcript sequences from closely related species can be a valuable resource for identification of additional AS events. Even for species with large EST collections, including human and mouse, cross-species EST alignment have been used to reveal novel AS events. As many as 42% of human genes show novel AS patterns by aligning mouse transcripts to human genome <abbrgrp>
					<abbr bid="B7">7</abbr>
				</abbrgrp>, and more than 10% of human loci exhibit conserved AS events in mouse <abbrgrp>
					<abbr bid="B8">8</abbr>
				</abbrgrp>. Another study applying the cross-species strategy to human, mouse and rat identified 758 novel cassette-on exons (ExonS) as well as 167 novel retained introns (IntronR). RT-PCR validated 50~80% of tested events, indicating the impressive potential of the cross-species method in identifying novel AS events <abbrgrp>
					<abbr bid="B9">9</abbr>
				</abbrgrp>. In plants, cross-species transcripts have been used mainly for gene annotation. For example, transcript assemblies from 185 species were mapped to the <it>Os </it>genome, confirming about 90% of gene predictions plus about 500 novel genes <abbrgrp>
					<abbr bid="B10">10</abbr>
				</abbrgrp>. Similarly, approximately 850 novel genes and 1,000 novel AS events were annotated in <it>Os </it>by aligning ESTs from seven plant species <abbrgrp>
					<abbr bid="B11">11</abbr>
				</abbrgrp>. The AS events supported by cross-species transcripts are likely to be functional, as they are conserved between species.</p>
			<p>Experimental studies provide additional insight into the function of AS in plants. A wide range of plant genes with diverse functions are regulated through AS, including (but not limited to) genes involved in transcription, splicing, photosynthesis, disease resistance, stress, flowering and grain quality (reviewed in <abbrgrp>
					<abbr bid="B12">12</abbr>
					<abbr bid="B13">13</abbr>
				</abbrgrp>). Genes involved in splicing, especially in splicing regulation, seem to have a higher frequency of AS <abbrgrp>
					<abbr bid="B14">14</abbr>
				</abbrgrp>. Several recent studies have revealed that serine/arginine-rich (SR) protein transcripts exhibit extensive levels of AS and that some AS pattern are conserved between <it>At </it>and <it>Os </it>
				<abbrgrp>
					<abbr bid="B15">15</abbr>
					<abbr bid="B16">16</abbr>
					<abbr bid="B17">17</abbr>
					<abbr bid="B18">18</abbr>
				</abbrgrp>. Maize SR protein transcripts are also alternatively spliced <abbrgrp>
					<abbr bid="B19">19</abbr>
					<abbr bid="B20">20</abbr>
				</abbrgrp>. Temperature stress (cold and heat) as well as hormone treatment can change the AS patterns of SR proteins in <it>At</it>, suggesting an important role for AS in the stress response <abbrgrp>
					<abbr bid="B15">15</abbr>
				</abbrgrp>. One <it>At </it>U2AF35 homolog (atU2AF35a) is alternatively spliced by removing non-canonical introns with repeated borders in the 3'-end of the coding region. Changing the expression of U2AF35 homologs alters the splicing pattern of the FCA gene and, in turn, causes variation in flowering time <abbrgrp>
					<abbr bid="B21">21</abbr>
				</abbrgrp>. The U1-70K gene encodes a core protein in U1 small nuclear ribonucleoproteins (snRNP). The sixth intron of U1-70K can be retained in <it>At </it>
				<abbrgrp>
					<abbr bid="B22">22</abbr>
				</abbrgrp>, an event conserved between <it>At </it>and <it>Os </it>
				<abbrgrp>
					<abbr bid="B1">1</abbr>
				</abbrgrp>. Recently, the IntronR event was experimentally confirmed in <it>Os </it>and maize <abbrgrp>
					<abbr bid="B23">23</abbr>
				</abbrgrp>.</p>
			<p>Over 400 genes in 54 plant species are now known to be alternatively spliced <abbrgrp>
					<abbr bid="B24">24</abbr>
				</abbrgrp>. Only a few AS events, however, have been reported in legumes (<it>Fabaceae</it>), one of the largest and most important plant families. In <it>Lotus japonicus </it>(<it>Lj</it>), a phytochelatin synthase gene (LjPCS2) can be alternatively spliced, with one isoform present in nodules (LjPCS2-7N) and another isoform in roots (LjPCS2-7R). The two isoforms encode proteins differing only in five amino acids, where one protein (LjPCS2-7N) confers cadmium (Cd) tolerance while the other does not, at least not when ectopically expressed in yeast cells <abbrgrp>
					<abbr bid="B25">25</abbr>
				</abbrgrp>. A nodule specific gene (LjNOD70) shows an IntronR event in <it>Lj</it>, where the spliced isoform is less abundant in nodules <abbrgrp>
					<abbr bid="B26">26</abbr>
				</abbrgrp>. Six sucrose synthase genes exist in <it>At</it>, <it>Os </it>and <it>Lj</it>, but only the <it>Lj </it>homolog (LjSUS2) is alternatively spliced <abbrgrp>
					<abbr bid="B27">27</abbr>
				</abbrgrp>. In soybean (<it>Glycine max</it>,<it>Gm</it>), a nodule specific gene (GmPGN) has been identified through EST data mining. Experiments confirmed the tissue specificity and also revealed AS events for this gene <abbrgrp>
					<abbr bid="B28">28</abbr>
				</abbrgrp>. In kidney bean (<it>Phaseolus vulgaris</it>), a single gene (PvSBE2) can be alternatively spliced to produce two starch-branching enzyme isoforms, each with distinct characteristics and subcellular localization <abbrgrp>
					<abbr bid="B29">29</abbr>
				</abbrgrp>. A highly abundant novel giant retroelement (<it>Orge</it>) of pea (<it>Pisum sativum</it>) is partially spliced, probably regulating the ratio of full-length protein, as the retained intron causes truncation <abbrgrp>
					<abbr bid="B30">30</abbr>
				</abbrgrp>.</p>
			<p>Two legume plants, <it>Medicago truncatula </it>(<it>Mt</it>) and <it>L. japonicus </it>(<it>Lj</it>), have large-scale genome sequencing projects in progress <abbrgrp>
					<abbr bid="B31">31</abbr>
				</abbrgrp>. In late 2006, the <it>Medicago </it>genome sequence consortium (MGSC) constructed a partial genome assembly based on 1,996 Bacterial Artificial Chromosome (BAC) clone sequences as a basis for constructing draft pseudochromosomes. A total of 42,358 genes were annotated by the International <it>Medicago </it>Genome Annotation Group (IMGAG) <abbrgrp>
					<abbr bid="B32">32</abbr>
				</abbrgrp>, representing ~60% of all <it>Mt </it>genes. The data has been released as Mt1.0, available at <abbrgrp>
					<abbr bid="B33">33</abbr>
				</abbrgrp>. In parallel, <it>Lj </it>has 1,394 Transformation-competent Artificial Chromosomes (TACs) in GenBank (as of mid-2006), with 488 of them at phase 3 (finished). Both legume model plants have relatively large EST collections (over 150,000 sequences). There are also large numbers of transcript sequences from other legume species, especially soybean. These features make <it>Mt </it>and <it>Lj </it>ideal for computational comparison of AS events in legume and other plants.</p>
			<p>In this study, all available transcript sequences from legumes were aligned to <it>Mt </it>and <it>Lj </it>BAC/TAC sequences. <it>At </it>and <it>Os </it>transcript sequences were also aligned to their own genome sequences for comparison purpose. The frequency of alternatively spliced genes is very similar across the different plant species as long as the number of ESTs used as a basis for analysis is standardized across different species. In the case of <it>Mt</it>, about 10% of expressed genes are alternatively spliced at current EST coverage, with IntronR the most abundant type. Novel and conserved AS events can be identified if cross-species ESTs are aligned to the genome. These results provide a basis for analyzing AS events conserved in all plants as well as those found in legumes only. This is the first large-scale analysis of AS using EST-genome alignments in plants other than <it>At </it>and <it>Os</it>, and it is also the first detailed comparison using cross-species transcript sequences in plants.</p>
		</sec>
		<sec>
			<st>
				<p>Results</p>
			</st>
			<sec>
				<st>
					<p>Characteristics of legume exons and introns</p>
				</st>
				<p>Two computer programs, GeneSeqer <abbrgrp>
						<abbr bid="B34">34</abbr>
					</abbrgrp> and GMAP <abbrgrp>
						<abbr bid="B35">35</abbr>
					</abbrgrp>, produced largely similar results for the alignment of EST sequences to their native genomes for the <it>Mt</it>, <it>Lj</it>, <it>At</it>, and <it>Os </it>data sets. To reduce the likelihood of alignment artifacts as a result of ambiguities, only the commonly predicted alignments from the two programs were used in further analyses. Moreover, highly stringent criteria (>95% sequence identity, >80% transcript coverage) were used to limit the possibility of transcript mapping to non-cognate, diverged locations in the incompletely sequenced genomes. Approximately one half and one third of the species-specific EST sets could be aligned to the current <it>Mt </it>and <it>Lj </it>genome sequences, respectively, roughly reflecting the coverage of the whole genomes by their current sequence assemblies. For <it>Lj</it>, ~15% of the transcript sequences were mapped to finished (phase 3) BAC/TACs. Unless stated otherwise, our analyses for <it>Lj </it>were based solely on this subset. As shown in Table <tblr tid="T1">1</tblr>, a total of 11,516 and 3,298 genes/transcription units (TU, as defined in METHODS) were identified in <it>Mt </it>and <it>Lj</it>, respectively, with 74% and 57% of them having multiple EST support. The average number of ESTs per gene/TU was 10 and 7 in <it>Mt </it>and <it>Lj</it>, respectively, compared with 26 and 30 in <it>At </it>and <it>Os</it>.</p>
				<tbl id="T1">
					<title>
						<p>Table 1</p>
					</title>
					<caption>
						<p>Transcript alignments, intron and exon features in plants</p>
					</caption>
					<tblbdy cols="5">
						<r>
							<c>
								<p/>
							</c>
							<c ca="left">
								<p>Medicago</p>
							</c>
							<c ca="left">
								<p>Lotus<sup>#</sup>
								</p>
							</c>
							<c ca="left">
								<p>Arabidopsis</p>
							</c>
							<c ca="left">
								<p>Rice</p>
							</c>
						</r>
						<r>
							<c cspan="5">
								<hr/>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>EST/cDNA total</p>
							</c>
							<c ca="left">
								<p>225,920</p>
							</c>
							<c ca="left">
								<p>150,855</p>
							</c>
							<c ca="left">
								<p>691,516</p>
							</c>
							<c ca="left">
								<p>1,009,754</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Mapped to genome^</p>
							</c>
							<c ca="left">
								<p>104,382 (46.2%)</p>
							</c>
							<c ca="left">
								<p>22,144 (14.7%)*</p>
							</c>
							<c ca="left">
								<p>589,254 (85.2%)</p>
							</c>
							<c ca="left">
								<p>916,825 (90.8%)</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Transcription unit (TU)/Genes</p>
							</c>
							<c ca="left">
								<p>11,516</p>
							</c>
							<c ca="left">
								<p>3,298</p>
							</c>
							<c ca="left">
								<p>22,518</p>
							</c>
							<c ca="left">
								<p>31,044</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>MultiEST TU/Genes</p>
							</c>
							<c ca="left">
								<p>8,544 (74.2%)</p>
							</c>
							<c ca="left">
								<p>1,879 (57.0%)</p>
							</c>
							<c ca="left">
								<p>19,857 (88.2%)</p>
							</c>
							<c ca="left">
								<p>26,859 (86.5%)</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Average (Median) ESTs/gene</p>
							</c>
							<c ca="left">
								<p>9.8 (4)</p>
							</c>
							<c ca="left">
								<p>6.9 (2)</p>
							</c>
							<c ca="left">
								<p>26.3 (11)</p>
							</c>
							<c ca="left">
								<p>30.1 (10)</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Number of Introns</p>
							</c>
							<c ca="left">
								<p>32,860</p>
							</c>
							<c ca="left">
								<p>4,357</p>
							</c>
							<c ca="left">
								<p>97,095</p>
							</c>
							<c ca="left">
								<p>107,162</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Average (Median) intron size</p>
							</c>
							<c ca="left">
								<p>472 (218)</p>
							</c>
							<c ca="left">
								<p>458 (215)</p>
							</c>
							<c ca="left">
								<p>171 (101)</p>
							</c>
							<c ca="left">
								<p>438 (164)</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Long intron (>1000 nt)</p>
							</c>
							<c ca="left">
								<p>12.7%</p>
							</c>
							<c ca="left">
								<p>10.9%</p>
							</c>
							<c ca="left">
								<p>0.7%</p>
							</c>
							<c ca="left">
								<p>10.7%</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Number of internal exons</p>
							</c>
							<c ca="left">
								<p>24,600</p>
							</c>
							<c ca="left">
								<p>2,717</p>
							</c>
							<c ca="left">
								<p>78,911</p>
							</c>
							<c ca="left">
								<p>83,668</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Average (Median) internal exon</p>
							</c>
							<c ca="left">
								<p>140 (108)</p>
							</c>
							<c ca="left">
								<p>127 (100)</p>
							</c>
							<c ca="left">
								<p>164 (114)</p>
							</c>
							<c ca="left">
								<p>175 (113)</p>
							</c>
						</r>
					</tblbdy>
					<tblfn>
						<p>^ Transcript sequences are required to have >95% identity and >80% coverage to be considered as mapped.</p>
						<p>
							<sup># </sup>Lotus data are based on the ESTs aligned to finished TACs (phase 3).</p>
						<p>* A total of 48,691 (32.3% of 150,855) transcript sequences can be mapped to <it>Lj </it>TACs in all phases, including phase 1, phase 2 and phase 3.</p>
					</tblfn>
				</tbl>
				<p>We compared intron/exon features revealed by EST alignments in the four species. The intron size distribution was quite similar in <it>Mt </it>and <it>Lj</it>, with a mean intron size around 460&#8211;470 nt and median approximately 220 nt in both species. Legume introns are therefore significantly longer than in <it>At </it>(mean 171 nt, median 101 nt) and slightly longer than <it>Os </it>introns (mean 438 nt, median 164 nt). As shown in Figure <figr fid="F1">1A</figr>, the intron size distributions have a peak near 90 nt in all four species. <it>Mt </it>and <it>Lj </it>have fewer introns shorter than 150 nt but more introns longer than 200 nt compared with <it>At </it>and <it>Os</it>. <it>At </it>introns are clearly the shortest of the four plants. Fewer than 1% of introns are longer than 1,000 nt in <it>At</it>, while this number is over 10% in the other plant species. Exon size tends to be similar among the four plant species, with legume exons slightly shorter than <it>At </it>and <it>Os </it>exons. In <it>Mt </it>and <it>Lj</it>, the mean internal exon sizes are 140 and 127 nt, respectively, with the median sizes about 108 nt and 100 nt. <it>At </it>and <it>Os </it>have internal exons with a mean of 164 nt and 175 nt and a median of 113 nt and 114 nt. Figure <figr fid="F1">1B</figr> shows that the size distributions of exons in <it>Mt</it>, <it>At </it>and <it>Os </it>all display a peak at around 80 nt. <it>Lj </it>data is less consistent due to its small sample size. In contrast to introns, the frequency of exons smaller than 150 nt is higher in <it>Mt </it>and <it>Lj </it>than in <it>At </it>and <it>Os</it>, while the frequency of exons longer than 200 nt is lower in legumes. Overall, legumes have longer introns but slightly shorter exons than <it>At </it>and <it>Os</it>. Generally speaking, plant introns are longer than exons. More than 40% of introns in <it>Mt</it>, <it>Lj </it>and <it>Os </it>are longer than 300 nt, while less than 10% exons are so large.</p>
				<fig id="F1">
					<title>
						<p>Figure 1</p>
					</title>
					<caption>
						<p>Size distributions of introns and internal exons in plants</p>
					</caption>
					<text>
						<p>
							<b>Size distributions of introns and internal exons in plants</b>. The x-axis indicates the size of either introns (A) or internal exons (B). Each number except the last one is labeled with the upper bound (e.g., 100 nt comprises size 51&#8211;100 nt). The y-axis indicates the fraction of total introns (A) or internal exons (B) for a given size range of intron or internal exon. The insets show a detailed distribution of smaller (&lt;300 nt) introns (A) or internal exons (B). The bin size is 10, and 100 nt comprises size 91&#8211;100 nt for the insets.</p>
					</text>
					<graphic file="1471-2229-8-17-1"/>
				</fig>
				<p>As noted previously <abbrgrp>
						<abbr bid="B1">1</abbr>
						<abbr bid="B36">36</abbr>
					</abbrgrp>, the GC-content of introns and exons is ~5% lower in <it>At </it>than in <it>Os</it>. The GC-content of legume introns and exons is very similar to that of <it>At</it>, although <it>Mt </it>has slightly lower GC-content than either <it>At </it>or <it>Lj </it>in both intronic and exonic regions (see Additional file <supplr sid="S1">1</supplr>, Supplementary Table S1 and Supplementary Figure S2). G-content and A-content are similar in all species including <it>Os</it>, although <it>Os </it>introns are relatively more C-rich and less U-rich. There is more variation in the distribution of U-(T-) and A- content than in G- or C-content in all species (see Additional file <supplr sid="S1">1</supplr>, Supplementary Figure S3). The difference in GC-content between introns and exons is about 10% in all four species, with <it>Mt </it>showing the largest difference of 11.7% and <it>Os </it>showing the smallest, 9.6% (see Additional file <supplr sid="S1">1</supplr>, Supplementary Table S1).</p>
			</sec>
			<sec>
				<st>
					<p>Different plant species have similar levels of alternatively spliced genes</p>
				</st>
				<p>Previous studies revealed that approximately 20% of expressed genes are alternatively spliced in <it>At </it>and <it>Os</it>, with half of the AS events being intron retention (IntronR) <abbrgrp>
						<abbr bid="B1">1</abbr>
					</abbrgrp>. When we re-examined AS frequency in <it>At </it>and <it>Os </it>for this study, we also found a frequency of around 20%. However the total number of transcript sequences increased 80%-200% due to the increased sizes of the EST data sets in these species. In the case of <it>Mt </it>and <it>Lj</it>, the number of ESTs available for analysis were much lower. Consistently, the fraction alternatively spliced genes observed was much lower, just 9.6% in <it>Mt </it>and 2.8% in <it>Lj </it>(Table <tblr tid="T2">2</tblr>). Examples of alternatively spliced genes in <it>Mt </it>are shown in Additional file <supplr sid="S1">1</supplr>, Supplementary Figure S1. All the AS data are deposited and viewable at the ASIP site <abbrgrp>
						<abbr bid="B37">37</abbr>
					</abbrgrp>.</p>
				<tbl id="T2">
					<title>
						<p>Table 2</p>
					</title>
					<caption>
						<p>Comparison of alternative splicing events and frequencies in plants</p>
					</caption>
					<tblbdy cols="5">
						<r>
							<c>
								<p/>
							</c>
							<c ca="left">
								<p>Medicago</p>
							</c>
							<c ca="left">
								<p>Lotus<sup>#</sup>
								</p>
							</c>
							<c ca="left">
								<p>Arabidopsis</p>
							</c>
							<c ca="left">
								<p>Rice</p>
							</c>
						</r>
						<r>
							<c cspan="5">
								<hr/>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>AltD</p>
							</c>
							<c ca="left">
								<p>204 (13.5%)</p>
							</c>
							<c ca="left">
								<p>18 (15.7%)</p>
							</c>
							<c ca="left">
								<p>818 (11.3%)</p>
							</c>
							<c ca="left">
								<p>1,165 (9.6%)</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>AltA</p>
							</c>
							<c ca="left">
								<p>350 (23.1%)</p>
							</c>
							<c ca="left">
								<p>37 (32.2%)</p>
							</c>
							<c ca="left">
								<p>1,785 (24.7%)</p>
							</c>
							<c ca="left">
								<p>2,377 (19.5%)</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>AltP</p>
							</c>
							<c ca="left">
								<p>21 (1.4%)</p>
							</c>
							<c ca="left">
								<p>2 (1.7%)</p>
							</c>
							<c ca="left">
								<p>106 (1.5%)</p>
							</c>
							<c ca="left">
								<p>306 (2.5%)</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>ExonS</p>
							</c>
							<c ca="left">
								<p>162 (10.7%)</p>
							</c>
							<c ca="left">
								<p>10 (8.7%)</p>
							</c>
							<c ca="left">
								<p>445 (6.2%)</p>
							</c>
							<c ca="left">
								<p>1,332 (10.9%)</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>IntronR</p>
							</c>
							<c ca="left">
								<p>778 (51.3%)</p>
							</c>
							<c ca="left">
								<p>48 (41.7%)</p>
							</c>
							<c ca="left">
								<p>4,062 (56.3%)</p>
							</c>
							<c ca="left">
								<p>7,011 (57.5%)</p>
							</c>
						</r>
						<r>
							<c cspan="5">
								<hr/>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Total</p>
							</c>
							<c ca="left">
								<p>1,515</p>
							</c>
							<c ca="left">
								<p>115</p>
							</c>
							<c ca="left">
								<p>7,216</p>
							</c>
							<c ca="left">
								<p>12,191</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>AS genes</p>
							</c>
							<c ca="left">
								<p>1,107 (9.6%)</p>
							</c>
							<c ca="left">
								<p>92 (2.8%)</p>
							</c>
							<c ca="left">
								<p>4,497 (20.0%)</p>
							</c>
							<c ca="left">
								<p>6,313 (20.3%)</p>
							</c>
						</r>
					</tblbdy>
					<tblfn>
						<p>Percentages in parenthesis for each alternative splicing type are the portion relative to the total events. Percentages for AS genes are the portion of alternatively spliced genes relative to the total number of expressed genes (genes/TU) in Table 1.</p>
						<p>
							<sup># </sup>Lotus data are based on the ESTs aligned to finished TACs (phase 3).</p>
					</tblfn>
				</tbl>
				<p>To compare the frequency of alternative splicing between different species, earlier studies relied on 10 randomly selected ESTs per gene as a basis for estimating AS frequency <abbrgrp>
						<abbr bid="B4">4</abbr>
					</abbrgrp>. Here, only a small fraction (10&#8211;20%) of legume genes were covered by 10 or more ESTs, so this approach was not practical. Instead, we plotted the AS frequency for all groups of genes with similar EST coverage in different species, as shown in Figure <figr fid="F2">2</figr>. <it>Mt </it>categories with fewer than 80 genes total were removed to reduce noise due to small sample size, and <it>Lj </it>data are not included at all, as sample size was uniformly too small. When analyzed in this way, the fractions of alternatively spliced genes are similar regardless of species for nearly all size classes. For genes with four ESTs (the median EST number per gene in <it>Mt</it>), the observed AS frequency is 6&#8211;12% in <it>Mt</it>, <it>At</it>, and <it>Os </it>alike. For genes with nine to 11 ESTs (the median EST number per gene in <it>Os </it>and <it>At</it>), 15&#8211;23% are alternatively spliced. In general, the fraction of alternatively spliced genes keeps increasing with increasing transcript coverage, eventually reaching 66% in <it>Os </it>and 46% in <it>At </it>for genes with hundreds of ESTs, a levels similar to those observed in mammals <abbrgrp>
						<abbr bid="B38">38</abbr>
						<abbr bid="B39">39</abbr>
					</abbrgrp>. Interestingly, the AS level in <it>Os </it>is consistently over 10% higher than in <it>At </it>in genes with more than 40 supporting ESTs.</p>
				<fig id="F2">
					<title>
						<p>Figure 2</p>
					</title>
					<caption>
						<p>Correlation between AS frequency and EST coverage</p>
					</caption>
					<text>
						<p>
							<b>Correlation between AS frequency and EST coverage</b>. The x-axis indicates groups of genes with certain numbers of ESTs. The primary y-axis for the bar graph indicates total number of genes within each group. The secondary y-axis for the line graph indicates the fraction of alternatively spliced genes for the group. Note that different bin sizes were used to keep the number of genes in each group greater than 500 in <it>At </it>and <it>Os</it>. AS data from groups with fewer than 80 genes in <it>Mt </it>were removed to reduce noise. <it>Lj </it>data were not shown as only the first six groups have more than 80 genes.</p>
					</text>
					<graphic file="1471-2229-8-17-2"/>
				</fig>
			</sec>
			<sec>
				<st>
					<p>IntronR is the most abundant AS type in legumes</p>
				</st>
				<p>As shown in Table <tblr tid="T2">2</tblr>, the proportions of different AS types are similar in <it>Mt</it>, <it>At </it>and <it>Os</it>. (<it>Lj </it>data are also listed but are not included in the analysis as only ~100 AS events were identified). More than half of AS events in plants are IntronR, 6&#8211;11% are ExonS, and the remaining 30&#8211;40% involve different splice sites (AltD/A/P). These numbers are quite similar to those observed previously <abbrgrp>
						<abbr bid="B1">1</abbr>
					</abbrgrp>. <it>Mt </it>has a slightly lower ratio of IntronR (51%) and a higher ratio of AltD (13%) compared with <it>At </it>and <it>Os</it>. Different levels of EST coverage have little effect on the composition of AS events. As shown in Additional file <supplr sid="S1">1</supplr> (Supplementary Figure S4), the ratios of different AS types remain largely constant across all EST levels, particularly in <it>At </it>and <it>Os</it>. IntronR is the most abundant at all levels, with a relatively lower ratio in <it>Mt</it>. The ExonS ratio is consistently lower in <it>At </it>than in <it>Os </it>(and <it>Mt</it>), while the AltA ratio is higher.</p>
				<p>To minimizes false AS events caused by sequencing errors or contaminations in the EST collection, we repeated the above analysis for the subset of AS events that are supported by at least two transcript sequences <abbrgrp>
						<abbr bid="B40">40</abbr>
					</abbrgrp>. As shown in Figure <figr fid="F3">3</figr>, the ratio of IntronR decreased ~5% in all plants in this subset. <it>Mt </it>has the lowest ratio of IntronR (45%), 6&#8211;7% lower than in <it>At </it>and <it>Os</it>. The ratio of ExonS remains unchanged compared with the full data set. In <it>Mt </it>and <it>Os</it>, 10&#8211;11% AS events are ExonS compared to 7% in <it>At</it>. The AltD ratio in <it>Mt </it>increased significantly to 21% in the subset, nearly double the ratio in <it>At </it>and <it>Os</it>. In <it>At</it>, the AltA ratio is ~30% compared to 23% in <it>Mt </it>and <it>Os</it>. Similar tendencies were observed for subset data with even more transcripts supporting each isoform. Both the full and subset data indicate that <it>Mt </it>has a lower ratio of IntronR and a higher ratio of AltD, and that <it>At </it>has a lower ratio of ExonS but a higher ratio of AltA.</p>
				<fig id="F3">
					<title>
						<p>Figure 3</p>
					</title>
					<caption>
						<p>Ratio of different AS types in a reliable subset of AS events</p>
					</caption>
					<text>
						<p>
							<b>Ratio of different AS types in a reliable subset of AS events</b>. The reliable data set consisted of AS events with multiple supporting ESTs for each isoform. IntronR is still the most abundant AS type in the subset. The error bar represents the ratio for each AS type in full data set described in Table 2.</p>
					</text>
					<graphic file="1471-2229-8-17-3"/>
				</fig>
			</sec>
			<sec>
				<st>
					<p>Cross-species EST alignment in Medicago reveals hundreds of novel AS events</p>
				</st>
				<p>Even "reliable" AS events (as defined above) may not necessarily be functional. Because conservation is usually a good indicator of function, we deployed a cross-species approach similar to large-scale methods used previously in mammals to identify functional AS events <abbrgrp>
						<abbr bid="B7">7</abbr>
						<abbr bid="B9">9</abbr>
					</abbrgrp>. All available EST sequences from <it>Lj</it>, <it>Gm</it>, and other legume species were aligned against <it>Mt </it>BACs. One concern with the cross-species approaches has been a potentially high error rate <abbrgrp>
						<abbr bid="B7">7</abbr>
					</abbrgrp>. Here, even using an identity cutoff as high as 80%, hundreds of AS events were identified from either GeneSeqer or GMAP alignments alone, with approximately 40% of events consistent between the programs. Our analysis used only common events identified by the two programs to reduce false positive events from alignment errors. As shown in Table <tblr tid="T3">3</tblr>, 10&#8211;20% of the non-<it>Mt </it>legume transcript sequences could be mapped to <it>Mt </it>BACs and clustered to a total of 7,896 non-redundant genes, 81% of which have also <it>Mt </it>EST support. Approximately 70% of the introns identified from cross-species EST alignments were consistent with <it>Mt </it>EST supported introns. The gene structures derived from cross-species ESTs and <it>Mt </it>ESTs alignments were mostly consistent, demonstrating the value of cross-species ESTs in genome annotation <abbrgrp>
						<abbr bid="B10">10</abbr>
					</abbrgrp>. In this analysis, a total of 307 <it>Mt </it>genes (3.9%) were found to be alternatively spliced, with 248 genes having no evidence of AS from <it>Mt </it>ESTs alone. If these novel AS events are included, the estimated frequency of <it>Mt </it>alternatively spliced gene increases from 9.6% to 10.4%. Interestingly, many more AS events were identified from soybean ESTs than from <it>Lj </it>ESTs, despite the similar evolutionary distance between <it>Mt-Gm </it>versus <it>Mt</it>-<it>Lj</it>. <it>At </it>and <it>Os </it>EST sequences were also applied in a comparable cross-species analysis, but only 1% of them could be mapped using the same criteria. No reliable AS events were deduced from <it>At </it>and <it>Os </it>transcript sequences.</p>
				<tbl id="T3">
					<title>
						<p>Table 3</p>
					</title>
					<caption>
						<p>Cross-species EST alignments in Medicago</p>
					</caption>
					<tblbdy cols="9">
						<r>
							<c ca="left">
								<p>Species</p>
							</c>
							<c ca="left">
								<p>EST/cDNA</p>
							</c>
							<c ca="left">
								<p>Mapped to <it>Mt </it>BACs</p>
							</c>
							<c ca="left">
								<p>Genes</p>
							</c>
							<c ca="left">
								<p>Genes without <it>Mt </it>EST</p>
							</c>
							<c ca="left">
								<p>AS Genes</p>
							</c>
							<c ca="left">
								<p>Novel AS*</p>
							</c>
							<c ca="left">
								<p>Predicted introns</p>
							</c>
							<c ca="left">
								<p>Consistent introns^</p>
							</c>
						</r>
						<r>
							<c cspan="9">
								<hr/>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Lotus</p>
							</c>
							<c ca="left">
								<p>150,855</p>
							</c>
							<c ca="left">
								<p>15,542 (10.3%)</p>
							</c>
							<c ca="left">
								<p>2,955</p>
							</c>
							<c ca="left">
								<p>367 (12.4%)</p>
							</c>
							<c ca="left">
								<p>12 (3.3%)</p>
							</c>
							<c ca="left">
								<p>8</p>
							</c>
							<c ca="left">
								<p>5,606</p>
							</c>
							<c ca="left">
								<p>4,256</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Soybean</p>
							</c>
							<c ca="left">
								<p>359,834</p>
							</c>
							<c ca="left">
								<p>42,665 (11.9%)</p>
							</c>
							<c ca="left">
								<p>5,810</p>
							</c>
							<c ca="left">
								<p>925 (15.9%)</p>
							</c>
							<c ca="left">
								<p>242 (4.2%)</p>
							</c>
							<c ca="left">
								<p>201</p>
							</c>
							<c ca="left">
								<p>16,758</p>
							</c>
							<c ca="left">
								<p>11,420</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Other legumes</p>
							</c>
							<c ca="left">
								<p>127,684</p>
							</c>
							<c ca="left">
								<p>26,547 (20.8%)</p>
							</c>
							<c ca="left">
								<p>5,335</p>
							</c>
							<c ca="left">
								<p>700 (13.1%)</p>
							</c>
							<c ca="left">
								<p>69 (1.3%)</p>
							</c>
							<c ca="left">
								<p>50</p>
							</c>
							<c ca="left">
								<p>13,052</p>
							</c>
							<c ca="left">
								<p>9,926</p>
							</c>
						</r>
						<r>
							<c cspan="9">
								<hr/>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Total</p>
							</c>
							<c ca="left">
								<p>638,373</p>
							</c>
							<c ca="left">
								<p>84,754 (13.3%)</p>
							</c>
							<c ca="left">
								<p>7,896</p>
							</c>
							<c ca="left">
								<p>1,475 (18.7%)</p>
							</c>
							<c ca="left">
								<p>307 (3.9%)</p>
							</c>
							<c ca="left">
								<p>248</p>
							</c>
							<c ca="left">
								<p>23,179</p>
							</c>
							<c ca="left">
								<p>15,506</p>
							</c>
						</r>
					</tblbdy>
					<tblfn>
						<p>* Novel AS gene indicates genes not identified as alternative splicing by <it>Mt </it>EST.</p>
						<p>^ Consistent introns indicate number of introns predicted from cross-species ESTs which are also supported by <it>Mt </it>EST.</p>
					</tblfn>
				</tbl>
				<p>Altogether, 367 cross-species AS events were identified from legume cross-species EST alignment, including 35.7% IntronR, 16.9% ExonS, 16.1% AltD, 29.1% AltA, and 2.2% AltP (Table <tblr tid="T4">4</tblr>). Compared with AS events identified using <it>Mt </it>ESTs alone, the cross-species AS events display a relatively lower ratio of IntronR and higher ratios of ExonS, AltD, and AltA. As most of the cross-species AS events are likely conserved between <it>Mt </it>and the native species of the EST, the ratio of each AS type in cross-species AS events could be interpreted to represent the ratio of functional AS events. However, the ratio of IntronR could have been underestimated by cross-species EST alignments because intron sequences are not as well-conserved as exons, even in closely related species. Thus, some cross-species ESTs retaining introns from their native species might have been filtered by the 80% identity cutoff. The location and outcome of cross-species AS events and same-species AS events are compared in Additional file <supplr sid="S1">1</supplr> (Supplementary Table S2).</p>
				<tbl id="T4">
					<title>
						<p>Table 4</p>
					</title>
					<caption>
						<p>AS events predicted from cross-species EST alignment in Medicago</p>
					</caption>
					<tblbdy cols="7">
						<r>
							<c ca="left">
								<p>Species</p>
							</c>
							<c ca="left">
								<p>AS events</p>
							</c>
							<c ca="left">
								<p>AltD</p>
							</c>
							<c ca="left">
								<p>AltA</p>
							</c>
							<c ca="left">
								<p>AltP</p>
							</c>
							<c ca="left">
								<p>ExonS</p>
							</c>
							<c ca="left">
								<p>IntronR</p>
							</c>
						</r>
						<r>
							<c cspan="7">
								<hr/>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Lotus</p>
							</c>
							<c ca="left">
								<p>12</p>
							</c>
							<c ca="left">
								<p>2 (16.7%)</p>
							</c>
							<c ca="left">
								<p>6 (50.0%)</p>
							</c>
							<c ca="left">
								<p>1 (8.3%)</p>
							</c>
							<c ca="left">
								<p>2 (16.7%)</p>
							</c>
							<c ca="left">
								<p>1 (8.3%)</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Soybean</p>
							</c>
							<c ca="left">
								<p>276</p>
							</c>
							<c ca="left">
								<p>40 (14.5%)</p>
							</c>
							<c ca="left">
								<p>75 (27.2%)</p>
							</c>
							<c ca="left">
								<p>5 (1.8%)</p>
							</c>
							<c ca="left">
								<p>53 (19.2%)</p>
							</c>
							<c ca="left">
								<p>103 (37.3%)</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Other legume</p>
							</c>
							<c ca="left">
								<p>87</p>
							</c>
							<c ca="left">
								<p>20 (23%)</p>
							</c>
							<c ca="left">
								<p>26 (29.9%)</p>
							</c>
							<c ca="left">
								<p>2 (2.3%)</p>
							</c>
							<c ca="left">
								<p>7 (8.0%)</p>
							</c>
							<c ca="left">
								<p>32 (36.8%)</p>
							</c>
						</r>
						<r>
							<c cspan="7">
								<hr/>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>Total</p>
							</c>
							<c ca="left">
								<p>367</p>
							</c>
							<c ca="left">
								<p>59 (16.1%)</p>
							</c>
							<c ca="left">
								<p>107 (29.1%)</p>
							</c>
							<c ca="left">
								<p>8 (2.2%)</p>
							</c>
							<c ca="left">
								<p>62 (16.9%)</p>
							</c>
							<c ca="left">
								<p>131 (35.7%)</p>
							</c>
						</r>
					</tblbdy>
				</tbl>
				<p>Approximately 90% of cross-species AS events are located in open reading frames (ORFs), much higher than the fraction (70&#8211;75%) in same-species AS events. There seem to be more cross-species and same-species AS events in the 5'-UTR than in the 3'-UTR (data not shown and <abbrgrp>
						<abbr bid="B1">1</abbr>
					</abbrgrp>). For AS events in ORFs, the fractions of translation-readthrough events, where some amino acids are added to or removed from the protein without changing the reading frame, are similar (20&#8211;24%) in cross-species and same-species events. AltA has the highest translation-readthrough ratio (35&#8211;40%), and IntronR has the lowest (2&#8211;10%). Intriguingly, the ratio of AS events producing substrates for nonsense-mediated decay (NMD) <abbrgrp>
						<abbr bid="B41">41</abbr>
					</abbrgrp> is higher in cross-species AS events than in same-species AS events. Nearly half of the cross-species AS events produce NMD substrates, compared with 30&#8211;40% in same-species AS events.</p>
			</sec>
			<sec>
				<st>
					<p>Conserved AS events identified from cross-species EST alignments in legumes</p>
				</st>
				<p>To identify AS events with direct evidence of conservation in multiple species, two approaches were employed: (1) Align all legume ESTs to <it>Lj </it>TACs to identify conserved AS events predicted by the same ESTs between <it>Mt </it>and <it>Lj</it>; (2) Identify conserved AS events in <it>Mt </it>with EST evidence from multiple legume species, all showing the same AS pattern. A total of 242 AS events conserved between <it>Mt </it>and <it>Lj </it>were identified through method (1), including 92 (38.0%) IntronR, 26 (10.7%) ExonS, 78 (32.2%) AltA, 41 (17.0%) AltD, and 5 (2.1%) AltP events. These AS events are viewable at the ASIP website. Method (2) identified 22 completely conserved AS events in <it>Mt </it>(see Additional file <supplr sid="S1">1</supplr>, Supplementary Table S3). Nine of the 22 genes also have <it>At </it>and/or <it>Os </it>close homologs sharing the same AS pattern. For instance, <it>Mt </it>hypothetical protein AC156627_1 has both soybean and <it>Mt </it>ESTs support for an AltA event in the first ORF intron, whereby an isoform utilizes an alternative acceptor site 5-nt upstream (AACAG) of the constitutive acceptor site (AGCAG), producing a substrate possibly subject to NMD. <it>At </it>homologs (At5g25360.1 and At1g15350.1) and <it>Os </it>homolog (LOC_Os02g10720) both have exactly the same AS pattern, including the alternative acceptor sites. This gene seems to be plant-specific, as non-plant homologs can not be identified. Another example of completely conserved AS events is the <it>Mt </it>AP2 domain containing protein AC151460_3, where the 3'-UTR intron can be retained. One <it>At </it>homolog and three <it>Os </it>homologs also have the same intron retained. There are also some AS events conserved in legumes but not observed in <it>At </it>and <it>Os</it>. One example is AC124951_11, a highly expressed carbonic anhydrase gene with the 3'-UTR intron alternatively spliced (AltD) in legumes species. The AltD event is conserved in all legume species (<it>Mt</it>, <it>Lj</it>, <it>Gm</it>, and others), but not in <it>At </it>and <it>Os </it>even though hundreds of ESTs exist, indicating that this AS event is probably legume-specific.</p>
				<p>One example of a completely conserved ExonS event occurs in an enoyl-CoA hydratase/isomerase gene (<it>Mt</it>: AC145449_47). As shown in Figure <figr fid="F4">4A</figr>, the IMGAG-annotated gene structure for AC145449_47 contains 11 exons, each with strong EST support. Exon3 (65 nt) and Exon4 (53 nt) are mutually exclusive. In one isoform, Exon3 is retained and Exon4 is skipped (<it>Mt</it>: 7206545, 90656179; <it>Lj: </it>45578881; Lupine: 27458685). In another isoform, Exon4 is retained with Exon3 skipped (<it>Mt</it>: 7567285, 11904359, 13596489, 33106093; <it>Lj</it>: 7719575). The two mRNA isoforms therefore encode two proteins (418 aa and 414 aa) differing slightly in their predicted Enoyl-CoA hydratase domain (ECH, pfam00378). No isoform contains both exons, while it is possible to skip both (<it>Mt</it>: 83667352). Two genes in <it>At </it>(At4g13360 and At3g24360), one gene in <it>Os </it>(LOC_Os06g39344) and one in <it>Lj </it>(<it>Lj</it>TC_2465, AP006370.1: 88858&#8211;94512) are the closest homologs to AC145449_47. Exactly the same AS pattern was observed in all the homologous genes except for At4g13360, where the 65-nt exon (Exon3) was retained constitutively and no trace of the 53-nt exon can be found in the corresponding region (Figure <figr fid="F4">4C&#8211;E</figr>). Sequence comparison revealed several nucleotide bases in degenerate codons conserved in all four species (Figure <figr fid="F4">4B</figr>). These bases may contribute to the recognition of (or skipping) the exon.</p>
				<fig id="F4">
					<title>
						<p>Figure 4</p>
					</title>
					<caption>
						<p>Completely conserved ExonS event in plant enoyl-CoA hydratase/isomerase genes</p>
					</caption>
					<text>
						<p>
							<b>Completely conserved ExonS event in plant enoyl-CoA hydratase/isomerase genes</b>. <b>A</b>: same-species and cross-species EST alignments in <it>Mt </it>gene locus AC145499_47. Filled boxes and arrows indicate exons, and lines indicate introns. Green open or filled boxes indicate exons skipped or retained in certain ESTs. The top black scale indicates coordinates for the gene locus on BAC (AC145499). The blue bar represents the IMGAG annotated gene model, with the green triangle representing the protein translation start codon and the red triangle representing the stop codon. Red bars represent individual same species EST alignments. Purple bars represent <it>Lj </it>ESTs, dark yellow bars represent soybean ESTs, and gray bars represent ESTs from other legume species. <b>B</b>. Multiple sequence alignments of the mutual exclusive exons. E3 indicates the Exon 3 and E4 indicates the Exon 4. At2E3 refers to the exon in the second copy of <it>At </it>gene (At4g13360). Amino acids encoded by <it>Mt </it>sequences are list at the top of sequence alignment. Degenerate positions (change in nucleotide will not change amino acids) which are conserved in all exons are highlighted in colors. <b>C</b>. EST alignment in the second copy of <it>At </it>gene (At4g13360). Only exon E3 exists in this gene and no ExonS can be detected. <b>D, E</b>. EST alignment in <it>At </it>and <it>Os </it>genes where the ExonS pattern is completely conserved.</p>
					</text>
					<graphic file="1471-2229-8-17-4"/>
				</fig>
			</sec>
		</sec>
		<sec>
			<st>
				<p>Discussion</p>
			</st>
			<sec>
				<st>
					<p>Comparison of AS frequencies in different species</p>
				</st>
				<p>In this study, alignment of current EST and genomic sequences revealed that ~10% of expressed genes are alternatively spliced in <it>Mt </it>compared with 20% in <it>At </it>and <it>Os</it>. This difference is mainly due to the lower EST coverage found in <it>Mt</it>. We demonstrated that the AS frequencies in the three plants are essentially similar when adjusted for genes having comparable EST numbers. This conclusion is different from the conclusion drawn in a recent study based on EST pairs gapped alignments, in which a greater degree of variation was observed for different plant species <abbrgrp>
						<abbr bid="B3">3</abbr>
					</abbrgrp>. Interpretation of EST-only data can be confounded by extensive gene duplication events. With more plant genome sequences becoming available, it should soon be possible to more precisely address the intriguing questions concerning the extent and evolution of AS in plants.</p>
				<p>Alternatively spliced isoforms are usually in low abundance, the chance of capturing them in a small EST collection is low, making it difficult to estimate AS frequencies accurately. Supposing a functional event has certain percentage <it>p </it>of transcripts alternatively spliced, the probability of observing an AS event with <it>n </it>ESTs covering the alternative splice site is 1 - (1 - <it>p</it>)<sup>
						<it>n</it>
					</sup>. For example, if an alternatively spliced isoform were generated <it>p </it>= 10% of the time, n = 10 transcript sequences would give a 65% probability of observing this event, and 22 transcript sequences would be required to have >90% probability of observing the event. Our results show that the AS frequency for genes with small numbers of ESTs are similar in <it>Mt</it>, <it>At</it>, and <it>Os</it>, suggesting that they all have similar levels of functional AS events.</p>
				<p>In cases where AS isoforms are even lower in abundance, greater numbers of transcripts would be clearly necessary to detect the event. Nevertheless, <it>Os </it>seems to have a higher frequency of AS in genes with >30 ESTs than either <it>Mt </it>or <it>At</it>. Focusing on genes with >40 ESTs only, the AS frequency in <it>Os </it>is consistently (>10%) higher than in <it>At</it>. For this analysis, we did not include transcripts from <it>Os </it>subspecies <it>indica </it>in order to eliminate the possibility that the higher AS frequency is falsely caused by cross-subspecies ESTs. In any case, the error rates from EST sequencing or genome contamination are probably similar in all three plants. Consequently, <it>Os </it>does seem to have higher levels of low-abundance AS events than <it>At </it>(or <it>Mt</it>). Some of the low-abundance events may be splicing errors captured in EST libraries constructed from plant tissues under various growth conditions, so the higher level of low-abundance AS events in <it>Os </it>could indicate higher error rates for the <it>Os </it>spliceosome.</p>
				<p>Not surprisingly, observed AS frequency is highly correlated with EST numbers in all three plants. Highly expressed genes (genes with large numbers of ESTs) are more likely to be detected as alternatively spliced. Over 60% and 40% genes with more than 500 ESTs are alternatively spliced in <it>Os </it>and <it>At</it>, respectively. This is comparable to the level in human <abbrgrp>
						<abbr bid="B42">42</abbr>
					</abbrgrp>. Half of human genes are alternatively spliced by the criterion that AS isoforms occurs in at least 1% of the observed transcripts, but only 20% of human genes are alternatively spliced if the required abundance level is increased to >10% <abbrgrp>
						<abbr bid="B42">42</abbr>
					</abbrgrp>. This frequency is notably similar to the frequency in plants under the same abundance level, suggesting that the frequency of regulated AS events in plants may not be significantly lower than in mammals.</p>
			</sec>
			<sec>
				<st>
					<p>Splicing errors and functional AS events</p>
				</st>
				<p>A clear difference between AS in plants and mammals is the predominance of IntronR in plants and ExonS in mammals. Both model legumes, <it>Mt </it>and <it>Lj</it>, have 40&#8211;50% of AS events as IntronR, a level noticeably lower than in <it>At </it>and <it>Os</it>, but still much higher than in mammals. Similar to the situation in <it>At </it>and <it>Os </it>
					<abbrgrp>
						<abbr bid="B1">1</abbr>
					</abbrgrp>, introns shorter than 70 nt are more likely to be retained in legumes (data not shown). The spliceosome is a large dynamic RNA-protein complex involving hundreds of proteins. If an intron is too small, the assembly and structure transformation of spliceosome will be constrained and may lead to inefficient splicing and IntronR <abbrgrp>
						<abbr bid="B1">1</abbr>
					</abbrgrp>. As the size of introns is considerably larger in <it>Mt </it>and <it>Lj</it>, fewer introns will be retained due to steric hindrance, possibly leading to a lower frequency of IntronR in legumes. These data also suggest that some AS events may be splicing errors. As we proposed in <abbrgrp>
						<abbr bid="B1">1</abbr>
					</abbrgrp>, the most common splicing error in plants is probably a failure to recognize and splice out introns, so IntronR should be the most common AS type. In mammals, where introns are defined through an exon recognition mechanism, a failure to recognize some exons, and therefore skip them, is likely the most common error. Consequently, ExonS is the most common AS type in human.</p>
				<p>Observed AS events are a mixture of functional AS events and splicing errors. Other types of error, such as sequencing errors, genome contamination, and alignment errors, will also contribute to the predicted level of AS events. Two alignment programs (GeneSeqer and GMAP) were applied and only common AS events were used in this study to minimize alignment errors. Genome contamination could be minimized by elimination of ESTs retaining all predicted introns. Distinguishing functional AS events from splicing errors, however, is not an easy task. We attempted to achieve this goal by two methods. First, we selected AS events with each isoform supported by multiple transcripts. As splicing errors are expected to occur at low frequency, the chances they will be captured in two distinct transcripts are low. In this data set, the frequency of IntronR is slightly lower, but still the highest among the five AS types, indicating that IntronR is indeed the most abundant regulated AS result. The second method is to look for conserved AS events through cross-species EST comparison and orthologous gene comparison. A few AS events were completely conserved in <it>Mt</it>, <it>Lj</it>, <it>At </it>and <it>Os</it>.</p>
				<p>Functional AS events, however, may not always be conserved. As a dynamic process, splicing requires hundreds of proteins as well as some snRNAs to function accurately <abbrgrp>
						<abbr bid="B14">14</abbr>
					</abbrgrp>. Mutations in both <it>trans</it>- and <it>cis</it>-elements on target genes will impact splicing patterns. Depending on when the mutation and fixation event occurs, functional AS events can be shared among closely related species or be lineage-specific. The AltD event in 3'-UTR of the highly expressed carbonic anhydrase gene (AC124951_11) may be a good example shared by legume species. Lineage-specific functional AS events are difficult to define from EST data alone.</p>
			</sec>
			<sec>
				<st>
					<p>Centralized data place and standard data set for ASIP</p>
				</st>
				<p>As more plant genomes and ESTs are being sequenced, more AS events will be identified in the future. It is important to have a centralized place to store and compare all AS data. In animal systems, a comprehensive database, ASAP <abbrgrp>
						<abbr bid="B43">43</abbr>
					</abbrgrp> includes AS data from 16 sequenced animals, which makes a comparison across different animal species straightforward. Such a database is also needed in plants, as the study of splicing signals and alternative splicing are just starting. The AS data identified in this study have been deposited in the ASIP database at PlantGDB <abbrgrp>
						<abbr bid="B37">37</abbr>
					</abbrgrp>, where previous AS data are stored and can be easily compared <abbrgrp>
						<abbr bid="B1">1</abbr>
					</abbrgrp>. Moreover, a database collecting genes related to splicing in <it>At</it>, animals and yeast is available through the SRGD database at PlantGDB <abbrgrp>
						<abbr bid="B14">14</abbr>
						<abbr bid="B44">44</abbr>
					</abbrgrp>. In the future, the database will be expanded to <it>Os </it>and other sequenced plant genomes including <it>Mt</it>, <it>Lj </it>and poplar. The analysis programs and plant genome browsers available at PlantGDB should facilitate the deep mining of AS data in plants. A core data set in which the AS events are conserved in all sequenced plants will be extremely useful for understanding the function of AS events, as well as the signals and regulation of this important and intriguing phenomenon.</p>
			</sec>
		</sec>
		<sec>
			<st>
				<p>Conclusion</p>
			</st>
			<p>As in <it>At </it>and <it>Os</it>, AS events are also widespread in the two model legumes <it>Mt </it>and <it>Lj</it>. Thousands of AS events were identified in <it>Mt </it>through a combination of same- and cross-species EST alignments. The frequency of alternatively spliced genes is similar across different plant species when the number of ESTs is standardized. Compared with mammals, plants are thought to have a relatively low frequency of alternatively spliced genes. Our results indicate that this assessment may be due in part to the comparatively low EST coverage in plant species. Among all five AS types discussed, IntronR is the most abundant in different subsets of genes, as previously observed in <it>At </it>and <it>Os</it>. We also identified hundreds of novel and conserved AS events through cross-species ESTs alignments. This is the first study in plants using cross-species ESTs to explore AS. For species with large EST collections but scant genome sequence data, including wheat and barley, aligning their ESTs to a closely related reference genome, such as <it>Os</it>, should shed light on alternative splicing in these species.</p>
		</sec>
		<sec>
			<st>
				<p>Methods</p>
			</st>
			<sec>
				<st>
					<p>Data sets</p>
				</st>
				<p>The <it>Medicago </it>Genome Sequence Consortium (MGSC) release 1.0, consisting of the 1,826 BACs analyzed in this study, were downloaded from <it>Medicago </it>genome sequencing project website <abbrgrp>
						<abbr bid="B45">45</abbr>
					</abbrgrp>. The assembly comprises a total of 186.2 Mb of non-redundant genome sequence, an estimated 38&#8211;47% of the entire genome and 55&#8211;58% of total gene space <abbrgrp>
						<abbr bid="B46">46</abbr>
					</abbrgrp>. All other sequence data sets used in this study were current as of July 17, 2006, the cutoff date for BACs incorporated into the Mt1.0 genome assembly. For <it>Lotus japonicus</it>, 1,394 BAC/TACs were downloaded from the NCBI <abbrgrp>
						<abbr bid="B47">47</abbr>
					</abbrgrp> nucleotide database using the query "txid34305 [ORGN:noexp] AND HTG [KYWD]". <it>Arabidopsis </it>genome sequences and gene annotation (TAIR release 6.0) were downloaded from the GenBank FTP site <abbrgrp>
						<abbr bid="B48">48</abbr>
					</abbrgrp>, and rice genome sequences and gene annotation (TIGR release 4.0) were downloaded from the TIGR FTP site <abbrgrp>
						<abbr bid="B49">49</abbr>
					</abbrgrp>.</p>
				<p>All EST sequences (including full-length cDNAs) were retrieved from GenBank nucleotide database. Sets of 225,920 <it>Mt </it>and 150,855 <it>Lj </it>transcript sequences were collected using the queries (txid3880 [ORGN] AND "biomol mrna" [PROP]) and (txid34305 [ORGN] AND "biomol mrna" [PROP]), respectively. Soybean transcript sequences (359,834) were retrieved using the query (txid3847 [ORGN] AND "biomol mrna" [PROP]), and 127,684 transcript sequences from all other legumes were retrieved by using the query (txid3803 [ORGN:exp] NOT txid3880 [ORGN] NOT txid34305 [ORGN] NOT txid3847 [ORGN] AND "biomol mrna" [PROP]). For <it>At</it>, 691,516 transcript sequences were retrieved using the query (txid3702 [ORGN] AND "biomol mrna" [PROP] AND srcdb_ddbj/embl/genbank [PROP]). For <it>Os</it>, 1,009,574 ESTs from the <it>japonica </it>cultivar-group were retrieved using query (txid39947 [ORGN] AND "biomol mrna" [PROP] AND srcdb_ddbj/embl/genbank [PROP]). We intentionally excluded transcript sequences from the <it>indica </it>cultivar-group to reduce possible false positive alignments caused by differences between the two <it>Os </it>cultivar-groups.</p>
			</sec>
			<sec>
				<st>
					<p>Spliced alignment of transcript to genome sequences</p>
				</st>
				<p>The legume transcript sequences were mapped to the <it>Mt </it>and <it>Lj </it>BAC sets using the two computer programs GeneSeqer <abbrgrp>
						<abbr bid="B34">34</abbr>
					</abbrgrp> and GMAP <abbrgrp>
						<abbr bid="B35">35</abbr>
					</abbrgrp>. The splice site models for GeneSeqer were set to <it>Medicago</it>-specific parameters using the program option "-s Medicago". Default parameters were used for all other options. Default alignment parameters were used for GMAP. For <it>At </it>and <it>Os</it>, only GMAP alignments were performed locally, and GeneSeqer alignments derived from a larger data set were downloaded from PlantGDB <abbrgrp>
						<abbr bid="B50">50</abbr>
					</abbrgrp>.</p>
				<p>GMAP and GeneSeqer output alignment files were processed by a pipeline (ASpipe1.0, available through SourceForge <abbrgrp>
						<abbr bid="B51">51</abbr>
					</abbrgrp>) developed from Perl and shell scripts used in a previous study <abbrgrp>
						<abbr bid="B1">1</abbr>
					</abbrgrp>. ASpipe extracts coordinates and scores for high-quality intron/exon/alignments from the original program outputs and stores them in MySQL5.0 databases. For same-species EST alignments, the criteria for high-quality alignments were >95% sequence identity and >80% coverage (defined as the portion the transcript sequence aligned to the genomic sequence). The high identity (95%) cutoff minimizes false mapping of transcript sequences to incomplete genomes. For cross-species transcript alignments, the identity cutoff was decreased to 80%, which selects reliable alignments from divergent transcript sequences. Redundant EST alignments in <it>Mt </it>were removed by comparison with the non-redundant gene list provided for Mt1.0 <abbrgrp>
						<abbr bid="B33">33</abbr>
					</abbrgrp>. Exons mapped with >95% and >80% sequence identity were considered as reliably identified exons for same-species and cross-species mappings, respectively. Introns with reliable neighboring exons on both ends were considered as reliably identified introns. A transcription unit (TU) was defined as a consecutive genomic region where transcript sequences were mapped and clustered. Annotated gene models may contain multiple TUs. For <it>Mt</it>, <it>At </it>and <it>Os</it>, annotated genes were used as the base for analysis. For <it>Lj</it>, where no gene annotation is available, TUs were the base for analysis.</p>
			</sec>
			<sec>
				<st>
					<p>Identification of alternative splicing (AS) and conserved AS events</p>
				</st>
				<p>The coordinates of reliable introns and exons were compared in a pairwise fashion in order to identify candidates for AS events. For intron/intron comparison, if two introns had the same 3'-end but a different 5'-end, this event was classified as AltD. If two introns differed only in the 3'-ends, this event was classified as AltA. AltP events refer to introns overlapping with each other but with both 5'- and 3'-ends differing. For intron/exon comparisons, if an intron was completely covered by an exon, the event was classified as IntronR. If an exon was completely covered by an intron, the event was classified as ExonS. ExonS events involving terminal exons and the AltA/D/P events related to ExonS events were removed. The process and algorithm for identifying and analyzing AS events is described in more detail in <abbrgrp>
						<abbr bid="B1">1</abbr>
					</abbrgrp>. AS events identified from cross-species EST alignment were labeled as "cross-species AS events". Correspondingly, the events from same-species EST alignment were referred to as "same-species AS events".</p>
				<p>Conserved AS events were identified in two ways: (1) Comparing cross-species AS events with same-species AS events and other cross-species AS events from different species; (2) Identifying orthologous gene pairs between <it>Mt </it>and <it>Lj </it>and comparing their AS events. In the first method, the <it>Mt </it>genome coordinates of the AS events predicted from multiple species ESTs were compared. Only events with identical coordinates of an alternatively processed intron(s)/exon(s) were regarded as completely conserved. In the second method, the orthologous genes were identified by searching ESTs mapped in both <it>Mt </it>and <it>Lj </it>genomes. In some cases, orthologs in <it>At </it>and <it>Os </it>were identified by reciprocal BLAST using annotated protein sequences from <it>At</it>, <it>Mt </it>and <it>Os</it>. Gene structures and AS events of orthologous genes were then compared to identify conserved AS events.</p>
			</sec>
		</sec>
		<sec>
			<st>
				<p>Abbreviations</p>
			</st>
			<p>AltA, Alternative Acceptor site; AltD, Alternative Donor site; AltP, Alternative Position (both donor and acceptor sites are different). AS, Alternative Splicing; <it>At, Arabidopsis thaliana</it>; EST, expressed sequence tag; ExonS, Exon Skipping; IntronR, Intron Retention; <it>Lj</it>: <it>Lotus japonicus</it>; <it>Mt: Medicago truncatula</it>; NMD, nonsense-mediated decay; ORF, open reading frame; <it>Os, Oryza sativa;</it>
			</p>
		</sec>
		<sec>
			<st>
				<p>Authors' contributions</p>
			</st>
			<p>BBW conceived of the study, performed research, analyzed data and drafted the manuscript. MOT participated in data analysis and web page creation. VB participated in data analysis and presentation and helped to draft the manuscript. NDY participated in the design of this study, coordinated data analysis and helped to draft the manuscript. All authors read and approved the final manuscript.</p>
		</sec>
	</bdy>
   <bm>
		<ack>
			<sec>
				<st>
					<p>Acknowledgements</p>
				</st>
				<p>BBW and MOT were supported by National Science Foundation grants DBI-0321460 and DBI-0606966 to NY. Data generated in this study are hosted at and publicly available through the ASIP database at PlantGDB <abbrgrp>
						<abbr bid="B37">37</abbr>
					</abbrgrp>, funded through NSF grant DBI-0606909 to VB.</p>
			</sec>
		</ack>
		<refgrp>
			<bibl id="B1">
				<title>
					<p>Genomewide comparative analysis of alternative splicing in plants</p>
				</title>
				<aug>
					<au>
						<snm>Wang</snm>
						<fnm>BB</fnm>
					</au>
					<au>
						<snm>Brendel</snm>
						<fnm>V</fnm>
					</au>
				</aug>
				<source>Proc Natl Acad Sci USA</source>
				<pubdate>2006</pubdate>
				<volume>103</volume>
				<issue>18</issue>
				<fpage>7175</fpage>
				<lpage>7180</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">1459036</pubid>
						<pubid idtype="pmpid" link="fulltext">16632598</pubid>
						<pubid idtype="doi">10.1073/pnas.0602039103</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B2">
				<title>
					<p>Comprehensive analysis of alternative splicing in rice and comparative analyses with Arabidopsis</p>
				</title>
				<aug>
					<au>
						<snm>Campbell</snm>
						<fnm>MA</fnm>
					</au>
					<au>
						<snm>Haas</snm>
						<fnm>BJ</fnm>
					</au>
					<au>
						<snm>Hamilton</snm>
						<fnm>JP</fnm>
					</au>
					<au>
						<snm>Mount</snm>
						<fnm>SM</fnm>
					</au>
					<au>
						<snm>Buell</snm>
						<fnm>CR</fnm>
					</au>
				</aug>
				<source>BMC Genomics</source>
				<pubdate>2006</pubdate>
				<volume>7</volume>
				<fpage>327</fpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">1769492</pubid>
						<pubid idtype="pmpid" link="fulltext">17194304</pubid>
						<pubid idtype="doi">10.1186/1471-2164-7-327</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B3">
				<title>
					<p>Comparative cross-species alternative splicing in plants</p>
				</title>
				<aug>
					<au>
						<snm>Ner-Gaon</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Leviatan</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Rubin</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>Fluhr</snm>
						<fnm>R</fnm>
					</au>
				</aug>
				<source>Plant Physiol</source>
				<pubdate>2007</pubdate>
				<volume>144</volume>
				<issue>3</issue>
				<fpage>1632</fpage>
				<lpage>1641</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">1914131</pubid>
						<pubid idtype="pmpid" link="fulltext">17496110</pubid>
						<pubid idtype="doi">10.1104/pp.107.098640</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B4">
				<title>
					<p>Different levels of alternative splicing among eukaryotes</p>
				</title>
				<aug>
					<au>
						<snm>Kim</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>Magen</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Ast</snm>
						<fnm>G</fnm>
					</au>
				</aug>
				<source>Nucleic Acids Res</source>
				<pubdate>2007</pubdate>
				<volume>35</volume>
				<issue>1</issue>
				<fpage>125</fpage>
				<lpage>131</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">1802581</pubid>
						<pubid idtype="pmpid" link="fulltext">17158149</pubid>
						<pubid idtype="doi">10.1093/nar/gkl924</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B5">
				<title>
					<p>Genome wide identification and classification of alternative splicing based on EST data</p>
				</title>
				<aug>
					<au>
						<snm>Gupta</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Zink</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Korn</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Vingron</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Haas</snm>
						<fnm>SA</fnm>
					</au>
				</aug>
				<source>Bioinformatics</source>
				<pubdate>2004</pubdate>
				<volume>20</volume>
				<fpage>2579</fpage>
				<lpage>2585</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1093/bioinformatics/bth288</pubid>
						<pubid idtype="pmpid" link="fulltext">15117759</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B6">
				<title>
					<p>Whole-genome microarray in Arabidopsis facilitates global analysis of retained introns</p>
				</title>
				<aug>
					<au>
						<snm>Ner-Gaon</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Fluhr</snm>
						<fnm>R</fnm>
					</au>
				</aug>
				<source>DNA Res</source>
				<pubdate>2006</pubdate>
				<volume>13</volume>
				<issue>3</issue>
				<fpage>111</fpage>
				<lpage>121</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1093/dnares/dsl003</pubid>
						<pubid idtype="pmpid" link="fulltext">16980712</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B7">
				<title>
					<p>Detection of novel splice forms in human and mouse using cross-species approach</p>
				</title>
				<aug>
					<au>
						<snm>Kan</snm>
						<fnm>Z</fnm>
					</au>
					<au>
						<snm>Castle</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Johnson</snm>
						<fnm>JM</fnm>
					</au>
					<au>
						<snm>Tsinoremas</snm>
						<fnm>NF</fnm>
					</au>
				</aug>
				<source>Pac Symp Biocomput</source>
				<pubdate>2004</pubdate>
				<volume>9</volume>
				<fpage>42</fpage>
				<lpage>53</lpage>
			</bibl>
			<bibl id="B8">
				<title>
					<p>Transcriptome and genome conservation of alternative splicing events in humans and mice</p>
				</title>
				<aug>
					<au>
						<snm>Sugnet</snm>
						<fnm>CW</fnm>
					</au>
					<au>
						<snm>Kent</snm>
						<fnm>WJ</fnm>
					</au>
					<au>
						<snm>Ares</snm>
						<fnm>M</fnm>
						<suf>Jr</suf>
					</au>
					<au>
						<snm>Haussler</snm>
						<fnm>D</fnm>
					</au>
				</aug>
				<source>Pac Symp Biocomput</source>
				<pubdate>2004</pubdate>
				<fpage>66</fpage>
				<lpage>77</lpage>
				<xrefbib>
					<pubid idtype="pmpid">14992493</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B9">
				<title>
					<p>Identification and evolutionary analysis of novel exons and alternative splicing events using cross-species EST-to-genome comparisons in human, mouse and rat</p>
				</title>
				<aug>
					<au>
						<snm>Chen</snm>
						<fnm>FC</fnm>
					</au>
					<au>
						<snm>Chen</snm>
						<fnm>CJ</fnm>
					</au>
					<au>
						<snm>Ho</snm>
						<fnm>JY</fnm>
					</au>
					<au>
						<snm>Chuang</snm>
						<fnm>TJ</fnm>
					</au>
				</aug>
				<source>BMC Bioinformatics</source>
				<pubdate>2006</pubdate>
				<volume>7</volume>
				<fpage>136</fpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">1479377</pubid>
						<pubid idtype="pmpid" link="fulltext">16536879</pubid>
						<pubid idtype="doi">10.1186/1471-2105-7-136</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B10">
				<title>
					<p>Improvement of whole-genome annotation of cereals through comparative analyses</p>
				</title>
				<aug>
					<au>
						<snm>Zhu</snm>
						<fnm>W</fnm>
					</au>
					<au>
						<snm>Buell</snm>
						<fnm>CR</fnm>
					</au>
				</aug>
				<source>Genome Res</source>
				<pubdate>2007</pubdate>
				<volume>17</volume>
				<issue>3</issue>
				<fpage>299</fpage>
				<lpage>310</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">1800921</pubid>
						<pubid idtype="pmpid" link="fulltext">17284677</pubid>
						<pubid idtype="doi">10.1101/gr.5881807</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B11">
				<title>
					<p>Plant Gene and Alternatively Spliced Variant Annotator. A plant genome annotation pipeline for rice gene and alternatively spliced variant identification with cross-species expressed sequence tag conservation from seven plant species</p>
				</title>
				<aug>
					<au>
						<snm>Chen</snm>
						<fnm>FC</fnm>
					</au>
					<au>
						<snm>Wang</snm>
						<fnm>SS</fnm>
					</au>
					<au>
						<snm>Chaw</snm>
						<fnm>SM</fnm>
					</au>
					<au>
						<snm>Huang</snm>
						<fnm>YT</fnm>
					</au>
					<au>
						<snm>Chuang</snm>
						<fnm>TJ</fnm>
					</au>
				</aug>
				<source>Plant Physiol</source>
				<pubdate>2007</pubdate>
				<volume>143</volume>
				<issue>3</issue>
				<fpage>1086</fpage>
				<lpage>1095</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">1820933</pubid>
						<pubid idtype="pmpid" link="fulltext">17220363</pubid>
						<pubid idtype="doi">10.1104/pp.106.092460</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B12">
				<title>
					<p>Alternative Splicing of Pre-Messenger RNAs in Plants in the Genomic Era</p>
				</title>
				<aug>
					<au>
						<snm>Reddy</snm>
						<fnm>AS</fnm>
					</au>
				</aug>
				<source>Annu Rev Plant Biol</source>
				<pubdate>2007</pubdate>
				<volume>58</volume>
				<fpage>267</fpage>
				<lpage>294</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1146/annurev.arplant.58.032806.103754</pubid>
						<pubid idtype="pmpid" link="fulltext">17222076</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B13">
				<title>
					<p>Nuclear pre-mRNA splicing in plants</p>
				</title>
				<aug>
					<au>
						<snm>Reddy</snm>
						<fnm>ASN</fnm>
					</au>
				</aug>
				<source>Critical Rev Plant Sci</source>
				<pubdate>2001</pubdate>
				<volume>20</volume>
				<fpage>523</fpage>
				<lpage>571</lpage>
				<xrefbib>
					<pubid idtype="doi">10.1016/S0735-2689(01)80004-6</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B14">
				<title>
					<p>The ASRG database: identification and survey of Arabidopsis thaliana genes involved in pre-mRNA splicing</p>
				</title>
				<aug>
					<au>
						<snm>Wang</snm>
						<fnm>BB</fnm>
					</au>
					<au>
						<snm>Brendel</snm>
						<fnm>V</fnm>
					</au>
				</aug>
				<source>Genome Biol</source>
				<pubdate>2004</pubdate>
				<volume>5</volume>
				<issue>12</issue>
				<fpage>R102</fpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">545797</pubid>
						<pubid idtype="pmpid" link="fulltext">15575968</pubid>
						<pubid idtype="doi">10.1186/gb-2004-5-12-r102</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B15">
				<title>
					<p>Alternative splicing of pre-mRNAs of Arabidopsis serine/arginine-rich proteins: regulation by hormones and stresses</p>
				</title>
				<aug>
					<au>
						<snm>Palusa</snm>
						<fnm>SG</fnm>
					</au>
					<au>
						<snm>Ali</snm>
						<fnm>GS</fnm>
					</au>
					<au>
						<snm>Reddy</snm>
						<fnm>AS</fnm>
					</au>
				</aug>
				<source>Plant J</source>
				<pubdate>2007</pubdate>
				<volume>49</volume>
				<issue>6</issue>
				<fpage>1091</fpage>
				<lpage>1107</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">17319848</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B16">
				<title>
					<p>Evolutionary conservation and regulation of particular alternative splicing events in plant SR proteins</p>
				</title>
				<aug>
					<au>
						<snm>Kalyna</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Lopato</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Voronin</snm>
						<fnm>V</fnm>
					</au>
					<au>
						<snm>Barta</snm>
						<fnm>A</fnm>
					</au>
				</aug>
				<source>Nucleic Acids Res</source>
				<pubdate>2006</pubdate>
				<volume>34</volume>
				<issue>16</issue>
				<fpage>4395</fpage>
				<lpage>4405</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">1636356</pubid>
						<pubid idtype="pmpid" link="fulltext">16936312</pubid>
						<pubid idtype="doi">10.1093/nar/gkl570</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B17">
				<title>
					<p>Survey of conserved alternative splicing events of mRNAs encoding SR proteins in land plants</p>
				</title>
				<aug>
					<au>
						<snm>Iida</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Go</snm>
						<fnm>M</fnm>
					</au>
				</aug>
				<source>Mol Biol Evol</source>
				<pubdate>2006</pubdate>
				<volume>23</volume>
				<issue>5</issue>
				<fpage>1085</fpage>
				<lpage>1094</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1093/molbev/msj118</pubid>
						<pubid idtype="pmpid" link="fulltext">16520337</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B18">
				<title>
					<p>The serine/arginine-rich protein family in rice plays important roles in constitutive and alternative splicing of pre-mRNA</p>
				</title>
				<aug>
					<au>
						<snm>Isshiki</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Tsumoto</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Shimamoto</snm>
						<fnm>K</fnm>
					</au>
				</aug>
				<source>Plant Cell</source>
				<pubdate>2006</pubdate>
				<volume>18</volume>
				<issue>1</issue>
				<fpage>146</fpage>
				<lpage>158</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">1323490</pubid>
						<pubid idtype="pmpid" link="fulltext">16339852</pubid>
						<pubid idtype="doi">10.1105/tpc.105.037069</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B19">
				<title>
					<p>Two novel arginine/serine (SR) proteins in maize are differentially spliced and utilize non-canonical splice sites</p>
				</title>
				<aug>
					<au>
						<snm>Gupta</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Wang</snm>
						<fnm>BB</fnm>
					</au>
					<au>
						<snm>Stryker</snm>
						<fnm>GA</fnm>
					</au>
					<au>
						<snm>Zanetti</snm>
						<fnm>ME</fnm>
					</au>
					<au>
						<snm>Lal</snm>
						<fnm>SK</fnm>
					</au>
				</aug>
				<source>Biochim Biophys Acta</source>
				<pubdate>2005</pubdate>
				<volume>1728</volume>
				<issue>3</issue>
				<fpage>105</fpage>
				<lpage>114</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">15780972</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B20">
				<title>
					<p>ASF/SF2-like maize pre-mRNA splicing factors affect splice site utilization and their transcripts are alternatively spliced</p>
				</title>
				<aug>
					<au>
						<snm>Gao</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Gordon-Kamm</snm>
						<fnm>WJ</fnm>
					</au>
					<au>
						<snm>Lyznik</snm>
						<fnm>LA</fnm>
					</au>
				</aug>
				<source>Gene</source>
				<pubdate>2004</pubdate>
				<volume>339</volume>
				<fpage>25</fpage>
				<lpage>37</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/j.gene.2004.06.047</pubid>
						<pubid idtype="pmpid" link="fulltext">15363843</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B21">
				<title>
					<p>Molecular characterization and phylogeny of U2AF35 homologs in plants</p>
				</title>
				<aug>
					<au>
						<snm>Wang</snm>
						<fnm>BB</fnm>
					</au>
					<au>
						<snm>Brendel</snm>
						<fnm>V</fnm>
					</au>
				</aug>
				<source>Plant Physiol</source>
				<pubdate>2006</pubdate>
				<volume>140</volume>
				<issue>2</issue>
				<fpage>624</fpage>
				<lpage>636</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">1361329</pubid>
						<pubid idtype="pmpid" link="fulltext">16407443</pubid>
						<pubid idtype="doi">10.1104/pp.105.073858</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B22">
				<title>
					<p>Structure and expression of a plant U1 snRNP 70K gene: alternative splicing of U1 snRNP 70K pre-mRNAs produces two different transcripts</p>
				</title>
				<aug>
					<au>
						<snm>Golovkin</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Reddy</snm>
						<fnm>AS</fnm>
					</au>
				</aug>
				<source>Plant Cell</source>
				<pubdate>1996</pubdate>
				<volume>8</volume>
				<issue>8</issue>
				<fpage>1421</fpage>
				<lpage>1435</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">161266</pubid>
						<pubid idtype="pmpid" link="fulltext">8776903</pubid>
						<pubid idtype="doi">10.1105/tpc.8.8.1421</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B23">
				<title>
					<p>Alternative splicing expression of U1 snRNP 70K gene is evolutionary conserved between different plant species</p>
				</title>
				<aug>
					<au>
						<snm>Gupta</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Ciungu</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Jameson</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Lal</snm>
						<fnm>SK</fnm>
					</au>
				</aug>
				<source>DNA Seq</source>
				<pubdate>2006</pubdate>
				<volume>17</volume>
				<issue>4</issue>
				<fpage>254</fpage>
				<lpage>261</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1080/10425170600856642</pubid>
						<pubid idtype="pmpid">17312944</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B24">
				<title>
					<p>Database and analyses of known alternatively spliced genes in plants</p>
				</title>
				<aug>
					<au>
						<snm>Zhou</snm>
						<fnm>Y</fnm>
					</au>
					<au>
						<snm>Zhou</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Ye</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Dong</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Xu</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Cai</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Zhang</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Wei</snm>
						<fnm>L</fnm>
					</au>
				</aug>
				<source>Genomics</source>
				<pubdate>2003</pubdate>
				<volume>82</volume>
				<issue>6</issue>
				<fpage>584</fpage>
				<lpage>595</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/S0888-7543(03)00204-0</pubid>
						<pubid idtype="pmpid" link="fulltext">14611800</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B25">
				<title>
					<p>Phytochelatin synthases of the model legume Lotus japonicus. A small multigene family with differential response to cadmium and alternatively spliced variants</p>
				</title>
				<aug>
					<au>
						<snm>Ramos</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Clemente</snm>
						<fnm>MR</fnm>
					</au>
					<au>
						<snm>Naya</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Loscos</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Perez-Rontome</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Sato</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Tabata</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Becana</snm>
						<fnm>M</fnm>
					</au>
				</aug>
				<source>Plant Physiol</source>
				<pubdate>2007</pubdate>
				<volume>143</volume>
				<issue>3</issue>
				<fpage>1110</fpage>
				<lpage>1118</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">1820930</pubid>
						<pubid idtype="pmpid" link="fulltext">17208961</pubid>
						<pubid idtype="doi">10.1104/pp.106.090894</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B26">
				<title>
					<p>The Lotus japonicus LjNOD70 nodulin gene encodes a protein with similarities to transporters</p>
				</title>
				<aug>
					<au>
						<snm>Szczyglowski</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Kapranov</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Hamburger</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>de Bruijn</snm>
						<fnm>FJ</fnm>
					</au>
				</aug>
				<source>Plant Mol Biol</source>
				<pubdate>1998</pubdate>
				<volume>37</volume>
				<issue>4</issue>
				<fpage>651</fpage>
				<lpage>661</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1023/A:1006043428636</pubid>
						<pubid idtype="pmpid" link="fulltext">9687069</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B27">
				<title>
					<p>TILLING Mutants of Lotus japonicus Reveal that Nitrogen Assimilation and Fixation can Occur in the Absence of Nodule-enhanced Sucrose Synthase</p>
				</title>
				<aug>
					<au>
						<snm>Horst</snm>
						<fnm>I</fnm>
					</au>
					<au>
						<snm>Welham</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Kelly</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Kaneko</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Sato</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Tabata</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Parniske</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Wang</snm>
						<fnm>TL</fnm>
					</au>
				</aug>
				<source>Plant Physiol</source>
				<pubdate>2007</pubdate>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">17468221</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B28">
				<title>
					<p>Structure, expression, and mapping of two nodule-specific genes identified by mining public soybean EST databases</p>
				</title>
				<aug>
					<au>
						<snm>Jeong</snm>
						<fnm>SC</fnm>
					</au>
					<au>
						<snm>Yang</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Park</snm>
						<fnm>JY</fnm>
					</au>
					<au>
						<snm>Han</snm>
						<fnm>KS</fnm>
					</au>
					<au>
						<snm>Yu</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Hwang</snm>
						<fnm>TY</fnm>
					</au>
					<au>
						<snm>Hur</snm>
						<fnm>CG</fnm>
					</au>
					<au>
						<snm>Kim</snm>
						<fnm>SH</fnm>
					</au>
					<au>
						<snm>Park</snm>
						<fnm>PB</fnm>
					</au>
					<au>
						<snm>Kim</snm>
						<fnm>HM</fnm>
					</au>
					<au>
						<snm>Park</snm>
						<fnm>YI</fnm>
					</au>
					<au>
						<snm>Liu</snm>
						<fnm>JR</fnm>
					</au>
				</aug>
				<source>Gene</source>
				<pubdate>2006</pubdate>
				<volume>383</volume>
				<fpage>71</fpage>
				<lpage>80</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/j.gene.2006.07.015</pubid>
						<pubid idtype="pmpid" link="fulltext">16973305</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B29">
				<title>
					<p>Differential characteristics and subcellular localization of two starch-branching enzyme isoforms encoded by a single gene in Phaseolus vulgaris L</p>
				</title>
				<aug>
					<au>
						<snm>Hamada</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Ito</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Hiraga</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Inagaki</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Nozaki</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Isono</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Yoshimoto</snm>
						<fnm>Y</fnm>
					</au>
					<au>
						<snm>Takeda</snm>
						<fnm>Y</fnm>
					</au>
					<au>
						<snm>Matsui</snm>
						<fnm>H</fnm>
					</au>
				</aug>
				<source>J Biol Chem</source>
				<pubdate>2002</pubdate>
				<volume>277</volume>
				<issue>19</issue>
				<fpage>16538</fpage>
				<lpage>16546</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1074/jbc.M110497200</pubid>
						<pubid idtype="pmpid" link="fulltext">11864975</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B30">
				<title>
					<p>Highly abundant pea LTR retrotransposon Ogre is constitutively transcribed and partially spliced</p>
				</title>
				<aug>
					<au>
						<snm>Neumann</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Pozarkova</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Macas</snm>
						<fnm>J</fnm>
					</au>
				</aug>
				<source>Plant Mol Biol</source>
				<pubdate>2003</pubdate>
				<volume>53</volume>
				<issue>3</issue>
				<fpage>399</fpage>
				<lpage>410</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1023/B:PLAN.0000006945.77043.ce</pubid>
						<pubid idtype="pmpid" link="fulltext">14750527</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B31">
				<title>
					<p>Sequencing the genespaces of Medicago truncatula and Lotus japonicus</p>
				</title>
				<aug>
					<au>
						<snm>Young</snm>
						<fnm>ND</fnm>
					</au>
					<au>
						<snm>Cannon</snm>
						<fnm>SB</fnm>
					</au>
					<au>
						<snm>Sato</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Kim</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Cook</snm>
						<fnm>DR</fnm>
					</au>
					<au>
						<snm>Town</snm>
						<fnm>CD</fnm>
					</au>
					<au>
						<snm>Roe</snm>
						<fnm>BA</fnm>
					</au>
					<au>
						<snm>Tabata</snm>
						<fnm>S</fnm>
					</au>
				</aug>
				<source>Plant Physiol</source>
				<pubdate>2005</pubdate>
				<volume>137</volume>
				<issue>4</issue>
				<fpage>1174</fpage>
				<lpage>1181</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">1088310</pubid>
						<pubid idtype="pmpid" link="fulltext">15824279</pubid>
						<pubid idtype="doi">10.1104/pp.104.057034</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B32">
				<title>
					<p>Annotating the genome of Medicago truncatula</p>
				</title>
				<aug>
					<au>
						<snm>Town</snm>
						<fnm>CD</fnm>
					</au>
				</aug>
				<source>Curr Opin Plant Biol</source>
				<pubdate>2006</pubdate>
				<volume>9</volume>
				<issue>2</issue>
				<fpage>122</fpage>
				<lpage>127</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/j.pbi.2006.01.004</pubid>
						<pubid idtype="pmpid" link="fulltext">16458040</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B33">
				<title>
					<p>Medicago genome sequence release 1.0</p>
				</title>
				<url>http://www.medicago.org/genome/downloads/Mt1/</url>
			</bibl>
			<bibl id="B34">
				<title>
					<p>Gene structure prediction from consensus spliced alignment of multiple ESTs matching the same genomic locus</p>
				</title>
				<aug>
					<au>
						<snm>Brendel</snm>
						<fnm>V</fnm>
					</au>
					<au>
						<snm>Xing</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Zhu</snm>
						<fnm>W</fnm>
					</au>
				</aug>
				<source>Bioinformatics</source>
				<pubdate>2004</pubdate>
				<volume>20</volume>
				<issue>7</issue>
				<fpage>1157</fpage>
				<lpage>1169</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1093/bioinformatics/bth058</pubid>
						<pubid idtype="pmpid" link="fulltext">14764557</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B35">
				<title>
					<p>GMAP: a genomic mapping and alignment program for mRNA and EST sequences</p>
				</title>
				<aug>
					<au>
						<snm>Wu</snm>
						<fnm>TD</fnm>
					</au>
					<au>
						<snm>Watanabe</snm>
						<fnm>CK</fnm>
					</au>
				</aug>
				<source>Bioinformatics</source>
				<pubdate>2005</pubdate>
				<volume>21</volume>
				<issue>9</issue>
				<fpage>1859</fpage>
				<lpage>1875</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1093/bioinformatics/bti310</pubid>
						<pubid idtype="pmpid" link="fulltext">15728110</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B36">
				<title>
					<p>Different effects of intron nucleotide composition and secondary structure on pre-mRNA splicing in monocot and dicot plants</p>
				</title>
				<aug>
					<au>
						<snm>Goodall</snm>
						<fnm>GJ</fnm>
					</au>
					<au>
						<snm>Filipowicz</snm>
						<fnm>W</fnm>
					</au>
				</aug>
				<source>Embo J</source>
				<pubdate>1991</pubdate>
				<volume>10</volume>
				<issue>9</issue>
				<fpage>2635</fpage>
				<lpage>2644</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">452964</pubid>
						<pubid idtype="pmpid">1868837</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B37">
				<title>
					<p>Alternative Splicing In Plants (ASIP)</p>
				</title>
				<url>http://www.plantgdb.org/ASIP/</url>
			</bibl>
			<bibl id="B38">
				<title>
					<p>Genome-wide survey of human alternative pre-mRNA splicing with exon junction microarrays</p>
				</title>
				<aug>
					<au>
						<snm>Johnson</snm>
						<fnm>JM</fnm>
					</au>
					<au>
						<snm>Castle</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Garrett-Engele</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Kan</snm>
						<fnm>Z</fnm>
					</au>
					<au>
						<snm>Loerch</snm>
						<fnm>PM</fnm>
					</au>
					<au>
						<snm>Armour</snm>
						<fnm>CD</fnm>
					</au>
					<au>
						<snm>Santos</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Schadt</snm>
						<fnm>EE</fnm>
					</au>
					<au>
						<snm>Stoughton</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Shoemaker</snm>
						<fnm>DD</fnm>
					</au>
				</aug>
				<source>Science</source>
				<pubdate>2003</pubdate>
				<volume>302</volume>
				<issue>5653</issue>
				<fpage>2141</fpage>
				<lpage>2144</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1126/science.1090100</pubid>
						<pubid idtype="pmpid" link="fulltext">14684825</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B39">
				<title>
					<p>Alternative splicing and genome complexity</p>
				</title>
				<aug>
					<au>
						<snm>Brett</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Pospisil</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Valcarcel</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Reich</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Bork</snm>
						<fnm>P</fnm>
					</au>
				</aug>
				<source>Nat Genet</source>
				<pubdate>2002</pubdate>
				<volume>30</volume>
				<issue>1</issue>
				<fpage>29</fpage>
				<lpage>30</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1038/ng803</pubid>
						<pubid idtype="pmpid" link="fulltext">11743582</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B40">
				<title>
					<p>Features of Arabidopsis Genes and Genome Discovered using Full-length cDNAs</p>
				</title>
				<aug>
					<au>
						<snm>Alexandrov</snm>
						<fnm>NN</fnm>
					</au>
					<au>
						<snm>Troukhan</snm>
						<fnm>ME</fnm>
					</au>
					<au>
						<snm>Brover</snm>
						<fnm>VV</fnm>
					</au>
					<au>
						<snm>Tatarinova</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Flavell</snm>
						<fnm>RB</fnm>
					</au>
					<au>
						<snm>Feldmann</snm>
						<fnm>KA</fnm>
					</au>
				</aug>
				<source>Plant Mol Biol</source>
				<pubdate>2006</pubdate>
				<volume>60</volume>
				<issue>1</issue>
				<fpage>69</fpage>
				<lpage>85</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1007/s11103-005-2564-9</pubid>
						<pubid idtype="pmpid" link="fulltext">16463100</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B41">
				<title>
					<p>Evidence for the widespread coupling of alternative splicing and nonsense-mediated mRNA decay in humans</p>
				</title>
				<aug>
					<au>
						<snm>Lewis</snm>
						<fnm>BP</fnm>
					</au>
					<au>
						<snm>Green</snm>
						<fnm>RE</fnm>
					</au>
					<au>
						<snm>Brenner</snm>
						<fnm>SE</fnm>
					</au>
				</aug>
				<source>Proc Natl Acad Sci USA</source>
				<pubdate>2003</pubdate>
				<volume>100</volume>
				<issue>1</issue>
				<fpage>189</fpage>
				<lpage>192</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">140922</pubid>
						<pubid idtype="pmpid" link="fulltext">12502788</pubid>
						<pubid idtype="doi">10.1073/pnas.0136770100</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B42">
				<title>
					<p>Selecting for functional alternative splices in ESTs</p>
				</title>
				<aug>
					<au>
						<snm>Kan</snm>
						<fnm>Z</fnm>
					</au>
					<au>
						<snm>States</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Gish</snm>
						<fnm>W</fnm>
					</au>
				</aug>
				<source>Genome Res</source>
				<pubdate>2002</pubdate>
				<volume>12</volume>
				<issue>12</issue>
				<fpage>1837</fpage>
				<lpage>1845</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">187565</pubid>
						<pubid idtype="pmpid" link="fulltext">12466287</pubid>
						<pubid idtype="doi">10.1101/gr.764102</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B43">
				<title>
					<p>The ASAP II database: analysis and comparative genomics of alternative splicing in 15 animal species</p>
				</title>
				<aug>
					<au>
						<snm>Kim</snm>
						<fnm>N</fnm>
					</au>
					<au>
						<snm>Alekseyenko</snm>
						<fnm>AV</fnm>
					</au>
					<au>
						<snm>Roy</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Lee</snm>
						<fnm>C</fnm>
					</au>
				</aug>
				<source>Nucleic Acids Res</source>
				<pubdate>2007</pubdate>
				<issue>35 Database</issue>
				<fpage>D93</fpage>
				<lpage>98</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">1669709</pubid>
						<pubid idtype="pmpid" link="fulltext">17108355</pubid>
						<pubid idtype="doi">10.1093/nar/gkl884</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B44">
				<title>
					<p>Splicing Related Genes Database (SRGD)</p>
				</title>
				<url>http://www.plantgdb.org/SRGD</url>
			</bibl>
			<bibl id="B45">
				<title>
					<p>Medicago genome sequencing project</p>
				</title>
				<url>http://www.medicago.org/genome/</url>
			</bibl>
			<bibl id="B46">
				<title>
					<p>Medicago Genome Sequence Release 1.0 white book</p>
				</title>
				<url>http://www.medicago.org/genome/downloads/Mt1/Mt1.0.pdf</url>
			</bibl>
			<bibl id="B47">
				<title>
					<p>National Center for Biotechnology Information (NCBI)</p>
				</title>
				<url>http://www.ncbi.nlm.nih.gov/</url>
			</bibl>
			<bibl id="B48">
				<title>
					<p>NCBI Arabidopsis Genome Sequence FTP Site</p>
				</title>
				<url>ftp://ftp.ncbi.nih.gov/genomes/Arabidopsis_thaliana/</url>
			</bibl>
			<bibl id="B49">
				<title>
					<p>TIGR Rice Genome Sequences Release 4.0 FTP site</p>
				</title>
				<url>ftp://ftp.tigr.org/pub/data/Eukaryotic_Projects/o_sativa/annotation_dbs/pseudomolecules/version_4.0/</url>
			</bibl>
			<bibl id="B50">
				<title>
					<p>Comparative plant genomics resources at PlantGDB</p>
				</title>
				<aug>
					<au>
						<snm>Dong</snm>
						<fnm>Q</fnm>
					</au>
					<au>
						<snm>Lawrence</snm>
						<fnm>CJ</fnm>
					</au>
					<au>
						<snm>Schlueter</snm>
						<fnm>SD</fnm>
					</au>
					<au>
						<snm>Wilkerson</snm>
						<fnm>MD</fnm>
					</au>
					<au>
						<snm>Kurtz</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Lushbough</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Brendel</snm>
						<fnm>V</fnm>
					</au>
				</aug>
				<source>Plant Physiol</source>
				<pubdate>2005</pubdate>
				<volume>139</volume>
				<issue>2</issue>
				<fpage>610</fpage>
				<lpage>618</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">1255980</pubid>
						<pubid idtype="pmpid" link="fulltext">16219921</pubid>
						<pubid idtype="doi">10.1104/pp.104.059212</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B51">
				<title>
					<p>ASpipe project at SourceForge</p>
				</title>
				<url>https://sourceforge.net/projects/aspipe/</url>
			</bibl>
		</refgrp>
	</bm>
</art>
