<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>gb-2008-9-11-r162</ui>
   <ji>GBJ</ji>
   <fm>
      <dochead>Research</dochead>
      <bibl>
         <title>
            <p>Proteomics studies confirm the presence of alternative protein isoforms on a large scale</p>
         </title>
         <aug>
            <au id="A1" ca="yes">
               <snm>Tress</snm>
               <mi>L</mi>
               <fnm>Michael</fnm>
               <insr iid="I1"/>
               <email>mtress@cnio.es</email>
            </au>
            <au id="A2">
               <snm>Bodenmiller</snm>
               <fnm>Bernd</fnm>
               <insr iid="I2"/>
               <email>bodenmiller@imsb.biol.ethz.ch</email>
            </au>
            <au id="A3">
               <snm>Aebersold</snm>
               <fnm>Ruedi</fnm>
               <insr iid="I2"/>
               <insr iid="I3"/>
               <insr iid="I4"/>
               <insr iid="I5"/>
               <email>aebersold@imsb.biol.ethz.ch</email>
            </au>
            <au id="A4">
               <snm>Valencia</snm>
               <fnm>Alfonso</fnm>
               <insr iid="I1"/>
               <email>valencia@cnio.es</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Structural Biology and Biocomputing Programme, Spanish National Cancer Research Centre (CNIO), C. Melchor Fernandez Almagro, Madrid 28029, Spain</p>
            </ins>
            <ins id="I2">
               <p>Institute of Molecular Systems Biology, ETH, Wolfgang-Pauli-Str., 8093 Zurich, Switzerland</p>
            </ins>
            <ins id="I3">
               <p>Institute for Systems Biology, Seattle, WA 98103, USA</p>
            </ins>
            <ins id="I4">
               <p>Competence Center for Systems Physiology and Metabolic Diseases, ETH Zurich, 8093 Zurich, Switzerland</p>
            </ins>
            <ins id="I5">
               <p>Faculty of Science, University of Zurich, 8057 Zurich, Switzerland</p>
            </ins>
         </insg>
         <source>Genome Biology</source>
         <issn>1465-6906</issn>
         <pubdate>2008</pubdate>
         <volume>9</volume>
         <issue>11</issue>
         <fpage>R162</fpage>
         <url>http://genomebiology.com/2008/9/11/R162</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">19017398</pubid>
               <pubid idtype="doi">10.1186/gb-2008-9-11-r162</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>13</day>
               <month>6</month>
               <year>2008</year>
            </date>
         </rec>
         <revrec>
            <date>
               <day>29</day>
               <month>9</month>
               <year>2008</year>
            </date>
         </revrec>
         <acc>
            <date>
               <day>18</day>
               <month>11</month>
               <year>2008</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>18</day>
               <month>11</month>
               <year>2008</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2008</year>
         <collab>Tress et al.; licensee BioMed Central Ltd.</collab>
         <note>This is an open access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <shorttitle>
         <p>Proteomic analysis of alternative splicing</p>
      </shorttitle>
      <shortabs>
         <p>Stably expressed alternatively-spliced protein isoforms are produced on a genome-wide scale in Drosophila.</p>
      </shortabs>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>Alternative splicing of messenger RNA permits the formation of a wide range of mature RNA transcripts and has the potential to generate a diverse spectrum of functional proteins. Although there is extensive evidence for large scale alternative splicing at the transcript level, there have been no comparable studies demonstrating the existence of alternatively spliced protein isoforms.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>Recent advances in proteomics technology have allowed us to carry out a comprehensive identification of protein isoforms in <it>Drosophila</it>. The analysis of this proteomic data confirmed the presence of multiple alternative gene products for over a hundred <it>Drosophila </it>genes.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusions</p>
               </st>
               <p>We demonstrate that proteomics techniques can detect the expression of stable alternative splice isoforms on a genome-wide scale. Many of these alternative isoforms are likely to have regions that are disordered in solution, and specific proteomics methodologies may be required to identify these peptides.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <meta>
      <classifications>
         <classification type="BMC" subtype="man_spc_id" id="30010010">Genome studies</classification>
         <classification type="BMC" subtype="man_spc_id" id="30010015">Model organisms</classification>
         <classification type="BMC" subtype="man_spc_id" id="30010016">Molecular biology</classification>
      </classifications>
   </meta>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>The alternative splicing of pre-messenger RNA (mRNA) allows for the generation of diverse mature RNA transcripts from a single mRNA strand <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B2">2</abbr></abbrgrp>. Recent studies have estimated that more than 60% of multi-exon human genes <abbrgrp><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr><abbr bid="B5">5</abbr></abbrgrp> and at least 40% of <it>Drosophila </it>genes <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> can produce differently spliced mRNA transcripts. The extent of alternative splicing of transcripts has led to suggestions that its purpose is to expand functional complexity in the cell <abbrgrp><abbr bid="B7">7</abbr><abbr bid="B8">8</abbr></abbrgrp> and that alternative splicing may be one of the keys to understanding the discrepancy between the number of genes and functional complexity <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>. Alternative splicing events within protein coding regions can generate a range of protein isoforms with altered structure and biological function <abbrgrp><abbr bid="B10">10</abbr><abbr bid="B11">11</abbr><abbr bid="B12">12</abbr></abbrgrp> and, therefore, alternative splicing has the potential to expand the cellular protein repertoire. However, there is still some controversy about the degree of impact that individual alternative splicing events can have <it>in vivo </it>on the range of conventional protein functions <abbrgrp><abbr bid="B11">11</abbr><abbr bid="B13">13</abbr></abbrgrp>.</p>
         <p>Considerable supporting evidence exists for the expression of multiple alternative mRNA transcripts. The expression of many differently spliced mRNA transcripts is strongly supported by both microarray data <abbrgrp><abbr bid="B4">4</abbr></abbrgrp> and by cDNA and expressed sequence tag (EST) sequence evidence <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>. There is overwhelming evidence for the expression of transcripts even when these might encode protein sequences with unusual evolutionary or structural features, but it is much more difficult to demonstrate the existence of alternative variants at the protein level. To date, most evidence for the translation of alternative splice variants as stable proteins has come from individual experiments. One well known example is the <it>Dscam </it>gene, which codes for an axon guidance receptor involved in the formation of synaptic branching patterns in neural circuit development <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>. It has four sets of mutually exclusive alternative exons that code for three immunoglobulin-like domains and a trans-membrane domain that could theoretically generate 38,016 different protein isoforms. It has been shown that the expression of different <it>Dscam </it>isoforms affects the recognition of mechanosensory neurons <abbrgrp><abbr bid="B15">15</abbr></abbrgrp> and two <it>Dscam </it>isoforms have been crystallized <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>. Another important example in <it>Drosophila </it>is the <it>Sex lethal </it>gene. Alternative splicing of this gene determines whether the fly will be male or female <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>. <it>Sex lethal </it>encodes an RNA-binding protein that forms part of a complex regulatory cascade <abbrgrp><abbr bid="B18">18</abbr></abbrgrp>. The male-specific isoform of <it>Sex lethal </it>is an inactive truncated protein.</p>
         <p>Incontrovertible evidence for the expression of alternative protein variants ought to be available from proteomics technologies, but until recently these methods have only been able to identify a fraction of the peptide ions present in protease (tryptic) digests. This has hindered the analysis of protein isoforms. However, one recent study was able to show that 16 pairs of alternative protein isoforms were expressed in humans <abbrgrp><abbr bid="B19">19</abbr></abbrgrp> based on peptide data from the Peptide Atlas <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>. Two recent large scale proteomics studies have generated extensive, high quality peptide catalogs from the <it>Drosophila melangaster </it>proteome. The first was able to match peptides to almost 7,000 proteins (50% of the <it>Drosophila </it>genome), a level of coverage that has not been reached for any other complex eukaryote <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>. It was achieved using a novel iterative strategy that maximized sample diversity. The second study detected phosphorylated peptides representing 3,500 <it>Drosophila </it>proteins <abbrgrp><abbr bid="B22">22</abbr></abbrgrp>. The two studies are complementary; only a fraction of the peptides detected were present in both studies. Both studies also used the same protein database to assign peptide sequences to the generated tandem mass spectra, facilitating the comparison of the two datasets.</p>
         <p>The extent and coverage of these two sets of peptides has allowed us to perform the first large-scale analysis of alternative splicing at the protein level. This analysis demonstrates that the expression of protein isoforms is widespread and points the way towards further research in this area. The results presented here should prompt further studies to generate and analyze proteomic data sets in the search for protein isoforms expressed in different organisms.</p>
      </sec>
      <sec>
         <st>
            <p>Results and discussion</p>
         </st>
         <p>Our analysis was based on the peptides detected in two proteomics studies <abbrgrp><abbr bid="B21">21</abbr><abbr bid="B22">22</abbr></abbrgrp>. The 'Brunner set' consisted of 32,729 non-overlapping peptides from the <it>D. melangaster </it>proteome <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>. The 'Bodenmiller set' contained 10,118 high-confidence phosphorylated peptides <abbrgrp><abbr bid="B22">22</abbr></abbrgrp>. There were significant differences in the collection methods. In the Brunner analysis the protein samples came from experiments carried out under a wide range of distinct conditions and developmental stages of <it>Drosophila</it>, while the samples used in the Bodenmiller analysis were from a single <it>Drosophila </it>cell line grown under just five different conditions. In both studies the peptides were identified by searching against <it>in silico </it>trypsin digests of the FlyBase <it>D. melangaster </it>proteome <abbrgrp><abbr bid="B23">23</abbr></abbrgrp>.</p>
         <sec>
            <st>
               <p>Identifying splice isoform unique peptides</p>
            </st>
            <p>In the first step of analysis we searched for peptides that unambiguously indicated the presence of two or more splice isoforms from the same gene from the more than 42,000 peptides identified in the two proteomics studies. Peptides were matched to the proteins in FlyBase (release 5.4). Since FlyBase was used to identify the peptides from both studies, the peptides could be mapped back directly to proteins in FlyBase using a simple Perl script. Each of the more than 42,000 peptides mapped to at least one gene in FlyBase.</p>
            <p>FlyBase release 5.4 contains 15,181 genes, of which 14,141 are predicted to be protein coding. The release contains a total of 20,823 protein coding transcripts and 17,961 of the polypeptide gene products are unique. The 2,762 transcripts that are alternatively spliced in the 3' or 5' untranslatable regions were not considered in this study, since they produce identical translated products and cannot be distinguished with peptide data alone. A total of 2,406 protein-coding genes (17.01%) code for more than one gene product.</p>
            <p>It is important to note that the peptides in the Brunner and Bodenmiller experiments were identified using FlyBase. This means that neither of the studies could characterize peptides that did not match FlyBase sequences. As a result, our analysis of splice isoforms had to be limited to the alternative splice isoforms annotated in FlyBase. Therefore, the highest possible detectable alternative splicing rate from the two experiments was 17.01%.</p>
            <p>After matching the Brunner and Bodenmiller peptides to the unique proteins in FlyBase, we searched for genes that had two or more alternative isoforms confirmed by the peptide evidence. Genes with confirmed alternative protein isoforms were those for which it was possible to map peptides to regions unique to two or more alternative isoforms. In other words, where the detected peptides could unequivocally demonstrate the presence of two gene products with distinct protein sequences (for example, Figure <figr fid="F1">1</figr>).</p>
            <fig id="F1">
               <title>
                  <p>Figure 1</p>
               </title>
               <caption>
                  <p>Schematic representation of the <it>LOLA </it>gene</p>
               </caption>
               <text>
                  <p>Schematic representation of the <it>LOLA </it>gene. The figure shows a representation of the seven variants detected by the two analyses. Coding exons are shown in shades of gray and the position of the BTB and zinc-finger domains are marked in color. Introns are not to scale. <it>LOLA </it>variants have an invariant amino-terminal region, but different carboxyl termini. Despite the differences, all these carboxy-terminal regions contain paired zinc-finger domains. Peptides detected for this gene are shown as vertical dashed lines, and are highlighted in orange when the peptide crosses the exon boundary. All the peptides detected in the two analyses were in the variable carboxy-terminal regions.</p>
               </text>
               <graphic file="gb-2008-9-11-r162-1"/>
            </fig>
            <p>The peptide evidence from the Brunner set confirmed multiple alternative isoforms for 76 genes and the evidence from the Bodenmiller set confirmed multiple alternative isoforms for 60 genes. There was a certain amount of overlap between the two experiments - 19 genes had multiple isoforms confirmed by the peptides from both analyses. In addition, when the two sets of peptides were combined there was evidence for alternative gene products from another 13 genes. In total, we were able to demonstrate that 130 separate <it>Drosophila </it>genes expressed at least two alternative isoforms (Additional data file 1). While this is only a small proportion of the genes that are supposed to express alternative protein isoforms, the figure is considerably higher than any previous study we know.</p>
            <p>Those genes for which it was possible to show the presence of three or more distinct gene products were particularly interesting. Five genes - <it>SNF4A-gamma</it>, <it>Akap200</it>, <it>14-3-3-epsilon</it>, <it>mod(mdg4)</it>, and <it>LOLA </it>- each expressed at least four distinct gene products. By combining the peptide data it was possible to show that one gene, <it>LOLA</it>, expressed at least seven different isoforms (Figure <figr fid="F1">1</figr>). The 26 splice isoforms of <it>LOLA </it>annotated in Flybase have very different carboxyl termini, but two-thirds of them include two zinc-finger domains. All seven confirmed <it>LOLA </it>splice isoforms have both carboxy-terminal zinc-finger domains (Figure <figr fid="F2">2</figr>). Of the five genes shown to have more than two distinct gene products, there are individual studies for <it>LOLA </it><abbrgrp><abbr bid="B24">24</abbr></abbrgrp>, <it>SNF4A-gamma </it><abbrgrp><abbr bid="B25">25</abbr></abbrgrp> and <it>mod(mdg4) </it><abbrgrp><abbr bid="B26">26</abbr></abbrgrp> that predict the presence of multiple splice isoforms.</p>
            <fig id="F2">
               <title>
                  <p>Figure 2</p>
               </title>
               <caption>
                  <p>Alignment of <it>LOLA </it>isoforms</p>
               </caption>
               <text>
                  <p>Alignment of <it>LOLA </it>isoforms. The alignment of the carboxy-terminal C<sub>2</sub>H<sub>2 </sub>DNA-binding zinc finger domains of the seven detected <it>LOLA </it>isoforms (CG12052). Six isoforms have two C<sub>2</sub>H<sub>2 </sub>zinc-finger domains, isoform PP has a C<sub>2</sub>H<sub>2 </sub>DNA-binding zinc-finger domain and a C<sub>2</sub>HC zinc finger domain. Zinc-binding residues (cysteines and histidines) are marked in red, and structurally important residues marked in green. The symbols below the alignment indicate the degree of conservation of the aligned residues: asterisk, completely conserved column; colon, highly conserved column; single dot, some conservation.</p>
               </text>
               <graphic file="gb-2008-9-11-r162-2"/>
            </fig>
            <p>There was no evidence of the expression of alternative splice isoforms of the <it>Dscam </it>protein that was mentioned in the introduction, but the two studies did show the expression of different isoforms of <it>Sex lethal </it>(for the tandem mass spectra of the phosphopeptides for the two isoforms see Additional data files 2-4). Other genes that are predicted to have differently functioning splice isoforms include <it>CTBP</it>, thought to be important in development and to have two variants that are conserved across all insect species <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>, <it>Eif2 </it>and <it>Su(var)3-9</it>, which fuse in <it>Drosophila </it>and are expected to create two different gene products <abbrgrp><abbr bid="B28">28</abbr></abbrgrp>, and <it>Polychaetoid </it><abbrgrp><abbr bid="B29">29</abbr></abbrgrp> and <it>Thioredoxin reductase </it><abbrgrp><abbr bid="B30">30</abbr></abbrgrp>, both of which are predicted to have gene products whose cellular locations are controlled by alternative splicing.</p>
         </sec>
         <sec>
            <st>
               <p>Simulated peptide detection</p>
            </st>
            <p>While the two studies present irrefutable evidence that alternatively spliced transcripts are expressed as proteins, the total number of genes with confirmed alternative products from the two experiments is small. We could confirm the expression of alternative protein isoforms for just 1.1% of the 6,980 unambiguously identified genes in the Brunner set, while the Bodenmiller study showed that 1.8% of the 3,472 identified genes express more than one protein isoform.</p>
            <p>However, this apparently low level of alternative splicing at the protein level has to be seen in the context of the relatively low coverage of the <it>Drosophila </it>proteome by the two experiments. Although the peptides detected by the Brunner study did confirm the presence of gene products for more than half of the <it>Drosophila </it>genes, the identified peptides covered just 5% of the amino acid residues in the <it>Drosophila </it>proteome. This obviously decreases the chances of finding peptides that unambiguously correspond to alternatively spliced regions.</p>
            <p>In order to demonstrate whether the low rates of detection of alternative isoforms are significant, we carried out simulated <it>in silico </it>peptide identification experiments. These <it>in silico </it>experiments determine the expected rates of detection assuming that all peptides are equally detectable, even though some proteins may be more abundant or more easily detectable than others. The simulations cannot tell us what the real rate of alternative splicing at the protein level is, since the maximum detectable rate is limited by the rate of alternative splicing found in FlyBase (in this particular case 17%; see above). However, they do provide an estimation of the number of alternative isoforms that we would have expected the two experiments to detect.</p>
            <p>For the comparison with the Brunner analysis we drew 37,279 peptides at random from an <it>in silico </it>trypsin digest of the <it>D. melangaster </it>proteome based on Flybase release 5.4. The simulation was performed 1,000 times. The results showed that a random selection of 37,279 peptides from the <it>in silico </it>digest would be expected to confirm the expression of alternatively spliced isoforms for a mean of 242.75 genes (standard deviation of 9.23). By way of contrast, the peptides identified in the Brunner analysis confirmed multiple alternatively spliced isoforms for just 76 genes.</p>
            <p>For the Bodenmiller study we drew 10,118 peptides at random from the <it>in silico </it>trypsin digest and again the simulation was performed 1,000 times. From the random drawing we were able to show that 10,118 peptides would be expected to confirm expression of distinct splice isoforms for a mean of 56.24 genes (standard deviation of 5.05). The Bodenmiller analysis confirmed multiple alternatively spliced gene products for 60 genes.</p>
            <p>The simulations allowed us to show that the Brunner analysis detects a little under a third of the number of genes that would be expected to produce alternative isoforms. There are several possible explanations for this. One reason may be that many isoforms are only expressed in certain tissues or certain stages of development. Given the wide range of cell types and developmental states that were used in this experiment, this explanation seems less likely. A second possibility is that the transcripts predicted from cDNA and EST evidence are simply not all transcribed and the transcript evidence is an overestimation of the real number of proteins expressed in the cell. Another explanation might be that many alternative isoforms may only be expressed in very low quantities and are less easily detected.</p>
            <p>By way of contrast with the results from the Brunner simulations, the 60 genes with confirmed multiple alternatively spliced gene products in the Bodenmiller analysis is almost exactly what would be expected from the Bodenmiller simulations. It is somewhat surprising that it was the Bodenmiller analysis, and not the Brunner analysis, that detected the rate of alternative splicing expected from the random simulations. The prevailing ideology of alternative splicing assumes that alternative isoforms are expressed in distinct tissues and developmental stages; therefore, we would expect to confirm a higher rate of alternative splicing with the Brunner analysis, where many cell types and developmental stages were interrogated, than in the Bodenmiller analysis, where only a single cell type was tested.</p>
            <p>The difference in the frequency of alternative splicing detected by the two studies is enlightening. The Bodenmiller analysis identified proportionally more alternative isoforms than the Brunner analysis (1.8% of genes detected in the Bodenmiller study had alternative protein isoforms, compared to 1.1% detected in the Brunner experiments) and this is even more clear from the simulations. The contrasting results from the two analyses strongly suggest the possibility that methods such as those used in the Bodenmiller analysis are more sensitive when it comes to detecting certain alternative isoforms.</p>
            <p>While some peptides were identified by both studies, on the whole the peptides recognized by the two analyses are different. This is clear from the residue composition of the peptides detected by the two analyses. Although the residue composition of the peptides found in the Brunner analysis is similar to the residue composition of the <it>Drosophila </it>proteome - except that there are fewer basic residues and more acidic residues (Figure <figr fid="F3">3b</figr>) - the peptides detected in the Bodenmiller analysis have a quite distinctive composition (Figure <figr fid="F3">3a</figr>). The Bodenmiller analysis specifically selected peptides that were phosphorylated. The peptides detected in these experiments have substantially more serine residues (unsurprising because almost 90% of the phosphorylated residues are serines), but also many more proline (5.5-7.6%), asparagine (4.7-5.7%), glycine (6.2-7.2%) and aspartate residues (5.2-5.9%). All these values are significantly higher than expected according to Chi-squared tests (<it>p</it>-values &lt; 0.005).</p>
            <fig id="F3">
               <title>
                  <p>Figure 3</p>
               </title>
               <caption>
                  <p>Composition of peptides identified in the Brunner and Bodenmiller studies</p>
               </caption>
               <text>
                  <p>Composition of peptides identified in the Brunner and Bodenmiller studies. FlyBase residue composition was calculated from Flybase release 5.4. <b>(a) </b>Comparison of the percentage of each amino acid found in the Bodenmiller peptides and in the <it>Drosophila </it>proteome. <b>(b) </b>Comparison of the proportion of each amino acid in the Brunner peptides and the <it>Drosophila </it>proteome. The three sets of proteins differed most in the proportion of hydrophobic and disorder-promoting residues. <b>(c) </b>Comparison of the percentage of each type of residue in five different sets of peptides. Hydrophobic residues were C, F, I, L, M, V, W and Y. Disorder promoting residues (Extreme LDR) were A, D, E, G, P, N and S (according to Romero <it>et al</it>. <abbrgrp><abbr bid="B35">35</abbr></abbrgrp>). BodenDisc is the subset of peptides that could be used to discriminate one isoform from another in the Bodenmiller analysis; BrunnDisc is the subset of peptides that could be used to discriminate one isoform from another in the Brunner analysis. The Brunner discriminating peptides had markedly fewer hydrophobic residues and markedly more disorder promoting residues than the whole set of Brunner peptides and the <it>Drosophila </it>proteome.</p>
               </text>
               <graphic file="gb-2008-9-11-r162-3"/>
            </fig>
            <p>In addition, all hydrophobic residues are markedly under-represented: the values for cysteine (1.83% of the residues in the <it>Drosophila </it>proteome and just 0.63% of the Bodenmiller peptides), phenylalanine (3.47 and 2.18%), isoleucine (4.9 and 3.71%), leucine (8.94 and 6.69%), methionine (2.33 and 1.14%), valine (5.9 and 5.34%), tryptophan (0.98 and 0.35%) and tyrosine (2.92 and 1.57%) are significantly less than expected (all Chi-squared <it>p</it>-values &lt; 0.005). The peptides from the Bodenmiller set are considerably less hydrophobic than normal - just one in five of the Bodenmiller residues are hydrophobic, compared to the one in three residues in the <it>Drosophila </it>proteome (Figure <figr fid="F3">3c</figr>).</p>
            <p>This residue composition of the Bodenmiller peptides is typical of regions that are disordered in solution. It is well known that proteins with few hydrophobic residues and more polar residues are likely to correspond to disordered regions of the structure <abbrgrp><abbr bid="B31">31</abbr><abbr bid="B32">32</abbr></abbrgrp>. Studies also suggest that phosphorylated residues tend to be more frequent in flexible, unstructured segments and linkers <abbrgrp><abbr bid="B33">33</abbr><abbr bid="B34">34</abbr></abbrgrp>. Taken together, this information strongly suggests that many of the Bodenmiller peptides, as well as being in exposed regions on the surface of proteins, will be disordered when in solution. Indeed, where the Bodenmiller peptides can be mapped to known structures, most map to regions on the surface or regions known to be disordered in solution (Figure <figr fid="F4">4</figr>).</p>
            <fig id="F4">
               <title>
                  <p>Figure 4</p>
               </title>
               <caption>
                  <p>Mapping of detected phosphorylation sites to known structures</p>
               </caption>
               <text>
                  <p>Mapping of detected phosphorylation sites to known structures. Peptides detected in the Bodenmiller analysis were mapped to highly similar, known structures. The three-dimensional structures are shown in orange spacefilling representation except where the peptides map to the structures (shown as black ribbons). Detected phosphorylation sites are shown as black dots. <b>(a) </b>The alternative isoforms generated from <it>shaggy </it>are 76% sequence identical to human glycogen synthase kinase 3 beta [PDB:<ext-link ext-link-type="pdb" ext-link-id="1j1c">1j1c</ext-link>]. The peptides detected in the analysis covered two regions. The amino-terminal region includes 22 residues that are known to be disordered in solution and are therefore not shown. <b>(b) </b>The structure for <it>Drosophila </it>fructose-1-biphosphate aldolase has been solved and the Bodenmiller analysis finds peptides covering two regions - both are on the surface of the structure. <b>(c) </b><it>Drosophila </it>moesin has 75% identity to fall army worm moesin, for which the three-dimensional structure has been solved [PDB:<ext-link ext-link-type="pdb" ext-link-id="2i1j">2i1j</ext-link>]; in addition to the residues marked as found in the Bodenmiller analysis, a further 15 residues that were detected are not shown in the figure because they are disordered in solution. <b>(d) </b>The isoforms generated from <it>alphabet </it>are 52% identical to human phosphatase 2C [PDB:<ext-link ext-link-type="pdb" ext-link-id="1a6q">1a6q</ext-link>]. The analysis detected two peptides, one of which also coincided with the 14 disordered carboxy-terminal residues of the template (not shown).</p>
               </text>
               <graphic file="gb-2008-9-11-r162-4"/>
            </fig>
         </sec>
         <sec>
            <st>
               <p>False positive rates</p>
            </st>
            <p>It is widely recognized that a certain proportion of peptides identified in proteomics techniques can be false positives. However, both the Brunner and Bodenmiller studies have low rates of false positives. The Brunner analysis has a false positive rate of approximately 5% and the Bodenmiller analysis has a false positive rate of 1-4%. Even if 10% of these peptides were to be false positives (twice the determined value), there would still be considerably more than 100 genes with evidence of alternative splicing at the protein level from the two studies. In any case, a small number of false positives will not affect the main conclusions of this study. Most alternative transcripts do seem to produce alternative gene products and many of these alternative isoforms may have regions that are disordered in solution.</p>
         </sec>
         <sec>
            <st>
               <p>Re-analysis of the spectra</p>
            </st>
            <p>Given the depth and range of peptides detected in the two studies, we might also have expected to be able to uncover the expression of peptides not in FlyBase, such as those from predicted genes and transcripts (isoforms), translated pseudogenes and small RNAs, or in principal any other peptide produced by the 6-frame translation of the fly genome. A complete re-analysis of the spectra from the Bodenmiller study was beyond the scope of this paper, but we were able to carry out an initial re-analysis against a locally generated database that contained 903,842 peptides from translated transcripts from predicted gene models, translated pseudogenes and translated miscellaneous functional RNA. The re-analysis identified seven peptides that mapped exclusively to predicted gene models and two peptides that were linked to the miscellaneous RNA. There was no evidence for the expression of any of the pseudogenes as peptides.</p>
            <p>Three of the predicted gene models (genscan_masked:gene254366, genscan_masked:gene247065, and genscan_masked:gene245985) were not similar to any sequences in the UniProt database. One <it>ab initio </it>predicted gene model (genscan_masked:gene264127) did match a unique sequence in UniProt, but only because the prediction itself had been erroneously included in the UniProt sequence database.</p>
            <p>One peptide mapped to four different predictions (genscan_masked:gene266459, genie_masked:gene1703010, genie_masked:gene1402427 and genscan_masked:gene1391762) that were 40% identical to a putative gag-pol protein (<it>Drosophila ananassae</it>). The remaining two predicted gene models identified by the spectra might be alternative variants of <it>vav </it>(both genie_masked:gene1736185 and genscan_masked:gene267148) and <it>lethal (2) 05510 </it>(genscan_masked:gene263593) but have yet to be annotated as variants in FlyBase.</p>
            <p>Of the two miscellaneous transcripts identified by the re-analysis of the spectra, one is a piece of rRNA (FBtr0114214) that is not similar to anything in the UniProt database and the other maps to a piece of small nucleolar RNA (snoRNA; <it>Or-aca1</it>, FBtr0113530) that, when translated, is 71% identical (but over just 20 residues) to the hypothetical protein SNOG_09564 (<it>Phaeosphaeria nodorum SN15</it>). <it>Or-aca1 </it>is located inside an intron of <it>ribosomal protein S16</it>.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Conclusion</p>
         </st>
         <sec>
            <st>
               <p>Genome-wide expression of alternative isoforms</p>
            </st>
            <p>We have been able to demonstrate conclusive evidence for the genome-wide expression of alternative splice variants at the protein level and have shown that distinct proteins are indeed produced from alternative splice variants. The results from the two large-scale proteomics studies on which our analysis is based showed that the expression of alternative gene products is extensive. These studies confirmed the presence of multiple alternative isoforms for over a hundred genes. Moreover, the alternative isoforms detected in these two studies were sufficiently stable <it>in vivo </it>and produced in sufficient quantities to be detectable in proteomics workflows. Even though the current technical limitations of proteomics studies allowed the recovery of just a small fraction of the potential alternative isoforms (less than 2% of <it>Drosophila </it>genes were identified with alternative protein isoforms), the results were enough to estimate the presence of alternative splicing in the genome and to propose that most, if not all, recorded alternative variants are likely to be expressed at the protein level in some form.</p>
         </sec>
         <sec>
            <st>
               <p>Phosphopeptide detection techniques are more sensitive</p>
            </st>
            <p>The comparison of the two proteomics studies showed that the level of expression of alternatively spliced variants in the general proteomics analysis of Brunner was less than would be expected, but that the expression levels of alternative gene products in the Bodenmiller experiment, which specifically targeted and identified phosphopeptides, corresponded to the levels of expression predicted by FlyBase. The higher proportion of alternatively spliced gene products detected in the Bodenmiller analysis is most probably related to the sensitivity of the analysis of charged phosphopeptides. One of the effects of the sensitivity to phosphorylation sites is that the identified peptides have a significant compositional bias. The peptides have many fewer hydrophobic residues and markedly more polar residues, suggesting that many of these phosphorylation sites are in regions that are disordered in corresponding protein structures.</p>
            <p>This observation is interesting since it may have consequences for our understanding of the structural and functional consequences of splicing. The detailed analysis of the potential effects of alternative splicing in proteins shows that alternative splicing would be expected to lead to substantial rearrangements in the corresponding structures <abbrgrp><abbr bid="B11">11</abbr></abbrgrp> and it is unlikely that the large changes introduced by alternative splicing events will generate regions that fold with a stable hydrophobic core. It has previously been suggested that a substantial proportion of alternative gene products are unstructured in solution <abbrgrp><abbr bid="B35">35</abbr></abbrgrp>. A corollary to this is that there are only eight known pairs of alternative splice isoforms in the Protein Data Bank (PDB) structural database <abbrgrp><abbr bid="B36">36</abbr></abbrgrp>. In five of these pairs the regions resulting from alternative splicing events are disordered. It may be that many alternative splicing events result in proteins that are, at least in part, unstructured and flexible in solution. If alternative splicing events are related to disordered regions and phosphorylated residues are more frequent in these unstructured and flexible regions, then it follows that the disordered regions resulting from alternative splicing events will be more easily detected by methods that detect phosphorylated peptides. Therefore, it is not surprising that the Bodenmiller analysis was able to detect a higher proportion of splice isoforms.</p>
            <p>The results of this analysis have shown that proteomics data can indeed be used to investigate the extent of alternative splicing at the protein level. The Bodenmiller analysis detected peptides that differed from those in the Brunner analysis because they specifically isolated phosphorylated peptides from whole cell lysate, suggesting a methodology for carrying out further experiments to detect alternative splicing at the protein level.</p>
            <p>Perhaps surprisingly though, our initial re-analysis of the liquid chromatography tandem mass spectrometry data using databases of predicted transcripts, translated pseudogenes and small RNAs failed to reveal any significant new findings. Unfortunately, while we were able to detect some evidence for the expression of small RNAs and predicted gene models, the number of novel identified peptides fell within the estimated false positive rate. A complete re-analysis of the spectra might have produced more interesting results. However, our initial re-analysis made it clear that any proteomics study of the expression of functional aspects of the genome would be complicated by a series of logistical challenges. For example, six-frame translations of genome transcripts would create an enormous search space that would result in extensive and impracticable database search times. In addition, peptide detection sensitivity correlates with database size (especially if most added sequences are likely not to be real peptides) and is strongly reduced in the case of a 6-frame translated database.</p>
            <p>Nevertheless, what is and is not expressed as protein is an interesting scientific question that needs to be addressed and our study points the way to further experiments of this nature. Future proteomics studies could address the question by searching against additional databases organized for the purpose, but specific methods to deal with the false positive problem still need to be developed and, ideally, the detected peptides would need to be confirmed using independent methods.</p>
            <p>The fact that we can show that alternative transcripts are translated into proteins at the predicted rate is a great step forward and shows the importance of proteomics in validating predicted transcripts. Of course, showing that the alternative splice isoforms are indeed expressed as stable proteins is only the first step in assessing the functional role of alternative variants. In order to broaden our understanding of the role of genomic protein diversity, further experimental approaches are needed. We feel that these results will serve as an important point of reference for these experiments.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Materials and methods</p>
         </st>
         <p>Our analysis was based on the peptides detected in two proteomics studies <abbrgrp><abbr bid="B21">21</abbr><abbr bid="B22">22</abbr></abbrgrp>. The first, the 'Brunner set' (for more details, see <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>) consisted of 32,729 non-overlapping peptides that could be uniquely attributed to a single gene product from the <it>D. melangaster </it>proteome. In total, the experimentally observed peptides contained sufficient information to identify 6,980 proteins unambiguously. The second dataset, the 'Bodenmiller set' (see <abbrgrp><abbr bid="B22">22</abbr></abbrgrp> for more details) contained 10,118 high-confidence phosphorylated peptides from 3,472 gene models.</p>
         <p>Even though the peptides detected in the two studies were identified based on tandem mass spectrometry, there were significant differences in the collection methods. In the Brunner analysis the protein samples from which the peptides were produced came from experiments carried out under a wide range of distinct conditions, including 5 developmental stages, 12 tissue types and 10 different fractionation techniques. Furthermore, the coverage was further augmented by a novel iterative data collection method <abbrgrp><abbr bid="B21">21</abbr></abbrgrp> that resulted in a significant increase in coverage relative to previous studies. Cells used in the Bodenmiller analysis were from a single <it>Drosophila </it>cell line (Kc167), but cells were grown under five different conditions in order to maximize phosphorylation site identifications. Finally, phosphopeptides were isolated using three different phosphopeptide isolation methods <abbrgrp><abbr bid="B37">37</abbr></abbrgrp> prior to mass spectrometric analysis using a high mass accuracy Fourier transform ion cyclotron resonance mass spectrometer in order to maximize the number of identified phosphopeptides.</p>
         <p>In both studies the peptides were identified by searching against <it>in silico </it>trypsin digests of the FlyBase <it>D. melangaster </it>proteome. For both methods the false positive rate of peptide identification was assessed using the statistical tool Peptide Prophet <abbrgrp><abbr bid="B38">38</abbr></abbrgrp> as well as a decoy database strategy and was found to be in the low percentage range. Most tandem mass spectra as well as their statistical analysis can be viewed in the PhosphoPep database <abbrgrp><abbr bid="B39">39</abbr></abbrgrp>.</p>
         <p>In order to make estimates for the expected rate of alternative splicing, we carried out simulated peptide detection. For this <it>in silico </it>peptide detection experiment the <it>D. melangaster </it>proteome (dmel-all-translation-r5.4.fasta) was subject to an <it>in silico </it>trypsin digest. Peptides in the <it>in silico </it>digest were generated by cutting after arginine and lysine residues, except where they were followed by proline residues. We carried out 1,000 peptide detection simulations by drawing 10,118 peptides (for the Bodenmiller analysis) or 37,279 peptides (for the Brunner set) at random from the peptides generated <it>in silico</it>.</p>
         <sec>
            <st>
               <p>Re-analysis of the spectra</p>
            </st>
            <p>We searched 154,509 spectra from the Bodenmiller dataset against a database that contained 903,842 peptides derived from 17,868 translated transcripts: 10, 000 translated transcripts came from predicted gene models from FlyBase, 1,456 came from the 6-frame translation of pseudogenes from FlyBase, 3,818 from 6-frame translations of miscellaneous functional RNA (rRNA, small nuclear RNA (snRNA) and snoRNA) from FlyBase and 2,594 were generated from the 6-frame translation of transcripts predicted by Manak <it>et al</it>. <abbrgrp><abbr bid="B40">40</abbr></abbrgrp> to be functional.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Abbreviations</p>
         </st>
         <p>EST: expressed sequence tag; PDB: Protein Data Bank; snoRNA: small nucleolar RNA.</p>
      </sec>
      <sec>
         <st>
            <p>Authors' contributions</p>
         </st>
         <p>MLT designed and carried out the <it>in silico </it>experiments, analyzed the data and drafted the manuscript. BB performed the proteomics data analyses and contributed to the manuscript. RA designed the data collection and edited the manuscript. AV conceived of the study and edited the manuscript.</p>
      </sec>
      <sec>
         <st>
            <p>Additional data files</p>
         </st>
         <p>The following additional data are available with the online version of this paper. Additional data file <supplr sid="S1">1</supplr> lists genes with multiple isoforms detected in the Brunner and Bodenmiller studies. Additional data file <supplr sid="S2">2</supplr> provides example tandem mass spectra of phosphopeptides distinguishing between the <it>Sex lethal </it>isoforms. Additional file <supplr sid="S3">3</supplr> is a table listing detected ion masses for <it>Sex lethal </it>isoforms PD, PI and PL. Additional data file <supplr sid="S4">4</supplr> is a table listing detected ion masses for <it>Sex lethal </it>isoforms PC, PG, PH, PJ, PN and PO.</p>
         <suppl id="S1">
            <title>
               <p>Additional data file 1</p>
            </title>
            <caption>
               <p>Genes with multiple isoforms detected in the Brunner and Bodenmiller studies</p>
            </caption>
            <text>
               <p>A list of all alternative isoforms confirmed by the Brunner and Bodenmiller analyses.</p>
            </text>
            <file name="gb-2008-9-11-r162-S1.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S2">
            <title>
               <p>Additional data file 2</p>
            </title>
            <caption>
               <p>Example tandem mass spectra of phosphopeptides distinguishing between the <it>Sex lethal </it>isoforms</p>
            </caption>
            <text>
               <p>Part 1A shows the phosphopeptide GFGMSHS*LPSGMSR, which is unique to the <it>Sex lethal </it>isoforms CG18350-PD, CG18350-PL, CG18350-PI. Part 1B shows the phosphopeptide GFGMS*HSLPSGMDTEFSFPSSSSR, which is unique to the <it>Sex lethal </it>isoforms CG18350-PG, CG18350-PH, CG18350-PO, CG18350-PC, CG18350-PJ, CG18350-PN. P is the Peptide Prophet score and corresponds to a &lt;1% false positive rate. In addition, all fragment ion masses are shown and detected ions are highlighted in red.</p>
            </text>
            <file name="gb-2008-9-11-r162-S2.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S3">
            <title>
               <p>Additional data file 3</p>
            </title>
            <caption>
               <p>Detected ion masses for <it>Sex lethal </it>isoforms PD, PI and PL</p>
            </caption>
            <text>
               <p>Fragment ion masses for the phosphopeptide GFGMSHS*LPSGMSR, which is unique to the <it>Sex lethal </it>isoforms CG18350-PD, CG18350-PL, CG18350-PI, are shown in tabular form. Detected ions are highlighted in red.</p>
            </text>
            <file name="gb-2008-9-11-r162-S3.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
         <suppl id="S4">
            <title>
               <p>Additional data file 4</p>
            </title>
            <caption>
               <p>Detected ion masses for <it>Sex lethal </it>isoforms PC, PG, PH, PJ, PN and PO</p>
            </caption>
            <text>
               <p>Fragment ion masses for the phosphopeptide GFGMS*HSLPSGMDTEFSFPSSSSR, which is unique to the <it>Sex lethal </it>isoforms CG18350-PG, CG18350-PH, CG18350-PO, CG18350-PC, CG18350-PJ, CG18350-PN, are shown in tabular form. Detected ions are highlighted in red.</p>
            </text>
            <file name="gb-2008-9-11-r162-S4.pdf">
               <p>Click here for file</p>
            </file>
         </suppl>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>This paper was financed by the BioSapiens Network of Excellence (grant number LSHG-CT-2003-503265), by Consolider BSC (grant number CSD2007-00050) and by the National Institute of Bioinformatics (INB), a platform of 'Genoma Espa&#241;a'. Bernd Bodenmiller is the recipient of a fellowship by the Boehringer Ingelheim Foundation.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>Alternative pre-mRNA splicing: the logic of combinatorial control.</p>
            </title>
            <aug>
               <au>
                  <snm>Smith</snm>
                  <fnm>CWJ</fnm>
               </au>
               <au>
                  <snm>Valcarcel</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Trends Biochem Sci</source>
            <pubdate>2000</pubdate>
            <volume>25</volume>
            <fpage>381</fpage>
            <lpage>388</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0968-0004(00)01604-2</pubid>
                  <pubid idtype="pmpid" link="fulltext">10916158</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>Protein diversity from alternative splicing: a challenge for bioinformatics and post-genome biology.</p>
            </title>
            <aug>
               <au>
                  <snm>Black</snm>
                  <fnm>DL</fnm>
               </au>
            </aug>
            <source>Cell</source>
            <pubdate>2000</pubdate>
            <volume>103</volume>
            <fpage>367</fpage>
            <lpage>370</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0092-8674(00)00128-8</pubid>
                  <pubid idtype="pmpid" link="fulltext">11081623</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>GENCODE: producing a reference annotation for ENCODE.</p>
            </title>
            <aug>
               <au>
                  <snm>Harrow</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Denoeud</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Frankish</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Reymond</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Chen</snm>
                  <fnm>CK</fnm>
               </au>
               <au>
                  <snm>Chrast</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Lagarde</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Gilbert</snm>
                  <fnm>JG</fnm>
               </au>
               <au>
                  <snm>Storey</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Swarbreck</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Rossier</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Ucla</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Hubbard</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Antonarakis</snm>
                  <fnm>SE</fnm>
               </au>
               <au>
                  <snm>Guigo</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2006</pubdate>
            <volume>7</volume>
            <fpage>S4</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1810553</pubid>
                  <pubid idtype="pmpid" link="fulltext">16925838</pubid>
                  <pubid idtype="doi">10.1186/gb-2006-7-s1-s4</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>Genome-wide survey of human alternative pre-mRNA splicing with exon junction microarrays.</p>
            </title>
            <aug>
               <au>
                  <snm>Johnson</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Castle</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Garrett-Engele</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Kan</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Loerch</snm>
                  <fnm>PM</fnm>
               </au>
               <au>
                  <snm>Armour</snm>
                  <fnm>CD</fnm>
               </au>
               <au>
                  <snm>Santos</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Schadt</snm>
                  <fnm>EE</fnm>
               </au>
               <au>
                  <snm>Stoughton</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Shoemaker</snm>
                  <fnm>DD</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2003</pubdate>
            <volume>302</volume>
            <fpage>2141</fpage>
            <lpage>2144</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.1090100</pubid>
                  <pubid idtype="pmpid" link="fulltext">14684825</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>The finished DNA sequence of human chromosome 12.</p>
            </title>
            <aug>
               <au>
                  <snm>Scherer</snm>
                  <fnm>SE</fnm>
               </au>
               <au>
                  <snm>Muzny</snm>
                  <fnm>DM</fnm>
               </au>
               <au>
                  <snm>Buhay</snm>
                  <fnm>CJ</fnm>
               </au>
               <au>
                  <snm>Chen</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Cree</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Ding</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Dugan-Rocha</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Gill</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Gunaratne</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Harris</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Hawes</snm>
                  <fnm>AC</fnm>
               </au>
               <au>
                  <snm>Hernandez</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Hodgson</snm>
                  <fnm>AV</fnm>
               </au>
               <au>
                  <snm>Hume</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Jackson</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Khan</snm>
                  <fnm>ZM</fnm>
               </au>
               <au>
                  <snm>Kovar-Smith</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Lewis</snm>
                  <fnm>LR</fnm>
               </au>
               <au>
                  <snm>Lozado</snm>
                  <fnm>RJ</fnm>
               </au>
               <au>
                  <snm>Metzker</snm>
                  <fnm>ML</fnm>
               </au>
               <au>
                  <snm>Milosavljevic</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Miner</snm>
                  <fnm>GR</fnm>
               </au>
               <au>
                  <snm>Montgomery</snm>
                  <fnm>KT</fnm>
               </au>
               <au>
                  <snm>Morgan</snm>
                  <fnm>MB</fnm>
               </au>
               <au>
                  <snm>Nazareth</snm>
                  <fnm>LV</fnm>
               </au>
               <au>
                  <snm>Scott</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Sodergren</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Song</snm>
                  <fnm>XZ</fnm>
               </au>
               <au>
                  <snm>Steffen</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Lovering</snm>
                  <fnm>RC</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nature</source>
            <pubdate>2006</pubdate>
            <volume>440</volume>
            <fpage>346</fpage>
            <lpage>351</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nature04569</pubid>
                  <pubid idtype="pmpid" link="fulltext">16541075</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>A gene expression map for the euchromatic genome of <it>Drosophila melanogaster</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>Stolc</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Gauhar</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Mason</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Halasz</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>van Batenburg</snm>
                  <fnm>MF</fnm>
               </au>
               <au>
                  <snm>Rifkin</snm>
                  <fnm>SA</fnm>
               </au>
               <au>
                  <snm>Hua</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Herreman</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Tongprasit</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Barbano</snm>
                  <fnm>PE</fnm>
               </au>
               <au>
                  <snm>Bussemaker</snm>
                  <fnm>HJ</fnm>
               </au>
               <au>
                  <snm>White</snm>
                  <fnm>KP</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2004</pubdate>
            <volume>306</volume>
            <fpage>655</fpage>
            <lpage>660</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.1101312</pubid>
                  <pubid idtype="pmpid" link="fulltext">15499012</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>Alternative splicing: increasing diversity in the proteomic world.</p>
            </title>
            <aug>
               <au>
                  <snm>Graveley</snm>
                  <fnm>BR</fnm>
               </au>
            </aug>
            <source>Trends Genet</source>
            <pubdate>2001</pubdate>
            <volume>17</volume>
            <fpage>100</fpage>
            <lpage>107</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0168-9525(00)02176-4</pubid>
                  <pubid idtype="pmpid" link="fulltext">11173120</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>Alternative pre-mRNA splicing in the human system: unexpected role of repetitive sequences as regulatory elements.</p>
            </title>
            <aug>
               <au>
                  <snm>Hui</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Bindereif</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Biol Chem</source>
            <pubdate>2005</pubdate>
            <volume>386</volume>
            <fpage>1265</fpage>
            <lpage>1271</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1515/BC.2005.143</pubid>
                  <pubid idtype="pmpid" link="fulltext">16336120</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>Function of alternative splicing.</p>
            </title>
            <aug>
               <au>
                  <snm>Stamm</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Ben-Ari</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Rafalska</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Tang</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Toiber</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Thanaraj</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Soreq</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>Gene</source>
            <pubdate>2005</pubdate>
            <volume>344</volume>
            <fpage>1</fpage>
            <lpage>20</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.gene.2004.10.022</pubid>
                  <pubid idtype="pmpid" link="fulltext">15656968</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>Alternative splicing and RNA selection pressure - evolutionary consequences for eukaryotic genomes.</p>
            </title>
            <aug>
               <au>
                  <snm>Xing</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Lee</snm>
                  <fnm>C</fnm>
               </au>
            </aug>
            <source>Nat Rev Genet</source>
            <pubdate>2006</pubdate>
            <volume>7</volume>
            <fpage>499</fpage>
            <lpage>510</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nrg1896</pubid>
                  <pubid idtype="pmpid" link="fulltext">16770337</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>The implications of alternative splicing in the ENCODE protein complement.</p>
            </title>
            <aug>
               <au>
                  <snm>Tress</snm>
                  <fnm>ML</fnm>
               </au>
               <au>
                  <snm>Martelli</snm>
                  <fnm>PL</fnm>
               </au>
               <au>
                  <snm>Frankish</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Reeves</snm>
                  <fnm>GA</fnm>
               </au>
               <au>
                  <snm>Wesselink</snm>
                  <fnm>JJ</fnm>
               </au>
               <au>
                  <snm>Yeats</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Olason</snm>
                  <fnm>PL</fnm>
               </au>
               <au>
                  <snm>Albrecht</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Hegyi</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Giorgetti</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Raimondo</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Lagarde</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Laskowski</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>L&#243;pez</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Sadowski</snm>
                  <fnm>MI</fnm>
               </au>
               <au>
                  <snm>Watson</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Fariselli</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Rossi</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Nagy</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Kai</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>St&#248;rling</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Orsini</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Assenov</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Blankenburg</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Huthmacher</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Ram&#237;rez</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Schlicker</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Denoeud</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Jones</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Kerrien</snm>
                  <fnm>S</fnm>
               </au>
               <etal/>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2007</pubdate>
            <volume>104</volume>
            <fpage>5495</fpage>
            <lpage>5500</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1838448</pubid>
                  <pubid idtype="pmpid" link="fulltext">17372197</pubid>
                  <pubid idtype="doi">10.1073/pnas.0700800104</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>The (in)dependence of alternative splicing and gene duplication</p>
            </title>
            <aug>
               <au>
                  <snm>Talavera</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Vogel</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Orozco</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Teichmann</snm>
                  <fnm>SA</fnm>
               </au>
               <au>
                  <snm>de la Cruz</snm>
                  <fnm>X</fnm>
               </au>
            </aug>
            <source>PLoS Comput Biol</source>
            <pubdate>2007</pubdate>
            <volume>3</volume>
            <fpage>e33</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">17335345</pubid>
                  <pubid idtype="doi">10.1371/journal.pcbi.0030033</pubid>
                  <pubid idtype="pmcid">1808492</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B13">
            <title>
               <p>Alternative splicing and protein structure evolution</p>
            </title>
            <aug>
               <au>
                  <snm>Birzele</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Csaba</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Zimmer</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2008</pubdate>
            <volume>36</volume>
            <fpage>550</fpage>
            <lpage>558</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">18055499</pubid>
                  <pubid idtype="pmcid">2241867</pubid>
                  <pubid idtype="doi">10.1093/nar/gkm1054</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p><it>Drosophila </it>Dscam is an axon guidance receptor exhibiting extraordinary molecular diversity.</p>
            </title>
            <aug>
               <au>
                  <snm>Schmucker</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Clemens</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Shu</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Worby</snm>
                  <fnm>CA</fnm>
               </au>
               <au>
                  <snm>Xiao</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Muda</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Dixon</snm>
                  <fnm>JE</fnm>
               </au>
               <au>
                  <snm>Zipursky</snm>
                  <fnm>SL</fnm>
               </au>
            </aug>
            <source>Cell</source>
            <pubdate>2000</pubdate>
            <volume>101</volume>
            <fpage>671</fpage>
            <lpage>684</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0092-8674(00)80878-8</pubid>
                  <pubid idtype="pmpid" link="fulltext">10892653</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B15">
            <title>
               <p>The molecular diversity of Dscam is functionally required for neuronal wiring specificity in <it>Drosophila</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>Chen</snm>
                  <fnm>BE</fnm>
               </au>
               <au>
                  <snm>Kondo</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Garnier</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Watson</snm>
                  <fnm>FL</fnm>
               </au>
               <au>
                  <snm>P&#252;ettmann-Holgado</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Lamar</snm>
                  <fnm>DR</fnm>
               </au>
               <au>
                  <snm>Schmucker</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Cell</source>
            <pubdate>2006</pubdate>
            <volume>125</volume>
            <fpage>607</fpage>
            <lpage>620</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.cell.2006.03.034</pubid>
                  <pubid idtype="pmpid" link="fulltext">16678102</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <title>
               <p>Structural basis of Dscam isoform specificity.</p>
            </title>
            <aug>
               <au>
                  <snm>Meijers</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Puettmann-Holgado</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Skiniotis</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Liu</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Walz</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Schmucker</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2007</pubdate>
            <volume>449</volume>
            <fpage>487</fpage>
            <lpage>491</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nature06147</pubid>
                  <pubid idtype="pmpid" link="fulltext">17721508</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>Vive la diff&#233;rence: males vs females in flies vs worms</p>
            </title>
            <aug>
               <au>
                  <snm>Cline</snm>
                  <fnm>TW</fnm>
               </au>
               <au>
                  <snm>Meyer</snm>
                  <fnm>BJ</fnm>
               </au>
            </aug>
            <source>Annu Rev Genet</source>
            <pubdate>1996</pubdate>
            <volume>30</volume>
            <fpage>637</fpage>
            <lpage>702</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">8982468</pubid>
                  <pubid idtype="doi">10.1146/annurev.genet.30.1.637</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p>Sex determination: controlling the master.</p>
            </title>
            <aug>
               <au>
                  <snm>Harrison</snm>
                  <fnm>DA</fnm>
               </au>
            </aug>
            <source>Curr Biol</source>
            <pubdate>2007</pubdate>
            <volume>17</volume>
            <fpage>R328</fpage>
            <lpage>330</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.cub.2007.03.012</pubid>
                  <pubid idtype="pmpid" link="fulltext">17470347</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B19">
            <title>
               <p>Improving gene annotation using peptide mass spectrometry.</p>
            </title>
            <aug>
               <au>
                  <snm>Tanner</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Shen</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Ng</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Florea</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Guig&#243;</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Briggs</snm>
                  <fnm>SP</fnm>
               </au>
               <au>
                  <snm>Bafna</snm>
                  <fnm>V</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2007</pubdate>
            <volume>17</volume>
            <fpage>231</fpage>
            <lpage>239</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1781355</pubid>
                  <pubid idtype="pmpid" link="fulltext">17189379</pubid>
                  <pubid idtype="doi">10.1101/gr.5646507</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B20">
            <title>
               <p>"The PeptideAtlas Project".</p>
            </title>
            <aug>
               <au>
                  <snm>Desiere</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Deutsch</snm>
                  <fnm>EW</fnm>
               </au>
               <au>
                  <snm>King</snm>
                  <fnm>NL</fnm>
               </au>
               <au>
                  <snm>Nesvizhskii</snm>
                  <fnm>AI</fnm>
               </au>
               <au>
                  <snm>Mallick</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Eng</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Chen</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Eddes</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Loevenich</snm>
                  <fnm>SN</fnm>
               </au>
               <au>
                  <snm>Aebersold</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2006</pubdate>
            <volume>34</volume>
            <fpage>D655</fpage>
            <lpage>D658</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1347403</pubid>
                  <pubid idtype="pmpid" link="fulltext">16381952</pubid>
                  <pubid idtype="doi">10.1093/nar/gkj040</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B21">
            <title>
               <p>A high-quality catalog of the <it>Drosophila melanogaster </it>proteome.</p>
            </title>
            <aug>
               <au>
                  <snm>Brunner</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Ahrens</snm>
                  <fnm>CH</fnm>
               </au>
               <au>
                  <snm>Mohanty</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Baetschmann</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Loevenich</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Potthast</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Deutsch</snm>
                  <fnm>EW</fnm>
               </au>
               <au>
                  <snm>Panse</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>de Lichtenberg</snm>
                  <fnm>U</fnm>
               </au>
               <au>
                  <snm>Rinner</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Lee</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Pedrioli</snm>
                  <fnm>PG</fnm>
               </au>
               <au>
                  <snm>Malmstrom</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Koehler</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Schrimpf</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Krijgsveld</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Kregenow</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Heck</snm>
                  <fnm>AJ</fnm>
               </au>
               <au>
                  <snm>Hafen</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Schlapbach</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Aebersold</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Nat Biotechnol</source>
            <pubdate>2007</pubdate>
            <volume>25</volume>
            <fpage>576</fpage>
            <lpage>583</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nbt1300</pubid>
                  <pubid idtype="pmpid" link="fulltext">17450130</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B22">
            <title>
               <p>PhosphoPep - a phosphoproteome resource for systems biology research in <it>Drosophila </it>Kc167 cells.</p>
            </title>
            <aug>
               <au>
                  <snm>Bodenmiller</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Malmstrom</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Gerrits</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Campbell</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Lam</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Schmidt</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Rinner</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Mueller</snm>
                  <fnm>LN</fnm>
               </au>
               <au>
                  <snm>Shannon</snm>
                  <fnm>PT</fnm>
               </au>
               <au>
                  <snm>Pedrioli</snm>
                  <fnm>PG</fnm>
               </au>
               <au>
                  <snm>Panse</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Lee</snm>
                  <fnm>HK</fnm>
               </au>
               <au>
                  <snm>Schlapbach</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Aebersold</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Mol Syst Biol</source>
            <pubdate>2007</pubdate>
            <volume>3</volume>
            <fpage>139</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">2063582</pubid>
                  <pubid idtype="pmpid" link="fulltext">17940529</pubid>
                  <pubid idtype="doi">10.1038/msb4100182</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B23">
            <title>
               <p>FlyBase: integration and improvements to query tools.</p>
            </title>
            <aug>
               <au>
                  <snm>Wilson</snm>
                  <fnm>RJ</fnm>
               </au>
               <au>
                  <snm>Goodman</snm>
                  <fnm>JL</fnm>
               </au>
               <au>
                  <snm>Strelets</snm>
                  <fnm>VB</fnm>
               </au>
               <au>
                  <cnm>the FlyBase Consortium</cnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2008</pubdate>
            <volume>36</volume>
            <fpage>D588</fpage>
            <lpage>D593</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">2238994</pubid>
                  <pubid idtype="pmpid" link="fulltext">18160408</pubid>
                  <pubid idtype="doi">10.1093/nar/gkm930</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B24">
            <title>
               <p>A developmentally regulated splice variant from the complex <it>lola </it>locus encoding multiple different zinc finger domain proteins interacts with the chromosomal kinase JIL-1.</p>
            </title>
            <aug>
               <au>
                  <snm>Zhang</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Long</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Girton</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Johansen</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Johansen</snm>
                  <fnm>KM</fnm>
               </au>
            </aug>
            <source>J Biol Chem</source>
            <pubdate>2003</pubdate>
            <volume>278</volume>
            <fpage>11696</fpage>
            <lpage>11704</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1074/jbc.M213269200</pubid>
                  <pubid idtype="pmpid" link="fulltext">12538650</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B25">
            <title>
               <p><it>SNF4Agamma</it>, the <it>Drosophila </it>AMPK gamma subunit is required for regulation of developmental and stress-induced autophagy.</p>
            </title>
            <aug>
               <au>
                  <snm>Lippai</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Csik&#243;s</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Mar&#243;y</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Luk&#225;csovich</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Juh&#225;sz</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Sass</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Autophagy</source>
            <pubdate>2008</pubdate>
            <volume>4</volume>
            <fpage>476</fpage>
            <lpage>486</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">18285699</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B26">
            <title>
               <p>Genetic and molecular complexity of the position effect variegation modifier <it>mod(mdg4) </it>in <it>Drosophila</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>Buechner</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Roth</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Schotta</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Krauss</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Saumweber</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Reuter</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Dorn</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Genetics</source>
            <pubdate>2000</pubdate>
            <volume>155</volume>
            <fpage>141</fpage>
            <lpage>157</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1461079</pubid>
                  <pubid idtype="pmpid" link="fulltext">10790390</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B27">
            <title>
               <p>Developmental expression and phylogenetic conservation of alternatively spliced forms of the carboxy-terminal binding protein corepressor.</p>
            </title>
            <aug>
               <au>
                  <snm>Mani-Telang</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Arnosti</snm>
                  <fnm>DN</fnm>
               </au>
            </aug>
            <source>Dev Genes Evol</source>
            <pubdate>2007</pubdate>
            <volume>217</volume>
            <fpage>127</fpage>
            <lpage>135</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1876751</pubid>
                  <pubid idtype="pmpid" link="fulltext">17120023</pubid>
                  <pubid idtype="doi">10.1007/s00427-006-0121-4</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B28">
            <title>
               <p>Two genes become one: the genes encoding heterochromatin protein Su(var)3-9 and translation initiation factor subunit eIF-2gamma are joined to a dicistronic unit in holometabolic insects.</p>
            </title>
            <aug>
               <au>
                  <snm>Krauss</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Reuter</snm>
                  <fnm>G</fnm>
               </au>
            </aug>
            <source>Genetics</source>
            <pubdate>2000</pubdate>
            <volume>156</volume>
            <fpage>1157</fpage>
            <lpage>1167</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1461327</pubid>
                  <pubid idtype="pmpid" link="fulltext">11063691</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B29">
            <title>
               <p>Localization of the <it>Drosophila </it>MAGUK protein Polychaetoid is controlled by alternative splicing.</p>
            </title>
            <aug>
               <au>
                  <snm>Wei</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Ellis</snm>
                  <fnm>HM</fnm>
               </au>
            </aug>
            <source>Mech Dev</source>
            <pubdate>2001</pubdate>
            <volume>100</volume>
            <fpage>217</fpage>
            <lpage>231</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0925-4773(00)00550-5</pubid>
                  <pubid idtype="pmpid" link="fulltext">11165479</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B30">
            <title>
               <p>Heterogeneity within animal thioredoxin reductases. Evidence for alternative first exon splicing.</p>
            </title>
            <aug>
               <au>
                  <snm>Sun</snm>
                  <fnm>QA</fnm>
               </au>
               <au>
                  <snm>Zappacosta</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Factor</snm>
                  <fnm>VM</fnm>
               </au>
               <au>
                  <snm>Wirth</snm>
                  <fnm>PJ</fnm>
               </au>
               <au>
                  <snm>Hatfield</snm>
                  <fnm>DL</fnm>
               </au>
               <au>
                  <snm>Gladyshev</snm>
                  <fnm>VN</fnm>
               </au>
            </aug>
            <source>J Biol Chem</source>
            <pubdate>2001</pubdate>
            <volume>276</volume>
            <fpage>3106</fpage>
            <lpage>3114</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1074/jbc.M004750200</pubid>
                  <pubid idtype="pmpid" link="fulltext">11060283</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B31">
            <title>
               <p>Sequence complexity of disordered protein.</p>
            </title>
            <aug>
               <au>
                  <snm>Romero</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Obradovic</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Garner</snm>
                  <fnm>EC</fnm>
               </au>
               <au>
                  <snm>Brown</snm>
                  <fnm>CJ</fnm>
               </au>
               <au>
                  <snm>Dunker</snm>
                  <fnm>AK</fnm>
               </au>
            </aug>
            <source>Proteins</source>
            <pubdate>2001</pubdate>
            <volume>42</volume>
            <fpage>38</fpage>
            <lpage>48</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/1097-0134(20010101)42:1&lt;38::AID-PROT50>3.0.CO;2-3</pubid>
                  <pubid idtype="pmpid" link="fulltext">11093259</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B32">
            <title>
               <p>A practical overview of protein disorder prediction methods.</p>
            </title>
            <aug>
               <au>
                  <snm>Ferron</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Longhi</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Canard</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Karlin</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Proteins</source>
            <pubdate>2006</pubdate>
            <volume>65</volume>
            <fpage>1</fpage>
            <lpage>14</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/prot.21075</pubid>
                  <pubid idtype="pmpid" link="fulltext">16856179</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B33">
            <title>
               <p>A systematic comparative and structural analysis of protein phosphorylation sites based on the mtcPTM database.</p>
            </title>
            <aug>
               <au>
                  <snm>Jim&#233;nez</snm>
                  <fnm>JL</fnm>
               </au>
               <au>
                  <snm>Hegemann</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Hutchins</snm>
                  <fnm>JRA</fnm>
               </au>
               <au>
                  <snm>Peters</snm>
                  <fnm>J-M</fnm>
               </au>
               <au>
                  <snm>Durbin</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2007</pubdate>
            <volume>8</volume>
            <fpage>R90</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1929158</pubid>
                  <pubid idtype="pmpid" link="fulltext">17521420</pubid>
                  <pubid idtype="doi">10.1186/gb-2007-8-5-r90</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B34">
            <title>
               <p>The importance of intrinsic disorder for protein phosphorylation.</p>
            </title>
            <aug>
               <au>
                  <snm>Iakoucheva</snm>
                  <fnm>LM</fnm>
               </au>
               <au>
                  <snm>Radivojac</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Brown</snm>
                  <fnm>CJ</fnm>
               </au>
               <au>
                  <snm>O'Connor</snm>
                  <fnm>TR</fnm>
               </au>
               <au>
                  <snm>Sikes</snm>
                  <fnm>JG</fnm>
               </au>
               <au>
                  <snm>Obradovic</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Dunker</snm>
                  <fnm>AK</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2004</pubdate>
            <volume>32</volume>
            <fpage>1037</fpage>
            <lpage>1049</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">373391</pubid>
                  <pubid idtype="pmpid" link="fulltext">14960716</pubid>
                  <pubid idtype="doi">10.1093/nar/gkh253</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B35">
            <title>
               <p>Alternative splicing in concert with protein intrinsic disorder enables increased functional diversity in multicellular organisms.</p>
            </title>
            <aug>
               <au>
                  <snm>Romero</snm>
                  <fnm>PR</fnm>
               </au>
               <au>
                  <snm>Zaidi</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Fang</snm>
                  <fnm>YY</fnm>
               </au>
               <au>
                  <snm>Uversky</snm>
                  <fnm>VN</fnm>
               </au>
               <au>
                  <snm>Radivojac</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Oldfield</snm>
                  <fnm>CJ</fnm>
               </au>
               <au>
                  <snm>Cortese</snm>
                  <fnm>MS</fnm>
               </au>
               <au>
                  <snm>Sickmeier</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>LeGall</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Obradovic</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Dunker</snm>
                  <fnm>AK</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci USA</source>
            <pubdate>2006</pubdate>
            <volume>103</volume>
            <fpage>8390</fpage>
            <lpage>8395</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1482503</pubid>
                  <pubid idtype="pmpid" link="fulltext">16717195</pubid>
                  <pubid idtype="doi">10.1073/pnas.0507916103</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B36">
            <title>
               <p>The Protein Data Bank.</p>
            </title>
            <aug>
               <au>
                  <snm>Berman</snm>
                  <fnm>HM</fnm>
               </au>
               <au>
                  <snm>Westbrook</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Feng</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Gilliland</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Bhat</snm>
                  <fnm>TN</fnm>
               </au>
               <au>
                  <snm>Weissig</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Shindyalov</snm>
                  <fnm>IN</fnm>
               </au>
               <au>
                  <snm>Bourne</snm>
                  <fnm>PE</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2000</pubdate>
            <volume>28</volume>
            <fpage>235</fpage>
            <lpage>242</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">102472</pubid>
                  <pubid idtype="pmpid" link="fulltext">10592235</pubid>
                  <pubid idtype="doi">10.1093/nar/28.1.235</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B37">
            <title>
               <p>An integrated chemical, mass spectrometric and computational strategy for (quantitative) phosphoproteomics: application to <it>Drosophila melanogaster </it>Kc167 cells.</p>
            </title>
            <aug>
               <au>
                  <snm>Bodenmiller</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Mueller</snm>
                  <fnm>LN</fnm>
               </au>
               <au>
                  <snm>Pedrioli</snm>
                  <fnm>PG</fnm>
               </au>
               <au>
                  <snm>Pflieger</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>J&#252;nger</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Eng</snm>
                  <fnm>JK</fnm>
               </au>
               <au>
                  <snm>Aebersold</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Tao</snm>
                  <fnm>WA</fnm>
               </au>
            </aug>
            <source>Mol Biosyst</source>
            <pubdate>2007</pubdate>
            <volume>3</volume>
            <fpage>275</fpage>
            <lpage>286</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1039/b617545g</pubid>
                  <pubid idtype="pmpid" link="fulltext">17372656</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B38">
            <title>
               <p>Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search.</p>
            </title>
            <aug>
               <au>
                  <snm>Keller</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Nesvizhskii</snm>
                  <fnm>AI</fnm>
               </au>
               <au>
                  <snm>Kolker</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Aebersold</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Anal Chem</source>
            <pubdate>2002</pubdate>
            <volume>74</volume>
            <fpage>5383</fpage>
            <lpage>5392</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1021/ac025747h</pubid>
                  <pubid idtype="pmpid">12403597</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B39">
            <title>
               <p>The PhosphoPep Database</p>
            </title>
            <url>http://www.phosphopep.org</url>
         </bibl>
         <bibl id="B40">
            <title>
               <p>Biological function of unannotated transcription during the early development of <it>Drosophila melanogaster</it>.</p>
            </title>
            <aug>
               <au>
                  <snm>Manak</snm>
                  <fnm>JR</fnm>
               </au>
               <au>
                  <snm>Dike</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Sementchenko</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Kapranov</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Biemar</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Long</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Cheng</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Bell</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Ghosh</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Piccolboni</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Gingeras</snm>
                  <fnm>TR</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>2006</pubdate>
            <volume>38</volume>
            <fpage>1151</fpage>
            <lpage>1158</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/ng1875</pubid>
                  <pubid idtype="pmpid" link="fulltext">16951679</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
      </refgrp>
   </bm>
</art>
