Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

This article is part of the supplement: Proceedings of the 21st International Conference on Genome Informatics (GIW2010)

Open Access Research

The utility of mass spectrometry-based proteomic data for validation of novel alternative splice forms reconstructed from RNA-Seq data: a preliminary assessment

Kang Ning13 and Alexey I Nesvizhskii12*

Author Affiliations

1 Department of Pathology, University of Michigan, Ann Arbor, MI 48109, USA

2 Center for Computational Biology and Medicine, University of Michigan, Ann Arbor, MI 48109, USA

3 BioEnergy Genome Center, Qingdao Institute of BioEnergy and Bioprocess Technology, Chinese Academy of Sciences, China

For all author emails, please log on.

BMC Bioinformatics 2010, 11(Suppl 11):S14  doi:10.1186/1471-2105-11-S11-S14

Published: 14 December 2010

Abstract

Background

Most mass spectrometry (MS) based proteomic studies depend on searching acquired tandem mass (MS/MS) spectra against databases of known protein sequences. In these experiments, however, a large number of high quality spectra remain unassigned. These spectra may correspond to novel peptides not present in the database, especially those corresponding to novel alternative splice (AS) forms. Recently, fast and comprehensive profiling of mammalian genomes using deep sequencing (i.e. RNA-Seq) has become possible. MS-based proteomics can potentially be used as an aid for protein-level validation of novel AS events observed in RNA-Seq data.

Results

In this work, we have used publicly available mouse tissue proteomic and RNA-Seq datasets and have examined the feasibility of using MS data for the identification of novel AS forms by searching MS/MS spectra against translated mRNA sequences derived from RNA-Seq data. A significant correlation between the likelihood of identifying a peptide from MS/MS data and the number of reads in RNA-Seq data for the same gene was observed. Based on in silico experiments, it was also observed that only a fraction of novel AS forms identified from RNA-Seq had the corresponding junction peptide compatible with MS/MS sequencing. The number of novel peptides that were actually identified from MS/MS spectra was substantially lower than the number expected based on in silico analysis.

Conclusions

The ability to confirm novel AS forms from MS/MS data in the dataset analyzed was found to be quite limited. This can be explained in part by low abundance of many novel transcripts, with the abundance of their corresponding protein products falling below the limit of detection by MS.