Log on / register
Feedback | Support | My details
Open AccessMethodology article

A compatible exon-exon junction database for the identification of exon skipping events using tandem mass spectrum data

Fan Mo* 1 email, Xu Hong* 1 email, Feng Gao2 email, Lin Du3 email, Jun Wang1 email, Gilbert S Omenn4,5 email and Biaoyang Lin1 email

1Systems Biology Division, Zhejiang-California Nanosystems Institute (ZCNI) of Zhejiang University, Zhejiang University Huajiachi Campus, 268 Kaixuan Road, Hangzhou 310029, PR China

2Department of General Surgery, The Second Affiliated Hospital, ShanXi Medical University, 382 Wuyi Road, Taiyuan 030000, PR China

3College of Life Science, Zhejiang University Zijingang Campus, Zijinhua Road, Hangzhou 310058, PR China

4Center for Computational Medicine and Biology, National Center for Integrative Biomedical Informatics, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI 48109-2218, USA.

5Departments of Internal Medicine and Human Genetics, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI 48109-2218, USA

author email corresponding author email* Contributed equally

BMC Bioinformatics 2008, 9:537doi:10.1186/1471-2105-9-537

Published: 16 December 2008

Abstract

Background

Alternative splicing is an important gene regulation mechanism. It is estimated that about 74% of multi-exon human genes have alternative splicing. High throughput tandem (MS/MS) mass spectrometry provides valuable information for rapidly identifying potentially novel alternatively-spliced protein products from experimental datasets. However, the ability to identify alternative splicing events through tandem mass spectrometry depends on the database against which the spectra are searched.

Results

We wrote scripts in perl, Bioperl, mysql and Ensembl API and built a theoretical exon-exon junction protein database to account for all possible combinations of exons for a gene while keeping the frame of translation (i.e., keeping only in-phase exon-exon combinations) from the Ensembl Core Database. Using our liver cancer MS/MS dataset, we identified a total of 488 non-redundant peptides that represent putative exon skipping events.

Conclusion

Our exon-exon junction database provides the scientific community with an efficient means to identify novel alternatively spliced (exon skipping) protein isoforms using mass spectrometry data. This database will be useful in annotating genome structures using rapidly accumulating proteomics data.


© 1999-2009 BioMed Central Ltd unless otherwise stated. Part of Springer Science+Business Media.