Email updates

Keep up to date with the latest news and content from BMC Genomics and BioMed Central.

This article is part of the supplement: Selected articles from the International Conference on Intelligent Biology and Medicine (ICIBM 2013): Genomics

Open Access Research

Identification of gene fusions from human lung cancer mass spectrometry data

Han Sun12, Xiaobin Xing3, Jing Li2, Fengli Zhou4, Yunqin Chen2, Ying He12, Wei Li2, Guangwu Wei1, Xiao Chang5, Jia Jia2, Yixue Li12* and Lu Xie12*

Author Affiliations

1 Key Laboratory of Systems Biology, Shanghai Institutes for Biological Science, Chinese Academy of Sciences, Shanghai, 200031, China

2 Shanghai Center for Bioinformation Technology, Shanghai Academy of Science and Technology, Shanghai, 201203, China

3 Genome Biology Unit, European Molecular Biology Laboratory, Heidelberg, 69117, Germany

4 Department of Respiration, The Third Affiliated Hospital of Sun Yat-sen University, Guangzhou, 510630, China

5 Department of Pediatrics, Division of Human Genetics, The Center for Applied Genomics, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA

For all author emails, please log on.

BMC Genomics 2013, 14(Suppl 8):S5  doi:10.1186/1471-2164-14-S8-S5

Published: 9 December 2013



Tandem mass spectrometry (MS/MS) technology has been applied to identify proteins, as an ultimate approach to confirm the original genome annotation. To be able to identify gene fusion proteins, a special database containing peptides that cross over gene fusion breakpoints is needed.


It is impractical to construct a database that includes all possible fusion peptides originated from potential breakpoints. Focusing on 6259 reported and predicted gene fusion pairs from ChimerDB 2.0 and Cancer Gene Census, we for the first time created a database CanProFu that comprehensively annotates fusion peptides formed by exon-exon linkage between these pairing genes.


Applying this database to mass spectrometry datasets of 40 human non-small cell lung cancer (NSCLC) samples and 39 normal lung samples with stringent searching criteria, we were able to identify 19 unique fusion peptides characterizing gene fusion events. Among them 11 gene fusion events were only found in NSCLC samples. And also, 4 alternative splicing events were characterized in cancerous or normal lung samples.


The database and workflow in this work can be flexibly applied to other MS/MS based human cancer experiments to detect gene fusions as potential disease biomarkers or drug targets.