Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

This article is part of the supplement: Statistical mass spectrometry-based proteomics

Open Access Review

Computational approaches to protein inference in shotgun proteomics

Yong Fuga Li and Predrag Radivojac*

Author affiliations

School of Informatics and Computing, Indiana University, Bloomington 150 S. Woodlawn Avenue, Bloomington, Indiana, 47405, USA

For all author emails, please log on.

Citation and License

BMC Bioinformatics 2012, 13(Suppl 16):S4  doi:10.1186/1471-2105-13-S16-S4

Published: 5 November 2012

Abstract

Shotgun proteomics has recently emerged as a powerful approach to characterizing proteomes in biological samples. Its overall objective is to identify the form and quantity of each protein in a high-throughput manner by coupling liquid chromatography with tandem mass spectrometry. As a consequence of its high throughput nature, shotgun proteomics faces challenges with respect to the analysis and interpretation of experimental data. Among such challenges, the identification of proteins present in a sample has been recognized as an important computational task. This task generally consists of (1) assigning experimental tandem mass spectra to peptides derived from a protein database, and (2) mapping assigned peptides to proteins and quantifying the confidence of identified proteins. Protein identification is fundamentally a statistical inference problem with a number of methods proposed to address its challenges. In this review we categorize current approaches into rule-based, combinatorial optimization and probabilistic inference techniques, and present them using integer programing and Bayesian inference frameworks. We also discuss the main challenges of protein identification and propose potential solutions with the goal of spurring innovative research in this area.