Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

This article is part of the supplement: Statistical mass spectrometry-based proteomics

Open Access Open Badges Research

Statistical protein quantification and significance analysis in label-free LC-MS experiments with complex designs

Timothy Clough1, Safia Thaminy23, Susanne Ragg4, Ruedi Aebersold25 and Olga Vitek16*

Author Affiliations

1 Department of Statistics, Purdue University, West Lafayette, IN, USA

2 Department of Biology, Institute of Molecular Systems Biology, ETH Zürich, Switzerland

3 Institute for Systems Biology, Seattle, WA, USA

4 School of Medicine, Indiana University, Indianapolis, IN, USA

5 Faculty of Science, University of Zürich, Switzerland

6 Department of Computer Science, Purdue University, West Lafayette, IN, USA

For all author emails, please log on.

BMC Bioinformatics 2012, 13(Suppl 16):S6  doi:10.1186/1471-2105-13-S16-S6

Published: 5 November 2012



Liquid chromatography coupled with tandem mass spectrometry (LC-MS/MS) is widely used for quantitative proteomic investigations. The typical output of such studies is a list of identified and quantified peptides. The biological and clinical interest is, however, usually focused on quantitative conclusions at the protein level. Furthermore, many investigations ask complex biological questions by studying multiple interrelated experimental conditions. Therefore, there is a need in the field for generic statistical models to quantify protein levels even in complex study designs.


We propose a general statistical modeling approach for protein quantification in arbitrary complex experimental designs, such as time course studies, or those involving multiple experimental factors. The approach summarizes the quantitative experimental information from all the features and all the conditions that pertain to a protein. It enables both protein significance analysis between conditions, and protein quantification in individual samples or conditions. We implement the approach in an open-source R-based software package MSstats suitable for researchers with a limited statistics and programming background.


We demonstrate, using as examples two experimental investigations with complex designs, that a simultaneous statistical modeling of all the relevant features and conditions yields a higher sensitivity of protein significance analysis and a higher accuracy of protein quantification as compared to commonly employed alternatives. The software is available at webcite.

Label-free LC-MS/MS; linear mixed effects models; protein quantification; quantitative proteomics; statistical design of experiments