Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Highly Accessed Methodology article

Isotope pattern deconvolution for peptide mass spectrometry by non-negative least squares/least absolute deviation template matching

Martin Slawski1*, Rene Hussong23, Andreas Tholey4, Thomas Jakoby4, Barbara Gregorius24, Andreas Hildebrandt25 and Matthias Hein1

Author Affiliations

1 Department of Computer Science, Saarland University, Saarbrücken, Germany

2 Center for Bioinformatics, Saarland University, Saarbrücken, Germany

3 Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, Esch-sur-AlzetteLuxembourg

4 Division for Systematic Proteome Research, Institute for Experimental Medicine, , Kiel, Germany

5 Institut für Informatik, Johannes-Gutenberg -Universität, Mainz, Germany

For all author emails, please log on.

BMC Bioinformatics 2012, 13:291  doi:10.1186/1471-2105-13-291

Published: 8 November 2012

Abstract

Background

The robust identification of isotope patterns originating from peptides being analyzed through mass spectrometry (MS) is often significantly hampered by noise artifacts and the interference of overlapping patterns arising e.g. from post-translational modifications. As the classification of the recorded data points into either ‘noise’ or ‘signal’ lies at the very root of essentially every proteomic application, the quality of the automated processing of mass spectra can significantly influence the way the data might be interpreted within a given biological context.

Results

We propose non-negative least squares/non-negative least absolute deviation regression to fit a raw spectrum by templates imitating isotope patterns. In a carefully designed validation scheme, we show that the method exhibits excellent performance in pattern picking. It is demonstrated that the method is able to disentangle complicated overlaps of patterns.

Conclusions

We find that regularization is not necessary to prevent overfitting and that thresholding is an effective and user-friendly way to perform feature selection. The proposed method avoids problems inherent in regularization-based approaches, comes with a set of well-interpretable parameters whose default configuration is shown to generalize well without the need for fine-tuning, and is applicable to spectra of different platforms. The R package IPPD implements the method and is available from the Bioconductor platform (http://bioconductor.fhcrc.org/help/bioc-views/devel/bioc/html/IPPD.html webcite).