Isotope pattern deconvolution for peptide mass spectrometry by non-negative least squares/least absolute deviation template matching
1 Department of Computer Science, Saarland University, Saarbrücken, Germany
2 Center for Bioinformatics, Saarland University, Saarbrücken, Germany
3 Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, Esch-sur-AlzetteLuxembourg
4 Division for Systematic Proteome Research, Institute for Experimental Medicine, , Kiel, Germany
5 Institut für Informatik, Johannes-Gutenberg -Universität, Mainz, Germany
BMC Bioinformatics 2012, 13:291 doi:10.1186/1471-2105-13-291Published: 8 November 2012
The robust identification of isotope patterns originating from peptides being analyzed through mass spectrometry (MS) is often significantly hampered by noise artifacts and the interference of overlapping patterns arising e.g. from post-translational modifications. As the classification of the recorded data points into either ‘noise’ or ‘signal’ lies at the very root of essentially every proteomic application, the quality of the automated processing of mass spectra can significantly influence the way the data might be interpreted within a given biological context.
We propose non-negative least squares/non-negative least absolute deviation regression to fit a raw spectrum by templates imitating isotope patterns. In a carefully designed validation scheme, we show that the method exhibits excellent performance in pattern picking. It is demonstrated that the method is able to disentangle complicated overlaps of patterns.
We find that regularization is not necessary to prevent overfitting and that thresholding is an effective and user-friendly way to perform feature selection. The proposed method avoids problems inherent in regularization-based approaches, comes with a set of well-interpretable parameters whose default configuration is shown to generalize well without the need for fine-tuning, and is applicable to spectra of different platforms. The R package IPPD implements the method and is available from the Bioconductor platform (http://bioconductor.fhcrc.org/help/bioc-views/devel/bioc/html/IPPD.html webcite).