Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Research article

Stronger findings from mass spectral data through multi-peak modeling

Tommi Suvitaival1, Simon Rogers2 and Samuel Kaski13*

Author Affiliations

1 Helsinki Institute for Information Technology HIIT, Department of Information and Computer Science, Aalto University, 00076 Espoo, Finland

2 School of Computing Science, University of Glasgow, G12 8QQ, Glasgow, UK

3 Helsinki Institute for Information Technology HIIT, Department of Computer Science, University of Helsinki, Helsinki, Finland

For all author emails, please log on.

BMC Bioinformatics 2014, 15:208  doi:10.1186/1471-2105-15-208

Published: 19 June 2014



Mass spectrometry-based metabolomic analysis depends upon the identification of spectral peaks by their mass and retention time. Statistical analysis that follows the identification currently relies on one main peak of each compound. However, a compound present in the sample typically produces several spectral peaks due to its isotopic properties and the ionization process of the mass spectrometer device. In this work, we investigate the extent to which these additional peaks can be used to increase the statistical strength of differential analysis.


We present a Bayesian approach for integrating data of multiple detected peaks that come from one compound. We demonstrate the approach through a simulated experiment and validate it on ultra performance liquid chromatography-mass spectrometry (UPLC-MS) experiments for metabolomics and lipidomics. Peaks that are likely to be associated with one compound can be clustered by the similarity of their chromatographic shape. Changes of concentration between sample groups can be inferred more accurately when multiple peaks are available.


When the sample-size is limited, the proposed multi-peak approach improves the accuracy at inferring covariate effects. An R implementation and data are available at webcite.

ANOVA-type modeling; Bayesian modeling; Clustering; Mass spectrometry; Metabolomics; Lipidomics; Nonparametric Bayes