Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Methodology article

A factorization method for the classification of infrared spectra

Carsten Henneges1*, Pavel Laskov1, Endang Darmawan2, Jürgen Backhaus2, Bernd Kammerer3 and Andreas Zell1

Author Affiliations

1 Zentrum für Bioinformatik Tübingen, Eberhard Karls Universität Tübingen, Sand 13, Tübingen, Germany

2 Institut für Instrumentelle Analytik und Bioanalytik, Hochschule Mannheim, Paul-Wittsack-Straße 10, Mannheim, Germany

3 Zentrum für Biosystemanalyse ZBSA, Albert-Ludwigs-Universität Freiburg, Habsburgerstraße 49, Freiburg, Germany

For all author emails, please log on.

BMC Bioinformatics 2010, 11:561  doi:10.1186/1471-2105-11-561

Published: 15 November 2010

Abstract

Background

Bioinformatics data analysis often deals with additive mixtures of signals for which only class labels are known. Then, the overall goal is to estimate class related signals for data mining purposes. A convenient application is metabolic monitoring of patients using infrared spectroscopy. Within an infrared spectrum each single compound contributes quantitatively to the measurement.

Results

In this work, we propose a novel factorization technique for additive signal factorization that allows learning from classified samples. We define a composed loss function for this task and analytically derive a closed form equation such that training a model reduces to searching for an optimal threshold vector. Our experiments, carried out on synthetic and clinical data, show a sensitivity of up to 0.958 and specificity of up to 0.841 for a 15-class problem of disease classification. Using class and regression information in parallel, our algorithm outperforms linear SVM for training cases having many classes and few data.

Conclusions

The presented factorization method provides a simple and generative model and, therefore, represents a first step towards predictive factorization methods.