A factorization method for the classification of infrared spectra
1 Zentrum für Bioinformatik Tübingen, Eberhard Karls Universität Tübingen, Sand 13, Tübingen, Germany
2 Institut für Instrumentelle Analytik und Bioanalytik, Hochschule Mannheim, Paul-Wittsack-Straße 10, Mannheim, Germany
3 Zentrum für Biosystemanalyse ZBSA, Albert-Ludwigs-Universität Freiburg, Habsburgerstraße 49, Freiburg, Germany
BMC Bioinformatics 2010, 11:561 doi:10.1186/1471-2105-11-561Published: 15 November 2010
Bioinformatics data analysis often deals with additive mixtures of signals for which only class labels are known. Then, the overall goal is to estimate class related signals for data mining purposes. A convenient application is metabolic monitoring of patients using infrared spectroscopy. Within an infrared spectrum each single compound contributes quantitatively to the measurement.
In this work, we propose a novel factorization technique for additive signal factorization that allows learning from classified samples. We define a composed loss function for this task and analytically derive a closed form equation such that training a model reduces to searching for an optimal threshold vector. Our experiments, carried out on synthetic and clinical data, show a sensitivity of up to 0.958 and specificity of up to 0.841 for a 15-class problem of disease classification. Using class and regression information in parallel, our algorithm outperforms linear SVM for training cases having many classes and few data.
The presented factorization method provides a simple and generative model and, therefore, represents a first step towards predictive factorization methods.