Email updates

Keep up to date with the latest news and content from BMC Systems Biology and BioMed Central.

This article is part of the supplement: 22nd International Conference on Genome Informatics: Systems Biology

Open Access Proceedings

A high performance profile-biomarker diagnosis for mass spectral profiles

Henry Han

Author Affiliations

Department of Mathematics and Bioinformatics, Eastern Michigan University, Ypsilanti MI, 48197, USA

The Laboratory for High Performance Computing in Bioinformatics, Eastern Michigan University, Ypsilanti, MI 48197, USA

BMC Systems Biology 2011, 5(Suppl 2):S5  doi:10.1186/1752-0509-5-S2-S5

Published: 14 December 2011

Abstract

Background

Although mass spectrometry based proteomics demonstrates an exciting promise in complex diseases diagnosis, it remains an important research field rather than an applicable clinical routine for its diagnostic accuracy and data reproducibility. Relatively less investigation has been done yet in attaining high-performance proteomic pattern classification compared with the amount of endeavours in enhancing data reproducibility.

Methods

In this study, we present a novel machine learning approach to achieve a clinical level disease diagnosis for mass spectral data. We propose multi-resolution independent component analysis, a novel feature selection algorithm to tackle the large dimensionality of mass spectra, by following our local and global feature selection framework. We also develop high-performance classifiers by embedding multi-resolution independent component analysis in linear discriminant analysis and support vector machines.

Results

Our multi-resolution independent component based support vector machines not only achieve clinical level classification accuracy, but also overcome the weakness in traditional peak-selection based biomarker discovery. In addition to rigorous theoretical analysis, we demonstrate our method’s superiority by comparing it with nine state-of-the-art classification and regression algorithms on six heterogeneous mass spectral profiles.

Conclusions

Our work not only suggests an alternative direction from machine learning to accelerate mass spectral proteomic technologies into a clinical routine by treating an input profile as a ‘profile-biomarker’, but also has positive impacts on large scale ‘omics' data mining. Related source codes and data sets can be found at: https://sites.google.com/site/heyaumbioinformatics/home/proteomics webcite