Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

This article is part of the supplement: Third International Workshop on Data and Text Mining in Bioinformatics (DTMBio) 2009

Open Access Proceedings

Multivariate classification of urine metabolome profiles for breast cancer diagnosis

Younghoon Kim1, Imhoi Koo2, Byung Hwa Jung3, Bong Chul Chung3 and Doheon Lee1*

Author Affiliations

1 Department of Bio and Brain Engineering, KAIST, Daejeon, South Korea

2 Korea Institute of Oriental Medicine, Daejeon, South Korea

3 Bioanalysis and Biotransformation Research Center, KIST, Chengryang, Seoul, South Korea

For all author emails, please log on.

BMC Bioinformatics 2010, 11(Suppl 2):S4  doi:10.1186/1471-2105-11-S2-S4

Published: 16 April 2010



Diagnosis techniques using urine are non-invasive, inexpensive, and easy to perform in clinical settings. The metabolites in urine, as the end products of cellular processes, are closely linked to phenotypes. Therefore, urine metabolome is very useful in marker discoveries and clinical applications. However, only univariate methods have been used in classification studies using urine metabolome. Since multiple genes or proteins would be involved in developments of complex diseases such as breast cancer, multiple compounds including metabolites would be related with the complex diseases, and multivariate methods would be needed to identify those multiple metabolite markers. Moreover, because combinatorial effects among the markers can seriously affect disease developments and there also exist individual differences in genetic makeup or heterogeneity in cancer progressions, single marker is not enough to identify cancers.


We proposed classification models using multivariate classification techniques and developed an analysis procedure for classification studies using metabolome data. Through this strategy, we identified five potential urinary biomarkers for breast cancer with high accuracy, among which the four biomarker candidates were not identifiable by only univariate methods. We also proposed potential diagnosis rules to help in clinical decision making. Besides, we showed that combinatorial effects among multiple biomarkers can enhance discriminative power for breast cancer.


In this study, we successfully showed that multivariate classifications are needed to precisely diagnose breast cancer. After further validation with independent cohorts and experimental confirmation, these marker candidates will likely lead to clinically applicable assays for earlier diagnoses of breast cancer.