Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Research article

Identification of biomarkers for genotyping Aspergilli using non-linear methods for clustering and classification

Irene Kouskoumvekaki1, Zhiyong Yang2, Svava Ó Jónsdóttir1, Lisbeth Olsson2 and Gianni Panagiotou2*

Author Affiliations

1 Center for Biological Sequence Analysis, BioCentrum-DTU, Building 208, Technical University of Denmark, DK-2800 Kgs. Lyngby, Denmark

2 Center for Microbial Biotechnology, BioCentrum-DTU, Building 223, Technical University of Denmark, DK-2800 Kgs Lyngby, Denmark

For all author emails, please log on.

BMC Bioinformatics 2008, 9:59  doi:10.1186/1471-2105-9-59

Published: 28 January 2008

Abstract

Background

In the present investigation, we have used an exhaustive metabolite profiling approach to search for biomarkers in recombinant Aspergillus nidulans (mutants that produce the 6- methyl salicylic acid polyketide molecule) for application in metabolic engineering.

Results

More than 450 metabolites were detected and subsequently used in the analysis. Our approach consists of two analytical steps of the metabolic profiling data, an initial non-linear unsupervised analysis with Self-Organizing Maps (SOM) to identify similarities and differences among the metabolic profiles of the studied strains, followed by a second, supervised analysis for training a classifier based on the selected biomarkers. Our analysis identified seven putative biomarkers that were able to cluster the samples according to their genotype. A Support Vector Machine was subsequently employed to construct a predictive model based on the seven biomarkers, capable of distinguishing correctly 14 out of the 16 samples of the different A. nidulans strains.

Conclusion

Our study demonstrates that it is possible to use metabolite profiling for the classification of filamentous fungi as well as for the identification of metabolic engineering targets and draws the attention towards the development of a common database for storage of metabolomics data.