This article is part of the supplement: Proceedings of the 2012 International Conference on Intelligent Computing (ICIC 2012)
Diagnostic prediction of complex diseases using phase-only correlation based on virtual sample template
Applied Bioinformatics Laboratory, the University of Kansas, 2034 Becker Drive, Lawrence, KS 66047, USA
BMC Bioinformatics 2013, 14(Suppl 8):S11 doi:10.1186/1471-2105-14-S8-S11Published: 9 May 2013
Complex diseases induce perturbations to interaction and regulation networks in living systems, resulting in dynamic equilibrium states that differ for different diseases and also normal states. Thus identifying gene expression patterns corresponding to different equilibrium states is of great benefit to the diagnosis and treatment of complex diseases. However, it remains a major challenge to deal with the high dimensionality and small size of available complex disease gene expression datasets currently used for discovering gene expression patterns.
Here we present a phase-only correlation (POC) based classification method for recognizing the type of complex diseases. First, a virtual sample template is constructed for each subclass by averaging all samples of each subclass in a training dataset. Then the label of a test sample is determined by measuring the similarity between the test sample and each template. This novel method can detect the similarity of overall patterns emerged from the differentially expressed genes or proteins while ignoring small mismatches.
The experimental results obtained on seven publicly available complex disease datasets including microarray and protein array data demonstrate that the proposed POC-based disease classification method is effective and robust for diagnosing complex diseases with regard to the number of initially selected features, and its recognition accuracy is better than or comparable to other state-of-the-art machine learning methods. In addition, the proposed method does not require parameter tuning and data scaling, which can effectively reduce the occurrence of over-fitting and bias.