Email updates

Keep up to date with the latest news and content from BMC Genetics and BioMed Central.

Open Access Highly Accessed Methodology article

Multi-view singular value decomposition for disease subtyping and genetic associations

Jiangwen Sun1, Jinbo Bi1* and Henry R Kranzler2

Author Affiliations

1 Department of Computer Science and Engineering, University of Connecticut, 371 Fairfield Way, Storrs, CT 06269, USA

2 Treatment Research Center, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 24105, USA

For all author emails, please log on.

BMC Genetics 2014, 15:73  doi:10.1186/1471-2156-15-73

Published: 17 June 2014

Abstract

Background

Accurate classification of patients with a complex disease into subtypes has important implications for medicine and healthcare. Using more homogeneous disease subtypes in genetic association analysis will facilitate the detection of new genetic variants that are not detectible using the non-differentiated disease phenotype. Subtype differentiation can also improve diagnostic classification, which can in turn inform clinical decision making and treatment matching. Currently, the most sophisticated methods for disease subtyping perform cluster analysis using patients’ clinical features. Without guidance from genetic information, the resultant subtypes are likely to be suboptimal and efforts at genetic association may fail.

Results

We propose a multi-view matrix decomposition approach that integrates clinical features with genetic markers to detect confirmatory evidence for a disease subtype. This approach groups patients into clusters that are consistent between the clinical and genetic dimensions of data; it simultaneously identifies the clinical features that define the subtype and the genotypes associated with the subtype. A simulation study validated the proposed approach, showing that it identified hypothesized subtypes and associated features. In comparison to the latest biclustering and multi-view data analytics using real-life disease data, the proposed approach identified clinical subtypes of a disease that differed from each other more significantly in the genetic markers, thus demonstrating the superior performance of the proposed approach.

Conclusions

The proposed algorithm is an effective and superior alternative to the disease subtyping methods employed to date. Integration of phenotypic features with genetic markers in the subtyping analysis is a promising approach to identify concurrently disease subtypes and their genetic associations.

Keywords:
Genotype-phenotype association; Multi-view data analysis; Subtyping; Biclustering; Matrix decomposition