Email updates

Keep up to date with the latest news and content from BMC Genomics and BioMed Central.

Open Access Research article

Improving gene expression data interpretation by finding latent factors that co-regulate gene modules with clinical factors

Tianwei Yu1 and Yun Bai2

Author Affiliations

1 Department of Biostatistics and Bioinformatics, Rollins School of Public Health, Emory University, Atlanta, GA, USA

2 Department of Pharmaceutical Sciences, School of Pharmacy, Philadelphia College of Osteopathic Medicine, Suwanee, GA, USA

BMC Genomics 2011, 12:563  doi:10.1186/1471-2164-12-563

Published: 16 November 2011



In the analysis of high-throughput data with a clinical outcome, researchers mostly focus on genes/proteins that show first-order relations with the clinical outcome. While this approach yields biomarkers and biological mechanisms that are easily interpretable, it may miss information that is important to the understanding of disease mechanism and/or treatment response. Here we test the hypothesis that unobserved factors can be mobilized by the living system to coordinate the response to the clinical factors.


We developed a computational method named Guided Latent Factor Discovery (GLFD) to identify hidden factors that act in combination with the observed clinical factors to control gene modules. In simulation studies, the method recovered masked factors effectively. Using real microarray data, we demonstrate that the method identifies latent factors that are biologically relevant, and extracts more information than analyzing only the first-order response to the clinical outcome.


Finding latent factors using GLFD brings extra insight into the mechanisms of the disease/drug response. The R code of the method is available at webcite.