Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Highly Accessed Research article

MCA: Multiresolution Correlation Analysis, a graphical tool for subpopulation identification in single-cell gene expression data

Justin Feigelman12, Fabian J Theis12 and Carsten Marr1*

Author Affiliations

1 Institute of Computational Biology, Helmholtz Zentrum München, Ingolstädter Landstrasse 1, 85764 Neuherberg, Germany

2 Department of Mathematics, Technische Universität München, Boltzmannstrasse, 3, 85747 Garching, Germany

For all author emails, please log on.

BMC Bioinformatics 2014, 15:240  doi:10.1186/1471-2105-15-240

Published: 11 July 2014

Abstract

Background

Biological data often originate from samples containing mixtures of subpopulations, corresponding e.g. to distinct cellular phenotypes. However, identification of distinct subpopulations may be difficult if biological measurements yield distributions that are not easily separable.

Results

We present Multiresolution Correlation Analysis (MCA), a method for visually identifying subpopulations based on the local pairwise correlation between covariates, without needing to define an a priori interaction scale. We demonstrate that MCA facilitates the identification of differentially regulated subpopulations in simulated data from a small gene regulatory network, followed by application to previously published single-cell qPCR data from mouse embryonic stem cells. We show that MCA recovers previously identified subpopulations, provides additional insight into the underlying correlation structure, reveals potentially spurious compartmentalizations, and provides insight into novel subpopulations.

Conclusions

MCA is a useful method for the identification of subpopulations in low-dimensional expression data, as emerging from qPCR or FACS measurements. With MCA it is possible to investigate the robustness of covariate correlations with respect subpopulations, graphically identify outliers, and identify factors contributing to differential regulation between pairs of covariates. MCA thus provides a framework for investigation of expression correlations for genes of interests and biological hypothesis generation.

Keywords:
Multiresolution; Correlation; Subpopulation identification; qPCR analysis