Open Access Highly Accessed Research article

A test for comparing two groups of samples when analyzing multiple omics profiles

Nimisha Chaturvedi14*, Jelle J Goeman26, Judith M Boer34, Wessel N van Wieringen15 and Renée X de Menezes14

Author Affiliations

1 Epidemiology and Biostatistics, VU University Medical Center, Amsterdam, The Netherlands

2 Department of Medical Statistics and Bioinformatics, Leiden University Medical Center, Leiden, The Netherlands

3 Department of Pediatric Oncology/Hematology, Erasmus MC-Sophia Children’s Hospital, Rotterdam, The Netherlands

4 Netherlands Bioinformatics Center, Nijmegen, The Netherlands

5 Department of Mathematics, VU University Amsterdam, Amsterdam, The Netherlands

6 Biostatistics, Department for Health Evidence, Radboud University Medical Center, Nijmegen, The Netherlands

For all author emails, please log on.

BMC Bioinformatics 2014, 15:236  doi:10.1186/1471-2105-15-236

Published: 8 July 2014



A number of statistical models has been proposed for studying the association between gene expression and copy number data in integrated analysis. The next step is to compare association patterns between different groups of samples.


We propose a method, named dSIM, to find differences in association between copy number and gene expression, when comparing two groups of samples. Firstly, we use ridge regression to correct for the baseline associations between copy number and gene expression. Secondly, the global test is applied to the corrected data in order to find differences in association patterns between two groups of samples. We show that dSIM detects differences even in small genomic regions in a simulation study. We also apply dSIM to two publicly available breast cancer datasets and identify chromosome arms where copy number led gene expression regulation differs between positive and negative estrogen receptor samples. In spite of differing genomic coverage, some selected arms are identified in both datasets.


We developed a flexible and robust method for studying association differences between two groups of samples while integrating genomic data from different platforms. dSIM can be used with most types of microarray/sequencing data, including methylation and microRNA expression. The method is implemented in R and will be made part of the BioConductor package SIM.

Group effect; Joint analysis; Penalized regression