BMC Bioinformatics

official impact factor 3.03

Open Access Highly Access Research article

Integrated analysis of DNA copy number and gene expression microarray data using gene sets

Renée X Menezes3,4,1,2*, Marten Boetzer1, Melle Sieswerda1, Gert-Jan B van Ommen4,1 and Judith M Boer3,4,1*

Author Affiliations

1 Center for Human and Clinical Genetics, Leiden University Medical Center, PO Box 9600, 2300 RC Leiden, The Netherlands

2 Pediatric Oncology Laboratory, Erasmus Medical Center, Rotterdam, The Netherlands

3 BioRange, Netherlands Bioinformatics Centre, Nijmegen, The Netherlands

4 Center for Medical Systems Biology, Leiden, The Netherlands

For all author emails, please log on.

BMC Bioinformatics 2009, 10:203 doi:10.1186/1471-2105-10-203

Published: 29 June 2009

Abstract

Background

Genes that play an important role in tumorigenesis are expected to show association between DNA copy number and RNA expression. Optimal power to find such associations can only be achieved if analysing copy number and gene expression jointly. Furthermore, some copy number changes extend over larger chromosomal regions affecting the expression levels of multiple resident genes.

Results

We propose to analyse copy number and expression array data using gene sets, rather than individual genes. The proposed model is robust and sensitive. We re-analysed two publicly available datasets as illustration. These two independent breast cancer datasets yielded similar patterns of association between gene dosage and gene expression levels, in spite of different platforms having been used. Our comparisons show a clear advantage to using sets of genes' expressions to detect associations with long-spanning, low-amplitude copy number aberrations. In addition, our model allows for using additional explanatory variables and does not require mapping between copy number and expression probes.

Conclusion

We developed a general and flexible tool for integration of multiple microarray data sets, and showed how the identification of genes whose expression is affected by copy number aberrations provides a powerful approach to prioritize putative targets for functional validation.