How to get the most from microarray data: advice from reverse genomics
1 Department of Genitourinary Medical Oncology, Unit 1374, The University of Texas MD Anderson Cancer Center, 1155 Pressler Street, Houston, TX 77030-3721, USA
2 Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
3 Department of Community and Family Medicine, Geisel School of Medicine, Dartmouth College, Lebanon, NH, USA
4 Department of Eidemiology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
5 Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
BMC Genomics 2014, 15:223 doi:10.1186/1471-2164-15-223Published: 21 March 2014
Whole-genome profiling of gene expression is a powerful tool for identifying cancer-associated genes. Genes differentially expressed between normal and tumorous tissues are usually considered to be cancer associated. We recently demonstrated that the analysis of interindividual variation in gene expression can be useful for identifying cancer associated genes. The goal of this study was to identify the best microarray data–derived predictor of known cancer associated genes.
We found that the traditional approach of identifying cancer genes—identifying differentially expressed genes—is not very efficient. The analysis of interindividual variation of gene expression in tumor samples identifies cancer-associated genes more effectively. The results were consistent across 4 major types of cancer: breast, colorectal, lung, and prostate. We used recently reported cancer-associated genes (2011–2012) for validation and found that novel cancer-associated genes can be best identified by elevated variance of the gene expression in tumor samples.
The observation that the high interindividual variation of gene expression in tumor tissues is the best predictor of cancer-associated genes is likely a result of tumor heterogeneity on gene level. Computer simulation demonstrates that in the case of heterogeneity, an assessment of variance in tumors provides a better identification of cancer genes than does the comparison of the expression in normal and tumor tissues. Our results thus challenge the current paradigm that comparing the mean expression between normal and tumorous tissues is the best approach to identifying cancer-associated genes; we found that the high interindividual variation in expression is a better approach, and that using variation would improve our chances of identifying cancer-associated genes.