An integrated analysis of molecular aberrations in NCI-60 cell lines
Institute of Statistical Science, Academia Sinica, Taipei, Taiwan
BMC Bioinformatics 2010, 11:495 doi:10.1186/1471-2105-11-495Published: 6 October 2010
Cancer is a complex disease where various types of molecular aberrations drive the development and progression of malignancies. Large-scale screenings of multiple types of molecular aberrations (e.g., mutations, copy number variations, DNA methylations, gene expressions) become increasingly important in the prognosis and study of cancer. Consequently, a computational model integrating multiple types of information is essential for the analysis of the comprehensive data.
We propose an integrated modeling framework to identify the statistical and putative causal relations of various molecular aberrations and gene expressions in cancer. To reduce spurious associations among the massive number of probed features, we sequentially applied three layers of logistic regression models with increasing complexity and uncertainty regarding the possible mechanisms connecting molecular aberrations and gene expressions. Layer 1 models associate gene expressions with the molecular aberrations on the same loci. Layer 2 models associate expressions with the aberrations on different loci but have known mechanistic links. Layer 3 models associate expressions with nonlocal aberrations which have unknown mechanistic links. We applied the layered models to the integrated datasets of NCI-60 cancer cell lines and validated the results with large-scale statistical analysis. Furthermore, we discovered/reaffirmed the following prominent links: (1)Protein expressions are generally consistent with mRNA expressions. (2)Several gene expressions are modulated by composite local aberrations. For instance, CDKN2A expressions are repressed by either frame-shift mutations or DNA methylations. (3)Amplification of chromosome 6q in leukemia elevates the expression of MYB, and the downstream targets of MYB on other chromosomes are up-regulated accordingly. (4)Amplification of chromosome 3p and hypo-methylation of PAX3 together elevate MITF expression in melanoma, which up-regulates the downstream targets of MITF. (5)Mutations of TP53 are negatively associated with its direct target genes.
The analysis results on NCI-60 data justify the utility of the layered models for the incoming flow of cancer genomic data. Experimental validations on selected prominent links and application of the layered modeling framework to other integrated datasets will be carried out subsequently.