Log on / register
Feedback | Support | My details

This article is part of the supplement: Proceedings of the Fifth Annual MCBIOS Conference. Systems Biology: Bridging the Omics .

Open AccessProceedings

From microarray to biology: an integrated experimental, statistical and in silico analysis of how the extracellular matrix modulates the phenotype of cancer cells

Mikhail G Dozmorov1 email, Kimberly D Kyker1 email, Paul J Hauser1 email, Ricardo Saban4 email, David D Buethe1 email, Igor Dozmorov3 email, Michael B Centola4 email, Daniel J Culkin1 email and Robert E Hurst1,2 email

1Department of Urology, Oklahoma University Health Sciences Centre, Oklahoma City, OK 73104, USA

2Department of Biochemistry and Molecular Biology, Oklahoma University Health Sciences Centre, Oklahoma City, OK 73104, USA

3Microarray Core Facility, Oklahoma Medical Research Foundation, Oklahoma City, OK 73104, USA

4Department of Physiology, Oklahoma University Health Sciences Centre, Oklahoma City, OK 73104, USA

author email corresponding author email

BMC Bioinformatics 2008, 9(Suppl 9):S4doi:10.1186/1471-2105-9-S9-S4

Published: 12 August 2008

Abstract

A statistically robust and biologically-based approach for analysis of microarray data is described that integrates independent biological knowledge and data with a global F-test for finding genes of interest that minimizes the need for replicates when used for hypothesis generation. First, each microarray is normalized to its noise level around zero. The microarray dataset is then globally adjusted by robust linear regression. Second, genes of interest that capture significant responses to experimental conditions are selected by finding those that express significantly higher variance than those expressing only technical variability. Clustering expression data and identifying expression-independent properties of genes of interest including upstream transcriptional regulatory elements (TREs), ontologies and networks or pathways organizes the data into a biologically meaningful system. We demonstrate that when the number of genes of interest is inconveniently large, identifying a subset of "beacon genes" representing the largest changes will identify pathways or networks altered by biological manipulation. The entire dataset is then used to complete the picture outlined by the "beacon genes." This allow construction of a structured model of a system that can generate biologically testable hypotheses. We illustrate this approach by comparing cells cultured on plastic or an extracellular matrix which organizes a dataset of over 2,000 genes of interest from a genome wide scan of transcription. The resulting model was confirmed by comparing the predicted pattern of TREs with experimental determination of active transcription factors.


© 1999-2009 BioMed Central Ltd unless otherwise stated. Part of Springer Science+Business Media.