Email updates

Keep up to date with the latest news and content from BMC Medical Genomics and BioMed Central.

Open Access Research article

Predicting environmental chemical factors associated with disease-related gene expression data

Chirag J Patel123 and Atul J Butte123*

Author affiliations

1 Department of Pediatrics, Stanford University School of Medicine, Stanford, CA 94305, USA

2 Center for Biomedical Informatics Research, Department of Medicine, Stanford University School of Medicine, Stanford, CA 94305, USA

3 Lucile Packard Children's Hospital, 725 Welch Road, Palo Alto, CA 94304, USA

For all author emails, please log on.

Citation and License

BMC Medical Genomics 2010, 3:17  doi:10.1186/1755-8794-3-17

Published: 6 May 2010

Abstract

Background

Many common diseases arise from an interaction between environmental and genetic factors. Our knowledge regarding environment and gene interactions is growing, but frameworks to build an association between gene-environment interactions and disease using preexisting, publicly available data has been lacking. Integrating freely-available environment-gene interaction and disease phenotype data would allow hypothesis generation for potential environmental associations to disease.

Methods

We integrated publicly available disease-specific gene expression microarray data and curated chemical-gene interaction data to systematically predict environmental chemicals associated with disease. We derived chemical-gene signatures for 1,338 chemical/environmental chemicals from the Comparative Toxicogenomics Database (CTD). We associated these chemical-gene signatures with differentially expressed genes from datasets found in the Gene Expression Omnibus (GEO) through an enrichment test.

Results

We were able to verify our analytic method by accurately identifying chemicals applied to samples and cell lines. Furthermore, we were able to predict known and novel environmental associations with prostate, lung, and breast cancers, such as estradiol and bisphenol A.

Conclusions

We have developed a scalable and statistical method to identify possible environmental associations with disease using publicly available data and have validated some of the associations in the literature.