Meta-analysis of breast cancer microarray studies in conjunction with conserved cis-elements suggest patterns for coordinate regulation
1 Division of Information Sciences, City of Hope National Medical Center, Duarte, CA 91010, USA
2 Department o f Molecular Biology, City of Hope and Beckman Research Institute, Duarte, CA 91010, USA
3 Division of Molecular Medicine, City of Hope and Beckman Research Institute, Duarte, CA 91010, USA
4 Department of Computer and Information Science, Norwegian University of Science and Technology, NO-7489 Trondheim, Norway
5 Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology, NO-7489 Trondheim, Norway
BMC Bioinformatics 2008, 9:63 doi:10.1186/1471-2105-9-63Published: 28 January 2008
Gene expression measurements from breast cancer (BrCa) tumors are established clinical predictive tools to identify tumor subtypes, identify patients showing poor/good prognosis, and identify patients likely to have disease recurrence. However, diverse breast cancer datasets in conjunction with diagnostic clinical arrays show little overlap in the sets of genes identified. One approach to identify a set of consistently dysregulated candidate genes in these tumors is to employ meta-analysis of multiple independent microarray datasets. This allows one to compare expression data from a diverse collection of breast tumor array datasets generated on either cDNA or oligonucleotide arrays.
We gathered expression data from 9 published microarray studies examining estrogen receptor positive (ER+) and estrogen receptor negative (ER-) BrCa tumor cases from the Oncomine database. We performed a meta-analysis and identified genes that were universally up or down regulated with respect to ER+ versus ER- tumor status. We surveyed both the proximal promoter and 3' untranslated regions (3'UTR) of our top-ranking genes in each expression group to test whether common sequence elements may contribute to the observed expression patterns. Utilizing a combination of known transcription factor binding sites (TFBS), evolutionarily conserved mammalian promoter and 3'UTR motifs, and microRNA (miRNA) seed sequences, we identified numerous motifs that were disproportionately represented between the two gene classes suggesting a common regulatory network for the observed gene expression patterns.
Some of the genes we identified distinguish key transcripts previously seen in array studies, while others are newly defined. Many of the genes identified as overexpressed in ER- tumors were previously identified as expression markers for neoplastic transformation in multiple human cancers. Moreover, our motif analysis identified a collection of specific cis-acting target sites which may collectively play a role in the differential gene expression patterns observed in ER+ versus ER- breast cancer tumors. Importantly, the gene sets and associated DNA motifs provide a starting point with which to explore the mechanistic basis for the observed expression patterns in breast tumors.