Log on / register
Feedback | Support | My details
Open AccessHighly AccessResearch article

Integrated analysis of independent gene expression microarray datasets improves the predictability of breast cancer outcome

Zhe Zhang1 email, Dechang Chen3 email and David A Fenstermacher2 email

Department of Biomedical Engineering, University of North Carolina, Chapel Hill NC 27506, USA

Research Informatics, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL 33612, USA

Department of Preventive Medicine and Biometrics, Uniformed Services University of the Health Sciences, Bethesda, MD 20814, USA

author email corresponding author email

BMC Genomics 2007, 8:331doi:10.1186/1471-2164-8-331

Published: 20 September 2007

Additional files

Additional file 1:

Data analysis demo. This file includes step-by-step description of data analysis procedure used in this study.

Format: DOC Size: 141KB Download file

This file can be viewed with: Microsoft Word Viewer

Additional file 2:

Complete gene lists of expression profiles. This file presents the complete expression profiles derived from breast cancer microarray datasets using bootstrap procedure. In each table, 'ID' is the Unigene accession of a gene while 'Name' is its symbol. 'Count' represents how many times a gene was ranked within top-100 by bootstrapping re-samplings, and 'Weight' is the Z statistic of a gene obtained from Wilcoxon Rank Sum Test (RST) applied on the data of all patients in corresponding dataset. Table 1, 2, 3 separately give the 60-gene list derived from two independent breast cancer datasets and their combination. These lists represent gene expression profiles corresponding to 3-year recurrence outcome of breast cancer and the list in Table 3 is recommended by this study. The counts are based on 10,000 re-samplings.

Format: DOC Size: 214KB Download file

This file can be viewed with: Microsoft Word Viewer

Additional file 3:

Functional classification of top-ranked genes. The 60 genes top-ranked by the combined dataset [see Additional File 2] was mapped to pre-defined functional gene sets using DAVID. Table 1 lists significantly enriched gene sets with their counts of genes overlapped to given 60 genes and the p values of Fisher's exact test. Table 2 lists the overlapped genes of several key gene sets.

Format: DOC Size: 120KB Download file

This file can be viewed with: Microsoft Word Viewer

Additional file 4:

Validation of prognostic profiles by Veridex dataset. This file provides extra validation results obtained from Veridex dataset. SEP scores of patients were calculated with 51 genes in two expression profiles and their previous derived weights. Table 1 listed the resultant scores and follow-up and ER status data provided by the original study. Table 2a and 2b listed the correlation between each of the 51 genes in both profiles and 3-year prognosis of Veridex patients. The p values were based on one-sided Wilcoxon Rank Sum test.

Format: DOC Size: 719KB Download file

This file can be viewed with: Microsoft Word Viewer


© 1999-2009 BioMed Central Ltd unless otherwise stated. Part of Springer Science+Business Media.