Integrated analysis of independent gene expression microarray datasets improves the predictability of breast cancer outcome1 Department of Biomedical Engineering, University of North Carolina, Chapel Hill NC 27506, USA 2 Research Informatics, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL 33612, USA 3 Department of Preventive Medicine and Biometrics, Uniformed Services University of the Health Sciences, Bethesda, MD 20814, USA
BMC Genomics 2007, 8:331doi:10.1186/1471-2164-8-331
Additional filesAdditional file 1: Data analysis demo. This file includes step-by-step description of data analysis procedure used in this study. Format: DOC Size: 141KB Download file This file can be viewed with: Microsoft Word Viewer Additional file 2: Complete gene lists of expression profiles. This file presents the complete expression profiles derived from breast cancer microarray datasets using bootstrap procedure. In each table, 'ID' is the Unigene accession of a gene while 'Name' is its symbol. 'Count' represents how many times a gene was ranked within top-100 by bootstrapping re-samplings, and 'Weight' is the Z statistic of a gene obtained from Wilcoxon Rank Sum Test (RST) applied on the data of all patients in corresponding dataset. Table 1, 2, 3 separately give the 60-gene list derived from two independent breast cancer datasets and their combination. These lists represent gene expression profiles corresponding to 3-year recurrence outcome of breast cancer and the list in Table 3 is recommended by this study. The counts are based on 10,000 re-samplings. Format: DOC Size: 214KB Download file This file can be viewed with: Microsoft Word Viewer Additional file 3: Functional classification of top-ranked genes. The 60 genes top-ranked by the combined dataset [see Additional File 2] was mapped to pre-defined functional gene sets using DAVID. Table 1 lists significantly enriched gene sets with their counts of genes overlapped to given 60 genes and the p values of Fisher's exact test. Table 2 lists the overlapped genes of several key gene sets. Format: DOC Size: 120KB Download file This file can be viewed with: Microsoft Word Viewer Additional file 4: Validation of prognostic profiles by Veridex dataset. This file provides extra validation results obtained from Veridex dataset. SEP scores of patients were calculated with 51 genes in two expression profiles and their previous derived weights. Table 1 listed the resultant scores and follow-up and ER status data provided by the original study. Table 2a and 2b listed the correlation between each of the 51 genes in both profiles and 3-year prognosis of Veridex patients. The p values were based on one-sided Wilcoxon Rank Sum test. Format: DOC Size: 719KB Download file This file can be viewed with: Microsoft Word Viewer |




on Google Scholar







author email
corresponding author email