Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Highly Accessed Open Badges Research article

Identification of reproducible gene expression signatures in lung adenocarcinoma

Tzu-Pin Lu12, Eric Y Chuang23 and James J Chen14*

Author Affiliations

1 Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, Food and Drug Administration Jefferson, Little Rock, Arkansas, USA

2 YongLin Biomedical Engineering Center, National Taiwan University, Taipei, Taiwan

3 Graduate Institute of Biomedical Engineering and Bioinformatics National Taiwan University, Taipei, Taiwan

4 Graduate Institute of Biostatistics and Biostatistics Center, China Medical University, Taichung, Taiwan

For all author emails, please log on.

BMC Bioinformatics 2013, 14:371  doi:10.1186/1471-2105-14-371

Published: 26 December 2013



Lung cancer is the leading cause of cancer-related death worldwide. Tremendous research efforts have been devoted to improving treatment procedures, but the average five-year overall survival rates are still less than 20%. Many biomarkers have been identified for predicting survival; challenges arise, however, in translating the findings into clinical practice due to their inconsistency and irreproducibility. In this study, we proposed an approach by identifying predictive genes through pathways.


The microarrays from Shedden et al. were used as the training set, and the log-rank test was performed to select potential signature genes. We focused on 24 cancer-related pathways from 4 biological databases. A scoring scheme was developed by the Cox hazard regression model, and patients were divided into two groups based on the medians. Subsequently, their predictability and generalizability were evaluated by the 2-fold cross-validation and a resampling test in 4 independent datasets, respectively. A set of 16 genes related to apoptosis execution was demonstrated to have good predictability as well as generalizability in more than 700 lung adenocarcinoma patients and was reproducible in 4 independent datasets. This signature set was shown to have superior performances compared to 6 other published signatures. Furthermore, the corresponding risk scores derived from the set were found to associate with the efficacy of the anti-cancer drug ZD-6474 targeting EGFR.


In summary, we presented a new approach to identify reproducible survival predictors for lung adenocarcinoma, and the identified genes may serve as both prognostic and predictive biomarkers in the future.

Lung adenocarcinoma; Microarray; Pathway analysis; Prognostic biomarker; Predictive biomarker