Open Access Open Badges Research article

Relation between smoking history and gene expression profiles in lung adenocarcinomas

Johan Staaf12, Göran Jönsson12, Mats Jönsson1, Anna Karlsson1, Sofi Isaksson1, Annette Salomonsson1, Helen M Pettersson3, Maria Soller4, Sven-Börje Ewers1, Leif Johansson5, Per Jönsson6 and Maria Planck1*

Author Affiliations

1 Department of Oncology, Clinical Sciences, Lund University and Skåne University Hospital, Barngatan 2:1, SE-22185, Lund, Sweden

2 CREATE Health Strategic Center for Translational Cancer Research, Lund University, BMC C13, SE 221 84, Lund, Sweden

3 Center for Molecular Pathology, Department of Laboratory Medicine, Lund University, SE 20502, Malmö, Sweden

4 Department of Clinical Genetics, Lund University and Regional Laboratories Region Skåne, SE 22185, Lund, Sweden

5 Department of Pathology, Lund University and Regional Laboratories Region Skåne, SE 22185, Lund, Sweden

6 Department of Thoracic Surgery, Clinical Sciences, Lund University and Skåne University Hospital, SE 22185, Lund, Sweden

For all author emails, please log on.

BMC Medical Genomics 2012, 5:22  doi:10.1186/1755-8794-5-22

Published: 7 June 2012



Lung cancer is the worldwide leading cause of death from cancer. Tobacco usage is the major pathogenic factor, but all lung cancers are not attributable to smoking. Specifically, lung cancer in never-smokers has been suggested to represent a distinct disease entity compared to lung cancer arising in smokers due to differences in etiology, natural history and response to specific treatment regimes. However, the genetic aberrations that differ between smokers and never-smokers’ lung carcinomas remain to a large extent unclear.


Unsupervised gene expression analysis of 39 primary lung adenocarcinomas was performed using Illumina HT-12 microarrays. Results from unsupervised analysis were validated in six external adenocarcinoma data sets (n=687), and six data sets comprising normal airway epithelial or normal lung tissue specimens (n=467). Supervised gene expression analysis between smokers and never-smokers were performed in seven adenocarcinoma data sets, and results validated in the six normal data sets.


Initial unsupervised analysis of 39 adenocarcinomas identified two subgroups of which one harbored all never-smokers. A generated gene expression signature could subsequently identify never-smokers with 79-100% sensitivity in external adenocarcinoma data sets and with 76-88% sensitivity in the normal materials. A notable fraction of current/former smokers were grouped with never-smokers. Intriguingly, supervised analysis of never-smokers versus smokers in seven adenocarcinoma data sets generated similar results. Overlap in classification between the two approaches was high, indicating that both approaches identify a common set of samples from current/former smokers as potential never-smokers. The gene signature from unsupervised analysis included several genes implicated in lung tumorigenesis, immune-response associated pathways, genes previously associated with smoking, as well as marker genes for alveolar type II pneumocytes, while the best classifier from supervised analysis comprised genes strongly associated with proliferation, but also genes previously associated with smoking.


Based on gene expression profiling, we demonstrate that never-smokers can be identified with high sensitivity in both tumor material and normal airway epithelial specimens. Our results indicate that tumors arising in never-smokers, together with a subset of tumors from smokers, represent a distinct entity of lung adenocarcinomas. Taken together, these analyses provide further insight into the transcriptional patterns occurring in lung adenocarcinoma stratified by smoking history.

Lung cancer; Smoking; Gene expression analysis; Adenocarcinoma; EGFR; Never-smokers; Immune response