Email updates

Keep up to date with the latest news and content from BMC Genomics and BioMed Central.

This article is part of the supplement: Tenth International Conference on Bioinformatics. First ISCB Asia Joint Conference 2011 (InCoB/ISCB-Asia 2011): Computational Biology

Open Access Proceedings

A robust tool for discriminative analysis and feature selection in paired samples impacts the identification of the genes essential for reprogramming lung tissue to adenocarcinoma

Swee Heng Toh1, Philip Prathipati1, Efthimios Motakis1, Chee Keong Kwoh2, Surya Pavan Yenamandra1 and Vladimir A Kuznetsov1*

Author Affiliations

1 Bioinformatics Institute, A-STAR, Singapore

2 School of Computer Engineering, Nanyang Technological University, Singapore

For all author emails, please log on.

BMC Genomics 2011, 12(Suppl 3):S24  doi:10.1186/1471-2164-12-S3-S24

Published: 30 November 2011

Additional files

Additional file 1:

Table S1. Master table annotated with gene symbols with gene symbols from Lung AC meta gene signatures, default and relaxed lung AC gene signatures and signatures derived by other methods cited in the literature.A: Characteristics of the data set and final results; B: Read me File C: Summary statistics. D: AC signature derived by SAM software from U95A data. S1e: Tables of the 604-gene PT-AT consensus signature, the results of GO analysis, the immunity and cell cycle gene signatures discriminating AC from normal lung tissues.

Format: XLS Size: 10MB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 2:

Table S2. Distribution of the number of false classifications of the 27 paired samples. Comparison of the classification accuracy of the extremely discriminative 2,829 probe sets (ECD) with the top-level 2,829 probe sets identified by the standard Wilcoxon sign ranked test (WT), EDGE, PAM, and Student’s t-test. (ECD signature was selected using MWT on cross-normalized signal intensities with 100% accuracy criteria and a bootstrap p-value cut-off<0.05). While the ECD classifier was derived using the cross-normalized dataset as input, the classifiers derived using PAM, EDGE, WT and t-test used the original MAS5-normalized data as input. However, the classification accuracy, in terms of the number of probe sets with 2 or more anomalous fold-changes, was estimated using the MAS5-normalized dataset.

Format: PDF Size: 120KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 3:

Comparison of different feature selection methods. Analysis of discriminative ability using the original MAS5 normalized data The two-way hierarchical cluster analysis of the 2,829 probe sets and the 27 pairs of normal-lung AC samples demonstrates the ability of the selected methods to separate lung AC from normal samples (Supplementary figures S1-S4). All of the methods, with the exception of the t-test and the Limma paired test, produced a near-perfect separation of the two classes, however, ECD provides more biologically reasonable grouping of the genes (see Results). Figure S1. Two-way hierarchical cluster analysis of the MAS5-normalized expression values of 2,829 probe sets identified by the standard Wilcoxon test. Figure S2. Two-way hierarchical cluster analysis of the cross-normalized expression values of the 2,829 probe sets identified by EDGE. Figure S3. Two-way hierarchical cluster analyses of (A) the MAS-normalized expression values and (B) the cross-normalized expression values of 2,829 probe sets identified using the Student’s t-test. Figure S4. Two-way hierarchical cluster analyses of (A) the MAS-normalized expression values and (B) the cross-normalized expression values of 2,829 probe sets identified using the Limma paired t-test (Smyth, G. K., 2005).

Format: PDF Size: 546KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 4:

Table S3. Meta-gene signatures associated with Lung AC, curated from the literature with the corresponding references.

Format: PDF Size: 430KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data