Open Access Research article

Transcriptome classification reveals molecular subtypes in psoriasis

Chrysanthi Ainali12, Najl Valeyev3, Gayathri Perera2, Andrew Williams2, Johann E Gudjonsson4, Christos A Ouzounis156, Frank O Nestle2* and Sophia Tsoka1*

Author Affiliations

1 Centre for Bioinformatics, Department of Informatics, School of Natural and Mathematical Sciences, King’s College London, Strand, London, WC2R 2LS, UK

2 St John’s Institute of Dermatology, Division of Genetics and Molecular Medicine, King’s College London, Tower Wing, Guy’s Hospital, Great Maze Pond, London, SE1 9RT, UK

3 Centre for Systems, Dynamics and Control, College of Engineering, Mathematics and Physical Science, University of Exeter, Exeter, EX4 4QF, UK

4 Department of Dermatology, School of Medicine, University of Michigan, Box 0932, Ann Arbor, MI 48109-0932, USA

5 Present address: Computational Genomics Unit, Institute of Agrobiotechnology, Centre for Research & Technology Hellas, Thessaloniki, Greece

6 Present address: Donnelly Centre for Cellular & Biomolecular Research, University of Toronto, 160 College Street, Toronto, ON, M5S 3E1, Canada

For all author emails, please log on.

BMC Genomics 2012, 13:472  doi:10.1186/1471-2164-13-472

Published: 12 September 2012

Additional files

Additional File 1:

The ‘core’ set of genes defined through differential expression analysis: positively (130) and negatively (76) differentially expressed genes in psoriatic samples of the GAIN dataset.

Format: DOC Size: 32KB Download file

This file can be viewed with: Microsoft Word Viewer

Open Data

Additional File 2:

Supplementary methods.

Format: DOC Size: 50KB Download file

This file can be viewed with: Microsoft Word Viewer

Open Data

Additional File 3:

A multidimensional scaling (MDS) plot showing the distinction of psoriatic cases into two groups, PP01 (red) and PP02 (black), as obtained after RF clustering and classification.

Format: JPEG Size: 310KB Download file

Open Data

Additional File 4:

Markov Cluster Algorithm (MCL) applied on the psoriatic sub-group tissue sample networks to extract clusters of gene expression. Both networks consisted of 36 clusters and the largest clusters (number of nodes > 8) for both networks are shown and denoted by colour. Pathway enrichment for these clusters is shown in tables 3 and 4 for PP01 and PP02 networks respectively.

Format: JPG Size: 561KB Download file

Open Data

Additional File 5:

Genes identified as most informative after classification of skin disease phenotypes. Gini Index (GI) was used as variable importance measure and was estimated for each gene per group from random forest classification, so as to prioritise genes in terms of their ability to discriminate distinct molecular patterns. After training of the random forest classifier, GI is derived for each gene across all trees and the ranking of genes with GI > = 0.02 is shown here for each skin group.

Format: JPEG Size: 1.7MB Download file

Open Data

Additional File 6:

Results of text mining.

Format: TXT Size: 1KB Download file

Open Data

Additional File 7:

A multidimensional scaling plot of psoriasis datasets from Gudjonnson et al. 2010 [18] (A) and Yao et al. 2008 [36] (B) to illustrate grouping of samples according to random forest clustering. Two distinct psoriatic groups are identified in involved tissue (PP01 green and PP02 purple), while NN and PN samples largely co-localise. Overall, clustering is comparable to GAIN data that is shown in figure 4.

Format: JPEG Size: 2MB Download file

Open Data

Additional File 8:

Graphical representation to illustrate the relationship between 19 highly discriminative genes and disease sub-groups according to Gini Index calculated from decision trees forest in the Gudjonnson dataset. The green band represents the first psoriatic group (PP01), light blue corresponds to the second psoriatic sub-group (PP02), yellow corresponds to healthy individuals (NN) and light green presents the non-lesional cases (PN) and are arranged clockwise followed by purple to orange rectangular bands that represent relevant genes. Genes and skin groups are ordered according to shared pairing links, as described previously.

Format: JPEG Size: 397KB Download file

Open Data

Additional File 9:

Graphical representation to illustrate the relationship between 27 highly discriminative genes and disease sub-groups according to Gini Index calculated from RF for the Yao dataset. Light blue to green rectangular bands represent the four skin-types (PP01: light blue, PP02: blue, NN: light- green, PN: green) and are followed by purple to orange rectangular bands representing relevant genes (arranged clockwise). Genes and skin groups are ordered according shared pairing links. An overview of patterns of informative genes for prediction of each disease class can be visualised.

Format: JPEG Size: 94KB Download file

Open Data

Additional File 10:

Pathway enrichment in the Gudjonsson dataset.

Format: XLS Size: 15KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional File 11:

Pathway enrichment in the Yao dataset.

Format: XLS Size: 23KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional File 12:

Example of a decision tree for classification of tissue samples in appropriate disease classes. Heatmap illustrates expression values for 25 genes across 108 tissue samples and represents part of the heatmap shown in figure 2. A decision tree is a tree-like structure to relate gene expression measurements to sample phenotype class, with a view to deriving a predictive model. Nodes (rectangles) in the tree represent a test on gene expressions to derive a decision on a sample’s class, edges (arrows) indicate the expression level of the variable that can best distinguish the samples and leaves (or terminal nodes - circles) represent class predictions. The path from root to each terminal node equates to a list of conditions in the form of gene expression rules that can relate tissue samples to disease phenotype class.

Format: JPEG Size: 722KB Download file

Open Data

Additional File 13:

Correlation between the two variable importance measures of gene selection, Gini Index and mean decrease in accuracy.

Format: JPEG Size: 230KB Download file

Open Data