Hierarchical clusterings and principle component analyses of ExpressionSets outputted by virtualArray. On the basis of the combined dataset from three different platforms a hierarchical clustering was calculated based on Euclidian distance matrices. Samples from GSE23402 are marked in red, samples from GSE26428 are marked in green and samples from GSE28688 are marked in blue. ESC, human embryonic stem cells; iPSC, human induced pluripotent stem cells. A, clustering of combined data without batch effect removed, B, clustering of combined data with batch effect removed in non-supervised mode; C, clustering of combined data with batch effect removed in supervised mode. The direct analysis of the combined dataset exhibits strong batch effects (A), that can be reduced by the use of EBM (B) in non-supervised mode. The benefit of the supervised mode can be seen in PCA plots (D, E) but not hierarchical clusterings (C). Principle component analyses were performed on the combined batch effect removed dataset. The batch effects were removed in non-supervised (D) and supervised mode (E), respectively.
Heider and Alt BMC Bioinformatics 2013 14:75 doi:10.1186/1471-2105-14-75