BMC Bioinformatics Volume 3
|
Viewing options:Associated material:Related literature:- Articles citing this article
- Other articles by authors
- Related articles/pages
Tools:Post to:
|
 Research articleClustering of the SOM easily reveals distinct gene expression patterns: results of a reanalysis of lymphoma studyJunbai Wang1 , Jan Delabie2 , Hans Christian Aasheim3 , Erlend Smeland3 and Ola Myklebost1  1Departments of Tumor Biology, Norwegian Radium Hospital, N0310 Oslo, Norway 2Department of Pathology, Norwegian Radium Hospital, N0310 Oslo, Norway 3Department of Immunology, Norwegian Radium Hospital, N0310 Oslo, Norway author email corresponding author email
BMC Bioinformatics 2002,
3:36doi:10.1186/1471-2105-3-36
|
|
| Published: |
24 November 2002 |
Abstract
Background
A method to evaluate and analyze the massive data generated by series of microarray experiments is of utmost importance to reveal the hidden patterns of gene expression. Because of the complexity and the high dimensionality of microarray gene expression profiles, the dimensional reduction of raw expression data and the feature selections necessary for, for example, classification of disease samples remains a challenge. To solve the problem we propose a two-level analysis. First self-organizing map (SOM) is used. SOM is a vector quantization method that simplifies and reduces the dimensionality of original measurements and visualizes individual tumor sample in a SOM component plane. Next, hierarchical clustering and K-means clustering is used to identify patterns of gene expression useful for classification of samples.
Results
We tested the two-level analysis on public data from diffuse large B-cell lymphomas. The analysis easily distinguished major gene expression patterns without the need for supervision: a germinal center-related, a proliferation, an inflammatory and a plasma cell differentiation-related gene expression pattern. The first three patterns matched the patterns described in the original publication using supervised clustering analysis, whereas the fourth one was novel.
Conclusions
Our study shows that by using SOM as an intermediate step to analyze genome-wide gene expression data, the gene expression patterns can more easily be revealed. The "expression display" by the SOM component plane summarises the complicated data in a way that allows the clinician to evaluate the classification options rather than giving a fixed diagnosis. |