Spectral, singular value decomposition-based, algorithms for dimension reduction and clustering are known to be useful in a range of areas of science and engineering. Motivation for this work is in the analysis of microarray data from a number of different samples leading to a natural bipartite graph framework. Spectral bi-partitioning of microarray data was for the first time considered in . In this work we generalise the ideas in  in order to present further theoretical support and investigate a multiclass dataset with the aim of revealing a complex picture of oral cancer progression.
Materials and methods
A simple and informative derivation of a spectral algorithm for reordering a weighted bipartite graph is presented. We start with a discrete optimization problem then add constraints and relax it into a tractable continuous analogue. Natural data preprocessing is a part of the algorithm. Singular vectors can be used not only for gene/sample reordering but also for identifying informative genes. The data set pertains to the gene expression profile of different cell cultures (samples) isolated from normal oral tissue (N) and biopsies from different stages of oral cancer: dysplasias (D), primary (P), metastasised (M) and recurrencent cancers (R).
Currently oral cancer is seen as a single progression mechanism. Here, the bi-partitioning algorithm applied to the gene expression matrix revealed a major distinction (V2 in Figure 1) between mortal and immortal (empty and full symbols) samples suggesting that oral cancer can develop by two pathways. Moreover, those dysplasias that are known to have progressed to cancer (green symbols) closely resemble cancer samples. This is of note since currently clinicians lack diagnostic tools to predict which dysplasias are likely to develop into cancer. The algorithm also identifies atypical cultures: a) the extended-lifespan D17EL dysplasia lacks expression of p16INK4A b) whereas the immortal dysplasia D38 continues to express p16INK4A, c) the immortal cancer P68 that clustered closest to the immortal dysplasias retains wild-type p53. More details can be found in .
Figure 1. Oral cancer: spectral analysis
Immortality is a dominant factor influencing the overall gene expression profiles of both cancers and dysplasias and outcome data related to the different carcinoma cultures indicates that immortality is associated with poorer prognosis. Analysis of an additional data set shows gene expression changes consistently associated with immortality can be identified in vivo as well as in vitro.
J Comput Appl Math
online 2 June 2006.