School of Computer Science and Engineering, Soongsil University, Seoul 156-743, Korea

Department of Clinical Epidemiology and Biostatistics, Asan Medical Centre, Korea

University of Ulsan College of Medicine, Seoul 138-736, Korea

Abstract

Background

Multidimensional scaling (MDS) is a widely used approach to dimensionality reduction. It has been applied to feature selection and visualization in various areas. Among diverse MDS methods, the classical MDS is a simple and theoretically sound solution for projecting data objects onto a low dimensional space while preserving the original distances among them as much as possible. However, it is not trivial to apply it to genome-scale data (e.g., microarray gene expression profiles) on regular desktop computers, because of its high computational complexity.

Results

We implemented a highly-efficient software application, called CFMDS (CUDA-based Fast MultiDimensional Scaling), which produces an approximate solution of the classical MDS based on CUDA (compute unified device architecture) and the divide-and-conquer principle. CUDA is a parallel computing architecture exploiting the power of the GPU (graphics processing unit). The principle of divide-and-conquer was adopted for circumventing the small memory problem of usual graphics cards. Our application software has been tested on various benchmark datasets including microarrays and compared with the classical MDS algorithms implemented using C# and MATLAB. In our experiments, CFMDS was more than a hundred times faster for large data than such general solutions. Regarding the quality of dimensionality reduction, our approximate solutions were as good as those from the general solutions, as the Pearson's correlation coefficients between them were larger than 0.9.

Conclusions

CFMDS is an expeditious solution for the data dimensionality reduction problem. It is especially useful for efficient processing of genome-scale data consisting of several thousands of objects in several minutes.

Background

Multidimensional scaling (MDS) is a technique for representing objects (or data points) in a low-dimensional space based on their similarity. Main purposes of MDS include exploratory data analysis by visualization and feature selection for subsequent analysis such as classification. In bioinformatics and related areas, MDS has been applied to diverse problems such as gene expression pattern visualization

Among various MDS methods, the classical MDS is based on the idea of finding coordinates appropriate for describing dissimilarities as distances

One problem with CUDA is the relatively small memory size of most graphics cards (usually less than 1 gigabyte). General graphics cards do not have sufficient memory for storing and processing large-scale datasets containing tens of thousands data points. For circumventing this problem, we exploit a famous engineering principle, i.e., divide-and-conquer. Divide-and-conquer approach to the classical multidimensional scaling has drawn much attention for reducing its computational complexity and has been applied in serial computing environments

Implementation

We implemented CFDMS by extending our previous work

One-shot MDS

In the one-shot mode, the classical MDS on a dissimilarity matrix **D**, of which size is

1. **D**^{(2) }= [_{ij }^{2}], where _{ij }**D **on the

2. **J **= **I **- ^{-1}**1**, where **I **is the identity matrix and **1 **denotes the

3.

4. Calculate the first **e**_{1}, **e**_{2}, ..., **e**_{m }_{1}, λ_{2}, ..., λ_{m }**B**.

5. Calculate the

Each column of **X **corresponds to the coordinate of each data point in the reduced (

Divide-and-conquer MDS

The divide-and-conquer MDS based on

Process of divide-and-conquer mode

**Process of divide-and-conquer mode**. First, a dissimilarity matrix is randomly decomposed into **D**_{1}, ..., **D**_{p}**M**_{align}**D**_{1}, ..., **D**_{p }**M**_{align}**dMDS**_{1}, ..., **dMDS**_{p }**mMDS**, respectively. After that, the objects sampled from each of **D**_{1}, ..., **D**_{p }**dMDS**_{1}, ..., sub**dMDS**_{p }**mMDS**_{1}, ..., **mMDS**_{p}**dMDS**_{i }**mMDS**_{i }**A**_{i }**A**_{i}**dMDS**_{i }**mMDS**_{i}^{2 }norm. The linearly transformed objects **dMDS**_{i }**A**_{i}**dMDS**_{i}**dMDS**_{1}, ..., **dMDS**_{p }

1. Randomly decompose an **D**_{all }**D**_{1}, **D**_{2}, ..., **D**_{p}

2. Sample

3. Merge the sampled objects and construct a new dissimilarity submatrix **M**_{align }

4. Apply the one-shot MDS method to **D**_{1}, **D**_{2}, ..., **D**_{p }**M**_{align}**dMDS**_{1}, **dMDS**_{2}, ..., **dMDS**_{p }**mMDS**, respectively.

5. Extract the objects sampled at step 2 from the above results, obtaining sub**dMDS**_{1}, sub**dMDS**_{2}, ..., sub**dMDS**_{p }**mMDS**_{1}, **mMDS**_{2}, ..., **mMDS**_{p}

6. For each pair sub**dMDS**_{i }**mMDS**_{i }_{Ai }||**A**_{i}**dMDS**_{i }**mMDS**_{i}^{2 }norm.

7. Linearly transform the objects of **D**_{i }**A**_{i}**dMDS**_{i }**dMDS**_{i}

8. Combine **dMDS**_{1}, **dMDS**_{2}, ..., **dMDS**_{p }

Since the size of submatrix is determined by the available memory size of a graphics card, the number of submatrices

Results

CFMDS has been tested using five benchmark datasets. Table

Benchmark datasets

**Dataset**

**Source**

**Number of Attributes**

**Number of Instances**

**Pearson's Median Skewness Coefficient**

**Coefficient of Variation**

IRIS

UCI ML Repository

4

150

0.34

0.64

Dermatology

UCI ML Repository

33

366

-0.61

0.42

GEO

4,000

2,000

0.94

1.08

GEO

1,000

9,300

0.73

0.56

MNIST

MNIST

784

10,000

-0.13

0.14

UCI ML Repository is UCI Machine Learning Repository

Experimental setting

**Dataset**

**Size of Dissimilarity Matrix**

**No. of Submatrices**

**( p)**

**No. of Samples**

**in Each Submatrix ( s)**

IRIS

150 × 150

3

20

Dermatology

366 × 366

3

60

2,000 × 2,000

10

100

9,300 × 9,300

10

150

MNIST

10,000 × 10,000

10

150

These parameters were set for comparison experiments of the divide-and-conquer mode of CFMDS. In fact, the CFMS application automatically detects the available memory size and these parameters are subsequently determined. For IRIS, Dermatology, and

Execution time of CFMDS

The execution time was compared to demonstrate the speed-up of the proposed application. Figure

Comparison results of execution time

**Comparison results of execution time**. Average running time in seconds is shown. The y-axis is in log scale. Random (MaxMin) means the divide-and-conquer mode of CFMDS with

Accuracy of CFMDS

To examine the accuracy of the divide-and conquer mode of CFMDS, Pearson's correlation coefficient between the results from the classical MDS and CFMDS was used. More precisely, vectors, consisting of the Euclidean distance between each object pair on a reduced dimension, were generated from the results of the classical MDS and CFMDS, respectively. Then, Pearson's correlation coefficient between these vectors was calculated. As the correlation coefficient is close to 1, the result from the divide-and-conquer mode of CFMDS is similar to the result from the classical MDS. The accuracy comparison results are shown in Figure

Comparison results of accuracy

**Comparison results of accuracy**. Pearson's correlation coefficient was used as accuracy. The mean value and standard deviation from 100 independent simulation results are shown. Random (MaxMin) means the divide-and-conquer mode of CFMDS with

However, CFMDS with

Discussion

We implemented a software application, CFMDS (CUDA-based Fast MultiDimensional Scaling) for efficient dimensionality reduction of large-scale genomic data. CFMDS adopted CUDA programming library and divide-and-conquer strategy to handle several thousands of features in less than several minutes on a commodity PC equipped with a graphics card. CUDA was applied as a parallel computing method and divide-and-conquer principle was used to circumvent the small memory size problem of usual graphics cards. By combining these two techniques, CFMDS enables that a regular PC with a CUDA-support graphics card handles the large-scale genomic data dimensionality reduction problem which can be efficiently executed only on high performance computers. The simulation results confirmed that our approach can perform MDS more than a hundred times faster with a comparable accuracy for genome-scale data. Therefore, CFMDS is especially useful to visualize and analyze data consisting of several thousands of objects in less than several minutes. We implemented two sampling options for the divide-and-conquer mode of CFMDS such as

Availability and requirements

Project name: CFMDS

Project home page:

Operating system(s): Windows XP or higher (32-bit and 64-bit), Linux (tested on Ubuntu Linux 9.04, Red Hat Enterprise Linux 5.3/4.7, Fedora 11)

Programming language: CUDA

Other requirements: NVIDIA's GPU with CUDA, CUDA toolkit 2.3 (not support CUDA 3.0 toolkit yet), The latest version of CULA basic libraries

License: GNU GPL v2

Any restrictions to use by non-academics: none

Competing interests

The authors declare that they have no competing interests.

Authors' contributions

S.P. developed the software application and performed the experiments. S.-Y.S. wrote the manuscript and discussed the results. K.-B.H. led the project and wrote the article. All of the authors have read and approved the final manuscript.

Acknowledgements

K.-B.H. was supported by the Soongsil University Research Fund and by the Proteogenomic Research Program and Basic Science Research Program (2012R1A1A2039822) through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology. S.-Y.S. was supported by Basic Science Research Program (2012R1A1A2002804) through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology.

This article has been published as part of