<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art><ui>1471-2105-13-S17-S23</ui><ji>1471-2105</ji><fm>
<dochead>Proceedings</dochead>
<bibl>
<title>
<p>CFMDS: CUDA-based fast multidimensional scaling for genome-scale data</p>
</title>
<aug>
<au id="A1"><snm>Park</snm><fnm>Sungin</fnm><insr iid="I1"/><email>sipark@ml.ssu.ac.kr</email></au>
<au ca="yes" id="A2"><snm>Shin</snm><fnm>Soo-Yong</fnm><insr iid="I2"/><insr iid="I3"/><email>sooyong.shin@amc.seoul.kr</email></au>
<au ca="yes" id="A3"><snm>Hwang</snm><fnm>Kyu-Baek</fnm><insr iid="I1"/><email>kbhwang@ssu.ac.kr</email></au>
</aug>
<insg>
<ins id="I1"><p>School of Computer Science and Engineering, Soongsil University, Seoul 156-743, Korea</p></ins>
<ins id="I2"><p>Department of Clinical Epidemiology and Biostatistics, Asan Medical Centre, Korea</p></ins>
<ins id="I3"><p>University of Ulsan College of Medicine, Seoul 138-736, Korea</p></ins>
</insg>
<source>BMC Bioinformatics</source>


<supplement><title><p>Eleventh International Conference on Bioinformatics (InCoB2012): Bioinformatics</p></title><editor>Shoba Ranganathan, Christian Sch&#246;nbach, Sissades Tongsima, Jonathan Chan and Tin Wee Tan</editor><sponsor><note>The articles in this supplement were supported by funding agencies as detailed in the Acknowledgement section of each article</note></sponsor><note>Proceedings</note></supplement><conference><title><p>Asia Pacific Bioinformatics Network (APBioNet) Eleventh International Conference on Bioinformatics (InCoB2012)</p></title><location>Bangkok, Thailand</location><date-range>3-5 October 2012</date-range><url>http://www.incob2012.org/</url></conference><issn>1471-2105</issn>
<pubdate>2012</pubdate>
<volume>13</volume>
<issue>Suppl 17</issue>
<fpage>S23</fpage>
<url>http://www.biomedcentral.com/1471-2105/13/S17/S23</url>
<xrefbib><pubidlist><pubid idtype="pmpid">23282007</pubid><pubid idtype="doi">10.1186/1471-2105-13-S17-S23</pubid></pubidlist></xrefbib>
</bibl>
<history><pub><date><day>13</day><month>12</month><year>2012</year></date></pub></history>
<cpyrt><year>2012</year><collab>Park et al.; licensee BioMed Central Ltd.</collab><note>This is an open access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note></cpyrt>
<abs>
<sec>
<st>
<p>Abstract</p>
</st>
<sec>
<st>
<p>Background</p>
</st>
<p>Multidimensional scaling (MDS) is a widely used approach to dimensionality reduction. It has been applied to feature selection and visualization in various areas. Among diverse MDS methods, the classical MDS is a simple and theoretically sound solution for projecting data objects onto a low dimensional space while preserving the original distances among them as much as possible. However, it is not trivial to apply it to genome-scale data (e.g., microarray gene expression profiles) on regular desktop computers, because of its high computational complexity.</p>
</sec>
<sec>
<st>
<p>Results</p>
</st>
<p>We implemented a highly-efficient software application, called CFMDS (CUDA-based Fast MultiDimensional Scaling), which produces an approximate solution of the classical MDS based on CUDA (compute unified device architecture) and the divide-and-conquer principle. CUDA is a parallel computing architecture exploiting the power of the GPU (graphics processing unit). The principle of divide-and-conquer was adopted for circumventing the small memory problem of usual graphics cards. Our application software has been tested on various benchmark datasets including microarrays and compared with the classical MDS algorithms implemented using C# and MATLAB. In our experiments, CFMDS was more than a hundred times faster for large data than such general solutions. Regarding the quality of dimensionality reduction, our approximate solutions were as good as those from the general solutions, as the Pearson's correlation coefficients between them were larger than 0.9.</p>
</sec>
<sec>
<st>
<p>Conclusions</p>
</st>
<p>CFMDS is an expeditious solution for the data dimensionality reduction problem. It is especially useful for efficient processing of genome-scale data consisting of several thousands of objects in several minutes.</p>
</sec>
</sec>
</abs>
</fm><bdy>
<sec>
<st>
<p>Background</p>
</st>
<p>Multidimensional scaling (MDS) is a technique for representing objects (or data points) in a low-dimensional space based on their similarity. Main purposes of MDS include exploratory data analysis by visualization and feature selection for subsequent analysis such as classification. In bioinformatics and related areas, MDS has been applied to diverse problems such as gene expression pattern visualization <abbrgrp>
<abbr bid="B1">1</abbr>
<abbr bid="B2">2</abbr>
</abbrgrp>, drug responses profiling <abbrgrp>
<abbr bid="B3">3</abbr>
</abbrgrp>, and p53 transactivation prediction <abbrgrp>
<abbr bid="B4">4</abbr>
</abbrgrp>.</p>
<p>Among various MDS methods, the classical MDS is based on the idea of finding coordinates appropriate for describing dissimilarities as distances <abbrgrp>
<abbr bid="B5">5</abbr>
</abbrgrp>. The classical MDS finds coordinates by a set of matrix operations. Roughly speaking, it decomposes the squared distance matrix by solving the eigenpair problem, of which complexity is proportional to the cube of the number of data points <abbrgrp>
<abbr bid="B6">6</abbr>
</abbrgrp>. This heavy computational burden is a bottleneck for quick processing of large-scale datasets having thousands of objects. Meanwhile, massive parallel processing based on graphics processing units (GPUs) for general computing applications, a.k.a. GPGPU (general purpose computation on graphics processing units) has risen as a reasonable option for expediting computationally-intensive jobs on normal desktop computers equipped with a graphics card <abbrgrp>
<abbr bid="B7">7</abbr>
</abbrgrp>. CUDA (compute unified device architecture) is one of the most pervasively-used frameworks for GPGPU developed by NVIDIA, Inc. <abbrgrp>
<abbr bid="B8">8</abbr>
</abbrgrp>. In the CUDA environment, linear algebra packages such as CUBLAS <abbrgrp>
<abbr bid="B8">8</abbr>
</abbrgrp> and CULA <abbrgrp>
<abbr bid="B9">9</abbr>
</abbrgrp> are provided. In bioinformatics, CUDA has been deployed for diverse applications such as sequence alignment <abbrgrp>
<abbr bid="B10">10</abbr>
<abbr bid="B11">11</abbr>
<abbr bid="B12">12</abbr>
</abbrgrp>, protein substructure search <abbrgrp>
<abbr bid="B13">13</abbr>
</abbrgrp>, RNA microarray analysis <abbrgrp>
<abbr bid="B14">14</abbr>
</abbrgrp>, and a non-classical MDS <abbrgrp>
<abbr bid="B15">15</abbr>
</abbrgrp>.</p>
<p>One problem with CUDA is the relatively small memory size of most graphics cards (usually less than 1 gigabyte). General graphics cards do not have sufficient memory for storing and processing large-scale datasets containing tens of thousands data points. For circumventing this problem, we exploit a famous engineering principle, i.e., divide-and-conquer. Divide-and-conquer approach to the classical multidimensional scaling has drawn much attention for reducing its computational complexity and has been applied in serial computing environments <abbrgrp>
<abbr bid="B6">6</abbr>
<abbr bid="B16">16</abbr>
</abbrgrp>.</p>
</sec>
<sec>
<st>
<p>Implementation</p>
</st>
<p>We implemented CFDMS by extending our previous work <abbrgrp>
<abbr bid="B17">17</abbr>
</abbrgrp>. Our software application has two operating modes. If a graphics card allows sufficient memory for reading and processing all data points, it runs in "one-shot" mode. When available memory is not enough, it operates in "divide-and-conquer" mode and produces an approximate solution. The available memory size is automatically detected and the two operating modes are accordingly toggled on and off.</p>
<sec>
<st>
<p>One-shot MDS</p>
</st>
<p>In the one-shot mode, the classical MDS on a dissimilarity matrix <b>D</b>, of which size is <it>n </it>&#215; <it>n</it>, proceeds as follows.</p>
<p>1. <b>D</b>
<sup>(2) </sup>= [<it>d<sub>ij </sub>
</it>
<sup>2</sup>], where <it>d<sub>ij </sub>
</it>denotes the element of <b>D </b>on the <it>i</it>th row and the <it>j</it>th column, i.e., the dissimilarity between the <it>i</it>th and <it>j</it>th points.</p>
<p>2. <b>J </b>= <b>I </b>- <it>n</it>
<sup>-1</sup>
<b>1</b>, where <b>I </b>is the identity matrix and <b>1 </b>denotes the <it>n </it>&#215; <it>n </it>matrix of which elements are all one.</p>
<p>3.<inline-formula>
<m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S17-S23-i1"><m:mrow>
   <m:mstyle class="text">
      <m:mtext class="textsf" mathvariant="sans-serif">B</m:mtext>
   </m:mstyle>
   <m:mo class="MathClass-rel">=</m:mo>
   <m:mspace class="thinspace" width="0.3em"/>
   <m:mo class="MathClass-bin">-</m:mo>
   <m:mspace class="thinspace" width="0.3em"/>
   <m:mfrac>
      <m:mrow>
         <m:mn>1</m:mn>
      </m:mrow>
      <m:mrow>
         <m:mn>2</m:mn>
      </m:mrow>
   </m:mfrac>
   <m:mstyle class="text">
      <m:mtext class="textsf" mathvariant="sans-serif">J</m:mtext>
   </m:mstyle>
   <m:msup>
      <m:mrow>
         <m:mstyle class="text">
            <m:mtext class="textsf" mathvariant="sans-serif">D</m:mtext>
         </m:mstyle>
      </m:mrow>
      <m:mrow>
         <m:mfenced close=")" open="(" separators="">
            <m:mrow>
               <m:mn>2</m:mn>
            </m:mrow>
         </m:mfenced>
      </m:mrow>
   </m:msup>
   <m:mstyle class="text">
      <m:mtext class="textsf" mathvariant="sans-serif">J</m:mtext>
   </m:mstyle>
</m:mrow>
</m:math>
</inline-formula>.</p>
<p>4. Calculate the first <it>m </it>eigenvectors <b>e</b>
<sub>1</sub>, <b>e</b>
<sub>2</sub>, ..., <b>e</b>
<it>
<sub>m </sub>
</it>and the corresponding eigenvalues &#955;<sub>1</sub>, &#955;<sub>2</sub>, ..., &#955;<it>
<sub>m </sub>
</it>from <b>B</b>.</p>
<p>5. Calculate the <it>m</it>-dimensional coordinates of the <it>n </it>data points by <inline-formula>
<m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2105-13-S17-S23-i2"><m:mrow>
   <m:msup>
      <m:mrow>
         <m:mi mathvariant="bold">X</m:mi>
      </m:mrow>
      <m:mrow>
         <m:mi mathvariant="bold">T</m:mi>
      </m:mrow>
   </m:msup>
   <m:mo class="MathClass-rel">=</m:mo>
   <m:mfenced close="]" open="[" separators="">
      <m:mrow>
         <m:msub>
            <m:mrow>
               <m:mi mathvariant="bold">e</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mn>1</m:mn>
            </m:mrow>
         </m:msub>
         <m:mo class="MathClass-punc">,</m:mo>
         <m:mspace class="thinspace" width="0.3em"/>
         <m:msub>
            <m:mrow>
               <m:mi mathvariant="bold">e</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mn>2</m:mn>
            </m:mrow>
         </m:msub>
         <m:mo class="MathClass-punc">,</m:mo>
         <m:mspace class="thinspace" width="0.3em"/>
         <m:mi>.</m:mi>
         <m:mi>.</m:mi>
         <m:mi>.</m:mi>
         <m:mo class="MathClass-punc">,</m:mo>
         <m:mspace class="thinspace" width="0.3em"/>
         <m:msub>
            <m:mrow>
               <m:mi mathvariant="bold">e</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mi>m</m:mi>
            </m:mrow>
         </m:msub>
      </m:mrow>
   </m:mfenced>
   <m:msubsup>
      <m:mrow>
         <m:mi mathvariant="bold">&#923;</m:mi>
      </m:mrow>
      <m:mrow>
         <m:mi>m</m:mi>
      </m:mrow>
      <m:mrow>
         <m:mfrac>
            <m:mrow>
               <m:mn>1</m:mn>
            </m:mrow>
            <m:mrow>
               <m:mn>2</m:mn>
            </m:mrow>
         </m:mfrac>
      </m:mrow>
   </m:msubsup>
   <m:mo class="MathClass-punc">,</m:mo>
   <m:mspace class="thinspace" width="0.3em"/>
   <m:mstyle class="text">
      <m:mtext class="textsf" mathvariant="sans-serif">where&#160;</m:mtext>
   </m:mstyle>
   <m:msubsup>
      <m:mrow>
         <m:mo>&#923;</m:mo>
      </m:mrow>
      <m:mrow>
         <m:mi>m</m:mi>
      </m:mrow>
      <m:mrow>
         <m:mn>1</m:mn>
         <m:mo class="MathClass-bin">/</m:mo>
         <m:mn>2</m:mn>
      </m:mrow>
   </m:msubsup>
   <m:mo class="MathClass-rel">=</m:mo>
   <m:mstyle class="text">
      <m:mtext class="textsf" mathvariant="sans-serif">diag</m:mtext>
   </m:mstyle>
   <m:mfenced close=")" open="(" separators="">
      <m:mrow>
         <m:msubsup>
            <m:mrow>
               <m:mo>&#955;</m:mo>
            </m:mrow>
            <m:mrow>
               <m:mn>1</m:mn>
            </m:mrow>
            <m:mrow>
               <m:mn>1</m:mn>
               <m:mo class="MathClass-bin">/</m:mo>
               <m:mn>2</m:mn>
            </m:mrow>
         </m:msubsup>
         <m:mo class="MathClass-punc">,</m:mo>
         <m:mspace class="thinspace" width="0.3em"/>
         <m:msubsup>
            <m:mrow>
               <m:mo>&#955;</m:mo>
            </m:mrow>
            <m:mrow>
               <m:mn>2</m:mn>
            </m:mrow>
            <m:mrow>
               <m:mn>1</m:mn>
               <m:mo class="MathClass-bin">/</m:mo>
               <m:mn>2</m:mn>
            </m:mrow>
         </m:msubsup>
         <m:mo class="MathClass-punc">,</m:mo>
         <m:mspace class="thinspace" width="0.3em"/>
         <m:mi>.</m:mi>
         <m:mi>.</m:mi>
         <m:mi>.</m:mi>
         <m:mo class="MathClass-punc">,</m:mo>
         <m:mspace class="thinspace" width="0.3em"/>
         <m:msubsup>
            <m:mrow>
               <m:mo>&#955;</m:mo>
            </m:mrow>
            <m:mrow>
               <m:mi>m</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mn>1</m:mn>
               <m:mo class="MathClass-bin">/</m:mo>
               <m:mn>2</m:mn>
            </m:mrow>
         </m:msubsup>
      </m:mrow>
   </m:mfenced>
   <m:mi>.</m:mi>
</m:mrow>
</m:math>
</inline-formula>
</p>
<p>Each column of <b>X </b>corresponds to the coordinate of each data point in the reduced (<it>m</it>-dimensional) space. The above procedure has been implemented using CUBLAS <abbrgrp>
<abbr bid="B8">8</abbr>
</abbrgrp> and CULA <abbrgrp>
<abbr bid="B9">9</abbr>
</abbrgrp>.</p>
</sec>
<sec>
<st>
<p>Divide-and-conquer MDS</p>
</st>
<p>The divide-and-conquer MDS based on <abbrgrp>
<abbr bid="B6">6</abbr>
</abbrgrp> divides a given set of objects into several subsets of manageable size. Then, another subset of manageable size is made by sampling from each of the previous subsets. The same MDS routine of the one-shot mode is applied to each of the submatrices. Finally, each result is merged into an approximate MDS solution for the entire objects. More precise steps are as follows (see Figure <figr fid="F1">1</figr>).</p>
<fig id="F1"><title><p>Figure 1</p></title><caption><p>Process of divide-and-conquer mode</p></caption><text>
   <p><b>Process of divide-and-conquer mode</b>. First, a dissimilarity matrix is randomly decomposed into <it>p </it>submatrices along the diagonal, <b>D</b><sub>1</sub>, ..., <b>D</b><it><sub>p</sub></it>. Second, <it>s </it>objects are sampled from each of the submatrices. Then, the sampled objects are merged to construct a new dissimilarity submatrix <b>M</b><it><sub>align</sub></it>. The one-shot MDS method is applied to <b>D</b><sub>1</sub>, ..., <b>D</b><it><sub>p </sub></it>as well as <b>M</b><it><sub>align</sub></it>. The resulting coordinates are <b>dMDS</b><sub>1</sub>, ..., <b>dMDS</b><it><sub>p </sub></it>as well as <b>mMDS</b>, respectively. After that, the objects sampled from each of <b>D</b><sub>1</sub>, ..., <b>D</b><it><sub>p </sub></it>are extracted from the resulting coordinates matrices, comprising sub<b>dMDS</b><sub>1</sub>, ..., sub<b>dMDS</b><it><sub>p </sub></it>as well as <b>mMDS</b><sub>1</sub>, ..., <b>mMDS</b><it><sub>p</sub></it>. For each pair, sub<b>dMDS</b><it><sub>i </sub></it>and <b>mMDS</b><it><sub>i </sub></it>(<it>i </it>= 1, 2, ..., <it>p</it>), a linear transformation matrix <b>A</b><it><sub>i </sub></it>is obtained by minimizing ||<b>A</b><it><sub>i</sub></it>sub<b>dMDS</b><it><sub>i </sub></it>- <b>mMDS</b><it><sub>i</sub></it>||, where || &#183; || denotes <it>L</it><sup>2 </sup>norm. The linearly transformed objects <it>new</it><b>dMDS</b><it><sub>i </sub></it>on a reduced dimension are obtained by <b>A</b><it><sub>i</sub></it><b>dMDS</b><it><sub>i</sub></it>. Finally, <it>new</it><b>dMDS</b><sub>1</sub>, ..., <it>new</it><b>dMDS</b><it><sub>p </sub></it>are combined to produce the MDS result for the entire objects.</p>
</text><graphic file="1471-2105-13-S17-S23-1"/></fig>
<p>1. Randomly decompose an <it>n </it>&#215; <it>n </it>dissimilarity matrix <b>D</b>
<it>
<sub>all </sub>
</it>along the diagonal into <it>p </it>submatrices, i.e., <b>D</b>
<sub>1</sub>, <b>D</b>
<sub>2</sub>, ..., <b>D</b>
<it>
<sub>p</sub>
</it>.</p>
<p>2. Sample <it>s </it>objects from each of the submatrices.</p>
<p>3. Merge the sampled objects and construct a new dissimilarity submatrix <b>M</b>
<it>
<sub>align </sub>
</it>of which size is (<it>sp</it>) &#215; (<it>sp</it>).</p>
<p>4. Apply the one-shot MDS method to <b>D</b>
<sub>1</sub>, <b>D</b>
<sub>2</sub>, ..., <b>D</b>
<it>
<sub>p </sub>
</it>as well as <b>M</b>
<it>
<sub>align</sub>
</it>. Denote the resulting coordinates by <b>dMDS</b>
<sub>1</sub>, <b>dMDS</b>
<sub>2</sub>, ..., <b>dMDS</b>
<it>
<sub>p </sub>
</it>as well as <b>mMDS</b>, respectively.</p>
<p>5. Extract the objects sampled at step 2 from the above results, obtaining sub<b>dMDS</b>
<sub>1</sub>, sub<b>dMDS</b>
<sub>2</sub>, ..., sub<b>dMDS</b>
<it>
<sub>p </sub>
</it>as well as <b>mMDS</b>
<sub>1</sub>, <b>mMDS</b>
<sub>2</sub>, ..., <b>mMDS</b>
<it>
<sub>p</sub>
</it>.</p>
<p>6. For each pair sub<b>dMDS</b>
<it>
<sub>i </sub>
</it>and <b>mMDS</b>
<it>
<sub>i </sub>
</it>(<it>i </it>= 1, 2, ..., <it>p</it>), solve the following linear least squares problem, argmin<sub>
<b>A</b>
<it>i </it>
</sub>||<b>A</b>
<it>
<sub>i</sub>
</it>sub<b>dMDS</b>
<it>
<sub>i </sub>
</it>- <b>mMDS</b>
<it>
<sub>i</sub>
</it>||, where || &#183; || denotes <it>L</it>
<sup>2 </sup>norm.</p>
<p>7. Linearly transform the objects of <b>D</b>
<it>
<sub>i </sub>
</it>as follows. <b>A</b>
<it>
<sub>i</sub>
</it>
<b>dMDS</b>
<it>
<sub>i </sub>
</it>= <it>new</it>
<b>dMDS</b>
<it>
<sub>i</sub>
</it>.</p>
<p>8. Combine <it>new</it>
<b>dMDS</b>
<sub>1</sub>, <it>new</it>
<b>dMDS</b>
<sub>2</sub>, ..., <it>new</it>
<b>dMDS</b>
<it>
<sub>p </sub>
</it>into an approximate MDS solution to the entire objects.</p>
<p>Since the size of submatrix is determined by the available memory size of a graphics card, the number of submatrices <it>p </it>and the number of sampled objects from each submatrix <it>s </it>are determined automatically by our software application. Two ways of sampling from the submatrices (Step 2 of the algorithm above) are "Random" and "MaxMin". <it>Random </it>denotes usual random sampling without replacement. In the <it>MaxMin </it>approach, data points are chosen one at a time, and each new point maximizes, over all unused data points, the minimum distance to any of the previously-sampled points <abbrgrp>
<abbr bid="B18">18</abbr>
</abbrgrp>. As in the one-shot mode, all the matrix operations have been implemented using CUBLAS and CULA.</p>
</sec>
</sec>
<sec>
<st>
<p>Results</p>
</st>
<p>CFMDS has been tested using five benchmark datasets. Table <tblr tid="T1">1</tblr> describes the data source and simple characteristics of each dataset. As shown in Table <tblr tid="T1">1</tblr>, diverse datasets, ranging from a simple dataset with four attributes to complicated microarrays and handwritten digits, were used to demonstrate the performance of CFMDS. Experiments were performed using a commodity PC equipped with an Intel Core2 Quad Processor Q6600 (2.4 GHz), 4 GB of RAM, and GeForce 8600 GT (graphics card). The operating system was Windows XP (32-bit version). CFMDS was run on this PC. For comparison, a general solution for the classical MDS was implemented using C# on this computer. However, the C#-based implementation was not able to process <it>S</it>. <it>cerevisiae </it>Microarray and MNIST datasets due to a memory shortage on the PC (4 GB only). For these large datasets, the classical MDS algorithm was implemented using MATLAB on a 64-bit Linux PC Server equipped with two Intel Xeon Processors E5506 (2.13 GHz) and 32 GB of RAM. It should be noted that the performance of matrix operations in MATLAB are known to be generally better than those implemented by other efficient languages such as C++ <abbrgrp>
<abbr bid="B15">15</abbr>
</abbrgrp>. Parameter settings for the experiments are shown in Table <tblr tid="T2">2</tblr>. The size of dissimilarity matrix is <it>n </it>&#215; <it>n</it>, where <it>n </it>is the number of instances in Table <tblr tid="T1">1</tblr>. The number of submatrices (<it>p</it>) and the number of objects sampled from each submatrix (<it>s</it>) were set based on the available memory size of the graphics card for <it>S. cerevisiae </it>Microarray and MNIST datasets. For IRIS, Dermatology, and <it>M</it>. <it>musculus </it>Microarray datasets, these parameters were set arbitrarily because they can be processed by the one-shot mode of CFMDS.</p>
<tbl id="T1"><title><p>Table 1</p></title><caption><p>Benchmark datasets</p></caption><tblbdy cols="6">
      <r>
         <c ca="center">
            <p>
               <b>Dataset</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>Source</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>Number of Attributes</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>Number of Instances</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>Pearson's Median Skewness Coefficient</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>Coefficient of Variation</b>
            </p>
         </c>
      </r>
      <r>
         <c cspan="6">
            <hr/>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>IRIS</p>
         </c>
         <c ca="center">
            <p>UCI ML Repository</p>
         </c>
         <c ca="center">
            <p>4</p>
         </c>
         <c ca="center">
            <p>150</p>
         </c>
         <c ca="center">
            <p>0.34</p>
         </c>
         <c ca="center">
            <p>0.64</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>Dermatology</p>
         </c>
         <c ca="center">
            <p>UCI ML Repository</p>
         </c>
         <c ca="center">
            <p>33</p>
         </c>
         <c ca="center">
            <p>366</p>
         </c>
         <c ca="center">
            <p>-0.61</p>
         </c>
         <c ca="center">
            <p>0.42</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p><it>M. musculus </it>Microarray</p>
         </c>
         <c ca="center">
            <p>GEO</p>
         </c>
         <c ca="center">
            <p>4,000</p>
         </c>
         <c ca="center">
            <p>2,000</p>
         </c>
         <c ca="center">
            <p>0.94</p>
         </c>
         <c ca="center">
            <p>1.08</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p><it>S. cerevisiae </it>Microarray</p>
         </c>
         <c ca="center">
            <p>GEO</p>
         </c>
         <c ca="center">
            <p>1,000</p>
         </c>
         <c ca="center">
            <p>9,300</p>
         </c>
         <c ca="center">
            <p>0.73</p>
         </c>
         <c ca="center">
            <p>0.56</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>MNIST</p>
         </c>
         <c ca="center">
            <p>MNIST</p>
         </c>
         <c ca="center">
            <p>784</p>
         </c>
         <c ca="center">
            <p>10,000</p>
         </c>
         <c ca="center">
            <p>-0.13</p>
         </c>
         <c ca="center">
            <p>0.14</p>
         </c>
      </r>
   </tblbdy><tblfn>
      <p>UCI ML Repository is UCI Machine Learning Repository <url>http://archive.ics.uci.edu/ml/datasets.html</url>. GEO is Gene Expression Omnibus <url>http://www.ncbi.nlm.nih.gov/geo/</url>. MNIST is the MNIST Database of handwritten digits <url>http://yann.lecun.com/exdb/mnist/</url>. <it>M</it>. <it>musculus </it>Microarray is a modified dataset from <it>Mus musculus </it>microarrays in GEO and <it>S</it>. <it>cerevisiae </it>Microarray is a modified dataset from <it>Saccharomyces cerevisiae </it>microarrays in GEO. MNIST dataset is from scanned handwritten digit images of 28 &#215; 28 pixels.</p>
   </tblfn></tbl>
<tbl id="T2"><title><p>Table 2</p></title><caption><p>Experimental setting</p></caption><tblbdy cols="4">
      <r>
         <c ca="center">
            <p>
               <b>Dataset</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>Size of Dissimilarity Matrix</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>No. of Submatrices</b>
            </p>
            <p>
               <b>(<it>p</it>)</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <b>No. of Samples</b>
            </p>
            <p>
               <b>in Each Submatrix (<it>s</it>)</b>
            </p>
         </c>
      </r>
      <r>
         <c cspan="4">
            <hr/>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>IRIS</p>
         </c>
         <c ca="center">
            <p>150 &#215; 150</p>
         </c>
         <c ca="center">
            <p>3</p>
         </c>
         <c ca="center">
            <p>20</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>Dermatology</p>
         </c>
         <c ca="center">
            <p>366 &#215; 366</p>
         </c>
         <c ca="center">
            <p>3</p>
         </c>
         <c ca="center">
            <p>60</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p><it>M. musculus </it>Microarray</p>
         </c>
         <c ca="center">
            <p>2,000 &#215; 2,000</p>
         </c>
         <c ca="center">
            <p>10</p>
         </c>
         <c ca="center">
            <p>100</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p><it>S. cerevisiae </it>Microarray</p>
         </c>
         <c ca="center">
            <p>9,300 &#215; 9,300</p>
         </c>
         <c ca="center">
            <p>10</p>
         </c>
         <c ca="center">
            <p>150</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>MNIST</p>
         </c>
         <c ca="center">
            <p>10,000 &#215; 10,000</p>
         </c>
         <c ca="center">
            <p>10</p>
         </c>
         <c ca="center">
            <p>150</p>
         </c>
      </r>
   </tblbdy><tblfn>
      <p>These parameters were set for comparison experiments of the divide-and-conquer mode of CFMDS. In fact, the CFMS application automatically detects the available memory size and these parameters are subsequently determined. For IRIS, Dermatology, and <it>M. muculus </it>Microarray datasets, these parameters were set arbitrarily, because they can be processed by the one-shot mode of CFMDS.</p>
   </tblfn></tbl>
<sec>
<st>
<p>Execution time of CFMDS</p>
</st>
<p>The execution time was compared to demonstrate the speed-up of the proposed application. Figure <figr fid="F2">2</figr> shows the execution time of each method including CFMDS with <it>Random </it>sampling, CFMDS with <it>MaxMin </it>sampling, one-shot CFMDS, and conventional solutions for the classical MDS in serial computing environments. In the figure, the y-axis is in log scale. As expected, CFMDS showed significant improvement in running time for large datasets such as the two microarray and MNIST datasets. For the most time-consuming dataset, MNIST, the conventional MDS algorithm took almost 6 hours to get the result. However, CFMDS with <it>Random </it>or <it>MaxMin </it>sampling produced the results from the same dataset within 3 minutes. CFMDS with <it>Random </it>sampling was more than 100 times faster than the conventional MDS algorithm for <it>M. musculus </it>and <it>S. cerevisiae </it>datasets. CFMDS with <it>MaxMin </it>sampling was more than 66 times faster than the conventional MDS algorithm for these microarray datasets. CFMDS also achieved significant speed-up for even small datasets such as IRIS and Dermatology, ranging from 3 to 22 times faster. These results confirm the fact that the proposed application is very useful for fast multidimensional scaling of diverse datasets, not only of genome-scale data. We also verified the necessity of our divide-and-conquer strategy for large data. Both the one-shot and divide-and-conquer modes of CFMDS required similar computational time for small datasets such as IRIS and Dermatology. However, the one-shot mode needed much more computational time than the divide-and-conquer mode for <it>M</it>. <it>musculus </it>Microarray dataset. Further, the one-shot mode was not able to process <it>S</it>. <it>cerevisiae </it>Microarray and MNIST datasets due to the limitation of memory in the graphics card. "0.00" in Figure <figr fid="F2">2</figr> means "not applicable."</p>
<fig id="F2"><title><p>Figure 2</p></title><caption><p>Comparison results of execution time</p></caption><text>
   <p><b>Comparison results of execution time</b>. Average running time in seconds is shown. The y-axis is in log scale. Random (MaxMin) means the divide-and-conquer mode of CFMDS with <it>Random </it>(<it>MaxMin</it>) sampling. One-shot MDS represents CFMDS without divide-and-conquer. Conventional MDS represents the classical MDS implemented using C# or MATLAB in serial computing environments. "0.00" denotes "not applicable." For <it>S. cerevisiae </it>and MNIST datasets, we were not able to apply the one-shot mode of CFMDS due to the memory limitation in our graphics card.</p>
</text><graphic file="1471-2105-13-S17-S23-2"/></fig>
</sec>
<sec>
<st>
<p>Accuracy of CFMDS</p>
</st>
<p>To examine the accuracy of the divide-and conquer mode of CFMDS, Pearson's correlation coefficient between the results from the classical MDS and CFMDS was used. More precisely, vectors, consisting of the Euclidean distance between each object pair on a reduced dimension, were generated from the results of the classical MDS and CFMDS, respectively. Then, Pearson's correlation coefficient between these vectors was calculated. As the correlation coefficient is close to 1, the result from the divide-and-conquer mode of CFMDS is similar to the result from the classical MDS. The accuracy comparison results are shown in Figure <figr fid="F3">3</figr>. The figure depicts average values of 100 independent runs with error bars representing standard deviation. As shown in Figure <figr fid="F3">3</figr>, the divide-and-conquer mode of CFMDS produced highly accurate results from all datasets. Pearson's correlation coefficients were larger than 0.9 in <it>Random </it>or <it>MaxMin </it>samplings. For the simplest IRIS dataset, which has 4 attributes and 150 instances, CFMDS achieved almost identical results compared to the classical MDS (Pearson's correlation coefficient: about 0.99) both in <it>Random </it>and <it>MaxMin </it>sampling modes. Dermatology and <it>S. cerevisiae </it>Microarray datasets showed similar trends with decrease in accuracy compared to the IRIS dataset.</p>
<fig id="F3"><title><p>Figure 3</p></title><caption><p>Comparison results of accuracy</p></caption><text>
   <p><b>Comparison results of accuracy</b>. Pearson's correlation coefficient was used as accuracy. The mean value and standard deviation from 100 independent simulation results are shown. Random (MaxMin) means the divide-and-conquer mode of CFMDS with <it>Random </it>(<it>MaxMin</it>) sampling.</p>
</text><graphic file="1471-2105-13-S17-S23-3"/></fig>
<p>However, CFMDS with <it>Random </it>and <it>MaxMin </it>sampling modes showed different results for <it>M. musculus </it>Microarray and MNIST datasets. For <it>M. musculus </it>Microarray dataset, <it>Random </it>sampling mode showed the worst result among all benchmark datasets with the largest standard deviation, although <it>MaxMin </it>sampling method produced almost identical results compared to the result from the classical MDS (Pearson's correlation coefficient: about 0.97). On the contrary, <it>MaxMin </it>mode showed a relatively low performance with high variance for MNIST dataset. For the same dataset, <it>Random </it>sampling mode achieved relatively accurate results (Pearson's correlation coefficient: about 0.93). The difference in performance of <it>Random </it>and <it>MaxMin </it>sampling methods of CFMDS could be due to the skewness or dispersion of data. The <it>MaxMin </it>sampling mode is suitable for datasets with high skewness or dispersion, because it could sample data points which are far apart from each other <abbrgrp>
<abbr bid="B18">18</abbr>
</abbrgrp>. We checked the skewness and dispersion of our experimental datasets using Pearson's median skewness coefficient and coefficient of variation of distances between data points. The Pearson's median skewness coefficient (PMSC) is defined as 3(<it>mean </it>- <it>median</it>)/<it>standard deviation </it>and measures asymmetry of a distribution. Coefficient of variation (CV) is defined as <it>standard deviation </it>/<it>mean </it>and is a normalized measure of dispersion. Among the five datasets, <it>M. musculus </it>Microarray showed the highest skewness and dispersion (PMSC = 0.94, CV = 1.08). For this dataset, <it>MaxMin </it>sampling mode of CFMDS generated relatively accurate results. MNIST dataset showed the lowest skewness and dispersion (PMSC = -0.13, CV = 0.14). For this dataset, <it>Random </it>sampling mode showed relatively accurate results. As a conclusion, we suggest the use of <it>MaxMin </it>sampling for highly skewed or dispersed data and <it>Random </it>sampling for symmetric and lowly dispersed data.</p>
</sec>
</sec>
<sec>
<st>
<p>Discussion</p>
</st>
<p>We implemented a software application, CFMDS (CUDA-based Fast MultiDimensional Scaling) for efficient dimensionality reduction of large-scale genomic data. CFMDS adopted CUDA programming library and divide-and-conquer strategy to handle several thousands of features in less than several minutes on a commodity PC equipped with a graphics card. CUDA was applied as a parallel computing method and divide-and-conquer principle was used to circumvent the small memory size problem of usual graphics cards. By combining these two techniques, CFMDS enables that a regular PC with a CUDA-support graphics card handles the large-scale genomic data dimensionality reduction problem which can be efficiently executed only on high performance computers. The simulation results confirmed that our approach can perform MDS more than a hundred times faster with a comparable accuracy for genome-scale data. Therefore, CFMDS is especially useful to visualize and analyze data consisting of several thousands of objects in less than several minutes. We implemented two sampling options for the divide-and-conquer mode of CFMDS such as <it>Random </it>and <it>MaxMin </it>samplings. As shown in Results section, CFMDS with <it>Random </it>sampling approach usually works quite well in practice. <it>MaxMin </it>sampling method is especially useful in some contexts where data distribution is highly skewed or dispersed. Further work includes optimizing our application with respect to data transfer between graphics cards and host computers.</p>
</sec>
<sec>
<st>
<p>Availability and requirements</p>
</st>
<p>Project name: CFMDS</p>
<p>Project home page: <url>http://ml.ssu.ac.kr/CFMDS/CFMDS.html</url>
</p>
<p>Operating system(s): Windows XP or higher (32-bit and 64-bit), Linux (tested on Ubuntu Linux 9.04, Red Hat Enterprise Linux 5.3/4.7, Fedora 11)</p>
<p>Programming language: CUDA</p>
<p>Other requirements: NVIDIA's GPU with CUDA, CUDA toolkit 2.3 (not support CUDA 3.0 toolkit yet), The latest version of CULA basic libraries</p>
<p>License: GNU GPL v2</p>
<p>Any restrictions to use by non-academics: none</p>
</sec>
<sec>
<st>
<p>Competing interests</p>
</st>
<p>The authors declare that they have no competing interests.</p>
</sec>
<sec>
<st>
<p>Authors' contributions</p>
</st>
<p>S.P. developed the software application and performed the experiments. S.-Y.S. wrote the manuscript and discussed the results. K.-B.H. led the project and wrote the article. All of the authors have read and approved the final manuscript.</p>
</sec>
</bdy><bm>
<ack>
<sec>
<st>
<p>Acknowledgements</p>
</st>
<p>K.-B.H. was supported by the Soongsil University Research Fund and by the Proteogenomic Research Program and Basic Science Research Program (2012R1A1A2039822) through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology. S.-Y.S. was supported by Basic Science Research Program (2012R1A1A2002804) through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology.</p>
<p>This article has been published as part of <it>BMC Bioinformatics </it>Volume 13 Supplement 17, 2012: Eleventh International Conference on Bioinformatics (InCoB2012): Bioinformatics. The full contents of the supplement are available online at <url>http://www.biomedcentral.com/bmcbioinformatics/supplements/13/S17</url>.</p>
</sec>
</ack>
<refgrp><bibl id="B1"><title><p>Regional patterns of gene expression in human and chimpanzee brains</p></title><aug><au><snm>Khaitovich</snm><fnm>P</fnm></au><au><snm>Muetzel</snm><fnm>B</fnm></au><au><snm>She</snm><fnm>X</fnm></au><au><snm>Lachmann</snm><fnm>M</fnm></au><au><snm>Hellmann</snm><fnm>I</fnm></au><au><snm>Dietzsch</snm><fnm>J</fnm></au><au><snm>Steigele</snm><fnm>S</fnm></au><au><snm>Do</snm><fnm>HH</fnm></au><au><snm>Weiss</snm><fnm>G</fnm></au><au><snm>Enard</snm><fnm>W</fnm></au><au><snm>Heissig</snm><fnm>F</fnm></au><au><snm>Arendt</snm><fnm>T</fnm></au><au><snm>Nieselt-Struwe</snm><fnm>K</fnm></au><au><snm>Eichler</snm><fnm>EE</fnm></au><au><snm>P&#257;&#257;bo</snm><fnm>S</fnm></au></aug><source>Genome Res</source><pubdate>2004</pubdate><volume>14</volume><fpage>1462</fpage><lpage>1473</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1101/gr.2538704</pubid><pubid idtype="pmcid">509255</pubid><pubid idtype="pmpid" link="fulltext">15289471</pubid></pubidlist></xrefbib></bibl><bibl id="B2"><title><p>Relational patterns of gene expression via non-metric multidimensional scaling analysis</p></title><aug><au><snm>Taguchi</snm><fnm>YH</fnm></au><au><snm>Oono</snm><fnm>Y</fnm></au></aug><source>Bioinformatics</source><pubdate>2005</pubdate><volume>21</volume><issue>6</issue><fpage>730</fpage><lpage>740</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/bioinformatics/bti067</pubid><pubid idtype="pmpid" link="fulltext">15509613</pubid></pubidlist></xrefbib></bibl><bibl id="B3"><title><p>Image-based multivariate profiling of drug responses from single cells</p></title><aug><au><snm>Loo</snm><fnm>LH</fnm></au><au><snm>Wu</snm><fnm>LF</fnm></au><au><snm>Altschuler</snm><fnm>SJ</fnm></au></aug><source>Nat Methods</source><pubdate>2007</pubdate><volume>4</volume><issue>5</issue><fpage>445</fpage><lpage>453</lpage><xrefbib><pubid idtype="pmpid" link="fulltext">17401369</pubid></xrefbib></bibl><bibl id="B4"><title><p>Regression based predictor for p53 transactivation</p></title><aug><au><snm>Gowrisankar</snm><fnm>S</fnm></au><au><snm>Jegga</snm><fnm>AG</fnm></au></aug><source>BMC Bioinformatics</source><pubdate>2009</pubdate><volume>10</volume><fpage>215</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1186/1471-2105-10-215</pubid><pubid idtype="pmcid">2719629</pubid><pubid idtype="pmpid" link="fulltext">19602281</pubid></pubidlist></xrefbib></bibl><bibl id="B5"><aug><au><snm>Borg</snm><fnm>I</fnm></au><au><snm>Groenen</snm><fnm>PJF</fnm></au></aug><source>Modern Multidimensional Scaling: Theory and Applications</source><publisher>New York, Springer</publisher><edition>2</edition><pubdate>2005</pubdate></bibl><bibl id="B6"><title><p>A fast approximation to multidimensional scaling</p></title><aug><au><snm>Yang</snm><fnm>T</fnm></au><au><snm>Lui</snm><fnm>J</fnm></au><au><snm>McMillan</snm><fnm>L</fnm></au><au><snm>Wang</snm><fnm>W</fnm></au></aug><source>Proceedings of the ECCV 2006 Workshop on Computational Intensive Methods for Computer Vision</source><pubdate>2006</pubdate></bibl><bibl id="B7"><title><p>Mapping computational concepts to GPUs</p></title><aug><au><snm>Harris</snm><fnm>M</fnm></au></aug><source>Proceedings of SIGGRAPH '05 ACM SIGGRAPH 2005 Courses</source><pubdate>2005</pubdate><xrefbib><pubid idtype="pmpid" link="fulltext">21521289</pubid></xrefbib></bibl><bibl id="B8"><title><p>NVIDIA CUDA Zone</p></title><url>http://www.nvidia.com/object/cuda home new.html</url></bibl><bibl id="B9"><title><p>CULA tools, EM Photonics</p></title><url>http://www.culatools.com</url></bibl><bibl id="B10"><title><p>CUDA compatible GPU cards as efficient hardware accelerators for Smith-Waterman sequence alignment</p></title><aug><au><snm>Manavski</snm><fnm>SA</fnm></au><au><snm>Valle</snm><fnm>G</fnm></au></aug><source>BMC Bioinformatics</source><pubdate>2008</pubdate><volume>9</volume><issue>Suppl 2</issue><fpage>S10</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1186/1471-2105-9-S2-S10</pubid><pubid idtype="pmcid">2638150</pubid><pubid idtype="pmpid" link="fulltext">19091009</pubid></pubidlist></xrefbib></bibl><bibl id="B11"><title><p>CBESW: Sequence alignment on the Playstation 3</p></title><aug><au><snm>Wirawan</snm><fnm>A</fnm></au><au><snm>Kwoh</snm><fnm>CK</fnm></au><au><snm>Hieu</snm><fnm>NT</fnm></au><au><snm>Schmidt</snm><fnm>B</fnm></au></aug><source>BMC Bioinformatics</source><pubdate>2008</pubdate><volume>9</volume><fpage>377</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1186/1471-2105-9-377</pubid><pubid idtype="pmcid">2571991</pubid><pubid idtype="pmpid" link="fulltext">18798993</pubid></pubidlist></xrefbib></bibl><bibl id="B12"><title><p>CUDASW++: optimizing Smith-Waterman sequence database searches for CUDA-enabled graphics processing units</p></title><aug><au><snm>Lui</snm><fnm>Y</fnm></au><au><snm>Maskell</snm><fnm>DL</fnm></au><au><snm>Schmidt</snm><fnm>B</fnm></au></aug><source>BMC Research Notes</source><pubdate>2009</pubdate><volume>2</volume><fpage>73</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1186/1756-0500-2-73</pubid><pubid idtype="pmcid">2694204</pubid><pubid idtype="pmpid" link="fulltext">19416548</pubid></pubidlist></xrefbib></bibl><bibl id="B13"><title><p>Fast and accurate protein substructure searching with simulated annealing and GPUs</p></title><aug><au><snm>Stivala</snm><fnm>AD</fnm></au><au><snm>Stuckey</snm><fnm>PJS</fnm></au><au><snm>Wirth</snm><fnm>AI</fnm></au></aug><source>BMC Bioinformatics</source><pubdate>2010</pubdate><volume>11</volume><fpage>446</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1186/1471-2105-11-446</pubid><pubid idtype="pmcid">2944279</pubid><pubid idtype="pmpid" link="fulltext">20813068</pubid></pubidlist></xrefbib></bibl><bibl id="B14"><title><p>permGPU: Using graphics processing units in RNA microarray association studies</p></title><aug><au><snm>Shterev</snm><fnm>ID</fnm></au><au><snm>Jung</snm><fnm>SH</fnm></au><au><snm>George</snm><fnm>SL</fnm></au><au><snm>Owzar</snm><fnm>K</fnm></au></aug><source>BMC Bioinformatics</source><pubdate>2010</pubdate><volume>11</volume><fpage>329</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1186/1471-2105-11-329</pubid><pubid idtype="pmcid">2910023</pubid><pubid idtype="pmpid" link="fulltext">20553619</pubid></pubidlist></xrefbib></bibl><bibl id="B15"><title><p>CUDA-based multi-core implementation of MDS-based bioinformatics algorithms</p></title><aug><au><snm>Fester</snm><fnm>T</fnm></au><au><snm>Schreiber</snm><fnm>F</fnm></au><au><snm>Strickert</snm><fnm>M</fnm></au></aug><source>Proceedings of German Conference on Bioinformatics (GCB 2009)</source><fpage>67</fpage><lpage>79</lpage></bibl><bibl id="B16"><title><p>Multidimensional scaling for large genomic datasets</p></title><aug><au><snm>Tzeng</snm><fnm>J</fnm></au><au><snm>Lu</snm><fnm>HHS</fnm></au><au><snm>Li</snm><fnm>WH</fnm></au></aug><source>BMC Bioinformatics</source><pubdate>2008</pubdate><volume>9</volume><fpage>179</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1186/1471-2105-9-179</pubid><pubid idtype="pmcid">2375126</pubid><pubid idtype="pmpid" link="fulltext">18394154</pubid></pubidlist></xrefbib></bibl><bibl id="B17"><title><p>An efficient multidimensional scaling method based on CUDA and divide-and-conquer</p></title><aug><au><snm>Park</snm><fnm>S</fnm></au><au><snm>Hwang</snm><fnm>KB</fnm></au></aug><source>Journal of the Korean Institute of Information Scientists and Engineers: Computing Practices and Letters</source><pubdate>2010</pubdate><volume>16</volume><issue>4</issue><fpage>427</fpage><lpage>431</lpage></bibl><bibl id="B18"><aug><au><snm>De Silva</snm><fnm>V</fnm></au><au><snm>Tenenbaum</snm><fnm>JB</fnm></au></aug><source>Sparse multidimensional scaling using landmark points</source><publisher>Technical Report, Stanford University</publisher><pubdate>2004</pubdate></bibl></refgrp>
</bm></art>