<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
	<ui>1471-2105-7-S4-S7</ui>
	<ji>1471-2105</ji>
	<fm>
		<dochead>Research</dochead>
		<bibl>
			<title>
				<p>FM-test: a fuzzy-set-theory-based approach to differential gene expression data analysis</p>
			</title>
			<aug>
				<au ce="yes" id="A1">
					<snm>Liang</snm>
					<mi>R</mi>
					<fnm>Lily</fnm>
					<insr iid="I1"/>
					<email>lliang@udc.edu</email>
				</au>
				<au id="A2" ca="yes" ce="yes">
					<snm>Lu</snm>
					<fnm>Shiyong</fnm>
					<insr iid="I2"/>
					<email>shiyong@wayne.edu</email>
				</au>
				<au id="A3">
					<snm>Wang</snm>
					<fnm>Xuena</fnm>
					<insr iid="I3"/>
					<email>xuenawang@yahoo.com</email>
				</au>
				<au id="A4">
					<snm>Lu</snm>
					<fnm>Yi</fnm>
					<insr iid="I2"/>
					<email>luyi@wayne.edu</email>
				</au>
				<au id="A5">
					<snm>Mandal</snm>
					<fnm>Vinay</fnm>
					<insr iid="I2"/>
					<email>aw9420@wayne.edu</email>
				</au>
				<au id="A6">
					<snm>Patacsil</snm>
					<fnm>Dorrelyn</fnm>
					<insr iid="I4"/>
					<email>dorrelynmarie@yahoo.com</email>
				</au>
				<au id="A7">
					<snm>Kumar</snm>
					<fnm>Deepak</fnm>
					<insr iid="I4"/>
					<email>dkumar@udc.edu</email>
				</au>
			</aug>
			<insg>
				<ins id="I1">
					<p>Department of Computer Science and Information Technology, University of the District of Columbia, Washington, DC, 20008, USA</p>
				</ins>
				<ins id="I2">
					<p>Department of Computer Science, Wayne State University, Detroit, MI, 48202, USA</p>
				</ins>
				<ins id="I3">
					<p>University of Hawaii, USA</p>
				</ins>
				<ins id="I4">
					<p>Department of Biological and Environmental Sciences, University of the District of Columbia, Washington, DC, 20008, USA</p>
				</ins>
			</insg>
			<source>BMC Bioinformatics</source>
			<supplement>
				<title>
					<p>Symposium of Computations in Bioinformatics and Bioscience (SCBB06)</p>
				</title>
				<editor>Youping Deng, Jun Ni</editor>
				<note>Research</note>
				<url>http://www.biomedcentral.com/content/pdf/1471-2105-7-S4-info.pdf</url>
			</supplement>
			<conference>
				<title>
					<p>Symposium of Computations in Bioinformatics and Bioscience (SCBB06) in conjunction with the International Multi-Symposiums on Computer and Computational Sciences 2006 (IMSCCS|06)</p>
				</title>
				<location>Hangzhou, China</location>
				<date-range>June 20&#8211;24, 2006</date-range>
				<url>http://mfgn.usm.edu/ebl/SCBB06</url>
			</conference>
			<issn>1471-2105</issn>
			<pubdate>2006</pubdate>
			<volume>7</volume>
			<issue>Suppl 4</issue>
			<fpage>S7</fpage>
			<xrefbib>
				<pubidlist><pubid idtype="pmpid">17217525</pubid><pubid idtype="doi">10.1186/1471-2105-7-S4-S7</pubid>
				</pubidlist></xrefbib>
		</bibl>
		<history>
			<pub>
				<date>
					<day>12</day>
					<month>12</month>
					<year>2006</year>
				</date>
			</pub>
		</history>
		<cpyrt>
			<year>2006</year>
			<collab>Liang et al; licensee BioMed Central Ltd</collab>
			<note>This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
		</cpyrt>
		<abs>
			<sec>
				<st>
					<p>Abstract</p>
				</st>
				<sec>
					<st>
						<p>Background</p>
					</st>
					<p>Microarray techniques have revolutionized genomic research by making it possible to monitor the expression of thousands of genes in parallel. As the amount of microarray data being produced is increasing at an exponential rate, there is a great demand for efficient and effective expression data analysis tools. Comparison of gene expression profiles of patients against those of normal counterpart people will enhance our understanding of a disease and identify leads for therapeutic intervention.</p>
				</sec>
				<sec>
					<st>
						<p>Results</p>
					</st>
					<p>In this paper, we propose an innovative approach, <it>fuzzy membership test </it>(FM-test), based on fuzzy set theory to identify disease associated genes from microarray gene expression profiles. A new concept of FM d-value is defined to quantify the divergence of two sets of values. We further analyze the asymptotic property of FM-test, and then establish the relationship between FM d-value and p-value. We applied FM-test to a diabetes expression dataset and a lung cancer expression dataset, respectively. Within the 10 significant genes identified in diabetes dataset, six of them have been confirmed to be associated with diabetes in the literature and one has been suggested by other researchers. Within the 10 significantly overexpressed genes identified in lung cancer data, most (eight) of them have been confirmed by the literatures which are related to the lung cancer.</p>
				</sec>
				<sec>
					<st>
						<p>Conclusion</p>
					</st>
					<p>Our experiments on synthetic datasets show that FM-test is effective and robust. The results in diabetes and lung cancer datasets validated the effectiveness of FM-test. FM-test is implemented as a Web-based application and is available for free at <url>http://database.cs.wayne.edu/bioinformatics</url>.</p>
				</sec>
			</sec>
		</abs>
	</fm>
	<bdy>
		<sec>
			<st>
				<p>Background</p>
			</st>
			<p>Microarray techniques have revolutionized genomic research by making it possible to monitor the expression of thousands of genes in parallel. As the amount of microarray data being produced is increasing at an exponential rate, there is a great demand for efficient and effective expression data analysis tools. The gene expression profile of a cell determines its phenotype and responses to the environment. These responses include its responses towards environmental factors, drugs and therapies. Gene expression patterns can be determined by measuring the quantity of the end product, protein, or the mRNA template used to synthesize the protein. Comparison of gene expression profiles in patients against their normal counterpart people will enhance our understanding of a disease and identify leads for therapeutic intervention. Several important breakthroughs and progress in the gene expression profiling of diseases have been made <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B2">2</abbr><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr><abbr bid="B5">5</abbr></abbrgrp>. More interestingly, researchers have identified many genes that play important roles in the onset, development, and progression of various diseases. Identification of these disease genes offers a route to a better understanding of the molecular mechanisms underlying pathogenesis, a necessary prerequisite for the rational development of improved preventative and therapeutic methods.</p>
			<p>One effective approach of identifying genes that are associated with a disease is to measure the divergence of two sets of values of gene expression. A motivating example is shown in Table <tblr tid="T1">1</tblr>, which records the microarray gene expression values of five genes for two groups of people that are related to diabetes <abbrgrp><abbr bid="B6">6</abbr></abbrgrp>: five insulin-sensitive (IS) humans and five insulin-resistant (IR) humans. In order to identify the genes that are associated with diabetes, one needs to determine for each gene whether or not the two sets of expression values are significantly different from each other. The two most popular methods to measure the divergence of two sets of values are t-test <abbrgrp><abbr bid="B7">7</abbr></abbrgrp> and Wilcoxon rank sum test <abbrgrp><abbr bid="B7">7</abbr></abbrgrp>, The statistical method t-test assesses whether the means of two groups are statistically different from each other. Given two sets <it>S</it><sub><it>1 </it></sub>and <it>S</it><sub><it>2</it></sub>, the t-value is calculated as</p>
			<tbl id="T1">
				<title>
					<p>Table 1</p>
				</title>
				<caption>
					<p>The gene expression values for five genes under two conditions.</p>
				</caption>
				<tblbdy cols="15">
					<r>
						<c ca="left">
							<p>Gene ID</p>
						</c>
						<c cspan="5" ca="center">
							<p>IR</p>
						</c>
						<c cspan="5" ca="center">
							<p>IS</p>
						</c>
						<c ca="center">
							<p>d-value</p>
						</c>
						<c cspan="3" ca="center">
							<p>p-value</p>
						</c>
					</r>
					<r>
						<c>
							<p/>
						</c>
						<c>
							<p/>
						</c>
						<c>
							<p/>
						</c>
						<c>
							<p/>
						</c>
						<c>
							<p/>
						</c>
						<c>
							<p/>
						</c>
						<c>
							<p/>
						</c>
						<c>
							<p/>
						</c>
						<c>
							<p/>
						</c>
						<c>
							<p/>
						</c>
						<c>
							<p/>
						</c>
						<c>
							<p/>
						</c>
						<c cspan="3">
							<hr/>
						</c>
					</r>
					<r>
						<c>
							<p/>
						</c>
						<c>
							<p/>
						</c>
						<c>
							<p/>
						</c>
						<c>
							<p/>
						</c>
						<c>
							<p/>
						</c>
						<c>
							<p/>
						</c>
						<c>
							<p/>
						</c>
						<c>
							<p/>
						</c>
						<c>
							<p/>
						</c>
						<c>
							<p/>
						</c>
						<c>
							<p/>
						</c>
						<c>
							<p/>
						</c>
						<c ca="center">
							<p>FM</p>
						</c>
						<c ca="center">
							<p>t-test</p>
						</c>
						<c ca="center">
							<p>rank sum</p>
						</c>
					</r>
					<r>
						<c cspan="15">
							<hr/>
						</c>
					</r>
					<r>
						<c ca="left">
							<p>1</p>
						</c>
						<c ca="left">
							<p>750</p>
						</c>
						<c ca="left">
							<p>559</p>
						</c>
						<c ca="left">
							<p>649</p>
						</c>
						<c ca="left">
							<p>685</p>
						</c>
						<c ca="left">
							<p>636</p>
						</c>
						<c ca="left">
							<p>310</p>
						</c>
						<c ca="left">
							<p>359</p>
						</c>
						<c ca="left">
							<p>135</p>
						</c>
						<c ca="left">
							<p>97</p>
						</c>
						<c ca="left">
							<p>178</p>
						</c>
						<c ca="left">
							<p>0.999</p>
						</c>
						<c ca="left">
							<p>0.001</p>
						</c>
						<c ca="left">
							<p>0.008</p>
						</c>
						<c ca="left">
							<p>0.000</p>
						</c>
					</r>
					<r>
						<c ca="left">
							<p>2</p>
						</c>
						<c ca="left">
							<p>123</p>
						</c>
						<c ca="left">
							<p>142</p>
						</c>
						<c ca="left">
							<p>11</p>
						</c>
						<c ca="left">
							<p>406</p>
						</c>
						<c ca="left">
							<p>220</p>
						</c>
						<c ca="left">
							<p>305</p>
						</c>
						<c ca="left">
							<p>398</p>
						</c>
						<c ca="left">
							<p>707</p>
						</c>
						<c ca="left">
							<p>905</p>
						</c>
						<c ca="left">
							<p>688</p>
						</c>
						<c ca="left">
							<p>0.756</p>
						</c>
						<c ca="left">
							<p>0.012</p>
						</c>
						<c ca="left">
							<p>0.011</p>
						</c>
						<c ca="left">
							<p>0.031</p>
						</c>
					</r>
					<r>
						<c ca="left">
							<p>3</p>
						</c>
						<c ca="left">
							<p>246</p>
						</c>
						<c ca="left">
							<p>213</p>
						</c>
						<c ca="left">
							<p>232</p>
						</c>
						<c ca="left">
							<p>134</p>
						</c>
						<c ca="left">
							<p>67</p>
						</c>
						<c ca="left">
							<p>86</p>
						</c>
						<c ca="left">
							<p>79</p>
						</c>
						<c ca="left">
							<p>77</p>
						</c>
						<c ca="left">
							<p>94</p>
						</c>
						<c ca="left">
							<p>61</p>
						</c>
						<c ca="left">
							<p>0.725</p>
						</c>
						<c ca="left">
							<p>0.017</p>
						</c>
						<c ca="left">
							<p>0.021</p>
						</c>
						<c ca="left">
							<p>0.098</p>
						</c>
					</r>
					<r>
						<c ca="left">
							<p>4</p>
						</c>
						<c ca="left">
							<p>200</p>
						</c>
						<c ca="left">
							<p>191</p>
						</c>
						<c ca="left">
							<p>220</p>
						</c>
						<c ca="left">
							<p>83</p>
						</c>
						<c ca="left">
							<p>197</p>
						</c>
						<c ca="left">
							<p>49</p>
						</c>
						<c ca="left">
							<p>81</p>
						</c>
						<c ca="left">
							<p>116</p>
						</c>
						<c ca="left">
							<p>111</p>
						</c>
						<c ca="left">
							<p>135</p>
						</c>
						<c ca="left">
							<p>0.708</p>
						</c>
						<c ca="left">
							<p>0.019</p>
						</c>
						<c ca="left">
							<p>0.024</p>
						</c>
						<c ca="left">
							<p>0.058</p>
						</c>
					</r>
					<r>
						<c ca="left">
							<p>5</p>
						</c>
						<c ca="left">
							<p>598</p>
						</c>
						<c ca="left">
							<p>424</p>
						</c>
						<c ca="left">
							<p>695</p>
						</c>
						<c ca="left">
							<p>451</p>
						</c>
						<c ca="left">
							<p>141</p>
						</c>
						<c ca="left">
							<p>342</p>
						</c>
						<c ca="left">
							<p>260</p>
						</c>
						<c ca="left">
							<p>266</p>
						</c>
						<c ca="left">
							<p>229</p>
						</c>
						<c ca="left">
							<p>234</p>
						</c>
						<c ca="left">
							<p>0.674</p>
						</c>
						<c ca="left">
							<p>0.025</p>
						</c>
						<c ca="left">
							<p>0.077</p>
						</c>
						<c ca="left">
							<p>0.152</p>
						</c>
					</r>
				</tblbdy>
				<tblfn>
					<p>Five sample genes contain two set of gene expression for two groups of people: five insulin-sensitive humans (IS) and five insulin-resistant (IR) humans. Each set of gene expression contains five gene expression values. Four values are calculated for each gene: d-value, p-value for FM-test, p-value for t-test, and p-value for rank sum test.</p>
				</tblfn>
			</tbl>
			<p>
				<graphic file="1471-2105-7-S4-S7-i1.gif"/>
			</p>
			<p>where &#956;<sub><it>S </it></sub>and &#963;<sub><it>S </it></sub>are the sample mean and standard deviation of <it>S</it>, respectively.</p>
			<p>The limitation of t-test is that it cannot distinguish two sets with close means even though the two sets are significantly different from each other. Another limitation of t-test is that it is very sensitive to extreme values.</p>
			<p>Another popular statistical method is Wilcoxon rank sum test, which can be used to test the null hypothesis that two sets <it>S</it><sub><it>1 </it></sub>and <it>S</it><sub><it>2 </it></sub>have the same distribution. We first merge the data from these two sets and rank the values from the lowest to the highest with all sequences of ties being assigned an average rank. The Wilcoxon test statistic <it>W </it>is the sum of the ranks from set <it>S</it><sub><it>1</it></sub>. Assuming that the two sets have the same continuous distribution (and no ties occur), then <it>W </it>has a mean and standard deviation given by</p>
			<p>
				<graphic file="1471-2105-7-S4-S7-i2.gif"/>
			</p>
			<p>
				<graphic file="1471-2105-7-S4-S7-i3.gif"/>
			</p>
			<p>where <it>m </it>= |<it>S</it><sub><it>1</it></sub>| and <it>n </it>= |<it>S</it><sub><it>2</it></sub>|.</p>
			<p>We test the null hypothesis <it>H</it><sub><it>o</it></sub>: no difference in distributions. A one-sided alternative is <it>H</it><sub><it>a</it></sub>: <it>S</it><sub><it>1 </it></sub>yields lower measurements. We use this alternative if we expect or see that <it>W </it>is unusually lower than its expected value &#956;. In this case, the p-value is given by a normal approximation. We let <it>N</it>~<it>N</it>(&#956;,&#963;) and compute the left-tail <it>Pr(N &#8804; W) </it>(using continuity correction if <it>W </it>is an integer).</p>
			<p>If we expect or see that <it>W </it>is much higher than its expected value, then we should use the alternative <it>H</it><sub><it>a</it></sub>: first <it>S</it><sub><it>1 </it></sub>yields higher measurements. In this case, the p-value is given by the right-tail <it>Pr(N &#8805; W)</it>. If the two sums of ranks from each set are close, then we could use a two-sided alternative <it>H</it><sub><it>a</it></sub>: there is a difference in distributions. In this case, the p-value is given by twice the smallest tail value 2*<it>Pr(N &#8804; W)</it>, if <it>W </it>&lt; &#956;; or 2*<it>Pr(N &#8805; W)</it>, if <it>W </it>&gt; &#956;.</p>
			<p>Although rank sum test overcomes the limitation of t-test in sensitivity to extreme values, it is not sensitive to absolute values. This might be advantageous to some applications but not to others.</p>
		</sec>
		<sec>
			<st>
				<p>Results</p>
			</st>
			<p>To validate our approach, first, we investigated the distribution of FM d-value on a set of synthetic datasets. Second, we conducted experiments on a synthetic dataset to study the relationship between FM-test d-value and its empirical p-value. Third, on another synthetic dataset, we studied the relationship between FM d-value and the mean difference of distributions.</p>
			<sec>
				<st>
					<p>The probability distribution of FM d-value</p>
				</st>
				<p>Suppose two sets <it>S</it><sub><it>1 </it></sub>and <it>S</it><sub><it>2 </it></sub>are randomly drawn from the same normal distribution, what is the probability distribution of FM d-value? To answer this question, we conducted the following simulation:</p>
				<p>1. We generated <it>N </it>= 64000 pairs of sets of values, with each set containing 5 values. As shown in Figure <figr fid="F1">1(a)</figr>, each value in the two data sets is randomly generated from the same normal distribution <it>N</it>(0,1).</p>
				<fig id="F1">
					<title>
						<p>Figure 1</p>
					</title>
					<caption>
						<p>Random generation of d-value from normal distribution</p>
					</caption>
					<text>
						<p><b>Random generation of d-value from normal distribution</b>. (a) shows the random generation of two sets of values from the same normal distribution and the calculation of the FM d-value of these two sets. (b) shows the random generation of two sets of values from two different normal distributions and the calculation of FM d-value of these two sets.</p>
					</text>
					<graphic file="1471-2105-7-S4-S7-1"/>
				</fig>
				<p>2. We calculated the d-value for each pair of sets.</p>
				<p>3. We then estimated the probability density value <graphic file="1471-2105-7-S4-S7-i4.gif"/> where <it>&#948; </it>= 0.005. The value is essentially the fraction of the FM d-values falling in region [d-<it>&#948;</it>, d+<it>&#948;</it>] divided by the region length 2<it>&#948;</it>. The probability density function of the d-distribution was drawn in Figure <figr fid="F2">2</figr>.</p>
				<fig id="F2">
					<title>
						<p>Figure 2</p>
					</title>
					<caption>
						<p>The probability density function of FM d-value</p>
					</caption>
					<text>
						<p><b>The probability density function of FM d-value</b>. The probability density function of FM d-value shows that most d-values falls into the middle region and only 5% d-values are greater than 0.6058; these d-values are considered significant.</p>
					</text>
					<graphic file="1471-2105-7-S4-S7-2"/>
				</fig>
				<p>4. At the end, in order to understand the effect of the number of pairs used for simulation, i.e., the size of the dataset, on the approximation error of the d-distribution, we generated datasets with different data sizes. For each data size, we generated 10 datasets, and thus derived 10 probability density functions. The maximum standard deviation for all d-values is recorded as the <it>error rate </it>for that data size. As shown in Figure <figr fid="F3">3</figr>, as expected, the error rate decreases as the size of the dataset increases.</p>
				<fig id="F3">
					<title>
						<p>Figure 3</p>
					</title>
					<caption>
						<p>The impact of dataset size on error rate of PDF of FM d-value</p>
					</caption>
					<text>
						<p><b>The impact of dataset size on error rate of PDF of FM d-value</b>. We show the error rate for different data sizes from 500 to 32000. For each data size, we generated 10 datasets, and thus derived 10 probability density functions. The maximum standard deviation for all d-values is recorded as the <it>error rate </it>for that data size. The error rate decreases as the size of the dataset increases.</p>
					</text>
					<graphic file="1471-2105-7-S4-S7-3"/>
				</fig>
				<p>From Figure <figr fid="F2">2</figr>, we can see that most FM d-values fall into the range from 0.2 to 0.5, and very few fall into the range greater than 0.6, or less than 0.2. In particular, when <it>d </it>&#8805; 0.6056, p-value &#8804; 0.05. This is reflected in the red-shared area in Figure <figr fid="F2">2</figr> with <graphic file="1471-2105-7-S4-S7-i5.gif"/><it>f</it>(<it>x</it>)<it>dx </it>= 0.05. Therefore, given two sets <it>S</it><sub><it>1 </it></sub>and <it>S</it><sub><it>2 </it></sub>drawn from the same normal unit distribution, the chance that the pair has a FM d-value equal to or greater than 0.6056 is very low. On the other hand, if we observe that two sets have a d-value equal to or greater than 0.6056, then this is strong evidence that these two sets are drawn from two different distributions. Therefore, they should be considered as significantly divergent.</p>
				<p>Figure <figr fid="F3">3</figr> shows the effect of data size on the error rate of the derived probability density function. As the data size increases, the error rate decreases. We can see from Figure <figr fid="F3">3</figr> that, after the number of pairs of sets in a dataset is greater than 8000, the trend of the error rate becomes stable. Thus, to obtain a reliable empirical p-value for FM-test, the data size should be greater than 8000.</p>
			</sec>
			<sec>
				<st>
					<p>Relationship between FM d-value and its empirical p-value</p>
				</st>
				<p>Suppose two sets <it>S</it><sub><it>1 </it></sub>and <it>S</it><sub><it>2 </it></sub>are drawn from the same normal distribution, what is the probability that they have a FM d-value equal to or greater than a particular <it>D</it>? If the <it>D </it>increases, will this probability decrease? To answer these questions, we studied the relationship between FM d-value and empirical p-value as follows:</p>
				<p>1. Based on the above experimental result, we know that we need at least 8000 pairs of sets to obtain a reliable empirical p-value. Therefore, in this experiment, we generated 10000 pairs of sets of values, with each set containing 5 values. Each value is randomly generated from the unit normal distribution <it>N</it>(0,1).</p>
				<p>2. We calculated the d-value for each pair of sets.</p>
				<p>3. For each pair of sets <it>S</it><sub><it>1 </it></sub>and <it>S</it><sub><it>2 </it></sub>with d-value <it>D</it>, we calculated its empirical p-value as <it>n</it>+1/10001 where <it>n </it>is the number of pairs in these 10000 pairs that have a d-value equal to or greater than <it>D</it>.</p>
				<p>4. We drew the relationship between d-value and empirical p-value in Figure <figr fid="F4">4</figr>.</p>
				<fig id="F4">
					<title>
						<p>Figure 4</p>
					</title>
					<caption>
						<p>The relationship between FM d-value and its empirical p-value</p>
					</caption>
					<text>
						<p><b>The relationship between FM d-value and its empirical p-value</b>. It shows the relationship between d-value and its empirical p-value. We can see that as d-value increases, the p-value decreases. In particular, when <it>d </it>&#8805; <it>0.6056</it>, we have p-value &#8804; 0.05.</p>
					</text>
					<graphic file="1471-2105-7-S4-S7-4"/>
				</fig>
				<p>From Figure <figr fid="F4">4</figr>, we can see that as d-value increases, the p-value decreases. In particular, when <it>d </it>&#8805; 0.6056, we have p-value &#8804; 0.05.</p>
			</sec>
			<sec>
				<st>
					<p>Relationship between FM d-value and the mean difference of distributions</p>
				</st>
				<p>Suppose two sets <it>S</it><sub><it>1 </it></sub>and <it>S</it><sub><it>2 </it></sub>are drawn from two different distributions, then a good divergence measurement should satisfy the following property: the less overlap between these two distributions, the greater the d-value. We validated that our FM-test has this property as follows:</p>
				<p>1. As shown in Figure <figr fid="F1">1(b)</figr>, two data sets are generated from two distributions. Let <it>N</it>(0,1) and <it>N</it>(x, 1) be two normal distributions, where <it>x </it>is the mean difference between these two distributions. In this experiment, we consider <it>x </it>= 0, 0.5, 1, 1.5, 2, 2.5, 3, 3.5, 4, 4.5, respectively.</p>
				<p>2. We generated 1000 pairs of sets of values, with the first set containing 5 values that are randomly generated from <it>N</it>(0,1), and the second set containing 5 values that are randomly generated from <it>N</it>(x, 1).</p>
				<p>3. We calculated the d-value for each pair. Let the average of these 1000 d-values be <it>d</it>. We then plotted (<it>x</it>, <it>d</it>) in Figure <figr fid="F5">5</figr>.</p>
				<fig id="F5">
					<title>
						<p>Figure 5</p>
					</title>
					<caption>
						<p>Relationship between the mean difference of distributions and d-value</p>
					</caption>
					<text>
						<p><b>Relationship between the mean difference of distributions and d-value</b>. Two datasets are generated from two distributions. Let <it>N</it>(0,1) and <it>N</it>(x, 1) be two normal distributions, where <it>x </it>is the mean difference between these two distributions. In this experiment, we consider <it>x </it>= 0, 0.5, 1, 1.5, 2, 2.5, 3, 3.5, 4, 4.5, respectively. The d-value between two sets increases when the mean difference of two data sets increases.</p>
					</text>
					<graphic file="1471-2105-7-S4-S7-5"/>
				</fig>
				<p>4. We repeated step 2 and 3 for different <it>x</it>. Finally, the curve was drawn in Figure <figr fid="F5">5</figr>.</p>
				<p>Figure <figr fid="F5">5</figr> confirmed the desirable property of FM-test: the larger the mean difference between the two distributions, the greater the d-value.</p>
			</sec>
		</sec>
		<sec>
			<st>
				<p>Discussion</p>
			</st>
			<sec>
				<st>
					<p>Analyzing diabetes data with FM-test</p>
				</st>
				<p>A diabetes dataset of microarray gene expression for a total of 10831 genes downloadable from <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> is used for analysis. For each gene, there are ten expression values, five from a group of insulin-sensitive (IS) people and five from a group of insulin-resistant (IR) people. Only the genes that have no null expression values are included in this analysis. We also require that, for a gene to be included, at least five out of its ten expression values are greater than 100. This eliminates the genes whose expression values are noisy and not reliable.</p>
				<p>The results of FM-test are compared with the results of t-test and rank sum test. As we can seen in Table <tblr tid="T2">2</tblr> although the orders of ranking are different for different methods, all three methods identify these genes as significantly differentially expressed between the IS and IR groups. Furthermore, 10 worst ranked genes in FM-test shown in Table <tblr tid="T2">2</tblr> are also consistent with the result of the other two methods. However, gene <it>U49835 </it>is identified by FM-test as the 21st ranked significant gene with p-value 0.0258. Neither t-test (with p-value 0.0768) nor rank sum test (with a p-value 0.1522) identifies this gene as significant.</p>
				<tbl id="T2">
					<title>
						<p>Table 2</p>
					</title>
					<caption>
						<p>Ten best-ranked and worst-ranked genes of diabetes identified by FM-test.</p>
					</caption>
					<tblbdy cols="6">
						<r>
							<c ca="center">
								<p>
									<b>Probe Set</b>
								</p>
							</c>
							<c ca="left">
								<p>
									<b>Gene Description</b>
								</p>
							</c>
							<c ca="center">
								<p>
									<b>d-value</b>
								</p>
							</c>
							<c ca="center">
								<p>
									<b>Empirical p-value</b>
								</p>
							</c>
							<c ca="center">
								<p>
									<b>t-test p-value</b>
								</p>
							</c>
							<c ca="center">
								<p>
									<b>rank sum p-value</b>
								</p>
							</c>
						</r>
						<r>
							<c cspan="6">
								<hr/>
							</c>
						</r>
						<r>
							<c ca="center">
								<p>U45973</p>
							</c>
							<c ca="left">
								<p>Human phosphatidylinositol (4,5) bisphosphate</p>
							</c>
							<c ca="center">
								<p>0.999</p>
							</c>
							<c ca="center">
								<p>0.0003</p>
							</c>
							<c ca="center">
								<p>0.0001</p>
							</c>
							<c ca="center">
								<p>0.0076</p>
							</c>
						</r>
						<r>
							<c ca="center">
								<p>M60858</p>
							</c>
							<c ca="left">
								<p>Human nucleolin gene</p>
							</c>
							<c ca="center">
								<p>0.935</p>
							</c>
							<c ca="center">
								<p>0.0016</p>
							</c>
							<c ca="center">
								<p>0.0017</p>
							</c>
							<c ca="center">
								<p>0.0076</p>
							</c>
						</r>
						<r>
							<c ca="center">
								<p>
									<b>D85181</b>
								</p>
							</c>
							<c ca="left">
								<p>Homo sapiens mRNA for fungal sterol-C5-desaturase homolog</p>
							</c>
							<c ca="center">
								<p>0.892</p>
							</c>
							<c ca="center">
								<p>0.0028</p>
							</c>
							<c ca="center">
								<p>0.0029</p>
							</c>
							<c ca="center">
								<p>0.0147</p>
							</c>
						</r>
						<r>
							<c ca="center">
								<p>
									<b>M95610</b>
								</p>
							</c>
							<c ca="left">
								<p>Human alpha 2 type IX collagen (COL9A2) mRNA</p>
							</c>
							<c ca="center">
								<p>0.872</p>
							</c>
							<c ca="center">
								<p>0.0038</p>
							</c>
							<c ca="center">
								<p>0.0066</p>
							</c>
							<c ca="center">
								<p>0.0076</p>
							</c>
						</r>
						<r>
							<c ca="center">
								<p>L07648</p>
							</c>
							<c ca="left">
								<p>Human MXI1 mRNA</p>
							</c>
							<c ca="center">
								<p>0.858</p>
							</c>
							<c ca="center">
								<p>0.0043</p>
							</c>
							<c ca="center">
								<p>0.0052</p>
							</c>
							<c ca="center">
								<p>0.0076</p>
							</c>
						</r>
						<r>
							<c ca="center">
								<p>L07033</p>
							</c>
							<c ca="left">
								<p>Human hydroxymethylglutaryl-CoA lyase mRNA</p>
							</c>
							<c ca="center">
								<p>0.855</p>
							</c>
							<c ca="center">
								<p>0.0046</p>
							</c>
							<c ca="center">
								<p>0.0054</p>
							</c>
							<c ca="center">
								<p>0.0076</p>
							</c>
						</r>
						<r>
							<c ca="center">
								<p>X53586</p>
							</c>
							<c ca="left">
								<p>Human mRNA for integrin alpha 6</p>
							</c>
							<c ca="center">
								<p>0.851</p>
							</c>
							<c ca="center">
								<p>0.0047</p>
							</c>
							<c ca="center">
								<p>0.0075</p>
							</c>
							<c ca="center">
								<p>0.0076</p>
							</c>
						</r>
						<r>
							<c ca="center">
								<p>X81003</p>
							</c>
							<c ca="left">
								<p>Homo sapiens HCG V mRNA</p>
							</c>
							<c ca="center">
								<p>0.791</p>
							</c>
							<c ca="center">
								<p>0.0089</p>
							</c>
							<c ca="center">
								<p>0.0077</p>
							</c>
							<c ca="center">
								<p>0.0076</p>
							</c>
						</r>
						<r>
							<c ca="center">
								<p>
									<b>X57959</b>
								</p>
							</c>
							<c ca="left">
								<p>ribosomal protein L7</p>
							</c>
							<c ca="center">
								<p>0.767</p>
							</c>
							<c ca="center">
								<p>0.0108</p>
							</c>
							<c ca="center">
								<p>0.0109</p>
							</c>
							<c ca="center">
								<p>0.0313</p>
							</c>
						</r>
						<r>
							<c ca="center">
								<p>
									<b>U06452</b>
								</p>
							</c>
							<c ca="left">
								<p>melan-A</p>
							</c>
							<c ca="center">
								<p>0.756</p>
							</c>
							<c ca="center">
								<p>0.0126</p>
							</c>
							<c ca="center">
								<p>0.0118</p>
							</c>
							<c ca="center">
								<p>0.0311</p>
							</c>
						</r>
						<r>
							<c cspan="6">
								<hr/>
							</c>
						</r>
						<r>
							<c ca="center">
								<p>X82324</p>
							</c>
							<c ca="left">
								<p>POU domain, class 3, transcription factor 4</p>
							</c>
							<c ca="center">
								<p>0.206</p>
							</c>
							<c ca="center">
								<p>0.9987</p>
							</c>
							<c ca="center">
								<p>0.407</p>
							</c>
							<c ca="center">
								<p>1</p>
							</c>
						</r>
						<r>
							<c ca="center">
								<p>M14764</p>
							</c>
							<c ca="left">
								<p>nerve growth factor receptor (TNFR superfamily, member 16)</p>
							</c>
							<c ca="center">
								<p>0.204</p>
							</c>
							<c ca="center">
								<p>0.9989</p>
							</c>
							<c ca="center">
								<p>0.652</p>
							</c>
							<c ca="center">
								<p>1</p>
							</c>
						</r>
						<r>
							<c ca="center">
								<p>M64673</p>
							</c>
							<c ca="left">
								<p>heat shock transcription factor 1</p>
							</c>
							<c ca="center">
								<p>0.204</p>
							</c>
							<c ca="center">
								<p>0.9990</p>
							</c>
							<c ca="center">
								<p>0.652</p>
							</c>
							<c ca="center">
								<p>0.844</p>
							</c>
						</r>
						<r>
							<c ca="center">
								<p>U20657</p>
							</c>
							<c ca="left">
								<p>ubiquitin specific peptidase 4 (proto-oncogene)</p>
							</c>
							<c ca="center">
								<p>0.197</p>
							</c>
							<c ca="center">
								<p>0.9993</p>
							</c>
							<c ca="center">
								<p>0.642</p>
							</c>
							<c ca="center">
								<p>0.844</p>
							</c>
						</r>
						<r>
							<c ca="center">
								<p>D17793</p>
							</c>
							<c ca="left">
								<p>aldo-keto reductase family 1, member C3</p>
							</c>
							<c ca="center">
								<p>0.196</p>
							</c>
							<c ca="center">
								<p>0.9999</p>
							</c>
							<c ca="center">
								<p>0.471</p>
							</c>
							<c ca="center">
								<p>0.839</p>
							</c>
						</r>
						<r>
							<c ca="center">
								<p>D78014</p>
							</c>
							<c ca="left">
								<p>dihydropyrimidinase-like 3</p>
							</c>
							<c ca="center">
								<p>0.194</p>
							</c>
							<c ca="center">
								<p>1</p>
							</c>
							<c ca="center">
								<p>0.620</p>
							</c>
							<c ca="center">
								<p>0.548</p>
							</c>
						</r>
						<r>
							<c ca="center">
								<p>AB002314</p>
							</c>
							<c ca="left">
								<p>PDZ domain containing 10</p>
							</c>
							<c ca="center">
								<p>0.191</p>
							</c>
							<c ca="center">
								<p>1</p>
							</c>
							<c ca="center">
								<p>0.367</p>
							</c>
							<c ca="center">
								<p>0.545</p>
							</c>
						</r>
						<r>
							<c ca="center">
								<p>L20348</p>
							</c>
							<c ca="left">
								<p>oncomodulin</p>
							</c>
							<c ca="center">
								<p>0.181</p>
							</c>
							<c ca="center">
								<p>1</p>
							</c>
							<c ca="center">
								<p>0.405</p>
							</c>
							<c ca="center">
								<p>0.544</p>
							</c>
						</r>
						<r>
							<c ca="center">
								<p>D50063</p>
							</c>
							<c ca="left">
								<p>proteasome (prosome, macropain) 26S subunit</p>
							</c>
							<c ca="center">
								<p>0.179</p>
							</c>
							<c ca="center">
								<p>1</p>
							</c>
							<c ca="center">
								<p>0.544</p>
							</c>
							<c ca="center">
								<p>0.421</p>
							</c>
						</r>
						<r>
							<c ca="center">
								<p>Z79581</p>
							</c>
							<c ca="left">
								<p>H.sapiens LAZ3/BCL6 gene, first non coding exon</p>
							</c>
							<c ca="center">
								<p>0.179</p>
							</c>
							<c ca="center">
								<p>1</p>
							</c>
							<c ca="center">
								<p>0.545</p>
							</c>
							<c ca="center">
								<p>0.407</p>
							</c>
						</r>
					</tblbdy>
				</tbl>
				<p>To study the relevance of genes in insulin metabolism and diabetes, the 10 best ranked differentially regulated genes shown in Table <tblr tid="T2">2</tblr> were further searched in the published literature. Human phosphatidylinositol(4,5) bisphosphate 5-phosphatase homolog (gene <it>U45973</it>) was found to be differentially expressed in insulin resistance cases. Over-expression of inositol polyphosphate 5-phosphatase-2 SHIP2 has been shown to inhibit insulin-stimulated phosphoinositide 3-kinase (PI3K) dependent signaling events. Analysis of diabetic human subjects has revealed an association between SHIP2 gene polymorphism and type 2 diabetes mellitus. Also knockout mouse studies have shown that SHIP2 is a significant therapeutic target for the treatment of type-2 diabetes as well as obesity <abbrgrp><abbr bid="B8">8</abbr></abbrgrp>. Csermely et al. reported that insulin mediates phosphorylation/dephosphorylation of nucleolar protein nucleolin (gene <it>M60858</it>) by stimulating casein kinase II, and this may play a role in the simultaneous enhancement in RNA efflux from isolated, intact cell nucle <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>. c-myc is an oncogene that codes for transcription factor Myc that along with other binding partners such as MAX plays an important role widely studied in various physiological processes including tumor growth in different cancers. Myc modulates the expression of hepatic genes and counteracts the obesity and insulin resistance induced by a high-fat diet in transgenic mice overexpressing c-myc in liver <abbrgrp><abbr bid="B10">10</abbr></abbrgrp>.</p>
				<p>Max interactor protein, MXI1 (gene <it>L07648</it>) competes for MAX thus negatively regulates MYC function and may play a role in insulin resistance. In the presence of glucose or glucose and insulin, leucine is utilized more efficiently as a precursor for lipid biosynthesis by adipose tissue. It has been shown that during the differentiation of 3T3-L1 fibroblasts to adipocytes, the rate of lipid biosynthesis from leucine increases at least 30-fold and the specific activity of 3-hydroxy-3-methylglutaryl-CoA lyase (gene <it>L07033</it>), the mitochondrial enzyme catalyzing the terminal reaction in the leucine degradation pathway, increases 4-fold during differentiation <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>. Schottelndreier et al<abbrgrp><abbr bid="B12">12</abbr></abbrgrp> have described a regulatory role of integrin alpha 6 (gene <it>X53586</it>) in Ca2+ signaling, that is known to have a significant role in insulin resistance <abbrgrp><abbr bid="B13">13</abbr></abbrgrp>.</p>
				<p>HCGV gene product (gene <it>X81003</it>) is known to inhibit the activity of protein phosphatase-1, which is involved in diverse signalling pathways including insulin signaling <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>. Human ribosomal protein L7 (Gene <it>X57959</it>)plays a regulatory role in eukaryotic translation apparatus. It has been shown to be an autoantigen in patients with systemic autoimmune diseases, such as systemic lupus erythematosus <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>. Identification of this gene in our analysis and by <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> suggests a possible role of this gene in insulin resistance. Published reports on these genes indicate their roles in insulin signalling and warrant further investigations on their functions in insulin resistance cases. We further recommend genes <it>D85181, M95610 </it>and <it>U06452 </it>as candidate genes for future research in this area.</p>
				<p>In order to compare the fold change of expression levels between the IS and IR groups to the statistical significance p-values, we presented all the genes in the diabetes dataset with a volcano plot shown in Figure <figr fid="F6">6</figr>. The volcano plot arranges the genes along dimensions of biological and statistical significance. The X axis is the fold change between the two groups, which is on a log scale log<sub><it>2</it></sub>(<graphic file="1471-2105-7-S4-S7-i6.gif"/>/<graphic file="1471-2105-7-S4-S7-i7.gif"/>), where <graphic file="1471-2105-7-S4-S7-i6.gif"/> is the mean of expressions in the IS group, and <graphic file="1471-2105-7-S4-S7-i7.gif"/> is the mean of the expressions in the IR group. In this way, up and down regulation appear symmetric. The Y axis represents the p-value for our FM-test, which is on a negative log scale log<sub>10</sub>(<it>p</it>-<it>value</it>), so that smaller p-values appear higher up. The X axis indicates biological impact of the change; the Y axis indicates the statistical evidence, or reliability of the change.</p>
				<fig id="F6">
					<title>
						<p>Figure 6</p>
					</title>
					<caption>
						<p>The volcano plot for the diabetes dataset</p>
					</caption>
					<text>
						<p><b>The volcano plot for the diabetes dataset</b>. We compare the fold change of expression levels between the IS and IR groups to the statistical significance p-values in a volcano plot. The volcano plot arranges the genes along dimensions of biological and statistical significance. The X axis is the fold change between the two groups, which is on a log scale log<sub>2</sub>(<graphic file="1471-2105-7-S4-S7-i6.gif"/>/<graphic file="1471-2105-7-S4-S7-i7.gif"/>), where <graphic file="1471-2105-7-S4-S7-i6.gif"/> is the mean of expressions in the IS group, and <graphic file="1471-2105-7-S4-S7-i7.gif"/> is the mean of the expressions in the IR group. As we can see, a few genes shows significant difference can be visualized in the plot.</p>
					</text>
					<graphic file="1471-2105-7-S4-S7-6"/>
				</fig>
				<p>As shown in Figure <figr fid="F6">6</figr>, gene <it>U45973 </it>is identified by FM-test as the most statistically significant gene and it is over-expressed in the IR group; gene <it>X53586 </it>is identified by FM-test as the 7th statistically significant gene and it is over-expressed in the IS group. Although genes <it>M60858, D85181, M95610, L07648, L07033</it>, and <it>X81003 </it>have been identified by FM-test among the top ten significant genes, they are not over-expressed in either groups. Finally, gene <it>U41515 </it>is identified by FM-test as the 11th significant gene and it is over-expressed in the IS group.</p>
				<p>In summary, out of the top 10 genes identified by FM-test, we could find 6 of them in the literature about their association with insulin metabolism and diabetes. Among the remaining four genes, gene <it>X57959 </it>has been recommended by <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> as a candidate gene for diabetes, we recommend that gene <it>D85181, M95610 </it>and <it>U06452 </it>could serve as candidate genes for future research in this area.</p>
			</sec>
			<sec>
				<st>
					<p>Analyzing lung cancer data with FM-test</p>
				</st>
				<p>To study the relevance of significant genes in lung cancer, a dataset of microarray gene expression for a total of 22283 genes downloadable from <abbrgrp><abbr bid="B16">16</abbr></abbrgrp> is used for analysis, the top ranked genes were further searched in the published literature. Most of the genes we found have a validated role in tumor progression. As showed in Table <tblr tid="T3">3</tblr>, we discuss a few genes that we ranked best using our method. Multiple identifiers of Keratins were ranked significant in the dataset. Cytokeratins are a polygenic family of insoluble proteins and have been proposed as potentially useful markers of differentiation in various malignancies including lung cancers <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>. Dystonin (DST/BPAG1) is a member of plakin protein family of adhesion junction plaque proteins. A recent study showed the expression of BPAG1in epithelial tumor cells <abbrgrp><abbr bid="B18">18</abbr></abbrgrp>. Maspin (SERPINB5) was has been shown to be involved in both tumor growth and metastasis such as cell invasion, angiogenesis, and more recently apoptosis <abbrgrp><abbr bid="B19">19</abbr></abbrgrp>. Tumor protein p73-like (TP73L/P63) is implicated in the activation of cell survival and antiapoptotic genes <abbrgrp><abbr bid="B20">20</abbr></abbrgrp> and has been used as a marker for lung cancer. It has been suggested that the p63 genomic amplification has an early role in lung tumorigenesis <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>. CLCA2 belongs to calcium sensitive chloride conductance protein family and has been used in a multigene detection assay for Non Small Cell Lung Cancer (NSCLC) <abbrgrp><abbr bid="B22">22</abbr></abbrgrp>. Plakophilins (PKPs) are members of the armadillo multigene family that function in cell adhesion and signal transduction, and also play a central role in tumorigenesis <abbrgrp><abbr bid="B23">23</abbr></abbrgrp>. Desmoplakin (DSP) is a desmosome protein that anchors intermediate filaments to desmosomal plaques. Microscopic analysis with fluorescence-labeled antibodies for DSP revealed high expression of membrane DSP in Squamous Cell Carcinomas (SCC) <abbrgrp><abbr bid="B24">24</abbr></abbrgrp>. The data analysis also identified cell cycle regulatory proteins such as CDC20 and Cyclin B1. Overexpression of CDC20 has been shown to be associated with premature anaphase promotion, resulting in mitotic abnormalities in oral SCC cell lines <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>. Mini chromosome maintenance2 (MCM2) protein is involved in the initiation of DNA replication and is marker for proliferating cells <abbrgrp><abbr bid="B26">26</abbr></abbrgrp>. Our analysis also identified GPR87 (NM_023915) and UGT1A9 (NM_019093). Role of G protein coupled receptors are well documented in lung cancer and GPR87 could be an important gene in cancer progression. Among overexpressed genes, we suggest NM_023915 and NM_019093 as potential candidates for biological investigation.</p>
				<tbl id="T3">
					<title>
						<p>Table 3</p>
					</title>
					<caption>
						<p>Ten best-ranked (overexpressed) cancer genes identified by FM-test.</p>
					</caption>
					<tblbdy cols="3">
						<r>
							<c ca="center">
								<p>
									<b>Probe Set</b>
								</p>
							</c>
							<c ca="left">
								<p>
									<b>Gene Description</b>
								</p>
							</c>
							<c ca="center">
								<p>
									<b>p-value</b>
								</p>
							</c>
						</r>
						<r>
							<c cspan="3">
								<hr/>
							</c>
						</r>
						<r>
							<c ca="center">
								<p>NM_173086</p>
							</c>
							<c ca="left">
								<p>KRT6E: Keratin 6E</p>
							</c>
							<c ca="center">
								<p>0.000125</p>
							</c>
						</r>
						<r>
							<c ca="center">
								<p>NM_001723</p>
							</c>
							<c ca="left">
								<p>DST: Dystonin</p>
							</c>
							<c ca="center">
								<p>0.000125</p>
							</c>
						</r>
						<r>
							<c ca="center">
								<p>NM_002639</p>
							</c>
							<c ca="left">
								<p>SERPINB5: Serpin peptidase inhibitor, clade B (ovalbumin), member 5</p>
							</c>
							<c ca="center">
								<p>0.000125</p>
							</c>
						</r>
						<r>
							<c ca="center">
								<p>AB010153</p>
							</c>
							<c ca="left">
								<p>TP73L: Tumor protein p73-like</p>
							</c>
							<c ca="center">
								<p>0.000125</p>
							</c>
						</r>
						<r>
							<c ca="center">
								<p>NM_023915</p>
							</c>
							<c ca="left">
								<p>GPR87: G protein-coupled receptor 87</p>
							</c>
							<c ca="center">
								<p>0.000125</p>
							</c>
						</r>
						<r>
							<c ca="center">
								<p>NM_006536</p>
							</c>
							<c ca="left">
								<p>CLCA2: Chloride channel, calcium activated, family member 2</p>
							</c>
							<c ca="center">
								<p>0.000125</p>
							</c>
						</r>
						<r>
							<c ca="center">
								<p>NM_001005337</p>
							</c>
							<c ca="left">
								<p>PKP1: Plakophilin 1 (ectodermal dysplasia/skin fragility syndrome)</p>
							</c>
							<c ca="center">
								<p>0.000125</p>
							</c>
						</r>
						<r>
							<c ca="center">
								<p>AF043977</p>
							</c>
							<c ca="left">
								<p>CLCA2: Chloride channel, calcium activated, family member 2</p>
							</c>
							<c ca="center">
								<p>0.000125</p>
							</c>
						</r>
						<r>
							<c ca="center">
								<p>NM_004415</p>
							</c>
							<c ca="left">
								<p>DSP: Desmoplakin</p>
							</c>
							<c ca="center">
								<p>0.000125</p>
							</c>
						</r>
						<r>
							<c ca="center">
								<p>NM_019093</p>
							</c>
							<c ca="left">
								<p>UGT1A9: UDP glucuronosyltransferase 1 family, polypeptide A9</p>
							</c>
							<c ca="center">
								<p>0.000125</p>
							</c>
						</r>
					</tblbdy>
				</tbl>
			</sec>
		</sec>
		<sec>
			<st>
				<p>Conclusion</p>
			</st>
			<p>We proposed an innovative approach based on the fuzzy set theory, FM-test, that quantifies the divergence of two sets directly. We have validated FM-test on synthetic datasets and show that it is effective and robust. We also applied FM-test to a real diabetes dataset and a cancer dataset. For each dataset, we identified 10 significant genes. Within 10 significant genes in diabetes dataset, six of them have been confirmed to be associated with insulin signalling and/or diabetes in the literature, one has been recommended by others, the remaining three genes, <it>D85181</it>, <it>M95610 </it>and <it>U06452</it>, are suggested as three potential diabetes genes involved in insulin resistance for further biological investigation. Out of the 10 significantly overexpressed genes identified in the lung cancer data eight are confirmed by literature to be related to lung cancer. The remaining two genes NM_023915 and NM_019093 are potential candidates for further biological investigation. In addition, we analyzed the asymptotic properties of the distribution of FM d-value and the equation to calculate its p-value. The analysis is presented in appendix. FM-test is implemented as a Web-based application and can be accessed for free at <url>http://database.cs.wayne.edu/bioinformatics</url>.</p>
		</sec>
		<sec>
			<st>
				<p>Methods</p>
			</st>
			<p>In this section, based on the fuzzy set theory <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>, we present our innovative approach, the fuzzy-set-theory-based method test (FM-test), to quantify the divergence of two sets of values directly and robustly. In addition, in append ix section, we show the asymptotic property of FM-test, and then establish the relationship between FM d-value with p-value.</p>
			<p>Let <it>S</it><sub><it>1 </it></sub>and <it>S</it><sub><it>2 </it></sub>be two sets of values of a particular feature for two groups of samples under two different conditions. The basic idea is to consider the two sets of values as samples from two different fuzzy sets. We examine the membership value of each element with respect to the other fuzzy set. By calculating the average of membership values, we measure the divergence of the original two sets. In particular, we perform the following steps:</p>
			<p>1. Compute the sample mean and standard deviation of <it>S</it><sub><it>1 </it></sub>and of <it>S</it><sub><it>2 </it></sub>respectively.</p>
			<p>2. Characterize <it>S</it><sub><it>1 </it></sub>and <it>S</it><sub><it>2 </it></sub>as two fuzzy sets <it>FS</it><sub><it>1 </it></sub>and <it>FS</it><sub><it>2 </it></sub>whose fuzzy membership functions, <graphic file="1471-2105-7-S4-S7-i8.gif"/>(<it>x</it>) and <graphic file="1471-2105-7-S4-S7-i9.gif"/>(<it>x</it>), are defined with the sample means and standard deviations. The fuzzy membership function <graphic file="1471-2105-7-S4-S7-i10.gif"/>(<it>x</it>)(i = 1,2) maps each value <it>x </it>to a fuzzy membership value that reflects the degree of <it>x </it>belonging to <graphic file="1471-2105-7-S4-S7-i10.gif"/>(<it>x</it>)(i = 1,2).</p>
			<p>3. Using the two fuzzy membership functions, <graphic file="1471-2105-7-S4-S7-i8.gif"/>(<it>x</it>) and <graphic file="1471-2105-7-S4-S7-i9.gif"/>(<it>x</it>), quantify the convergence degree of two sets.</p>
			<p>4. Define the divergence degree (FM d-value) between the two sets based on the convergence degree.</p>
			<sec>
				<st>
					<p>Fuzzy Sets and Membership Functions</p>
				</st>
				<p>The sample mean &#956;<sub>1 </sub>of <it>S</it><sub><it>1 </it></sub>is calculated as</p>
				<p>
					<graphic file="1471-2105-7-S4-S7-i11.gif"/>
				</p>
				<p>where <it>n</it><sub><it>1 </it></sub>is the number of elements in <it>S</it><sub><it>1</it></sub>, and the sample standard deviation &#963;<sub><it>1 </it></sub>of <it>S</it><sub><it>1 </it></sub>is calculated as</p>
				<p>
					<graphic file="1471-2105-7-S4-S7-i12.gif"/>
				</p>
				<p>For gene 5 in Table <tblr tid="T1">1</tblr>, we have &#956;<sub><it>1 </it></sub>= 461.8, &#963;<sub><it>1 </it></sub>= 210.59, &#956;<sub><it>2 </it></sub>= 266.2, and &#963;<sub><it>2 </it></sub>= 45.29. We then characterize set <it>S</it><sub><it>1 </it></sub>by a fuzzy set <it>FS</it><sub><it>1 </it></sub>whose fuzzy membership function is defined as</p>
				<p>
					<graphic file="1471-2105-7-S4-S7-i13.gif"/>
				</p>
				<p>The function <graphic file="1471-2105-7-S4-S7-i8.gif"/>(<it>x</it>) maps each value <it>x </it>in <it>S</it><sub><it>1 </it></sub>to a fuzzy membership value to quantify the degree that <it>x </it>belongs to <it>FS</it><sub><it>1</it></sub>. A value equal to the mean has a membership value of 1 and belongs to fuzzy set <it>FS</it><sub><it>1 </it></sub>to a full degree; a value that deviates from the mean has a smaller membership value and belongs to <it>FS</it><sub><it>1 </it></sub>to a smaller degree. The further the value deviates from the mean, the smaller the fuzzy membership value. Similarly, the fuzzy membership function for <it>S</it><sub><it>2 </it></sub>is defined as</p>
				<p>
					<graphic file="1471-2105-7-S4-S7-i14.gif"/>
				</p>
				<p>where &#956;<sub><it>2 </it></sub>and &#963;<sub><it>2 </it></sub>are the mean and standard deviation of <it>S</it><sub><it>2 </it></sub>respectively.</p>
				<p>For gene 5 in Table <tblr tid="T1">1</tblr>, we have <graphic file="1471-2105-7-S4-S7-i15.gif"/> and <graphic file="1471-2105-7-S4-S7-i16.gif"/>. With these two fuzzy membership functions, the fuzzy membership values for each element with respect to the two sets can be calculated. For example, <graphic file="1471-2105-7-S4-S7-i8.gif"/>(598) = 0.81 and <graphic file="1471-2105-7-S4-S7-i9.gif"/>(598) = 2.2<it>E</it><sup>-12</sup>.</p>
			</sec>
			<sec>
				<st>
					<p>Our Proposed Method: FM-test</p>
				</st>
				<p>Since the fuzzy membership functions can overlap, one element can belong to more than one fuzzy set with a respective degree for each. For an element in <it>S</it><sub><it>1</it></sub>, we measure the degree that it belongs to <it>FS</it><sub><it>1 </it></sub>by applying its value to <graphic file="1471-2105-7-S4-S7-i8.gif"/>. Similarly we can apply its value to <graphic file="1471-2105-7-S4-S7-i9.gif"/>to measure the degree that it belongs to <it>FS</it><sub><it>2</it></sub>. The idea of FM-test is to consider the membership value of an element in <it>S</it><sub><it>1 </it></sub>with respect to <it>S</it><sub><it>2 </it></sub>as a bond between <it>S</it><sub><it>1 </it></sub>and <it>S</it><sub><it>2</it></sub>, and vice versa, then the aggregation of all these bonds reflects the overall bond between these two sets. The weaker this overall bond is, the more divergent these two sets are. The strength of the overall bond between two sets is quantified by their c-value, which aggregates the mutual membership values of elements in <it>S</it><sub><it>1 </it></sub>and <it>S</it><sub><it>2 </it></sub>and is defined as follows.</p>
				<p><b>Definition 1 (FM c-value)</b>: Given two sets <it>S</it><sub><it>1 </it></sub>and <it>S</it><sub><it>2</it></sub>, the convergence degree between <it>S</it><sub><it>1 </it></sub>and <it>S</it><sub><it>2 </it></sub>in FM-test is defined as</p>
				<p>
					<graphic file="1471-2105-7-S4-S7-i17.gif"/>
				</p>
				<p>Now we define the divergence value in FM-test (FM d-value) as follows.</p>
				<p><b>Definition 2 (FM d-value)</b>: Given two sets <it>S</it><sub><it>1 </it></sub>and <it>S</it><sub><it>2</it></sub>, the FM d-value between <it>S</it><sub><it>1 </it></sub>and <it>S</it><sub><it>2 </it></sub>is defined as</p>
				<p>
					<graphic file="1471-2105-7-S4-S7-i18.gif"/>
				</p>
				<p>For gene 5 in Table <tblr tid="T1">1</tblr>, <it>c(S</it><sub><it>1</it></sub><it>, S</it><sub><it>2</it></sub><it>) = 0.326</it>, thus the divergence value is 1-<it>c(S</it><sub>1</sub><it>, S</it><sub>2</sub><it>) </it>= 0.674. We calculated all the p-values for the five genes in Table <tblr tid="T1">1</tblr> for the three methods. One interesting observation is that, while both t-test and Wilcoxon rank sum test fail to recognize gene 5 as a significant gene since their p-values are greater than 0.05, our FM-test identifies gene 5 as a significant gene with a p-value of 0.025. The reason of the failure of t-test and Wilcoxon rank sum test is due to their sensitivity to the extreme value 141 in the first set of the gene.</p>
				<p>Given a calculated FM d-value <it>D </it>for two sets <it>S</it><sub><it>1 </it></sub>and <it>S</it><sub><it>2</it></sub>, to interpret <it>D </it>in terms of "significantly divergent" or not, we need to know the cutoff value &#948; of <it>D</it>, so that when <it>D </it>&#8805; &#948;, the two sets are interpreted as significantly divergent. In the context of FM-test, we like to test the following null hypothesis <it>H</it><sub><it>o</it></sub>: <it>S</it><sub>1 </sub>and <it>S</it><sub>2 </sub>originate from the same distribution. Then the p-value is defined as the probability {<it>Pr(d(S</it><sub><it>1</it></sub>, <it>S</it><sub>2</sub><it>) </it>&#8805; <it>D </it>| <it>S</it><sub><it>1 </it></sub>and <it>S</it><sub><it>2 </it></sub>were randomly sampled from the same distribution}. As a convention of statistical analysis, if <it>p-value </it>&#8804; 0.05, then this is strong evidence to reject the null hypothesis, and accepts that the two sets are significantly divergent, while the p-value reflects the significance. It has been very common to use Monte Carlo procedures to calculate the empirical p-value which approximates the exact p-value without relying on asymptotic distributional theory or on exhaustive enumeration. Davison and Hinkley <abbrgrp><abbr bid="B28">28</abbr></abbrgrp> present the formula for obtaining an empirical p-value as <it>(n+1)/(N+1)</it>, where <it>N </it>is the number of samples in the data set, and <it>n </it>is the number of those samples which produce the statistical value greater than or equal to the specified value.</p>
				<p>We perform the following steps to calculate the p-value of two sets <it>S</it><sub><it>1 </it></sub>and <it>S</it><sub><it>2 </it></sub>with their FM d-value <it>D</it>: (1) Estimate the distribution that <it>S</it><sub><it>1 </it></sub>and <it>S</it><sub><it>2 </it></sub>are drawn from a normal distribution <it>N</it>(&#956;,&#963;), where &#956; and &#963; are estimated using the sample mean and standard deviation of <it>S</it><sub><it>1 </it></sub>&#8746; <it>S</it><sub><it>2</it></sub>; (2) Randomly draw <it>N </it>pairs of sets from <it>N</it>(&#956;,&#963;), then calculate the FM d-value for each pair; (3) Calculate the empirical p-value as <it>(n+1)/(N+1)</it>, where <it>n </it>is the number of pairs whose FM d-values are equal or greater than <it>D</it>.</p>
			</sec>
		</sec>
		<sec>
			<st>
				<p>Authors' contributions</p>
			</st>
			<p>LRL and SL designed the algorithm and coordinated the project. XW proved the asymptotic property of FM-test and wrote part of manuscript. YL carried out the study and drafted the manuscript. VM implemented the Web-based application of FM-test. DP and DK analyzed gene functional data and wrote part of manuscript.</p>
		</sec>
		<sec>
			<st>
				<p>APPENDIX</p>
			</st>
			<sec>
				<st>
					<p>Asymptotic Characteristics of the FM d-value</p>
				</st>
				<p>The FM d-value is defined in Method section as follows:</p>
				<p>
					<graphic file="1471-2105-7-S4-S7-i19.gif"/>
				</p>
				<p>Here we are trying to establish the asymptotic characteristics of the FM d-value by estimating its corresponding mean and variance. To the end, formula (10) is rewritten by defining an indicator variable <graphic file="1471-2105-7-S4-S7-i20.gif"/>(&#183;) as follows:</p>
				<p>
					<graphic file="1471-2105-7-S4-S7-i21.gif"/>
				</p>
				<p>where <it>S </it>= <it>S</it><sub>1</sub>&#8746; <it>S</it><sub>2 </sub>= {<it>x</it><sub><it>i</it></sub>,<it>i </it>= 1,..., <it>n</it><sub>1 </sub>+ <it>n</it><sub>2</sub>}, <it>n</it><sub>1 </sub>= |<it>S</it><sub>1</sub>| &#183; <it>n</it><sub>2 </sub>= |<it>S</it><sub>2</sub>| and <graphic file="1471-2105-7-S4-S7-i20.gif"/>(<it>x</it>) = 1 if <it>x </it>&#8712; <it>S</it><sub><it>i </it></sub>and 0 otherwise for <it>i </it>= 1,2.</p>
				<p>Let <graphic file="1471-2105-7-S4-S7-i22.gif"/> w.r.t. a r.v. <it>X </it>over sample space <it>S </it>with a probability <it>p </it>of choosing a sample <it>x </it>from <it>S</it><sub>1</sub>. The calculation of the d-value for a given sample <it>x </it>is therefore given by <it>d</it>(<it>S</it><sub>1</sub>,<it>S</it><sub>2</sub>) = = 1 -<graphic file="1471-2105-7-S4-S7-i23.gif"/>. Next, the mean and the variance of &#916;(<it>X</it>) are calculated respectively preparing for establishing the asymptotic distribution of the d-value.</p>
				<sec>
					<st>
						<p>(1). Calculation of the mean of &#916;(<it>X</it>)</p>
					</st>
					<p>The mean of &#916;(<it>X</it>) is given by</p>
					<p>
						<graphic file="1471-2105-7-S4-S7-i24.gif"/>
					</p>
					<p>
						<graphic file="1471-2105-7-S4-S7-i25.gif"/>
					</p>
					<p>Similarly, <graphic file="1471-2105-7-S4-S7-i26.gif"/></p>
					<p>By (12)&#8211;(14), the mean of &#916;(<it>X</it>) when <it>p </it>= 0.5 is</p>
					<p>
						<graphic file="1471-2105-7-S4-S7-i27.gif"/>
					</p>
				</sec>
				<sec>
					<st>
						<p>(2). Calculation of the variance of &#916;(<it>X</it>)</p>
					</st>
					<p>Since <it>S</it><sub>1 </sub>and <it>S</it><sub>2 </sub>are independent, the variance of &#916;(<it>X</it>) is given by</p>
					<p>
						<graphic file="1471-2105-7-S4-S7-i28.gif"/>
					</p>
					<p>Similarly,</p>
					<p>
						<graphic file="1471-2105-7-S4-S7-i29.gif"/>
					</p>
					<p>Therefore, when <it>p </it>= 0.5</p>
					<p>
						<graphic file="1471-2105-7-S4-S7-i30.gif"/>
					</p>
					<p>As illustrated in the beginning, d-value is a function of <graphic file="1471-2105-7-S4-S7-i31.gif"/> which is given by <it>d</it>(<it>S</it><sub>1</sub>,<it>S</it><sub>2</sub>) = 1 -<graphic file="1471-2105-7-S4-S7-i23.gif"/>. By calculating the mean and the variance of &#916;(<it>X</it>) in formula (&#916;1) and (&#916;2), the mean and the variance of the d-value are derived straightforward as follows:</p>
					<p>
						<graphic file="1471-2105-7-S4-S7-i32.gif"/>
					</p>
					<p>
						<graphic file="1471-2105-7-S4-S7-i33.gif"/>
					</p>
					<p>For a large sample, by the <b>central limit theorem</b>, the distribution of the d-value follows a truncated normal distribution approximately: <it>d</it>(<it>S</it><sub>1</sub>,<it>S</it><sub>2</sub>)&#8594; <it>N</it>(<it>E</it>(<it>d</it>),<it>Var</it>(<it>d</it>)) on a restrained domain of [0 1].</p>
					<p>For the purpose of further illustration, several special cases of the distribution of d-value under application-specific constrains are demonstrated.</p>
					<p>i. Balance study: <it>p </it>= 0.5, <it>n</it><sub>1 </sub>= <it>n</it><sub>2 </sub>= <it>n</it>/2</p>
					<p>
						<graphic file="1471-2105-7-S4-S7-i34.gif"/>
					</p>
					<p>
						<graphic file="1471-2105-7-S4-S7-i35.gif"/>
					</p>
					<p>ii. Balance study with equal mean: <it>p </it>= 0.5, <it>n</it><sub>1 </sub>= <it>n</it><sub>2 </sub>= <it>n</it>/2, &#956;<sub>1 </sub>= &#956;<sub>2</sub></p>
					<p>
						<graphic file="1471-2105-7-S4-S7-i36.gif"/>
					</p>
					<p>
						<graphic file="1471-2105-7-S4-S7-i37.gif"/>
					</p>
					<p>iii. Balance study with equal variance: <it>p </it>= 0.5, <it>n</it><sub>1 </sub>= <it>n</it><sub>2</sub>, &#963;<sub>1</sub><sup>2 </sup>= &#963;<sub>2 </sub><sup>2 </sup>= &#963;<sup>2</sup></p>
					<p>
						<graphic file="1471-2105-7-S4-S7-i38.gif"/>
					</p>
					<p>iv. Balance study with equal variance of 1 and large samples:&#963;<sup>2</sup>= 1, <it>n</it><sub>1 </sub>= <it>n</it><sub>2 </sub>&#8805; 25</p>
					<p>
						<graphic file="1471-2105-7-S4-S7-i39.gif"/>
					</p>
					<p>v. Balance study with equal variance of 1 and equal mean for large samples:&#963;<sup>2 </sup>= 1, &#956;<sub>1 </sub>= &#956;<sub>2</sub>, <it>n</it><sub>1 </sub>= <it>n</it><sub>2 </sub>&#8805; 25</p>
					<p><it>E</it>(<it>d</it>(<it>s</it><sub>1</sub>,<it>s</it><sub>2</sub>)) = 1-<graphic file="1471-2105-7-S4-S7-i40.gif"/> &#8776; 0.293, <it>var</it>(<it>d</it>(<it>s</it><sub>1</sub>,<it>s</it><sub>2</sub>))=(<graphic file="1471-2105-7-S4-S7-i41.gif"/>,<graphic file="1471-2105-7-S4-S7-i42.gif"/>)/<it>n </it>&#8776; 0.327/<it>n</it></p>
					<p><it>d</it>(<it>S</it><sub>1</sub>,<it>S</it><sub>2</sub>) &#8594; <it>N</it>(0.293,0.327/<it>n</it>) with a restrained domain of [0 1].</p>
					<p>Figure <figr fid="F7">7</figr> shows the density function of d-value for this special case when n = 50 with mean 0.293 and variance 0.08.</p>
					<fig id="F7">
						<title>
							<p>Figure 7</p>
						</title>
						<caption>
							<p>Asymptotic density function of d-value for a balance study with equal variance of one</p>
						</caption>
						<text>
							<p>Asymptotic density function of d-value for a balance study with equal variance of one.</p>
						</text>
						<graphic file="1471-2105-7-S4-S7-7"/>
					</fig>
				</sec>
			</sec>
			<sec>
				<st>
					<p>Calculation of p-value</p>
				</st>
				<p>P-value is also called the observed level of significance and is commonly used to report the smallest &#945;-level at which the observed test result is significant. In this section, we derived the parametric calculation of p-value for the FM test based on the asymptotic distribution obtained from section <it>I</it>.</p>
				<p>The null hypothesis of the test is <it>H</it><sub>0</sub>:&#956;<sub>1 </sub>= &#956;<sub>2</sub>, where &#956;<sub>1 </sub>and &#956;<sub>2 </sub>are the mean gene expression levels of two studied groups. According to the asymptotic distribution of the d-value, following its special case (ii) (balance study with equal mean), a test statistic under the null hypothesis for large sample size (n &gt;= 25) is given by</p>
				<p>
					<graphic file="1471-2105-7-S4-S7-i43.gif"/>
				</p>
				<p>Where <graphic file="1471-2105-7-S4-S7-i44.gif"/> and <graphic file="1471-2105-7-S4-S7-i45.gif"/>.</p>
				<p>Suppose <it>d</it><sub><it>obs </it></sub>is an observed d-value for a given study based on two independent samples <it>S</it><sub>1 </sub>= {<it>x</it><sub><it>i</it></sub>,<it>i </it>= 1,...,<it>n</it><sub>1</sub>} and <it>S</it><sub>2 </sub>= {<it>y</it><sub><it>i</it></sub>,<it>i </it>= 1,...,<it>n</it><sub>2</sub>}. The population variances &#963;<sub>1</sub><sup>2 </sup>and &#963;<sub>2</sub><sup>2 </sup>are estimated by the corresponding sample variances <graphic file="1471-2105-7-S4-S7-i46.gif"/> and <graphic file="1471-2105-7-S4-S7-i47.gif"/>. Thus the mean and variance of d-value are estimated by</p>
				<p><graphic file="1471-2105-7-S4-S7-i48.gif"/> and <graphic file="1471-2105-7-S4-S7-i49.gif"/></p>
				<p>P-value is therefore derived as follows:</p>
				<p>
					<graphic file="1471-2105-7-S4-S7-i50.gif"/>
				</p>
			</sec>
			<sec>
				<st>
					<p>Application in Gene Expression Analysis</p>
				</st>
				<p>Table <tblr tid="T4">4</tblr> shows the calculated P-values for the study example. It is concluded that the p-values calculated by (&#916;3) are consistent with the empirical p-values listed in Table <tblr tid="T1">1</tblr> except the Gene 5 which is above 0.05. As a reminder, while the formula (&#916;3) is being applied for the calculation of p-values, a large sample size (n &gt;= 25) is desired for a robust estimation due to the assumption of the CLT.</p>
				<tbl id="T4">
					<title>
						<p>Table 4</p>
					</title>
					<caption>
						<p>P-values given by FM-test for five genes from the study example.</p>
					</caption>
					<tblbdy cols="16">
						<r>
							<c ca="center">
								<p>Gene ID</p>
							</c>
							<c cspan="5" ca="center">
								<p>IR</p>
							</c>
							<c cspan="5" ca="center">
								<p>IS</p>
							</c>
							<c ca="center">
								<p>d-value</p>
							</c>
							<c cspan="4" ca="center">
								<p>p-value</p>
							</c>
						</r>
						<r>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c cspan="4">
								<hr/>
							</c>
						</r>
						<r>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c>
								<p/>
							</c>
							<c ca="center">
								<p>FM-test by(&#916; 3)</p>
							</c>
							<c ca="center">
								<p>FM</p>
							</c>
							<c ca="center">
								<p>t-test</p>
							</c>
							<c ca="center">
								<p>rank sum</p>
							</c>
						</r>
						<r>
							<c cspan="16">
								<hr/>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>1</p>
							</c>
							<c ca="left">
								<p>750</p>
							</c>
							<c ca="left">
								<p>559</p>
							</c>
							<c ca="left">
								<p>649</p>
							</c>
							<c ca="left">
								<p>685</p>
							</c>
							<c ca="left">
								<p>636</p>
							</c>
							<c ca="left">
								<p>310</p>
							</c>
							<c ca="left">
								<p>359</p>
							</c>
							<c ca="left">
								<p>135</p>
							</c>
							<c ca="left">
								<p>97</p>
							</c>
							<c ca="left">
								<p>178</p>
							</c>
							<c ca="left">
								<p>0.999</p>
							</c>
							<c ca="left">
								<p>0.000</p>
							</c>
							<c ca="left">
								<p>0.001</p>
							</c>
							<c ca="left">
								<p>0.008</p>
							</c>
							<c ca="left">
								<p>0.000</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>2</p>
							</c>
							<c ca="left">
								<p>123</p>
							</c>
							<c ca="left">
								<p>142</p>
							</c>
							<c ca="left">
								<p>11</p>
							</c>
							<c ca="left">
								<p>406</p>
							</c>
							<c ca="left">
								<p>220</p>
							</c>
							<c ca="left">
								<p>305</p>
							</c>
							<c ca="left">
								<p>398</p>
							</c>
							<c ca="left">
								<p>707</p>
							</c>
							<c ca="left">
								<p>905</p>
							</c>
							<c ca="left">
								<p>688</p>
							</c>
							<c ca="left">
								<p>0.756</p>
							</c>
							<c ca="left">
								<p>0.007</p>
							</c>
							<c ca="left">
								<p>0.012</p>
							</c>
							<c ca="left">
								<p>0.011</p>
							</c>
							<c ca="left">
								<p>0.031</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>3</p>
							</c>
							<c ca="left">
								<p>246</p>
							</c>
							<c ca="left">
								<p>213</p>
							</c>
							<c ca="left">
								<p>232</p>
							</c>
							<c ca="left">
								<p>134</p>
							</c>
							<c ca="left">
								<p>67</p>
							</c>
							<c ca="left">
								<p>86</p>
							</c>
							<c ca="left">
								<p>79</p>
							</c>
							<c ca="left">
								<p>77</p>
							</c>
							<c ca="left">
								<p>94</p>
							</c>
							<c ca="left">
								<p>61</p>
							</c>
							<c ca="left">
								<p>0.725</p>
							</c>
							<c ca="left">
								<p>0.041</p>
							</c>
							<c ca="left">
								<p>0.017</p>
							</c>
							<c ca="left">
								<p>0.021</p>
							</c>
							<c ca="left">
								<p>0.098</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>4</p>
							</c>
							<c ca="left">
								<p>200</p>
							</c>
							<c ca="left">
								<p>191</p>
							</c>
							<c ca="left">
								<p>220</p>
							</c>
							<c ca="left">
								<p>83</p>
							</c>
							<c ca="left">
								<p>197</p>
							</c>
							<c ca="left">
								<p>49</p>
							</c>
							<c ca="left">
								<p>81</p>
							</c>
							<c ca="left">
								<p>116</p>
							</c>
							<c ca="left">
								<p>111</p>
							</c>
							<c ca="left">
								<p>135</p>
							</c>
							<c ca="left">
								<p>0.708</p>
							</c>
							<c ca="left">
								<p>0.014</p>
							</c>
							<c ca="left">
								<p>0.019</p>
							</c>
							<c ca="left">
								<p>0.024</p>
							</c>
							<c ca="left">
								<p>0.058</p>
							</c>
						</r>
						<r>
							<c ca="left">
								<p>5</p>
							</c>
							<c ca="left">
								<p>598</p>
							</c>
							<c ca="left">
								<p>424</p>
							</c>
							<c ca="left">
								<p>695</p>
							</c>
							<c ca="left">
								<p>451</p>
							</c>
							<c ca="left">
								<p>141</p>
							</c>
							<c ca="left">
								<p>342</p>
							</c>
							<c ca="left">
								<p>260</p>
							</c>
							<c ca="left">
								<p>266</p>
							</c>
							<c ca="left">
								<p>229</p>
							</c>
							<c ca="left">
								<p>234</p>
							</c>
							<c ca="left">
								<p>0.674</p>
							</c>
							<c ca="left">
								<p>0.062</p>
							</c>
							<c ca="left">
								<p>0.025</p>
							</c>
							<c ca="left">
								<p>0.077</p>
							</c>
							<c ca="left">
								<p>0.152</p>
							</c>
						</r>
					</tblbdy>
				</tbl>
			</sec>
		</sec>
	</bdy>
	<bm>
		<ack>
			<sec>
				<st>
					<p>Acknowledgements</p>
				</st>
				<p>We would like thank anonymous reviewers for their helpful comments. This work was supported by the Agricultural Experiment Station at the University of the District of Columbia (Project No.: DC-0LIANG; Accession No.: 0203877).</p>
				<p>This article has been published as part of <it>BMC Bioinformatics </it>Volume 7, Supplement 4, 2006: Symposium of Computations in Bioinformatics and Bioscience (SCBB06). The full contents of the supplement are available online at <url>http://www.biomedcentral.com/1471-2105/7?issue=S4</url>.</p>
			</sec>
		</ack>
		<refgrp>
			<bibl id="B1">
				<title>
					<p>Microarray profiling of human skeletal muscle reveals that insulin regulates approximately 800 genes during a hyperinsulinemic clamp</p>
				</title>
				<aug>
					<au>
						<snm>Rome</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Clement</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Rabasa-Lhoret</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Loizon</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>Poitou</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Barsh</snm>
						<fnm>GS</fnm>
					</au>
					<au>
						<snm>Riou</snm>
						<fnm>JP</fnm>
					</au>
					<au>
						<snm>Laville</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Vidal</snm>
						<fnm>H</fnm>
					</au>
				</aug>
				<source>J Biol Chem</source>
				<pubdate>2003</pubdate>
				<volume>278</volume>
				<issue>20</issue>
				<fpage>18063</fpage>
				<lpage>18068</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1074/jbc.M300293200</pubid>
						<pubid idtype="pmpid" link="fulltext">12621037</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B2">
				<title>
					<p>Oligonucleotide microarray analysis of intact human pancreatic islets: identification of glucose-responsive genes and a highly regulated TGFbeta signaling pathway</p>
				</title>
				<aug>
					<au>
						<snm>Shalev</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Pise-Masison</snm>
						<fnm>CA</fnm>
					</au>
					<au>
						<snm>Radonovich</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Hoffmann</snm>
						<fnm>SC</fnm>
					</au>
					<au>
						<snm>Hirshberg</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Brady</snm>
						<fnm>JN</fnm>
					</au>
					<au>
						<snm>Harlan</snm>
						<fnm>DM</fnm>
					</au>
				</aug>
				<source>Endocrinology</source>
				<pubdate>2002</pubdate>
				<volume>143</volume>
				<issue>9</issue>
				<fpage>3695</fpage>
				<lpage>3698</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1210/en.2002-220564</pubid>
						<pubid idtype="pmpid" link="fulltext">12193586</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B3">
				<title>
					<p>Gene expression profile in skeletal muscle of type 2 diabetes and the effect of insulin treatment</p>
				</title>
				<aug>
					<au>
						<snm>Sreekumar</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Halvatsiotis</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Schimke</snm>
						<fnm>JC</fnm>
					</au>
					<au>
						<snm>Nair</snm>
						<fnm>KS</fnm>
					</au>
				</aug>
				<source>Diabetes</source>
				<pubdate>2002</pubdate>
				<volume>51</volume>
				<issue>6</issue>
				<fpage>1913</fpage>
				<lpage>1920</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">12031981</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B4">
				<title>
					<p>Molecular pathways altered by insulin b9-23 immunization</p>
				</title>
				<aug>
					<au>
						<snm>Eckenrode</snm>
						<fnm>SE</fnm>
					</au>
					<au>
						<snm>Ruan</snm>
						<fnm>QG</fnm>
					</au>
					<au>
						<snm>Collins</snm>
						<fnm>CD</fnm>
					</au>
					<au>
						<snm>Yang</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>McIndoe</snm>
						<fnm>RA</fnm>
					</au>
					<au>
						<snm>Muir</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>She</snm>
						<fnm>JX</fnm>
					</au>
				</aug>
				<source>Ann N Y Acad Sci</source>
				<pubdate>2004</pubdate>
				<volume>1037</volume>
				<fpage>175</fpage>
				<lpage>185</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1196/annals.1337.029</pubid>
						<pubid idtype="pmpid" link="fulltext">15699514</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B5">
				<title>
					<p>Differences in gene expression profiles of diabetic and nondiabetic patients undergoing cardiopulmonary bypass and cardioplegic arrest</p>
				</title>
				<aug>
					<au>
						<snm>Voisine</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Ruel</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Khan</snm>
						<fnm>TA</fnm>
					</au>
					<au>
						<snm>Bianchi</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Xu</snm>
						<fnm>SH</fnm>
					</au>
					<au>
						<snm>Kohane</snm>
						<fnm>I</fnm>
					</au>
					<au>
						<snm>Libermann</snm>
						<fnm>TA</fnm>
					</au>
					<au>
						<snm>Otu</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Saltiel</snm>
						<fnm>AR</fnm>
					</au>
					<au>
						<snm>Sellke</snm>
						<fnm>FW</fnm>
					</au>
				</aug>
				<source>Circulation</source>
				<pubdate>2004</pubdate>
				<volume>110</volume>
				<issue>11 Suppl 1</issue>
				<fpage>II280</fpage>
				<lpage>286</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">15364876</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B6">
				<title>
					<p>Microarray profiling of skeletal muscle tissues from equally obese, non-diabetic insulin-sensitive and insulin-resistant Pima Indians</p>
				</title>
				<aug>
					<au>
						<snm>Yang</snm>
						<fnm>X</fnm>
					</au>
					<au>
						<snm>Pratley</snm>
						<fnm>RE</fnm>
					</au>
					<au>
						<snm>Tokraks</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Bogardus</snm>
						<fnm>C</fnm>
					</au>
					<au>
						<snm>Permana</snm>
						<fnm>PA</fnm>
					</au>
				</aug>
				<source>Diabetologia</source>
				<pubdate>2002</pubdate>
				<volume>45</volume>
				<issue>11</issue>
				<fpage>1584</fpage>
				<lpage>1593</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1007/s00125-002-0905-7</pubid>
						<pubid idtype="pmpid" link="fulltext">12436343</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B7">
				<title>
					<p>Fundamentals of Biostatistics</p>
				</title>
				<aug>
					<au>
						<snm>Rosner</snm>
						<fnm>B</fnm>
					</au>
				</aug>
				<source>Pacific Grove</source>
				<publisher>CA: Duxbury Press</publisher>
				<edition>5</edition>
				<pubdate>2000</pubdate>
			</bibl>
			<bibl id="B8">
				<title>
					<p>The SH2 domain containing inositol polyphosphate 5-phosphatase-2: SHIP2</p>
				</title>
				<aug>
					<au>
						<snm>Dyson</snm>
						<fnm>JM</fnm>
					</au>
					<au>
						<snm>Kong</snm>
						<fnm>AM</fnm>
					</au>
					<au>
						<snm>Wiradjaja</snm>
						<fnm>F</fnm>
					</au>
					<au>
						<snm>Astle</snm>
						<fnm>MV</fnm>
					</au>
					<au>
						<snm>Gurung</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Mitchell</snm>
						<fnm>CA</fnm>
					</au>
				</aug>
				<source>Int J Biochem Cell Biol</source>
				<pubdate>2005</pubdate>
				<volume>37</volume>
				<issue>11</issue>
				<fpage>2260</fpage>
				<lpage>2265</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/j.biocel.2005.05.003</pubid>
						<pubid idtype="pmpid" link="fulltext">15964236</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B9">
				<title>
					<p>Insulin induces the phosphorylation of nucleolin. A possible mechanism of insulin-induced RNA efflux from nuclei</p>
				</title>
				<aug>
					<au>
						<snm>Csermely</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Schnaider</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Cheatham</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Olson</snm>
						<fnm>MO</fnm>
					</au>
					<au>
						<snm>Kahn</snm>
						<fnm>CR</fnm>
					</au>
				</aug>
				<source>J Biol Chem</source>
				<pubdate>1993</pubdate>
				<volume>268</volume>
				<issue>13</issue>
				<fpage>9747</fpage>
				<lpage>9752</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">7683660</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B10">
				<title>
					<p>Overexpression of c-myc in the liver prevents obesity and insulin resistance</p>
				</title>
				<aug>
					<au>
						<snm>Riu</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>Ferre</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Hidalgo</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Mas</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Franckhauser</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Otaegui</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Bosch</snm>
						<fnm>F</fnm>
					</au>
				</aug>
				<source>Faseb J</source>
				<pubdate>2003</pubdate>
				<volume>17</volume>
				<issue>12</issue>
				<fpage>1715</fpage>
				<lpage>1717</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">12958186</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B11">
				<title>
					<p>Leucine catabolism during the differentiation of 3T3-L1 cells. Expression of a mitochondrial enzyme system</p>
				</title>
				<aug>
					<au>
						<snm>Frerman</snm>
						<fnm>FE</fnm>
					</au>
					<au>
						<snm>Sabran</snm>
						<fnm>JL</fnm>
					</au>
					<au>
						<snm>Taylor</snm>
						<fnm>JL</fnm>
					</au>
					<au>
						<snm>Grossberg</snm>
						<fnm>SE</fnm>
					</au>
				</aug>
				<source>J Biol Chem</source>
				<pubdate>1983</pubdate>
				<volume>258</volume>
				<issue>11</issue>
				<fpage>7087</fpage>
				<lpage>7093</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">6304077</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B12">
				<title>
					<p>Mechanisms involved in alpha6beta1-integrin-mediated Ca(2+) signalling</p>
				</title>
				<aug>
					<au>
						<snm>Schottelndreier</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Potter</snm>
						<fnm>BV</fnm>
					</au>
					<au>
						<snm>Mayr</snm>
						<fnm>GW</fnm>
					</au>
					<au>
						<snm>Guse</snm>
						<fnm>AH</fnm>
					</au>
				</aug>
				<source>Cell Signal</source>
				<pubdate>2001</pubdate>
				<volume>13</volume>
				<issue>12</issue>
				<fpage>895</fpage>
				<lpage>899</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/S0898-6568(01)00225-X</pubid>
						<pubid idtype="pmpid" link="fulltext">11728829</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B13">
				<title>
					<p>Islet secretory defect in insulin receptor substrate 1 null mice is linked with reduced calcium signaling and expression of sarco(endo)plasmic reticulum Ca2+-ATPase (SERCA)-2b and -3</p>
				</title>
				<aug>
					<au>
						<snm>Kulkarni</snm>
						<fnm>RN</fnm>
					</au>
					<au>
						<snm>Roper</snm>
						<fnm>MG</fnm>
					</au>
					<au>
						<snm>Dahlgren</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Shih</snm>
						<fnm>DQ</fnm>
					</au>
					<au>
						<snm>Kauri</snm>
						<fnm>LM</fnm>
					</au>
					<au>
						<snm>Peters</snm>
						<fnm>JL</fnm>
					</au>
					<au>
						<snm>Stoffel</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Kennedy</snm>
						<fnm>RT</fnm>
					</au>
				</aug>
				<source>Diabetes</source>
				<pubdate>2004</pubdate>
				<volume>53</volume>
				<issue>6</issue>
				<fpage>1517</fpage>
				<lpage>1525</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">15161756</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B14">
				<title>
					<p>Identification and characterization of the human HCG V gene product as a novel inhibitor of protein phosphatase-1</p>
				</title>
				<aug>
					<au>
						<snm>Zhang</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Zhang</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Zhao</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Lee</snm>
						<fnm>EY</fnm>
					</au>
				</aug>
				<source>Biochemistry</source>
				<pubdate>1998</pubdate>
				<volume>37</volume>
				<issue>47</issue>
				<fpage>16728</fpage>
				<lpage>16734</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1021/bi981169g</pubid>
						<pubid idtype="pmpid" link="fulltext">9843442</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B15">
				<title>
					<p>Characterization of eukaryotic protein L7 as a novel autoantigen which frequently elicits an immune response in patients suffering from systemic autoimmune disease</p>
				</title>
				<aug>
					<au>
						<snm>von Mikecz</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Hemmerich</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Peter</snm>
						<fnm>HH</fnm>
					</au>
					<au>
						<snm>Krawinkel</snm>
						<fnm>U</fnm>
					</au>
				</aug>
				<source>Immunobiology</source>
				<pubdate>1994</pubdate>
				<volume>192</volume>
				<issue>1&#8211;2</issue>
				<fpage>137</fpage>
				<lpage>154</lpage>
				<xrefbib>
					<pubid idtype="pmpid">7750987</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B16">
				<title>
					<p>Interactome-transcriptome analysis reveals the high centrality of genes differentially expressed in lung cancer tissues</p>
				</title>
				<aug>
					<au>
						<snm>Wachi</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Yoneda</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Wu</snm>
						<fnm>R</fnm>
					</au>
				</aug>
				<source>Bioinformatics</source>
				<pubdate>2005</pubdate>
				<volume>21</volume>
				<issue>23</issue>
				<fpage>4205</fpage>
				<lpage>4208</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1093/bioinformatics/bti688</pubid>
						<pubid idtype="pmpid" link="fulltext">16188928</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B17">
				<title>
					<p>Expression of p63, keratin 5/6, keratin 7, and surfactant-A in non-small cell lung carcinomas</p>
				</title>
				<aug>
					<au>
						<snm>Camilo</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Capelozzi</snm>
						<fnm>VL</fnm>
					</au>
					<au>
						<snm>Siqueira</snm>
						<fnm>SA</fnm>
					</au>
					<au>
						<snm>Del Carlo Bernardi</snm>
						<fnm>F</fnm>
					</au>
				</aug>
				<source>Hum Pathol</source>
				<pubdate>2006</pubdate>
				<volume>37</volume>
				<issue>5</issue>
				<fpage>542</fpage>
				<lpage>546</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/j.humpath.2005.12.019</pubid>
						<pubid idtype="pmpid" link="fulltext">16647951</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B18">
				<title>
					<p>Progression-specific genes identified by expression profiling of matched ductal carcinomas in situ and invasive breast tumors, combining laser capture microdissection and oligonucleotide microarray analysis</p>
				</title>
				<aug>
					<au>
						<snm>Schuetz</snm>
						<fnm>CS</fnm>
					</au>
					<au>
						<snm>Bonin</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Clare</snm>
						<fnm>SE</fnm>
					</au>
					<au>
						<snm>Nieselt</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Sotlar</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Walter</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Fehm</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Solomayer</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>Riess</snm>
						<fnm>O</fnm>
					</au>
					<au>
						<snm>Wallwiener</snm>
						<fnm>D</fnm>
					</au>
					<etal/>
				</aug>
				<source>Cancer Res</source>
				<pubdate>2006</pubdate>
				<volume>66</volume>
				<issue>10</issue>
				<fpage>5278</fpage>
				<lpage>5286</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1158/0008-5472.CAN-05-4610</pubid>
						<pubid idtype="pmpid" link="fulltext">16707453</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B19">
				<title>
					<p>Maspin and tumor metastasis</p>
				</title>
				<aug>
					<au>
						<snm>Chen</snm>
						<fnm>EI</fnm>
					</au>
					<au>
						<snm>Yates</snm>
						<fnm>JR</fnm>
					</au>
				</aug>
				<source>IUBMB Life</source>
				<pubdate>2006</pubdate>
				<volume>58</volume>
				<issue>1</issue>
				<fpage>25</fpage>
				<lpage>29</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">16540429</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B20">
				<title>
					<p>Connecting p63 to cellular proliferation: the example of the adenosine deaminase target gene</p>
				</title>
				<aug>
					<au>
						<snm>Sbisa</snm>
						<fnm>E</fnm>
					</au>
					<au>
						<snm>Mastropasqua</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Lefkimmiatis</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Caratozzolo</snm>
						<fnm>MF</fnm>
					</au>
					<au>
						<snm>D'Erchia</snm>
						<fnm>AM</fnm>
					</au>
					<au>
						<snm>Tullo</snm>
						<fnm>A</fnm>
					</au>
				</aug>
				<source>Cell Cycle</source>
				<pubdate>2006</pubdate>
				<volume>5</volume>
				<issue>2</issue>
				<fpage>205</fpage>
				<lpage>212</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">16410722</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B21">
				<title>
					<p>Significance of p63 amplification and overexpression in lung cancer development and prognosis</p>
				</title>
				<aug>
					<au>
						<snm>Massion</snm>
						<fnm>PP</fnm>
					</au>
					<au>
						<snm>Taflan</snm>
						<fnm>PM</fnm>
					</au>
					<au>
						<snm>Jamshedur Rahman</snm>
						<fnm>SM</fnm>
					</au>
					<au>
						<snm>Yildiz</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Shyr</snm>
						<fnm>Y</fnm>
					</au>
					<au>
						<snm>Edgerton</snm>
						<fnm>ME</fnm>
					</au>
					<au>
						<snm>Westfall</snm>
						<fnm>MD</fnm>
					</au>
					<au>
						<snm>Roberts</snm>
						<fnm>JR</fnm>
					</au>
					<au>
						<snm>Pietenpol</snm>
						<fnm>JA</fnm>
					</au>
					<au>
						<snm>Carbone</snm>
						<fnm>DP</fnm>
					</au>
					<etal/>
				</aug>
				<source>Cancer Res</source>
				<pubdate>2003</pubdate>
				<volume>63</volume>
				<issue>21</issue>
				<fpage>7113</fpage>
				<lpage>7121</lpage>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">14612504</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B22">
				<title>
					<p>Multigene real-time PCR detection of circulating tumor cells in peripheral blood of lung cancer patients</p>
				</title>
				<aug>
					<au>
						<snm>Hayes</snm>
						<fnm>DC</fnm>
					</au>
					<au>
						<snm>Secrist</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Bangur</snm>
						<fnm>CS</fnm>
					</au>
					<au>
						<snm>Wang</snm>
						<fnm>T</fnm>
					</au>
					<au>
						<snm>Zhang</snm>
						<fnm>X</fnm>
					</au>
					<au>
						<snm>Harlan</snm>
						<fnm>D</fnm>
					</au>
					<au>
						<snm>Goodman</snm>
						<fnm>GE</fnm>
					</au>
					<au>
						<snm>Houghton</snm>
						<fnm>RL</fnm>
					</au>
					<au>
						<snm>Persing</snm>
						<fnm>DH</fnm>
					</au>
					<au>
						<snm>Zehentner</snm>
						<fnm>BK</fnm>
					</au>
				</aug>
				<source>Anticancer Res</source>
				<pubdate>2006</pubdate>
				<volume>26</volume>
				<issue>2B</issue>
				<fpage>1567</fpage>
				<lpage>1575</lpage>
				<xrefbib>
					<pubid idtype="pmpid">16619573</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B23">
				<title>
					<p>Differential expression of desmosomal plakophilins in various types of carcinomas: correlation with cell type and differentiation</p>
				</title>
				<aug>
					<au>
						<snm>Schwarz</snm>
						<fnm>J</fnm>
					</au>
					<au>
						<snm>Ayim</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Schmidt</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Jager</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Koch</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Baumann</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Dunne</snm>
						<fnm>AA</fnm>
					</au>
					<au>
						<snm>Moll</snm>
						<fnm>R</fnm>
					</au>
				</aug>
				<source>Hum Pathol</source>
				<pubdate>2006</pubdate>
				<volume>37</volume>
				<issue>5</issue>
				<fpage>613</fpage>
				<lpage>622</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/j.humpath.2006.01.013</pubid>
						<pubid idtype="pmpid" link="fulltext">16647960</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B24">
				<title>
					<p>Differential expression and biodistribution of cytokeratin 18 and desmoplakins in non-small cell lung carcinoma subtypes</p>
				</title>
				<aug>
					<au>
						<snm>Young</snm>
						<fnm>GD</fnm>
					</au>
					<au>
						<snm>Winokur</snm>
						<fnm>TS</fnm>
					</au>
					<au>
						<snm>Cerfolio</snm>
						<fnm>RJ</fnm>
					</au>
					<au>
						<snm>Van Tine</snm>
						<fnm>BA</fnm>
					</au>
					<au>
						<snm>Chow</snm>
						<fnm>LT</fnm>
					</au>
					<au>
						<snm>Okoh</snm>
						<fnm>V</fnm>
					</au>
					<au>
						<snm>Garver</snm>
						<fnm>RI</fnm>
						<suf>Jr</suf>
					</au>
				</aug>
				<source>Lung Cancer</source>
				<pubdate>2002</pubdate>
				<volume>36</volume>
				<issue>2</issue>
				<fpage>133</fpage>
				<lpage>141</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1016/S0169-5002(01)00486-X</pubid>
						<pubid idtype="pmpid" link="fulltext">11955647</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B25">
				<title>
					<p>Overexpression of Cdc20 leads to impairment of the spindle assembly checkpoint and aneuploidization in oral cancer</p>
				</title>
				<aug>
					<au>
						<snm>Mondal</snm>
						<fnm>G</fnm>
					</au>
					<au>
						<snm>Sengupta</snm>
						<fnm>S</fnm>
					</au>
					<au>
						<snm>Panda</snm>
						<fnm>CK</fnm>
					</au>
					<au>
						<snm>Gollin</snm>
						<fnm>SM</fnm>
					</au>
					<au>
						<snm>Saunders</snm>
						<fnm>WS</fnm>
					</au>
					<au>
						<snm>Roychoudhury</snm>
						<fnm>S</fnm>
					</au>
				</aug>
				<source>Carcinogenesis</source>
				<pubdate>2006</pubdate>
				<xrefbib>
					<pubid idtype="pmpid" link="fulltext">16777988</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B26">
				<title>
					<p>Aberrant expression of minichromosome maintenance protein-2 and Ki67 in laryngeal squamous epithelial lesions</p>
				</title>
				<aug>
					<au>
						<snm>Chatrath</snm>
						<fnm>P</fnm>
					</au>
					<au>
						<snm>Scott</snm>
						<fnm>IS</fnm>
					</au>
					<au>
						<snm>Morris</snm>
						<fnm>LS</fnm>
					</au>
					<au>
						<snm>Davies</snm>
						<fnm>RJ</fnm>
					</au>
					<au>
						<snm>Rushbrook</snm>
						<fnm>SM</fnm>
					</au>
					<au>
						<snm>Bird</snm>
						<fnm>K</fnm>
					</au>
					<au>
						<snm>Vowler</snm>
						<fnm>SL</fnm>
					</au>
					<au>
						<snm>Grant</snm>
						<fnm>JW</fnm>
					</au>
					<au>
						<snm>Saeed</snm>
						<fnm>IT</fnm>
					</au>
					<au>
						<snm>Howard</snm>
						<fnm>D</fnm>
					</au>
					<etal/>
				</aug>
				<source>Br J Cancer</source>
				<pubdate>2003</pubdate>
				<volume>89</volume>
				<issue>6</issue>
				<fpage>1048</fpage>
				<lpage>1054</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1038/sj.bjc.6601234</pubid>
						<pubid idtype="pmpid" link="fulltext">12966424</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B27">
				<title>
					<p>Fuzzy Sets and Fuzzy Logic: Theory and Applications</p>
				</title>
				<aug>
					<au>
						<snm>Klir</snm>
						<fnm>GJ</fnm>
					</au>
					<au>
						<snm>Yuan</snm>
						<fnm>B</fnm>
					</au>
				</aug>
				<publisher>Prentice-Hall</publisher>
				<pubdate>1995</pubdate>
			</bibl>
			<bibl id="B28">
				<title>
					<p>Bootstrap methods and their application</p>
				</title>
				<aug>
					<au>
						<snm>Davison</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Hinkley</snm>
						<fnm>D</fnm>
					</au>
				</aug>
				<publisher>Cambridge: Cambridge University Press</publisher>
				<pubdate>1997</pubdate>
			</bibl>
		</refgrp>
	</bm>
</art>

