<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1471-2105-6-286</ui>
   <ji>1471-2105</ji>
   <fm>
      <dochead>Research article</dochead>
      <bibl>
         <title>
            <p>Identification of significant periodic genes in microarray gene expression data</p>
         </title>
         <aug>
            <au id="A1" ca="yes">
               <snm>Chen</snm>
               <fnm>Jie</fnm>
               <insr iid="I1"/>
               <email>chenj@umkc.edu</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Department of Mathematics and Statistics, University of Missouri-Kansas City, 5100 Rockhill Road, Kansas City, MO, USA</p>
            </ins>
         </insg>
         <source>BMC Bioinformatics</source>
         <issn>1471-2105</issn>
         <pubdate>2005</pubdate>
         <volume>6</volume>
         <issue>1</issue>
         <fpage>286</fpage>
         <url>http://www.biomedcentral.com/1471-2105/6/286</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">16318631</pubid>
               <pubid idtype="doi">10.1186/1471-2105-6-286</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>28</day>
               <month>3</month>
               <year>2005</year>
            </date>
         </rec>
         <acc>
            <date>
               <day>30</day>
               <month>11</month>
               <year>2005</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>30</day>
               <month>11</month>
               <year>2005</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2005</year>
         <collab>Chen; licensee BioMed Central Ltd.</collab>
         <note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>One frequent application of microarray experiments is in the study of monitoring gene activities in a cell during cell cycle or cell division. A new challenge for analyzing the microarray experiments is to identify genes that are statistically significantly periodically expressed during the cell cycle. Such a challenge occurs due to the large number of genes that are simultaneously measured, a moderate to small number of measurements per gene taken at different time points, and high levels of non-normal random noises inherited in the data.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>Based on two statistical hypothesis testing methods for identifying periodic time series, a novel statistical inference approach, the <it>C&amp;G </it>procedure, is proposed to effectively screen out statistically significantly periodically expressed genes. The approach is then applied to yeast and bacterial cell cycle gene expression data sets, as well as to human fibroblasts and human cancer cell line data sets, and significantly periodically expressed genes are successfully identified.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusion</p>
               </st>
               <p>The <it>C&amp;G </it>procedure proposed is an effective method for identifying statistically significant periodic genes in microarray time series gene expression data.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>Microarray experiments are widely used for gene profiling in different cell lines, various tissues, and conditions (normal versus cancerous). High throughput microarray technologies have made it possible to study problems that range from gene regulation and mRNA stability, to pathways for genetic diseases and the discovery of target subpopulations for drug or other therapies. One frequent application of microarray experiments is in the study of monitoring gene activities in a cell during cell cycle or cell division. A new challenge to statisticians for analyzing the microarray experiments is to identify genes that are statistically significantly periodically expressed during the cell cycle. Such a challenge occurs due to the large number of genes that are simultaneously measured, a moderate to small number of measurements per gene taken at different time points, and high levels of non-normal random noises inherited in the data (Wichert <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>). Several authors, including Spellman <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>, Cho <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>, Shedden and Cooper <abbrgrp><abbr bid="B4">4</abbr><abbr bid="B5">5</abbr></abbrgrp>, Whitfield <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> have noticed the presence of cyclicity or periodicity of genes in their microarray data sets and used a number of ways to identify periodically expressed genes in some available yeast and human cell cycle data sets obtained by them. There are some debates concerning the methods those authors used in finding the cyclic genes and how statistically significant those cyclic genes are. Whitfield <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> established a catalog of genes periodically expressed in the human cell cycle via a series of large-scale microarray experiments. They introduced a statistic (periodicity score) for testing the inference of a periodically expressed gene. The method introduced in Whitfield <abbrgrp><abbr bid="B6">6</abbr></abbrgrp>, however, may not be effective in identifying multiple periodically expressed genes, as it did not address the multiple comparison issue and hence inflated false discovery rate (FDR). Recently, Wichert <abbrgrp><abbr bid="B1">1</abbr></abbrgrp> proposed to use a graphical device, average periodogram, as an exploratory method to signal the presence of possible periodic genes. They showed through extensive simulations that plotting average periodogram against frequencies reveals the presence of periodic genes in the data set if there is any. They also applied Fisher's exact <it>G</it>-test statistic, along with the use of FDR, on the periodogram to screen out statistically significantly periodically expressed genes.</p>
         <p>In this paper, another test statistic, the Bartlett's exact <it>C</it>-test statistic, for the inference of periodic time series is introduced. By combining both the <it>G</it>-statistic and the <it>C</it>-statistic, a novel statistical inference approach, the <it>C&amp;G </it>procedure, is proposed to effectively screen out statistically significantly periodically expressed. The approach is then applied to yeast and bacterial cell cycle gene expression data sets, as well as to human fibroblasts and human cancer cell line data sets, and significantly periodically expressed genes are successfully identified.</p>
      </sec>
      <sec>
         <st>
            <p>Results</p>
         </st>
         <p>For testing the null hypothesis of a signal being a normal white noise against the alternative hypothesis of a signal being periodic (see Methods section), a statistical method is to use the periodigrams of the signal (see Methods section for details) to form a test statistic and calculate the p-value of the test statistic. A small p-value, smaller than a predetermined significance level, indicates the significance of the signal being periodic rather than white noise. Fisher <abbrgrp><abbr bid="B7">7</abbr></abbrgrp> proposed a test statistic and derived the null distribution of the Fisher's <it>G</it>-statistic. In the context of microarray gene expression data, the observed significance value or p-value for the hypothesis testing of the periodicity of a fixed gene <it>g</it>, using <it>G</it>-statistic as the test statistic, denoted by <m:math name="1471-2105-6-286-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>p</m:mi><m:mi>g</m:mi><m:mi>G</m:mi></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciGacaGaaeqabaqabeGadaaakeaacqWGWbaCdaqhaaWcbaGaem4zaCgabaGaem4raCeaaaaa@30B2@</m:annotation></m:semantics></m:math>, can be obtained by</p>
         <p>
            <graphic file="1471-2105-6-286-i2.gif"/>
         </p>
         <p>where <it>&#958;</it><sub><it>g </it></sub>is the sample realization of the <it>G</it>-statistic value calculated from the Fisher's <it>G</it>-statistic (see equation (7) in Methods section) divided by <it>m</it>, and <it>L</it>(<it>&#958;</it><sub><it>g</it></sub>) is the largest integer less than 1/<it>&#958;</it><sub><it>g</it></sub>.</p>
         <p>A more general setting of the hypothesis is to test whether a signal is normal white noise or not. Bartlett <abbrgrp><abbr bid="B8">8</abbr></abbrgrp> proposed a test statistic, the <it>C</it>-statistic (see methods), to test for the hypotheses. According to Durbin <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>, the p-value for the hypothesis testing of the periodicity of a fixed gene <it>g </it>using Bartlett's <it>C</it>-statistic as the test statistic, denoted by <m:math name="1471-2105-6-286-i3" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>p</m:mi><m:mi>g</m:mi><m:mi>C</m:mi></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciGacaGaaeqabaqabeGadaaakeaacqWGWbaCdaqhaaWcbaGaem4zaCgabaGaem4qameaaaaa@30AA@</m:annotation></m:semantics></m:math>, can be found by</p>
         <p>
            <graphic file="1471-2105-6-286-i4.gif"/>
         </p>
         <p>where <it>a</it><sub><it>g </it></sub>= <it>mC</it><sub><it>g</it></sub>, <it>C</it><sub><it>g </it></sub>is given in equation (10) of the Methods section, [<it>a</it><sub><it>g</it></sub>] = <it>INT</it>{<it>a</it><sub><it>g</it></sub>}, and <it>n </it>= <it>m </it>- 1. Suppose that a large number <m:math name="1471-2105-6-286-i5" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>G</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBamXvP5wqSXMqHnxAJn0BKvguHDwzZbqegm0B1jxALjhiov2DaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaWaaeGaeaaakeaaimaacaWFhbaaaa@3962@</m:annotation></m:semantics></m:math> of genes are simultaneously observed through a microarray experiment, and each gene is measured at a relatively short period, or at sparse intervals of time (say at <it>N </it>time points). The researcher is interested in whether any genes are expressed in a periodic pattern of some kind. As high levels of non-normal random noise may present in the data, some visual evidence of periodic gene may be simply due to random noise; and as there are usually a large number of genes (<m:math name="1471-2105-6-286-i5" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>G</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBamXvP5wqSXMqHnxAJn0BKvguHDwzZbqegm0B1jxALjhiov2DaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaWaaeGaeaaakeaaimaacaWFhbaaaa@3962@</m:annotation></m:semantics></m:math> is often from several thousands to several hundreds of thousands), there is a serious concern about the false discovery rate (FDR). Therefore, a multiple comparison approach must be employed to control the FDR level. Recently, Benjamini and Hochberg <abbrgrp><abbr bid="B10">10</abbr></abbrgrp> introduced a practical and powerful approach to multiple testing by controlling the (FDR). This approach is especially useful for multiple hypothesis testing in microarray experiments. It is a step-down type of multiple testing procedure in combination with Bonferroni approach. In light of the p-value, <m:math name="1471-2105-6-286-i3" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>p</m:mi><m:mi>g</m:mi><m:mi>C</m:mi></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciGacaGaaeqabaqabeGadaaakeaacqWGWbaCdaqhaaWcbaGaem4zaCgabaGaem4qameaaaaa@30AA@</m:annotation></m:semantics></m:math>, obtained using the <it>C</it>-statistic, the p-value <m:math name="1471-2105-6-286-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>p</m:mi><m:mi>g</m:mi><m:mi>G</m:mi></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciGacaGaaeqabaqabeGadaaakeaacqWGWbaCdaqhaaWcbaGaem4zaCgabaGaem4raCeaaaaa@30B2@</m:annotation></m:semantics></m:math>, calculated using the <it>G</it>-statistic, and the multiple testing procedure controlling FDR, the following method (called "<it>C&amp;G </it>Procedure") is proposed for the selection of periodic gene expressions of the same period:</p>
         <p>Step 1: Calculate <m:math name="1471-2105-6-286-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>p</m:mi><m:mi>g</m:mi><m:mi>G</m:mi></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciGacaGaaeqabaqabeGadaaakeaacqWGWbaCdaqhaaWcbaGaem4zaCgabaGaem4raCeaaaaa@30B2@</m:annotation></m:semantics></m:math>, and <m:math name="1471-2105-6-286-i3" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>p</m:mi><m:mi>g</m:mi><m:mi>C</m:mi></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciGacaGaaeqabaqabeGadaaakeaacqWGWbaCdaqhaaWcbaGaem4zaCgabaGaem4qameaaaaa@30AA@</m:annotation></m:semantics></m:math> according to equations (1) and (2), respectively, for <it>g </it>= 1, ..., <m:math name="1471-2105-6-286-i5" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>G</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBamXvP5wqSXMqHnxAJn0BKvguHDwzZbqegm0B1jxALjhiov2DaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaWaaeGaeaaakeaaimaacaWFhbaaaa@3962@</m:annotation></m:semantics></m:math>.</p>
         <p>Step 2: Let the ordered <m:math name="1471-2105-6-286-i3" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>p</m:mi><m:mi>g</m:mi><m:mi>C</m:mi></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciGacaGaaeqabaqabeGadaaakeaacqWGWbaCdaqhaaWcbaGaem4zaCgabaGaem4qameaaaaa@30AA@</m:annotation></m:semantics></m:math> values be <m:math name="1471-2105-6-286-i6" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>p</m:mi><m:mrow><m:mo stretchy="false">(</m:mo><m:mn>1</m:mn><m:mo stretchy="false">)</m:mo></m:mrow><m:mi>C</m:mi></m:msubsup><m:mo>,</m:mo><m:msubsup><m:mi>p</m:mi><m:mrow><m:mo stretchy="false">(</m:mo><m:mn>2</m:mn><m:mo stretchy="false">)</m:mo></m:mrow><m:mi>C</m:mi></m:msubsup><m:mo>,</m:mo><m:mo>&#8230;</m:mo><m:mo>,</m:mo><m:msubsup><m:mi>p</m:mi><m:mrow><m:mo stretchy="false">(</m:mo><m:mi>G</m:mi><m:mo stretchy="false">)</m:mo></m:mrow><m:mi>C</m:mi></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBamXvP5wqSXMqHnxAJn0BKvguHDwzZbqegm0B1jxALjhiov2DaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaWaaeGaeaaakeaacqWGWbaCdaqhaaWcbaGaeiikaGIaeGymaeJaeiykaKcabaGaem4qameaaOGaeiilaWIaemiCaa3aa0baaSqaaiabcIcaOiabikdaYiabcMcaPaqaaiabdoeadbaakiabcYcaSiablAciljabcYcaSiabdchaWnaaDaaaleaacqGGOaakimaacaWFhbGaeiykaKcabaGaem4qameaaaaa@4C1F@</m:annotation></m:semantics></m:math> with corresponding genes <m:math name="1471-2105-6-286-i7" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>g</m:mi><m:mrow><m:mo stretchy="false">(</m:mo><m:mn>1</m:mn><m:mo stretchy="false">)</m:mo></m:mrow><m:mi>C</m:mi></m:msubsup><m:mo>,</m:mo><m:msubsup><m:mi>g</m:mi><m:mrow><m:mo stretchy="false">(</m:mo><m:mn>2</m:mn><m:mo stretchy="false">)</m:mo></m:mrow><m:mi>C</m:mi></m:msubsup><m:mo>,</m:mo><m:mo>&#8230;</m:mo><m:msubsup><m:mi>g</m:mi><m:mrow><m:mo stretchy="false">(</m:mo><m:mi>G</m:mi><m:mo stretchy="false">)</m:mo></m:mrow><m:mi>C</m:mi></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBamXvP5wqSXMqHnxAJn0BKvguHDwzZbqegm0B1jxALjhiov2DaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaWaaeGaeaaakeaacqWGNbWzdaqhaaWcbaGaeiikaGIaeGymaeJaeiykaKcabaGaem4qameaaOGaeiilaWIaem4zaC2aa0baaSqaaiabcIcaOiabikdaYiabcMcaPaqaaiabdoeadbaakiabcYcaSiablAciljabdEgaNnaaDaaaleaacqGGOaakimaacaWFhbGaeiykaKcabaGaem4qameaaaaa@4B09@</m:annotation></m:semantics></m:math>; and let the ordered <m:math name="1471-2105-6-286-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>p</m:mi><m:mi>g</m:mi><m:mi>G</m:mi></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciGacaGaaeqabaqabeGadaaakeaacqWGWbaCdaqhaaWcbaGaem4zaCgabaGaem4raCeaaaaa@30B2@</m:annotation></m:semantics></m:math> be <m:math name="1471-2105-6-286-i8" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>p</m:mi><m:mrow><m:mo stretchy="false">(</m:mo><m:mn>1</m:mn><m:mo stretchy="false">)</m:mo></m:mrow><m:mi>G</m:mi></m:msubsup><m:mo>,</m:mo><m:msubsup><m:mi>p</m:mi><m:mrow><m:mo stretchy="false">(</m:mo><m:mn>2</m:mn><m:mo stretchy="false">)</m:mo></m:mrow><m:mi>G</m:mi></m:msubsup><m:mo>,</m:mo><m:mo>&#8230;</m:mo><m:mo>,</m:mo><m:msubsup><m:mi>p</m:mi><m:mrow><m:mo stretchy="false">(</m:mo><m:mi>G</m:mi><m:mo stretchy="false">)</m:mo></m:mrow><m:mi>G</m:mi></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBamXvP5wqSXMqHnxAJn0BKvguHDwzZbqegm0B1jxALjhiov2DaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaWaaeGaeaaakeaacqWGWbaCdaqhaaWcbaGaeiikaGIaeGymaeJaeiykaKcabaGaem4raCeaaOGaeiilaWIaemiCaa3aa0baaSqaaiabcIcaOiabikdaYiabcMcaPaqaaiabdEeahbaakiabcYcaSiablAciljabcYcaSiabdchaWnaaDaaaleaacqGGOaakimaacaWFhbGaeiykaKcabaGaem4raCeaaaaa@4C37@</m:annotation></m:semantics></m:math> with corresponding genes <m:math name="1471-2105-6-286-i9" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>g</m:mi><m:mrow><m:mo stretchy="false">(</m:mo><m:mn>1</m:mn><m:mo stretchy="false">)</m:mo></m:mrow><m:mi>G</m:mi></m:msubsup><m:mo>,</m:mo><m:msubsup><m:mi>g</m:mi><m:mrow><m:mo stretchy="false">(</m:mo><m:mn>2</m:mn><m:mo stretchy="false">)</m:mo></m:mrow><m:mi>G</m:mi></m:msubsup><m:mo>,</m:mo><m:mo>&#8230;</m:mo><m:msubsup><m:mi>g</m:mi><m:mrow><m:mo stretchy="false">(</m:mo><m:mi>G</m:mi><m:mo stretchy="false">)</m:mo></m:mrow><m:mi>G</m:mi></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBamXvP5wqSXMqHnxAJn0BKvguHDwzZbqegm0B1jxALjhiov2DaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaWaaeGaeaaakeaacqWGNbWzdaqhaaWcbaGaeiikaGIaeGymaeJaeiykaKcabaGaem4raCeaaOGaeiilaWIaem4zaC2aa0baaSqaaiabcIcaOiabikdaYiabcMcaPaqaaiabdEeahbaakiabcYcaSiablAciljabdEgaNnaaDaaaleaacqGGOaakimaacaWFhbGaeiykaKcabaGaem4raCeaaaaa@4B21@</m:annotation></m:semantics></m:math>.</p>
         <p>Step 3: For a given FDR level of <it>q</it>, let <it>i</it><sub><it>q </it></sub>be the largest <it>i </it>for which <m:math name="1471-2105-6-286-i10" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>p</m:mi><m:mrow><m:mo stretchy="false">(</m:mo><m:mi>i</m:mi><m:mo stretchy="false">)</m:mo></m:mrow><m:mi>C</m:mi></m:msubsup><m:mo>&#8804;</m:mo><m:mfrac><m:mn>1</m:mn><m:mi>G</m:mi></m:mfrac><m:mi>q</m:mi></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBamXvP5wqSXMqHnxAJn0BKvguHDwzZbqegm0B1jxALjhiov2DaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaWaaeGaeaaakeaacqWGWbaCdaqhaaWcbaGaeiikaGIaemyAaKMaeiykaKcabaGaem4qameaaOGaeyizIm6aaSaaaeaacqaIXaqmaeaaimaacaWFhbaaaiabdghaXbaa@433E@</m:annotation></m:semantics></m:math>, and let <it>j</it><sub><it>q </it></sub>be the largest <it>j </it>for which <m:math name="1471-2105-6-286-i11" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>p</m:mi><m:mrow><m:mo stretchy="false">(</m:mo><m:mi>j</m:mi><m:mo stretchy="false">)</m:mo></m:mrow><m:mi>G</m:mi></m:msubsup><m:mo>&#8804;</m:mo><m:mfrac><m:mi>j</m:mi><m:mi>G</m:mi></m:mfrac><m:mi>q</m:mi></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBamXvP5wqSXMqHnxAJn0BKvguHDwzZbqegm0B1jxALjhiov2DaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaWaaeGaeaaakeaacqWGWbaCdaqhaaWcbaGaeiikaGIaemOAaOMaeiykaKcabaGaem4raCeaaOGaeyizIm6aaSaaaeaacqWGQbGAaeaaimaacaWFhbaaaiabdghaXbaa@43B5@</m:annotation></m:semantics></m:math>.</p>
         <p>Step 4: The intersection set <m:math name="1471-2105-6-286-i12" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:mi>K</m:mi><m:mo>=</m:mo><m:mo>{</m:mo><m:msubsup><m:mi>g</m:mi><m:mrow><m:mo stretchy="false">(</m:mo><m:mn>1</m:mn><m:mo stretchy="false">)</m:mo></m:mrow><m:mi>C</m:mi></m:msubsup><m:mo>,</m:mo><m:msubsup><m:mi>g</m:mi><m:mrow><m:mo stretchy="false">(</m:mo><m:mn>2</m:mn><m:mo stretchy="false">)</m:mo></m:mrow><m:mi>C</m:mi></m:msubsup><m:mo>,</m:mo><m:mo>&#8230;</m:mo><m:msubsup><m:mi>g</m:mi><m:mrow><m:mo stretchy="false">(</m:mo><m:msub><m:mi>i</m:mi><m:mi>q</m:mi></m:msub><m:mo stretchy="false">)</m:mo></m:mrow><m:mi>C</m:mi></m:msubsup><m:mo>}</m:mo><m:mo>&#8745;</m:mo><m:mo>{</m:mo><m:msubsup><m:mi>g</m:mi><m:mrow><m:mo stretchy="false">(</m:mo><m:mn>1</m:mn><m:mo stretchy="false">)</m:mo></m:mrow><m:mi>G</m:mi></m:msubsup><m:mo>,</m:mo><m:msubsup><m:mi>g</m:mi><m:mrow><m:mo stretchy="false">(</m:mo><m:mn>2</m:mn><m:mo stretchy="false">)</m:mo></m:mrow><m:mi>G</m:mi></m:msubsup><m:mo>,</m:mo><m:mo>&#8230;</m:mo><m:msubsup><m:mi>g</m:mi><m:mrow><m:mo stretchy="false">(</m:mo><m:msub><m:mi>j</m:mi><m:mi>q</m:mi></m:msub><m:mo stretchy="false">)</m:mo></m:mrow><m:mi>G</m:mi></m:msubsup><m:mo>}</m:mo></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciGacaGaaeqabaqabeGadaaakeaacqWGlbWscqGH9aqpcqGG7bWEcqWGNbWzdaqhaaWcbaGaeiikaGIaeGymaeJaeiykaKcabaGaem4qameaaOGaeiilaWIaem4zaC2aa0baaSqaaiabcIcaOiabikdaYiabcMcaPaqaaiabdoeadbaakiabcYcaSiablAciljabdEgaNnaaDaaaleaacqGGOaakcqWGPbqAdaWgaaadbaGaemyCaehabeaaliabcMcaPaqaaiabdoeadbaakiabc2ha9jabgMIihlabcUha7jabdEgaNnaaDaaaleaacqGGOaakcqaIXaqmcqGGPaqkaeaacqWGhbWraaGccqGGSaalcqWGNbWzdaqhaaWcbaGaeiikaGIaeGOmaiJaeiykaKcabaGaem4raCeaaOGaeiilaWIaeSOjGSKaem4zaC2aa0baaSqaaiabcIcaOiabdQgaQnaaBaaameaacqWGXbqCaeqaaSGaeiykaKcabaGaem4raCeaaOGaeiyFa0haaa@5FE9@</m:annotation></m:semantics></m:math> then contains all the statistically significantly periodically expressed genes (of the same period). The difference set <m:math name="1471-2105-6-286-i13" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:mi>D</m:mi><m:mo>=</m:mo><m:mo>{</m:mo><m:msubsup><m:mi>g</m:mi><m:mrow><m:mo stretchy="false">(</m:mo><m:mn>1</m:mn><m:mo stretchy="false">)</m:mo></m:mrow><m:mi>C</m:mi></m:msubsup><m:mo>,</m:mo><m:msubsup><m:mi>g</m:mi><m:mrow><m:mo stretchy="false">(</m:mo><m:mn>2</m:mn><m:mo stretchy="false">)</m:mo></m:mrow><m:mi>C</m:mi></m:msubsup><m:mo>,</m:mo><m:mo>&#8230;</m:mo><m:msubsup><m:mi>g</m:mi><m:mrow><m:mo stretchy="false">(</m:mo><m:msub><m:mi>i</m:mi><m:mi>q</m:mi></m:msub><m:mo stretchy="false">)</m:mo></m:mrow><m:mi>C</m:mi></m:msubsup><m:mo>}</m:mo><m:mo>\</m:mo><m:mi>K</m:mi></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciGacaGaaeqabaqabeGadaaakeaacqWGebarcqGH9aqpcqGG7bWEcqWGNbWzdaqhaaWcbaGaeiikaGIaeGymaeJaeiykaKcabaGaem4qameaaOGaeiilaWIaem4zaC2aa0baaSqaaiabcIcaOiabikdaYiabcMcaPaqaaiabdoeadbaakiabcYcaSiablAciljabdEgaNnaaDaaaleaacqGGOaakcqWGPbqAdaWgaaadbaGaemyCaehabeaaliabcMcaPaqaaiabdoeadbaakiabc2ha9jabcYfaCjabdUealbaa@48D3@</m:annotation></m:semantics></m:math> then contains possible periodic genes with different periods, or of other patterns other than periodic.</p>
         <p>A natural question that might come up is: What is the FDR level of the identified periodic genes contained in set K? A straightforward proof leads to the conclusion that the FDR level of the identified periodic genes contained in set <it>K </it>of step 4 in the C&amp;G Procedure is at most <it>q</it>. In other words, by using this procedure, the FDR level is not inflated. The application of the <it>C&amp;G </it>Procedure is illustrated in the following four examples.</p>
         <sec>
            <st>
               <p>Analysis of the bacterial cell cycle data</p>
            </st>
            <p>The gene expression data from synchronized bacterium <it>Caulobacter crescentus </it>cells (Laub <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>) is analyzed for possible periodically expressed genes using the procedure proposed in this paper. The data can be downloaded from the Bacterial cell cycle data website <abbrgrp><abbr bid="B12">12</abbr></abbrgrp>. It contains information on 1474 genes over 11 equally spaced time points (with a time interval of 15 minutes). There are 533 genes identified as cell-cycle regulated genes in Laub <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>, while for the same data Wichert <abbrgrp><abbr bid="B1">1</abbr></abbrgrp> claims that only 44 genes are cyclic genes at FDR level of 0.05. Using the C&amp;G Procedure of this paper, it is found out that the <it>C</it>-statistic identifies 166 genes as significant non-white noise expressions (including possible cell-cycle regulated genes) and the G-statistic identifies 44 such genes; their intersection set contains 43 significant cell-cycle regulated genes. Therefore, we claim that there are 43 significant periodic genes (of the same period) at FDR level of 0.05. This conclusion matches very well with that of Wichert <it>et al</it>. (2004). The one gene which is considered as a periodic gene in Wichert <abbrgrp><abbr bid="B1">1</abbr></abbrgrp> but not as such a gene in this paper is ORF00082 (ABC transporter, ATP-binding protein), whose expression plot against the time is given in Figure <figr fid="F1">1</figr>. Clearly, Figure <figr fid="F1">1</figr> shows a fluctuation pattern rather than a periodic pattern. The ATP-binding protein gives general function prediction only and its biological function is poorly categorized according to the archive information provided on the National Center for Biotechnology Information (NCBI) website <abbrgrp><abbr bid="B13">13</abbr></abbrgrp>.</p>
            <fig id="F1">
               <title>
                  <p>Figure 1</p>
               </title>
               <caption>
                  <p>The gene (ORF00082) in Laub data that is not considered periodic in this paper</p>
               </caption>
               <text>
                  <p>The gene (ORF00082) in Laub data that is not considered periodic in this paper.</p>
               </text>
               <graphic file="1471-2105-6-286-1" hint_layout="single"/>
            </fig>
         </sec>
         <sec>
            <st>
               <p>Analysis of the yeast cell cycle data</p>
            </st>
            <p>In the second example, the gene expression data sets from the well-known yeast <it>Saccharomyces cerevisiae </it>microarray experiments of Spellman <abbrgrp><abbr bid="B2">2</abbr></abbrgrp> are analyzed for the identification of significantly periodically expressed genes. The data sets can be downloaded from the Yeast cell cycle data website <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>. These four data sets were produced by three different cell cycle synchronization techniques: alpha factor arrest (producing the "alpha" gene expression data), temperature arrest (producing "cdc15" and "cdc28" gene expression data sets), and elutriation synchronization (producing the "elution" data set). The alpha data set contains complete information on 4489 genes over 18 equally spaced time points (with a time interval of 7 minutes). Using the C&amp;G Procedure, it is found out that the C-statistic identifies 1188 genes as significant non-white noise expressions (including possible cell-cycle regulated genes) and the G-statistic identifies 473 such genes, their intersection set contains 471 significant cell-cycle regulated genes. Therefore, we claim that there are at least 471 significant periodic genes (of the same period) at FDR level of 0.05 in the alpha experiment data, and there are additional 717 genes in set D that are possibly periodic with different periods, or of other patterns other than periodic.</p>
            <p>The same procedure is applied to the cdc15, cdc28, and elution data sets, and the genes identified by both statistics, their intersection set K, and the difference set D are summarized in Table <tblr tid="T1">1</tblr>. Spellman <abbrgrp><abbr bid="B2">2</abbr></abbrgrp> originally identified 800 cell-cycle genes in all of the four experiments (alpha, cdc15, cdc28, and elution), while Wichert <abbrgrp><abbr bid="B1">1</abbr></abbrgrp> claimed 468 cyclic genes in alpha, 766 cyclic genes in cdc15, 105 in cdc28, and 193 in elution. The periodic genes found by the <it>C&amp;G </it>procedure are obviously in agreement with the findings in Spellman <abbrgrp><abbr bid="B2">2</abbr></abbrgrp> to some extent, and agree more with the findings in Wichert <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>, but not completely agree with theirs. The genes identified in the difference set <it>D </it>worth further investigation by biologists as they may lead to new interesting discoveries. Furthermore, the results found in this paper are certainly improvements over their discoveries of periodic genes. The nine most significant periodic genes in elution data are graphed in Figures <figr fid="F2">2</figr> for illustration purpose. The nine most significant genes (YDL034W, YDL055C, YNR020C, YOR362C, YER137C, YIL070C, YDR388W, YFL042C, and YGL004C) in set D of elution data are graphed in Figure <figr fid="F3">3</figr>. The patterns of these nine genes in Figure <figr fid="F3">3</figr> certainly represent a mixture of expressions of periodic, periodic with different period(s), or of other patterns other than periodic. The genetic footprinting of YDL034W, YDL055C, YNR020C, YIL070C, YDR388W, YFL042C, and YGL004C reveal apparent moderate growth defect on YPD after 20 generations according to the archive information provided on the yeast genome website <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>. This means that the expressions of these seven genes show gradual decade patterns rather than random patterns. Hence, our findings in set D really make biological sense. Gene YOR362C participates in endopeptidase activity and the molecular function of gene YER137C is still unknown.</p>
            <tbl id="T1" hint_layout="double">
               <title>
                  <p>Table 1</p>
               </title>
               <caption>
                  <p>Number of Significant Periodic Genes Identified by C-statistics, G-statistic, Intersection Set K, and Difference Set D</p>
               </caption>
               <tblbdy cols="8">
                  <r>
                     <c ca="left">
                        <p>Cell type</p>
                     </c>
                     <c ca="center">
                        <p>Experiment</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>N</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>G</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>N</it>
                           <sub>
                              <it>C</it>
                           </sub>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>N</it>
                           <sub>
                              <it>G</it>
                           </sub>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>N</it>
                           <sub>
                              <it>K</it>
                           </sub>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>N</it>
                           <sub>
                              <it>D</it>
                           </sub>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c cspan="8">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>C. crescentus</it>
                        </p>
                     </c>
                     <c ca="center">
                        <p>bacteria</p>
                     </c>
                     <c ca="center">
                        <p>11</p>
                     </c>
                     <c ca="center">
                        <p>1474</p>
                     </c>
                     <c ca="center">
                        <p>166</p>
                     </c>
                     <c ca="center">
                        <p>44</p>
                     </c>
                     <c ca="center">
                        <p>43</p>
                     </c>
                     <c ca="center">
                        <p>123</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="8">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Yeast</p>
                     </c>
                     <c ca="center">
                        <p>alpha</p>
                     </c>
                     <c ca="center">
                        <p>18</p>
                     </c>
                     <c ca="center">
                        <p>4489</p>
                     </c>
                     <c ca="center">
                        <p>1188</p>
                     </c>
                     <c ca="center">
                        <p>473</p>
                     </c>
                     <c ca="center">
                        <p>471</p>
                     </c>
                     <c ca="center">
                        <p>717</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Yeast</p>
                     </c>
                     <c ca="center">
                        <p>cdc15</p>
                     </c>
                     <c ca="center">
                        <p>24</p>
                     </c>
                     <c ca="center">
                        <p>4381</p>
                     </c>
                     <c ca="center">
                        <p>1636</p>
                     </c>
                     <c ca="center">
                        <p>788</p>
                     </c>
                     <c ca="center">
                        <p>779</p>
                     </c>
                     <c ca="center">
                        <p>857</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Yeast</p>
                     </c>
                     <c ca="center">
                        <p>cdc28</p>
                     </c>
                     <c ca="center">
                        <p>17</p>
                     </c>
                     <c ca="center">
                        <p>1383</p>
                     </c>
                     <c ca="center">
                        <p>292</p>
                     </c>
                     <c ca="center">
                        <p>27</p>
                     </c>
                     <c ca="center">
                        <p>27</p>
                     </c>
                     <c ca="center">
                        <p>265</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Yeast</p>
                     </c>
                     <c ca="center">
                        <p>Elution</p>
                     </c>
                     <c ca="center">
                        <p>14</p>
                     </c>
                     <c ca="center">
                        <p>5766</p>
                     </c>
                     <c ca="center">
                        <p>1056</p>
                     </c>
                     <c ca="center">
                        <p>769</p>
                     </c>
                     <c ca="center">
                        <p>695</p>
                     </c>
                     <c ca="center">
                        <p>361</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="8">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Human fibroblasts</p>
                     </c>
                     <c ca="center">
                        <p>N2</p>
                     </c>
                     <c ca="center">
                        <p>12</p>
                     </c>
                     <c ca="center">
                        <p>7077</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Human fibroblasts</p>
                     </c>
                     <c ca="center">
                        <p>N3</p>
                     </c>
                     <c ca="center">
                        <p>12</p>
                     </c>
                     <c ca="center">
                        <p>7077</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="8">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Human HeLa</p>
                     </c>
                     <c ca="center">
                        <p>Score1</p>
                     </c>
                     <c ca="center">
                        <p>12</p>
                     </c>
                     <c ca="center">
                        <p>15536</p>
                     </c>
                     <c ca="center">
                        <p>44</p>
                     </c>
                     <c ca="center">
                        <p>7</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="center">
                        <p>38</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Human HeLa</p>
                     </c>
                     <c ca="center">
                        <p>Score2</p>
                     </c>
                     <c ca="center">
                        <p>26</p>
                     </c>
                     <c ca="center">
                        <p>16287</p>
                     </c>
                     <c ca="center">
                        <p>1351</p>
                     </c>
                     <c ca="center">
                        <p>154</p>
                     </c>
                     <c ca="center">
                        <p>153</p>
                     </c>
                     <c ca="center">
                        <p>1198</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Human HeLa</p>
                     </c>
                     <c ca="center">
                        <p>Score3</p>
                     </c>
                     <c ca="center">
                        <p>48</p>
                     </c>
                     <c ca="center">
                        <p>41508</p>
                     </c>
                     <c ca="center">
                        <p>9702</p>
                     </c>
                     <c ca="center">
                        <p>6117</p>
                     </c>
                     <c ca="center">
                        <p>5770</p>
                     </c>
                     <c ca="center">
                        <p>3932</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Human HeLa</p>
                     </c>
                     <c ca="center">
                        <p>Score4</p>
                     </c>
                     <c ca="center">
                        <p>19</p>
                     </c>
                     <c ca="center">
                        <p>40815</p>
                     </c>
                     <c ca="center">
                        <p>52</p>
                     </c>
                     <c ca="center">
                        <p>52</p>
                     </c>
                     <c ca="center">
                        <p>17</p>
                     </c>
                     <c ca="center">
                        <p>35</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Human HeLa</p>
                     </c>
                     <c ca="center">
                        <p>Score5</p>
                     </c>
                     <c ca="center">
                        <p>9</p>
                     </c>
                     <c ca="center">
                        <p>35871</p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>0</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p><it>N</it>: number of time points; <it>G</it>: number of probe sets; <it>N</it><sub><it>C</it></sub>: number of significant genes picked up by C-statistic; <it>N</it><sub><it>G</it></sub>: number of significant genes picked up by G-statistic; <it>N</it><sub><it>K</it></sub>: number of significant periodic genes picked up by the intersection set K; <it>N</it><sub><it>D</it></sub>: number of significant other periodic genes or other patterned genes picked-up by the difference set D.</p>
               </tblfn>
            </tbl>
            <fig id="F2">
               <title>
                  <p>Figure 2</p>
               </title>
               <caption>
                  <p>The nine most significant periodic genes in Elution data</p>
               </caption>
               <text>
                  <p>The nine most significant periodic genes in Elution data.</p>
               </text>
               <graphic file="1471-2105-6-286-2" hint_layout="single"/>
            </fig>
            <fig id="F3">
               <title>
                  <p>Figure 3</p>
               </title>
               <caption>
                  <p>The nine most significant genes in set D for Elution data</p>
               </caption>
               <text>
                  <p>The nine most significant genes in set D for Elution data.</p>
               </text>
               <graphic file="1471-2105-6-286-3" hint_layout="single"/>
            </fig>
         </sec>
         <sec>
            <st>
               <p>Analysis of human fibroblasts data</p>
            </st>
            <p>In this example, the microarray data on the transcriptional profiling of the cell cycle in human fibroblasts will be analyzed. The experiments and data sets are reported in Cho <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>. The data is available at the Human fibroblasts data website <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>. There are two data sets resulted from experiment N2 and experiment N3 with 12 time points and 7077 probe sets. There were approximately 700 genes that were claimed as periodic genes in Cho <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>. The claim was based on clustering and pattern matching as described by Cho <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>. Shedden and Cooper <abbrgrp><abbr bid="B4">4</abbr></abbrgrp> had doubts about the biological grounds of the data analysis results which were claimed to be statistically significant in Cho <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>. Wichert <abbrgrp><abbr bid="B1">1</abbr></abbrgrp> found no significant periodic genes in these two data sets. Applying the C&amp;G Procedure of this paper to N2 data set, it is found out that the <it>C</it>-statistic identifies 1 gene as significant non-white noise expressions (including possible cell-cycle regulated genes) and the <it>G</it>-statistic identifies 2 such genes; their intersection set contains 1 significant cell-cycle regulated gene. Therefore, we claim that there is one significant periodic gene at FDR level of 0.05 in the N2 data set. Similarly, for the N3 data set, the <it>C</it>-statistic identifies 2 genes as significant non-white noise expressions (including possible cell-cycle regulated genes), and the <it>G</it>-statistic identifies 0 such gene; their intersection set contains 0 gene. Therefore, we claim that there is no significant periodic gene in the N3 data set. This conclusion matches very well with that of Wichert <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. What is more interesting is that the two genes <it>M</it>19645<sub><it>a</it></sub><it>t </it>(or HSPA5) and <it>U</it>09117<sub><it>a</it></sub><it>t </it>(or PLCD1) identified in set D (expressions are shown in Figure <figr fid="F4">4</figr>) certainly show some patterns which require further biological investigations. Gene HSPA5 belongs to the heat shock protein 70 family and probably plays a role in facilitating the assembly of multimeric protein complexes inside the ER, while gene PLCD1 participates in a protein coding process in the organ Bos Taurus (information available at NCBI).</p>
            <fig id="F4">
               <title>
                  <p>Figure 4</p>
               </title>
               <caption>
                  <p>The two genes in set D of N3 data</p>
               </caption>
               <text>
                  <p>The two genes in set D of N3 data.</p>
               </text>
               <graphic file="1471-2105-6-286-4" hint_layout="single"/>
            </fig>
         </sec>
         <sec>
            <st>
               <p>Analysis of human cancer cell line data</p>
            </st>
            <p>In this last example, the human cancer cell line profiling data sets resulted from large-scale microarray experiments given in Whitfield <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> will be analyzed by using the C&amp;G Procedure. The data sets can be downloaded from the Human cancer cell line data website <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>. There were 5 experiments conducted using three different cell cycle synchronization methods: a double thymidine block method (resulting in three data sets Score 1, Score2, and Score3); thymidine followed by arrest in mitosis with nocodazole (resulting in data set Score4) ; and mitotic shake-off using an automated cell shake (resulting in data set Score5). The C&amp;G procedure is applied to these five data sets, and the findings are also given in Table <tblr tid="T1">1</tblr>. In particular, the six significant periodic genes identified in set K of Score1 data are graphed in Figure <figr fid="F5">5</figr>; and their periodic patterns are quite evident. These six genes have gene symbols: H2AFX, CKS1, BIRC3, STK9, FLJ11259, and VAV3, respectively. According to the NCBI website, H2AFX encodes a member of the histone H2A family, and generates two transcripts through the use of the conserved stem-loop termination motif, and the polyA addition motif. Gene CKS1 is a protein coding gene in a human cell division control protein family. The BIRC3 protein coding gene is the inhibitor of apoptosis protein 1. STK9 is also a protein coding gene but its biological process is still unknown. FLJ11259 is a protein coding gene foe a hypothetical protein. VAV3 regulates the B cell responses by promoting the sustained production of PIP3 and thereby calcium flux. Therefore, close biological research on these six genes should be very worthy according to their detected patterns found by the C&amp;G procedure. The data sets analyzed by C-statistic, G-statistic, and C&amp;G Procedure in all above examples are summarized in Table <tblr tid="T1">1</tblr>. It is noted that the genes in Set K of Table <tblr tid="T1">1</tblr> are claimed as periodic genes (of the same period) by the C&amp;G procedure. The difference set D contains genes of periodic, periodic with different period(s), or of other patterns other than periodic. Genes in set D worth biologists' further study and discovery. Table <tblr tid="T2">2</tblr> gives the comparison of the results obtained by C&amp;G Procedure to the results obtained by the researchers who originally conducted those experiments, and to the results obtained by Wichert <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>.</p>
            <fig id="F5">
               <title>
                  <p>Figure 5</p>
               </title>
               <caption>
                  <p>The six significant periodic genes in set K of Score1 data</p>
               </caption>
               <text>
                  <p>The six significant periodic genes in set K of Score1 data.</p>
               </text>
               <graphic file="1471-2105-6-286-5" hint_layout="single"/>
            </fig>
            <tbl id="T2" hint_layout="double">
               <title>
                  <p>Table 2</p>
               </title>
               <caption>
                  <p>Number of Periodic Genes Identified by the Original Experimenters, Wichert et al. (2004), and Chen</p>
               </caption>
               <tblbdy cols="5">
                  <r>
                     <c ca="left">
                        <p>Cell type</p>
                     </c>
                     <c ca="left">
                        <p>Experiment</p>
                     </c>
                     <c ca="left">
                        <p>Experimenter</p>
                     </c>
                     <c ca="right">
                        <p>Wichert <it>et al</it>. (2004)</p>
                     </c>
                     <c ca="right">
                        <p>Chen</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="5">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>
                           <it>C. crescentus</it>
                        </p>
                     </c>
                     <c ca="left">
                        <p>bacteria</p>
                     </c>
                     <c ca="left">
                        <p>Laub <it>et al</it>. (2000), identified 553 periodic genes</p>
                     </c>
                     <c ca="right">
                        <p>44</p>
                     </c>
                     <c ca="right">
                        <p>43</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="5">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Yeast</p>
                     </c>
                     <c ca="left">
                        <p>alpha</p>
                     </c>
                     <c ca="left">
                        <p>Spellman <it>et al</it>. (1998),</p>
                     </c>
                     <c ca="right">
                        <p>468</p>
                     </c>
                     <c ca="right">
                        <p>471</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Yeast</p>
                     </c>
                     <c ca="left">
                        <p>cdc15</p>
                     </c>
                     <c ca="left">
                        <p>total of 800 periodic genes</p>
                     </c>
                     <c ca="right">
                        <p>766</p>
                     </c>
                     <c ca="right">
                        <p>779</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Yeast</p>
                     </c>
                     <c ca="left">
                        <p>cdc28</p>
                     </c>
                     <c ca="left">
                        <p>identified in all of these four</p>
                     </c>
                     <c ca="right">
                        <p>105</p>
                     </c>
                     <c ca="right">
                        <p>27</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Yeast</p>
                     </c>
                     <c ca="left">
                        <p>Elution</p>
                     </c>
                     <c ca="left">
                        <p>yeast cell cycle experiments</p>
                     </c>
                     <c ca="right">
                        <p>193</p>
                     </c>
                     <c ca="right">
                        <p>695</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="5">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Human fibroblasts</p>
                     </c>
                     <c ca="left">
                        <p>N2</p>
                     </c>
                     <c ca="left">
                        <p>Cho <it>et al</it>. (2001), 700 periodic</p>
                     </c>
                     <c ca="right">
                        <p>0</p>
                     </c>
                     <c ca="right">
                        <p>1</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Human fibroblasts</p>
                     </c>
                     <c ca="left">
                        <p>N3</p>
                     </c>
                     <c ca="left">
                        <p>genes identified in N2 and N3</p>
                     </c>
                     <c ca="right">
                        <p>0</p>
                     </c>
                     <c ca="right">
                        <p>0</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="5">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Human HeLa</p>
                     </c>
                     <c ca="left">
                        <p>Score1</p>
                     </c>
                     <c ca="left">
                        <p>Whitfield <it>et al</it>. (2002),</p>
                     </c>
                     <c ca="right">
                        <p>0</p>
                     </c>
                     <c ca="right">
                        <p>6</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Human HeLa</p>
                     </c>
                     <c ca="left">
                        <p>Score2</p>
                     </c>
                     <c ca="left">
                        <p>total of 800+ periodic genes</p>
                     </c>
                     <c ca="right">
                        <p>134</p>
                     </c>
                     <c ca="right">
                        <p>153</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Human HeLa</p>
                     </c>
                     <c ca="left">
                        <p>Score3</p>
                     </c>
                     <c ca="left">
                        <p>identified in these five</p>
                     </c>
                     <c ca="right">
                        <p>6043</p>
                     </c>
                     <c ca="right">
                        <p>5770</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Human HeLa</p>
                     </c>
                     <c ca="left">
                        <p>Score4</p>
                     </c>
                     <c ca="left">
                        <p>Human Cancer cell line</p>
                     </c>
                     <c ca="right">
                        <p>56</p>
                     </c>
                     <c ca="right">
                        <p>17</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Human HeLa</p>
                     </c>
                     <c ca="left">
                        <p>Score5</p>
                     </c>
                     <c ca="left">
                        <p>experiments</p>
                     </c>
                     <c ca="right">
                        <p>0</p>
                     </c>
                     <c ca="right">
                        <p>0</p>
                     </c>
                  </r>
               </tblbdy>
            </tbl>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Discussion</p>
         </st>
         <p>Regarding both of the test statistics, several points need to be addressed.</p>
         <p>First of all, the <it>G</it>-statistic is testing for the significance of the maximum periodogram. When the result is significant, the message conveyed to us is that the maximum periodogram is significant with the possible cause of the underlying model being periodic. On the other hand, the <it>C</it>-statistic utilizes a sort of standardized cumulative periodograms, and considers all periodograms' contributions towards the periodicity of the underlying model. Therefore, these two statistics are not exactly the same. Secondly, although both <it>G</it>-statistic and <it>C</it>-statistic can be used as test statistics for searching periodicity in a time series, the <it>G</it>-statistic method is more specific and the <it>C</it>-statistic method is broader in the sense that the alternative hypothesis to the null hypothesis is rather vague. In other words, for a fixed gene <it>g</it>, when the p-value <m:math name="1471-2105-6-286-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>p</m:mi><m:mi>g</m:mi><m:mi>G</m:mi></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciGacaGaaeqabaqabeGadaaakeaacqWGWbaCdaqhaaWcbaGaem4zaCgabaGaem4raCeaaaaa@30B2@</m:annotation></m:semantics></m:math> is small compared with a predetermined significance level, the conclusion that this gene is a significant periodic gene according to the <it>G</it>-statistic can be reached; however, when the p-value <m:math name="1471-2105-6-286-i3" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>p</m:mi><m:mi>g</m:mi><m:mi>C</m:mi></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciGacaGaaeqabaqabeGadaaakeaacqWGWbaCdaqhaaWcbaGaem4zaCgabaGaem4qameaaaaa@30AA@</m:annotation></m:semantics></m:math> is small, only the claim that this gene is not a white noise (might be of periodic, periodic with different period, or of other patterns other than periodic) according to the <it>C</it>-statistic can be drawn. Hence, one can anticipate that the <it>C</it>-statistic will pick up more significant genes than the <it>G</it>-statistic. This is valuable, especially in expensive microarray experiments, because the biologist can use the information to possibly discover genes that are of different periods, or of other pattern which they have not encountered before. Thirdly, from the definitions of the two statistics (see Methods), we can easily establish that</p>
         <p><it>G<sub>g </sub></it>&#8805; 1,0 &#8804; <it>C<sub>g </sub></it>&#8804; 1, and <it>G<sub>g </sub></it>&#8805;<it>C<sub>g</sub></it>. &#160;&#160;&#160; (3)</p>
         <p>Then, the fact that <it>G</it><sub><it>g </it></sub>is great than its threshold value does not necessarily imply that <it>C</it><sub><it>g </it></sub>is greater than its threshold value, and vise versa. In other words, from the fact given by (3), it is clear that these two statistics are not equivalent in general; there are times, however, that both tests overlap with each other. This is not surprising because the <it>G</it>-statistic is constructed for testing normal white noise versus periodic function, and the <it>C</it>-statistic method is broader in the sense that the alternative hypothesis to the null hypothesis is rather vague. One might think that the set of periodic signals identified by the <it>G</it>-statistic is contained in the set of genes identified by the <it>C</it>-statistic. It is not necessarily true for the reasons mentioned here in this section.</p>
         <p>Furthermore, the <it>G</it>-statistic method is sensitive to the departure from normality as pointed in Davis <abbrgrp><abbr bid="B18">18</abbr></abbrgrp> and Wilks <abbrgrp><abbr bid="B19">19</abbr></abbrgrp>. Hence, when the normality assumption on the random errors is violated, the null distribution of the <it>G</it>-statistic will not be true in general and the p-value in (1) could be very wrong. The <it>C</it>-statistic method is insensitive to the departure of normality as pointed out in Durbin <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>. The two statistics can then be served as constraints for each other in order to effectively search for true periodic genes.</p>
         <p>Moreover, the behavior of the <it>C</it>-statistic method, the <it>G</it>-statistic method, and the C&amp;G Procedure for identifying periodic signals is empirically studied by means of the following simulation studies. To investigate the power of the three methods under different noise conditions, a sine signal mixed with a normal white noise (with the ratio of amplitude of signal to noise being 1 : 1) on 20 time points is simulated 10,000 times, and the frequency that each of the three methods rejects the null hypothesis (at the false positive rate of 0.05), or identifies the signal as periodic, is recorded. Similarly, a sine signal mixed with a skewed noise (a chi-square distribution with 1 degree of freedom) on 20 time points is simulated 10,000 times, and the frequency that each of the three methods rejects the null hypothesis is recorded. The empirical power of each method is hence obtained and listed in Table <tblr tid="T3">3</tblr>. From Table <tblr tid="T3">3</tblr>, we conclude that the empirical powers of all three methods increase if the noise is improved from skewed distribution to normal distribution. Under each noise condition, the <it>C</it>-statistic method has higher power than that of the other two methods. The power of <it>C&amp;G </it>Procedure is about the same as the <it>G</it>-statistic method. When the periodic signal is stronger than the normal white noise (with the ratio of amplitude of signal to noise being 9 : 8, 10 : 8, 11 : 8, 12 : 8, respectively), our simulation (10,000 times) of such signals on 20 time points shows that all three methods have high powers (see Table <tblr tid="T4">4</tblr>). This is a very good property of all three methods. Next, to study the effectiveness of the methods in identifying true periodic signals when the data is noisy, weaker sine signals mixed with a stronger normal white noise (with the ratio of amplitude of signal to noise being 7 : 8, 6 : 8, 5 : 8, 4 : 8 or 1 : 2, respectively) on 20 time points are simulated for 10,000 times each. The empirical powers of the <it>C</it>-statistic, the <it>G</it>-statistic, and the <it>C</it>&amp;<it>G </it>Procedure are given in Table <tblr tid="T5">5</tblr>. We conclude from Table <tblr tid="T5">5</tblr> that the empirical power of the <it>C</it>-statistic method is always higher than the other methods, and the empirical power of <it>C</it>&amp;<it>G </it>Procedure is about the same or compatible with the <it>G</it>-statistic method when the data is even very noisy (signal to noise amplitude ratio being 1:2). The power of all methods decreases when the noise dominants the true periodic signal more and more (see the powers from row 2 to row 5 of Table <tblr tid="T5">5</tblr>). As there usually are strong and weak signals in a large gene expression dataset, knowing the behavior of all three methods under these situations helps the biologist to choose a right searching tool for analyzing their experimental data. Although these simulation studies show that the power of the <it>C</it>-statistic is higher than that of the other two methods, we need to investigate the empirical type I error rate, or false positive rate, of these three methods. For this purpose, a sequence of 20 normal observations (without any periodic signals) is simulated 10,000 times, the frequency that each of the three methods considers the observations as periodic signals (at the priori false positive rate of 0.05) is recorded. Similarly, a sequence of 20 Chi-square (1 degree of freedom) observations (without any periodic signals) and a sequence of 20 observations (without any periodic signals) from uniform (0,1) distribution are simulated 10,000 times, and the similar frequencies are recorded. Then the empirical false positive rates of the three methods are obtained and summarized in Table <tblr tid="T6">6</tblr>. It is clear that the false positive rate of the <it>C</it>-statistic is the highest and that of the <it>C</it>&amp;<it>G </it>Procedure is the least under each noise scenario. All simulation studies together indicate that to maintain a stable and relatively high power and to minimize the false positive rate, the <it>C</it>&amp;<it>G </it>Procedure is a right choice. Thus, the advantage of using the proposed <it>C</it>&amp;<it>G </it>Procedure emerges. The simulation, as well as all calculations in previous sections, is carried out using Matlab and Mintab 14.</p>
         <tbl id="T3" hint_layout="single">
            <title>
               <p>Table 3</p>
            </title>
            <caption>
               <p>Empirical power of <it>C</it>, <it>G</it>, and <it>C&amp;G </it>with the ratio of amplitude of signal to noise being 1 : 1</p>
            </caption>
            <tblbdy cols="4">
               <r>
                  <c ca="left">
                     <p>Signal type</p>
                  </c>
                  <c ca="center">
                     <p>
                        <it>C</it>
                     </p>
                  </c>
                  <c ca="center">
                     <p>
                        <it>G</it>
                     </p>
                  </c>
                  <c ca="center">
                     <p>
                        <it>C&amp;G</it>
                     </p>
                  </c>
               </r>
               <r>
                  <c cspan="4">
                     <hr/>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>sine signal with skewed noise</p>
                  </c>
                  <c ca="center">
                     <p>81.66%</p>
                  </c>
                  <c ca="center">
                     <p>75.25%</p>
                  </c>
                  <c ca="center">
                     <p>75.23%</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>sine signal with normal white noise</p>
                  </c>
                  <c ca="center">
                     <p>99.09%</p>
                  </c>
                  <c ca="center">
                     <p>97.57%</p>
                  </c>
                  <c ca="center">
                     <p>97.57%</p>
                  </c>
               </r>
            </tblbdy>
         </tbl>
         <tbl id="T4" hint_layout="single">
            <title>
               <p>Table 4</p>
            </title>
            <caption>
               <p>Empirical power of <it>C</it>, <it>G</it>, and <it>C&amp;G </it>on stronger signals</p>
            </caption>
            <tblbdy cols="4">
               <r>
                  <c ca="center">
                     <p>The ratio of amplitude of signal to noise</p>
                  </c>
                  <c ca="center">
                     <p>
                        <it>C</it>
                     </p>
                  </c>
                  <c ca="center">
                     <p>
                        <it>G</it>
                     </p>
                  </c>
                  <c ca="center">
                     <p>
                        <it>C&amp;G</it>
                     </p>
                  </c>
               </r>
               <r>
                  <c cspan="4">
                     <hr/>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>9:8</p>
                  </c>
                  <c ca="center">
                     <p>99.78%</p>
                  </c>
                  <c ca="center">
                     <p>99.50%</p>
                  </c>
                  <c ca="center">
                     <p>99.50%</p>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>10:8</p>
                  </c>
                  <c ca="center">
                     <p>99.97%</p>
                  </c>
                  <c ca="center">
                     <p>99.93%</p>
                  </c>
                  <c ca="center">
                     <p>99.93%</p>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>11:8</p>
                  </c>
                  <c ca="center">
                     <p>99.99%</p>
                  </c>
                  <c ca="center">
                     <p>99.99%</p>
                  </c>
                  <c ca="center">
                     <p>99.99%</p>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>12:8</p>
                  </c>
                  <c ca="center">
                     <p>100%</p>
                  </c>
                  <c ca="center">
                     <p>100%</p>
                  </c>
                  <c ca="center">
                     <p>100%</p>
                  </c>
               </r>
            </tblbdy>
         </tbl>
         <tbl id="T5" hint_layout="single">
            <title>
               <p>Table 5</p>
            </title>
            <caption>
               <p>Empirical power of <it>C</it>, <it>G</it>, and <it>C&amp;G </it>on weaker signals</p>
            </caption>
            <tblbdy cols="4">
               <r>
                  <c ca="center">
                     <p>The ratio of amplitude of signal to noise</p>
                  </c>
                  <c ca="center">
                     <p>
                        <it>C</it>
                     </p>
                  </c>
                  <c ca="center">
                     <p>
                        <it>G</it>
                     </p>
                  </c>
                  <c ca="center">
                     <p>
                        <it>C&amp;G</it>
                     </p>
                  </c>
               </r>
               <r>
                  <c cspan="4">
                     <hr/>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>7:8</p>
                  </c>
                  <c ca="center">
                     <p>96.00%</p>
                  </c>
                  <c ca="center">
                     <p>91.72%</p>
                  </c>
                  <c ca="center">
                     <p>91.72%</p>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>6:8</p>
                  </c>
                  <c ca="center">
                     <p>87.39%</p>
                  </c>
                  <c ca="center">
                     <p>78.40%</p>
                  </c>
                  <c ca="center">
                     <p>78.40%</p>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>5:8</p>
                  </c>
                  <c ca="center">
                     <p>71.95%</p>
                  </c>
                  <c ca="center">
                     <p>56.87%</p>
                  </c>
                  <c ca="center">
                     <p>56.82%</p>
                  </c>
               </r>
               <r>
                  <c ca="center">
                     <p>4:8</p>
                  </c>
                  <c ca="center">
                     <p>51.03%</p>
                  </c>
                  <c ca="center">
                     <p>34.30%</p>
                  </c>
                  <c ca="center">
                     <p>34.01%</p>
                  </c>
               </r>
            </tblbdy>
         </tbl>
         <tbl id="T6" hint_layout="single">
            <title>
               <p>Table 6</p>
            </title>
            <caption>
               <p>Empirical false positive rate of <it>C</it>, <it>G</it>, and <it>C&amp;G</it></p>
            </caption>
            <tblbdy cols="4">
               <r>
                  <c ca="left">
                     <p>noise type</p>
                  </c>
                  <c ca="center">
                     <p>
                        <it>C</it>
                     </p>
                  </c>
                  <c ca="center">
                     <p>
                        <it>G</it>
                     </p>
                  </c>
                  <c ca="center">
                     <p>
                        <it>C&amp;G</it>
                     </p>
                  </c>
               </r>
               <r>
                  <c cspan="4">
                     <hr/>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>normal</p>
                  </c>
                  <c ca="center">
                     <p>12.3%</p>
                  </c>
                  <c ca="center">
                     <p>6.45%</p>
                  </c>
                  <c ca="center">
                     <p>4.29%</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>uniform</p>
                  </c>
                  <c ca="center">
                     <p>13.23%</p>
                  </c>
                  <c ca="center">
                     <p>7.60%</p>
                  </c>
                  <c ca="center">
                     <p>4.70%</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Chi-square</p>
                  </c>
                  <c ca="center">
                     <p>7.32%</p>
                  </c>
                  <c ca="center">
                     <p>2.32%</p>
                  </c>
                  <c ca="center">
                     <p>1.36%</p>
                  </c>
               </r>
            </tblbdy>
         </tbl>
         <p>Finally, as the null distributions of these two statistics are all exact distributions, they work well (as long as the underlying assumptions are met) for any sample size (small or large). This characteristic makes both tests very valuable to microarray data sets as the observations obtained for each gene is usually not large in a microarray experiment.</p>
      </sec>
      <sec>
         <st>
            <p>Conclusion</p>
         </st>
         <p>In this paper a statistical C&amp;<it>G </it>Procedure is proposed for identifying significantly periodically expressed genes for a desired FDR level <it>q</it>. This approach uses both Bartlett's <it>C</it>-statistic and Fisher's <it>G</it>-statistic to secure the actual periodic genes existing in a microarray data set. As the searching process is also a multiple testing procedure, the FDR level is used to assure that the overall false discover rate for the whole procedure is at most <it>&#945;</it>. The <it>G</it>-statistic does assume that the sequence is Gaussian, this may not be the case for any microarray data set. Nevertheless, a log-transformed expression data usually can satisfy the Gaussian assumption. The <it>C</it>-statistic is more robust towards the violation of Gaussian assumption. The advantage of the <it>C</it>&amp;<it>G </it>Procedure thus emerges. Although the gene expression sequences in a microarray data set are usually correlated, the approach of the multiple testing with controlled FDR level does not rely on independence assumption heavily according to Benjamini and Hochberg <abbrgrp><abbr bid="B10">10</abbr></abbrgrp>. Therefore, this <it>C</it>&amp;<it>G </it>Procedure is a promising statistical tool for finding significantly periodically expressed genes (of the same period) in a microarray data set. Other issues, such as the analysis of data measured in unevenly spaced time intervals and the size of each sequence needed for valid statistical analysis, will be topics of future investigations in order to more effectively search for significantly periodically expressed genes in a microarray data set.</p>
      </sec>
      <sec>
         <st>
            <p>Methods</p>
         </st>
         <p>Suppose that a time series is observed and one concern is the possible periodicity of this time series. To be specific in the context of gene expressions observed at time <it>t </it>for any fixed gene <it>g</it>, we denote the time series (or gene expression observed in a time course) by <it>Y</it><sub><it>g</it></sub>(<it>t</it>) for <it>t </it>= 1, ..., <it>N </it>and <it>g </it>= 1, ..., <m:math name="1471-2105-6-286-i5" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>G</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBamXvP5wqSXMqHnxAJn0BKvguHDwzZbqegm0B1jxALjhiov2DaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaWaaeGaeaaakeaaimaacaWFhbaaaa@3962@</m:annotation></m:semantics></m:math>. To model <it>Y</it><sub><it>g</it></sub>(<it>t</it>) with periodicity, we can assume:</p>
         <p><it>Y<sub>g</sub></it>(<it>t</it>) = <it>f<sub>g</sub></it>(<it>t</it>) + <it>&#949;<sub>gt</sub></it>,</p>
         <p>where <it>f</it><sub><it>g</it></sub>(<it>t</it>) is a periodic function with a smallest positive period <it>T</it><sub><it>g </it></sub>for gene <it>g</it>, that is <it>f</it><sub><it>g</it></sub>(<it>t </it>+ <it>T</it><sub><it>g</it></sub>) = <it>f</it><sub><it>g</it></sub>(<it>t</it>) for all <it>t</it>; and <it>&#949;</it><sub><it>gt </it></sub>is a sequence of non-observable random errors with mean 0 and homogenous variance <it>&#963;</it><sup>2 </sup>for all <it>g </it>and <it>t</it>. For a fixed gene <it>g</it>, we can specifically assume that a time series gene expression is well represented by</p>
         <p><it>Y<sub>g</sub></it>(<it>t</it>) = <it>&#956; </it>+ <it>A </it>cos (<it>&#969;t</it>) + <it>B </it>sin(<it>&#969;t</it>) + <it>&#949;<sub>gt</sub></it>,</p>
         <p>where <it>A</it>, <it>B</it>, and <it>&#956; </it>(known) are constants, <it>&#969; </it>is of the form 2<it>&#960;k</it>/<it>N</it>, for <it>k </it>= 0,1, ..., <it>m</it>, with <it>m </it>= (<it>N </it>- 1)/2 for <it>N </it>odd and <it>m </it>= <it>N</it>/2 for <it>N </it>even. Given a finite realization of the time series gene expressions <it>y</it><sub><it>g</it></sub>(<it>t</it>) (sample values or microarray expressions obtained from the experiment), we can then view <it>y</it><sub><it>g</it></sub>(<it>t</it>) as represented by</p>
         <p>
            <m:math name="1471-2105-6-286-i14" xmlns:m="http://www.w3.org/1998/Math/MathML">
               <m:semantics>
                  <m:mrow>
                     <m:msub>
                        <m:mi>y</m:mi>
                        <m:mi>g</m:mi>
                     </m:msub>
                     <m:mo stretchy="false">(</m:mo>
                     <m:mi>t</m:mi>
                     <m:mo stretchy="false">)</m:mo>
                     <m:mo>=</m:mo>
                     <m:mn>2</m:mn>
                     <m:msub>
                        <m:mover accent="true">
                           <m:mi>y</m:mi>
                           <m:mo>&#175;</m:mo>
                        </m:mover>
                        <m:mi>g</m:mi>
                     </m:msub>
                     <m:mo>+</m:mo>
                     <m:mstyle displaystyle="true">
                        <m:munderover>
                           <m:mo>&#8721;</m:mo>
                           <m:mrow>
                              <m:mi>k</m:mi>
                              <m:mo>=</m:mo>
                              <m:mn>1</m:mn>
                           </m:mrow>
                           <m:mi>m</m:mi>
                        </m:munderover>
                        <m:mrow>
                           <m:mo stretchy="false">[</m:mo>
                           <m:msub>
                              <m:mi>a</m:mi>
                              <m:mrow>
                                 <m:mi>g</m:mi>
                                 <m:mo>,</m:mo>
                                 <m:mi>k</m:mi>
                              </m:mrow>
                           </m:msub>
                           <m:mi>cos</m:mi>
                           <m:mo>&#8289;</m:mo>
                           <m:mo stretchy="false">(</m:mo>
                           <m:msub>
                              <m:mi>&#969;</m:mi>
                              <m:mi>k</m:mi>
                           </m:msub>
                           <m:mi>t</m:mi>
                           <m:mo stretchy="false">)</m:mo>
                           <m:mo>+</m:mo>
                           <m:msub>
                              <m:mi>b</m:mi>
                              <m:mrow>
                                 <m:mi>g</m:mi>
                                 <m:mo>,</m:mo>
                                 <m:mi>k</m:mi>
                              </m:mrow>
                           </m:msub>
                           <m:mi>sin</m:mi>
                           <m:mo>&#8289;</m:mo>
                           <m:mo stretchy="false">(</m:mo>
                           <m:msub>
                              <m:mi>&#969;</m:mi>
                              <m:mi>k</m:mi>
                           </m:msub>
                           <m:mi>t</m:mi>
                           <m:mo stretchy="false">)</m:mo>
                           <m:mo stretchy="false">]</m:mo>
                           <m:mo>,</m:mo>
                        </m:mrow>
                     </m:mstyle>
                  </m:mrow>
                  <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciGacaGaaeqabaqabeGadaaakeaacqWG5bqEdaWgaaWcbaGaem4zaCgabeaakiabcIcaOiabdsha0jabcMcaPiabg2da9iabikdaYiqbdMha5zaaraWaaSbaaSqaaiabdEgaNbqabaGccqGHRaWkdaaeWbqaaiabcUfaBjabdggaHnaaBaaaleaacqWGNbWzcqGGSaalcqWGRbWAaeqaaOGagi4yamMaei4Ba8Maei4CamNaeiikaGIaeqyYdC3aaSbaaSqaaiabdUgaRbqabaGccqWG0baDcqGGPaqkcqGHRaWkcqWGIbGydaWgaaWcbaGaem4zaCMaeiilaWIaem4AaSgabeaakiGbcohaZjabcMgaPjabc6gaUjabcIcaOiabeM8a3naaBaaaleaacqWGRbWAaeqaaOGaemiDaqNaeiykaKIaeiyxa0LaeiilaWcaleaacqWGRbWAcqGH9aqpcqaIXaqmaeaacqWGTbqBa0GaeyyeIuoaaaa@639B@</m:annotation>
               </m:semantics>
            </m:math>
         </p>
         <p>where <it>&#969;<sub>k </sub></it>= 2<it>&#960;k</it>/<it>N</it>, for <it>k </it>= 0,1, ..., <it>m</it>, <graphic file="1471-2105-6-286-i15.gif"/>, and</p>
         <p>
            <graphic file="1471-2105-6-286-i16.gif"/>
         </p>
         <p>for <it>k </it>= 1, ..., <it>m </it>and g = 1, ..., <m:math name="1471-2105-6-286-i5" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>G</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBamXvP5wqSXMqHnxAJn0BKvguHDwzZbqegm0B1jxALjhiov2DaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaWaaeGaeaaakeaaimaacaWFhbaaaa@3962@</m:annotation></m:semantics></m:math>. For the testing of periodicity related hypotheses of a time series, the periodogram of gene <it>g </it>is denned as</p>
         <p>
            <m:math name="1471-2105-6-286-i17" xmlns:m="http://www.w3.org/1998/Math/MathML">
               <m:semantics>
                  <m:mtable>
                     <m:mtr>
                        <m:mtd>
                           <m:msub>
                              <m:mi>I</m:mi>
                              <m:mrow>
                                 <m:mi>g</m:mi>
                                 <m:mo>,</m:mo>
                                 <m:mi>n</m:mi>
                              </m:mrow>
                           </m:msub>
                           <m:mo stretchy="false">(</m:mo>
                           <m:msub>
                              <m:mi>&#969;</m:mi>
                              <m:mi>k</m:mi>
                           </m:msub>
                           <m:mo stretchy="false">)</m:mo>
                           <m:mo>=</m:mo>
                           <m:mfrac>
                              <m:mn>2</m:mn>
                              <m:mi>N</m:mi>
                           </m:mfrac>
                           <m:msup>
                              <m:mrow>
                                 <m:mo>|</m:mo>
                                 <m:mrow>
                                    <m:mstyle displaystyle="true">
                                       <m:munderover>
                                          <m:mo>&#8721;</m:mo>
                                          <m:mrow>
                                             <m:mi>t</m:mi>
                                             <m:mo>=</m:mo>
                                             <m:mn>1</m:mn>
                                          </m:mrow>
                                          <m:mi>N</m:mi>
                                       </m:munderover>
                                       <m:mrow>
                                          <m:msub>
                                             <m:mi>Y</m:mi>
                                             <m:mi>g</m:mi>
                                          </m:msub>
                                          <m:mo stretchy="false">(</m:mo>
                                          <m:mi>t</m:mi>
                                          <m:mo stretchy="false">)</m:mo>
                                          <m:msup>
                                             <m:mi>e</m:mi>
                                             <m:mrow>
                                                <m:mo>&#8722;</m:mo>
                                                <m:mi>i</m:mi>
                                                <m:msub>
                                                   <m:mi>&#969;</m:mi>
                                                   <m:mi>k</m:mi>
                                                </m:msub>
                                                <m:mi>t</m:mi>
                                             </m:mrow>
                                          </m:msup>
                                       </m:mrow>
                                    </m:mstyle>
                                 </m:mrow>
                                 <m:mo>|</m:mo>
                              </m:mrow>
                              <m:mn>2</m:mn>
                           </m:msup>
                        </m:mtd>
                     </m:mtr>
                     <m:mtr>
                        <m:mtd>
                           <m:mo>=</m:mo>
                           <m:mfrac>
                              <m:mi>N</m:mi>
                              <m:mn>2</m:mn>
                           </m:mfrac>
                           <m:mo stretchy="false">(</m:mo>
                           <m:msubsup>
                              <m:mi>a</m:mi>
                              <m:mrow>
                                 <m:mi>g</m:mi>
                                 <m:mo>,</m:mo>
                                 <m:mi>k</m:mi>
                              </m:mrow>
                              <m:mn>2</m:mn>
                           </m:msubsup>
                           <m:mo>+</m:mo>
                           <m:msubsup>
                              <m:mi>b</m:mi>
                              <m:mrow>
                                 <m:mi>g</m:mi>
                                 <m:mo>,</m:mo>
                                 <m:mi>k</m:mi>
                              </m:mrow>
                              <m:mn>2</m:mn>
                           </m:msubsup>
                           <m:mo stretchy="false">)</m:mo>
                           <m:mo>,</m:mo>
                           <m:mtext>&#160;&#160;&#160;&#160;&#160;</m:mtext>
                           <m:mrow>
                              <m:mo>(</m:mo>
                              <m:mn>4</m:mn>
                              <m:mo>)</m:mo>
                           </m:mrow>
                        </m:mtd>
                     </m:mtr>
                  </m:mtable>
                  <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciGacaGaaeqabaqabeGadaaakqaaeeqaaiabdMeajnaaBaaaleaacqWGNbWzcqGGSaalcqWGUbGBaeqaaOGaeiikaGIaeqyYdC3aaSbaaSqaaiabdUgaRbqabaGccqGGPaqkcqGH9aqpdaWcaaqaaiabikdaYaqaaiabd6eaobaadaabdaqaamaaqahabaGaemywaK1aaSbaaSqaaiabdEgaNbqabaGccqGGOaakcqWG0baDcqGGPaqkcqWGLbqzdaahaaWcbeqaaiabgkHiTiabdMgaPjabeM8a3naaBaaameaacqWGRbWAaeqaaSGaemiDaqhaaaqaaiabdsha0jabg2da9iabigdaXaqaaiabd6eaobqdcqGHris5aaGccaGLhWUaayjcSdWaaWbaaSqabeaacqaIYaGmaaaakeaacqGH9aqpdaWcaaqaaiabd6eaobqaaiabikdaYaaacqGGOaakcqWGHbqydaqhaaWcbaGaem4zaCMaeiilaWIaem4AaSgabaGaeGOmaidaaOGaey4kaSIaemOyai2aa0baaSqaaiabdEgaNjabcYcaSiabdUgaRbqaaiabikdaYaaakiabcMcaPiabcYcaSiaaxMaacaWLjaWaaeWaaeaacqaI0aanaiaawIcacaGLPaaaaaaa@6A00@</m:annotation>
               </m:semantics>
            </m:math>
         </p>
         <p>for <it>k </it>= 1, ..., <it>m </it>and <it>g </it>= 1, ..., <m:math name="1471-2105-6-286-i5" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>G</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBamXvP5wqSXMqHnxAJn0BKvguHDwzZbqegm0B1jxALjhiov2DaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaWaaeGaeaaakeaaimaacaWFhbaaaa@3962@</m:annotation></m:semantics></m:math>. Under the assumption that <it>&#949;</it><sub><it>gt</it></sub>'s are identically independently distributed normal random errors with mean 0 and homogenous variance <it>&#963;</it><sup>2 </sup>(that is, <it>Y</it><sub><it>g</it></sub>(<it>t</it>) is a normal white noise), Fisher <abbrgrp><abbr bid="B7">7</abbr></abbrgrp> proposed a G-statistic and derived the exact null distribution of <it>G</it>. Suppose it is of our interest to test</p>
         <p><it>H</it><sub>0</sub>: <it>Y<sub>g</sub></it>(<it>t</it>) = <it>&#956; </it>+ <it>&#949;<sub>gt</sub></it>, &#160;&#160;&#160; (5)</p>
         <p>versus</p>
         <p><it>H</it><sub>1</sub>: <it>Y<sub>g</sub></it>(<it>t</it>) = <it>&#956; </it>+ <it>A </it>cos(<it>&#969;t</it>) + <it>B </it>sin (<it>&#969;t</it>) + <it>&#949;<sub>gt</sub></it>, &#160;&#160;&#160; (6)</p>
         <p>then for a fixed gene <it>g</it>, the Fisher's <it>G</it>-statistic is given by</p>
         <p>
            <graphic file="1471-2105-6-286-i18.gif"/>
         </p>
         <p>For details on the <it>G </it>test statistic, its null distribution and the percentage points of the <it>G </it>test statistics, please refer to Fisher <abbrgrp><abbr bid="B7">7</abbr></abbrgrp>, Davis <abbrgrp><abbr bid="B18">18</abbr></abbrgrp>, Wilks <abbrgrp><abbr bid="B19">19</abbr></abbrgrp>, and Priestley <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>.</p>
         <p>Other test statistics for searching "hidden periodicity" in a time series have been proposed as part of spectral analysis (Fuller <abbrgrp><abbr bid="B21">21</abbr></abbrgrp>) in the literature. For the following more general setting of hypothesis testing of</p>
         <p><it>H</it><sub>0</sub>: <it>Y</it><sub><it>g</it></sub>(<it>t</it>) is a normal white noise, &#160;&#160;&#160; (8)</p>
         <p>versus</p>
         <p><it>H</it><sub>0</sub>: <it>Y</it><sub><it>g</it></sub>(<it>t</it>) is not a normal white noise, &#160;&#160;&#160; (9)</p>
         <p>for fixed gene <it>g</it>, Bartlett <abbrgrp><abbr bid="B8">8</abbr></abbrgrp> proposed to use a <it>C</it>-statistic as a test statistic to fulfill the task of such hypothesis testing procedure. For a fixed gene <it>g</it>, we obtain the <it>C</it>-statistic as</p>
         <p>
            <m:math name="1471-2105-6-286-i19" xmlns:m="http://www.w3.org/1998/Math/MathML">
               <m:semantics>
                  <m:mrow>
                     <m:msub>
                        <m:mi>C</m:mi>
                        <m:mi>g</m:mi>
                     </m:msub>
                     <m:mo>=</m:mo>
                     <m:munder>
                        <m:mrow>
                           <m:mi>max</m:mi>
                           <m:mo>&#8289;</m:mo>
                        </m:mrow>
                        <m:mrow>
                           <m:mn>1</m:mn>
                           <m:mo>&#8804;</m:mo>
                           <m:mi>k</m:mi>
                           <m:mo>&#8804;</m:mo>
                           <m:mi>m</m:mi>
                           <m:mo>&#8722;</m:mo>
                           <m:mn>1</m:mn>
                        </m:mrow>
                     </m:munder>
                     <m:mrow>
                        <m:mo>|</m:mo>
                        <m:mrow>
                           <m:msub>
                              <m:mi>C</m:mi>
                              <m:mrow>
                                 <m:mi>g</m:mi>
                                 <m:mo>,</m:mo>
                                 <m:mi>k</m:mi>
                              </m:mrow>
                           </m:msub>
                           <m:mo>&#8722;</m:mo>
                           <m:mi>k</m:mi>
                           <m:mo>/</m:mo>
                           <m:mi>m</m:mi>
                        </m:mrow>
                        <m:mo>|</m:mo>
                     </m:mrow>
                     <m:mtext>&#160;&#160;&#160;&#160;&#160;</m:mtext>
                     <m:mrow>
                        <m:mo>(</m:mo>
                        <m:mrow>
                           <m:mn>10</m:mn>
                        </m:mrow>
                        <m:mo>)</m:mo>
                     </m:mrow>
                  </m:mrow>
                  <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciGacaGaaeqabaqabeGadaaakeaacqWGdbWqdaWgaaWcbaGaem4zaCgabeaakiabg2da9maaxababaGagiyBa0MaeiyyaeMaeiiEaGhaleaacqaIXaqmcqGHKjYOcqWGRbWAcqGHKjYOcqWGTbqBcqGHsislcqaIXaqmaeqaaOWaaqWaaeaacqWGdbWqdaWgaaWcbaGaem4zaCMaeiilaWIaem4AaSgabeaakiabgkHiTiabdUgaRjabc+caViabd2gaTbGaay5bSlaawIa7aiaaxMaacaWLjaWaaeWaaeaacqaIXaqmcqaIWaamaiaawIcacaGLPaaaaaa@4EEF@</m:annotation>
               </m:semantics>
            </m:math>
         </p>
         <p>with</p>
         <p>
            <graphic file="1471-2105-6-286-i20.gif"/>
         </p>
         <p>for <it>g </it>= 1, ..., <m:math name="1471-2105-6-286-i5" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mi>G</m:mi><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBamXvP5wqSXMqHnxAJn0BKvguHDwzZbqegm0B1jxALjhiov2DaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaWaaeGaeaaakeaaimaacaWFhbaaaa@3962@</m:annotation></m:semantics></m:math>. Durbin (<abbrgrp><abbr bid="B9">9</abbr><abbr bid="B22">22</abbr></abbrgrp>) provided the details of the null distribution of the test statistic <it>C </it>under the normality assumption.</p>
         <p>According to Fisher <abbrgrp><abbr bid="B7">7</abbr></abbrgrp>, the observed significance value, or p-value <m:math name="1471-2105-6-286-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>p</m:mi><m:mi>g</m:mi><m:mi>G</m:mi></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciGacaGaaeqabaqabeGadaaakeaacqWGWbaCdaqhaaWcbaGaem4zaCgabaGaem4raCeaaaaa@30B2@</m:annotation></m:semantics></m:math>, for the hypothesis testing of the periodicity of a fixed gene <it>g </it>using <it>G</it>-statistic as the test statistic is expressed as in (1), or again</p>
         <p>
            <graphic file="1471-2105-6-286-i21.gif"/>
         </p>
         <p>where <it>&#958;</it><sub><it>g </it></sub>is the sample realization of the <it>G</it>-statistic value calculated from (7) divided by m, and <it>L</it>(<it>&#958;</it><sub><it>g</it></sub>) is the largest integer less than 1/<it>&#958;</it><sub><it>g</it></sub>. Meanwhile, according to Durbin <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>, the p-value, <m:math name="1471-2105-6-286-i3" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>p</m:mi><m:mi>g</m:mi><m:mi>C</m:mi></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciGacaGaaeqabaqabeGadaaakeaacqWGWbaCdaqhaaWcbaGaem4zaCgabaGaem4qameaaaaa@30AA@</m:annotation></m:semantics></m:math>, for the hypothesis testing of the periodicity of a fixed gene <it>g </it>using <it>C</it>-statistic as the test statistic is given in (2), or specifically,</p>
         <p>
            <graphic file="1471-2105-6-286-i22.gif"/>
         </p>
         <p>where <it>a</it><sub><it>g </it></sub>= <it>mC</it><sub><it>g</it></sub>, <it>C</it><sub><it>g </it></sub>is given in (10), [<it>a</it><sub><it>g</it></sub>] = <it>INT</it>{<it>a</it><sub><it>g</it></sub>}, and <it>n </it>= <it>m </it>- 1.</p>
         <p>The C&amp;G Procedure utilizes both of the test statistics and gives a practical way for identifying significant periodic genes in massive microarray data.</p>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>This research is supported in part by the NSF grant DMS-0426148. Part of this work is done while the author is a visiting scientist at the Stowers Institute for Medical Research (SIMR) and is on leave from University of Missouri-Kansas City. The author thanks two anonymous referees whose comments greatly improved the manuscript.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>Identifying periodically expressed transcripts in microarray time series data</p>
            </title>
            <aug>
               <au>
                  <snm>Wichert</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Folianos</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Strimmer</snm>
                  <fnm>K</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2004</pubdate>
            <volume>20</volume>
            <fpage>5</fpage>
            <lpage>20</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/btg364</pubid>
                  <pubid idtype="pmpid" link="fulltext">14693803</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>Comprehensive identification of cell cycle-regulated genes of the yeast <it>Saccharomyces cerevisiae </it>by microarray hybridization</p>
            </title>
            <aug>
               <au>
                  <snm>Spellman</snm>
                  <fnm>PT</fnm>
               </au>
               <au>
                  <snm>Sherlock</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>MQ</fnm>
               </au>
               <au>
                  <snm>Iyer</snm>
                  <fnm>VR</fnm>
               </au>
               <au>
                  <snm>Anders</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Eisen</snm>
                  <fnm>MB</fnm>
               </au>
               <au>
                  <snm>Brown</snm>
                  <fnm>PO</fnm>
               </au>
               <au>
                  <snm>Bostein</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Futcher</snm>
                  <fnm>B</fnm>
               </au>
            </aug>
            <source>Molecular Biology of the Cell</source>
            <pubdate>1998</pubdate>
            <volume>9</volume>
            <fpage>3273</fpage>
            <lpage>3297</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">25624</pubid>
                  <pubid idtype="pmpid" link="fulltext">9843569</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>Transcriptional regulation and function during the human cell cycle</p>
            </title>
            <aug>
               <au>
                  <snm>Cho</snm>
                  <fnm>RJ</fnm>
               </au>
               <au>
                  <snm>Huang</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Dong</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Steinmetz</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Sapinoso</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Hampton</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Elledge</snm>
                  <fnm>SJ</fnm>
               </au>
               <au>
                  <snm>Davis</snm>
                  <fnm>RW</fnm>
               </au>
               <au>
                  <snm>Lockhardt</snm>
                  <fnm>DJ</fnm>
               </au>
               <au>
                  <snm>Campbell</snm>
                  <fnm>MJ</fnm>
               </au>
            </aug>
            <source>Nature Genetics</source>
            <pubdate>2001</pubdate>
            <volume>27</volume>
            <fpage>48</fpage>
            <lpage>54</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">11137997</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>Analysis of cell-cycle-specific gene expression in human cells as determined by microarrays and double-thymidine block synchronization</p>
            </title>
            <aug>
               <au>
                  <snm>Shedden</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Cooper</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Proceedings of the National Academy of Science USA</source>
            <pubdate>2002</pubdate>
            <volume>99</volume>
            <fpage>4379</fpage>
            <lpage>4384</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1073/pnas.062569899</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>Analysis of cell-cycle gene expression in <it>Saccharomyces cerevisiae </it>using microarrays and multiple synchronization methods</p>
            </title>
            <aug>
               <au>
                  <snm>Shedden</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Cooper</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Research</source>
            <pubdate>2002</pubdate>
            <volume>30</volume>
            <fpage>2920</fpage>
            <lpage>2929</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">117069</pubid>
                  <pubid idtype="pmpid" link="fulltext">12087178</pubid>
                  <pubid idtype="doi">10.1093/nar/gkf414</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>Identification of Genes periodically expressed in the human cell cycle and their expression in tumors</p>
            </title>
            <aug>
               <au>
                  <snm>Whitfield</snm>
                  <fnm>ML</fnm>
               </au>
               <au>
                  <snm>Sherlock</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Saldanha</snm>
                  <fnm>AJ</fnm>
               </au>
               <au>
                  <snm>Murray</snm>
                  <fnm>JI</fnm>
               </au>
               <au>
                  <snm>Ball</snm>
                  <fnm>CA</fnm>
               </au>
               <au>
                  <snm>Alexander</snm>
                  <fnm>KE</fnm>
               </au>
               <au>
                  <snm>Marese</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Perou</snm>
                  <fnm>CM</fnm>
               </au>
               <au>
                  <snm>Hurt</snm>
                  <fnm>MM</fnm>
               </au>
               <au>
                  <snm>Brown</snm>
                  <fnm>PO</fnm>
               </au>
               <au>
                  <snm>Botstein</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Molecular Biology of the Cell</source>
            <pubdate>2002</pubdate>
            <volume>13</volume>
            <fpage>1977</fpage>
            <lpage>2000</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">117619</pubid>
                  <pubid idtype="pmpid" link="fulltext">12058064</pubid>
                  <pubid idtype="doi">10.1091/mbc.02-02-0030.</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>Tests of significance in Harmonic Analysis</p>
            </title>
            <aug>
               <au>
                  <snm>Fisher</snm>
                  <fnm>RA</fnm>
               </au>
            </aug>
            <source>Proceedings of the Royal Society of London, Series A</source>
            <pubdate>1929</pubdate>
            <volume>125</volume>
            <fpage>54</fpage>
            <lpage>59</lpage>
         </bibl>
         <bibl id="B8">
            <aug>
               <au>
                  <snm>Bartlett</snm>
                  <fnm>MS</fnm>
               </au>
            </aug>
            <source>An introduction to Stochastic Processes with Special Reference to Methods and Applications</source>
            <publisher>Cambridge University Press, Cambridge</publisher>
            <edition>2</edition>
            <pubdate>1966</pubdate>
         </bibl>
         <bibl id="B9">
            <title>
               <p>Tests of serial independence based on the cumulated periodogram</p>
            </title>
            <aug>
               <au>
                  <snm>Durbin</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Bulletin of the International Statistical Institute</source>
            <pubdate>1967</pubdate>
            <volume>42</volume>
            <fpage>1039</fpage>
            <lpage>1049</lpage>
         </bibl>
         <bibl id="B10">
            <title>
               <p>Controlling the false discovery rate: a practical and powerful approach to multiple testing</p>
            </title>
            <aug>
               <au>
                  <snm>Benjamini</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Hochberg</snm>
                  <fnm>Y</fnm>
               </au>
            </aug>
            <source>Journal of the Royal Statistical Society</source>
            <pubdate>1995</pubdate>
            <volume>B57</volume>
            <fpage>289</fpage>
            <lpage>300</lpage>
         </bibl>
         <bibl id="B11">
            <title>
               <p>Global analysis of the genetic network controlling a bacterial cell cycle</p>
            </title>
            <aug>
               <au>
                  <snm>Laub</snm>
                  <fnm>MT</fnm>
               </au>
               <au>
                  <snm>McAdams</snm>
                  <fnm>HH</fnm>
               </au>
               <au>
                  <snm>Feldblyum</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Fraser</snm>
                  <fnm>CM</fnm>
               </au>
               <au>
                  <snm>Shapiro</snm>
                  <fnm>L</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2000</pubdate>
            <volume>290</volume>
            <fpage>2144</fpage>
            <lpage>2148</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.290.5499.2144</pubid>
                  <pubid idtype="pmpid" link="fulltext">11118148</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>Bacterial cell cycle data</p>
            </title>
            <url>http://caulobacter.stanford.edu/CellCycle/DownloadData.htm</url>
         </bibl>
         <bibl id="B13">
            <title>
               <p>NCBI</p>
            </title>
            <url>http://www.ncbi.nlm.nih.gov</url>
         </bibl>
         <bibl id="B14">
            <title>
               <p>Yeast cell cycle data</p>
            </title>
            <url>http://genome-www.stanford.edu/cellcycle</url>
         </bibl>
         <bibl id="B15">
            <title>
               <p>Yeast genome website</p>
            </title>
            <url>http://www.yeastgenome.org</url>
         </bibl>
         <bibl id="B16">
            <title>
               <p>Human fibroblasts data</p>
            </title>
            <url>http://www-sequence.stanford.edu/human_cell_cycle</url>
         </bibl>
         <bibl id="B17">
            <title>
               <p>Human cancer cell line data</p>
            </title>
            <url>http://genome-www.stanford.edu/Human-CellCycle/Hela</url>
         </bibl>
         <bibl id="B18">
            <aug>
               <au>
                  <snm>Davis</snm>
                  <fnm>HT</fnm>
               </au>
            </aug>
            <source>The Analysis of Economic Time Series</source>
            <publisher>Principia Press, Bloomington, Indiana</publisher>
            <pubdate>1941</pubdate>
         </bibl>
         <bibl id="B19">
            <aug>
               <au>
                  <snm>Wilks</snm>
                  <fnm>SS</fnm>
               </au>
            </aug>
            <source>Mathematical Statistics</source>
            <publisher>Wiley, New York</publisher>
            <pubdate>1962</pubdate>
         </bibl>
         <bibl id="B20">
            <aug>
               <au>
                  <snm>Priestley</snm>
                  <fnm>MB</fnm>
               </au>
            </aug>
            <source>Spectral Analysis and Time Series</source>
            <publisher>Academic Press: San Diego</publisher>
            <pubdate>1981</pubdate>
         </bibl>
         <bibl id="B21">
            <aug>
               <au>
                  <snm>Fuller</snm>
                  <fnm>WA</fnm>
               </au>
            </aug>
            <source>Introduction to Statistical Time Series</source>
            <publisher>Wiley-Interscience: New York</publisher>
            <edition>2</edition>
            <pubdate>1996</pubdate>
         </bibl>
         <bibl id="B22">
            <title>
               <p>Tests for serial correlation in regression analysis based on the periodogram of least-squares residuals</p>
            </title>
            <aug>
               <au>
                  <snm>Durbin</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Biometrika</source>
            <pubdate>1969</pubdate>
            <volume>56</volume>
            <fpage>1</fpage>
            <lpage>15</lpage>
         </bibl>
      </refgrp>
   </bm>
</art>
