<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1471-2105-8-359</ui>
   <ji>1471-2105</ji>
   <fm>
      <dochead>Methodology article</dochead>
      <bibl>
         <title>
            <p>Differential analysis for high density tiling microarray data</p>
         </title>
         <aug>
            <au id="A1" ca="yes">
               <snm>Ghosh</snm>
               <fnm>Srinka</fnm>
               <insr iid="I1"/>
               <email>srinka_ghosh@affymetrix.com</email>
            </au>
            <au id="A2">
               <snm>Hirsch</snm>
               <mi>A</mi>
               <fnm>Heather</fnm>
               <insr iid="I2"/>
               <email>heather_hirsch@hms.harvard.edu</email>
            </au>
            <au id="A3">
               <snm>Sekinger</snm>
               <mi>A</mi>
               <fnm>Edward</fnm>
               <insr iid="I3"/>
               <email>esekinger@asuragen.com</email>
            </au>
            <au id="A4">
               <snm>Kapranov</snm>
               <fnm>Philipp</fnm>
               <insr iid="I1"/>
               <email>philipp_kapranov@affymetrix.com</email>
            </au>
            <au id="A5">
               <snm>Struhl</snm>
               <fnm>Kevin</fnm>
               <insr iid="I2"/>
               <email>kevin@hms.harvard.edu</email>
            </au>
            <au id="A6">
               <snm>Gingeras</snm>
               <mi>R</mi>
               <fnm>Thomas</fnm>
               <insr iid="I1"/>
               <email>tom_gingeras@affymetrix.com</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Affymetrix Inc., Santa Clara, CA 95051, USA</p>
            </ins>
            <ins id="I2">
               <p>Dept. Biological Chemistry &amp; Molecular Pharmacology, Harvard Medical School, Boston, MA 02115, USA</p>
            </ins>
            <ins id="I3">
               <p>Asuragen, Inc., 2150 Woodward, Austin, TX 78744, USA</p>
            </ins>
         </insg>
         <source>BMC Bioinformatics</source>
         <issn>1471-2105</issn>
         <pubdate>2007</pubdate>
         <volume>8</volume>
         <issue>1</issue>
         <fpage>359</fpage>
         <url>http://www.biomedcentral.com/1471-2105/8/359</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">17892592</pubid>
               <pubid idtype="doi">10.1186/1471-2105-8-359</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>24</day>
               <month>1</month>
               <year>2007</year>
            </date>
         </rec>
         <acc>
            <date>
               <day>24</day>
               <month>9</month>
               <year>2007</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>24</day>
               <month>9</month>
               <year>2007</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2007</year>
         <collab>Ghosh et al; licensee BioMed Central Ltd.</collab>
         <note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>High density oligonucleotide tiling arrays are an effective and powerful platform for conducting unbiased genome-wide studies. The <it>ab initio </it>probe selection method employed in tiling arrays is unbiased, and thus ensures consistent sampling across coding and non-coding regions of the genome. These arrays are being increasingly used to study the associated processes of transcription, transcription factor binding, chromatin structure and their association. Studies of differential expression and/or regulation provide critical insight into the mechanics of transcription and regulation that occurs during the developmental program of a cell. The time-course experiment, which comprises an <it>in-vivo </it>system and the proposed analyses, is used to determine if annotated and un-annotated portions of genome manifest coordinated differential response to the induced developmental program.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>We have proposed a novel approach, based on a piece-wise function &#8211; to analyze genome-wide differential response. This enables segmentation of the response based on protein-coding and non-coding regions; for genes the methodology also partitions differential response with a 5' versus 3' versus intra-genic bias.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusion</p>
               </st>
               <p>The algorithm built upon the framework of Significance Analysis of Microarrays, uses a generalized logic to define regions/patterns of coordinated differential change. By not adhering to the gene-centric paradigm, discordant differential expression patterns between exons and introns have been identified at a FDR of less than 12 percent. A co-localization of differential binding between RNA Polymerase II and tetra-acetylated histone has been quantified at a p-value &lt; 0.003; it is most significant at the 5' end of genes, at a p-value &lt; 10<sup>-13</sup>. The prototype R code has been made available as supplementary material [see Additional file <supplr sid="S1">1</supplr>].</p>
               <suppl id="S1">
                  <title>
                     <p>Additional file 1</p>
                  </title>
                  <text>
                     <p>gsam_prototypercode.zip. File archive comprising of prototype R code for gSAM implementation including readme and examples.</p>
                  </text>
                  <file name="1471-2105-8-359-S1.zip">
                     <p>Click here for file</p>
                  </file>
               </suppl>
            </sec>
         </sec>
      </abs>
   </fm>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>Use of DNA microarrays has become commonplace for monitoring the expression levels of thousands of genes simultaneously <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. The gene expression signature represents the steady state level of RNA in cells and can be utilized to detect cellular response to an exogenous stimulation originating from a treatment, disease or other sources <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr></abbrgrp>. In understanding the dynamics of transcriptional regulation it is imperative to both identify and quantify the response of the loci manifesting differential changes in a comprehensive, genome-wide manner. This requires an exhaustive probing of both the protein coding and non-coding regions of the genome. Tiling array technology has facilitated unbiased genome-wide interrogation. The subsequent challenge is one of bioinformatics, requiring statistical interpretation of voluminous data with potentially low signal to noise ratio (<it>SNR</it>) to detect, characterize and quantify differential regulation. In response to this challenge we have proposed generalized SAM (<it>gSAM</it>), an extension to the methodology which forms the basis of Significance Analysis of Microarrays (<it>SAM</it>) <abbrgrp><abbr bid="B5">5</abbr></abbrgrp>.</p>
         <sec>
            <st>
               <p>The analytical paradigm</p>
            </st>
            <p>Classically, a 2x fold change (<it>FC</it>) in gene expression level has been a surrogate for establishing differential change. Regions of the genome with reduced coding potential might not exhibit such FCs. In fact the stringency of the 2x requirement can introduce a strong false negative bias. A more direct approach is to determine if the FCs are significantly different from zero. Hence the null hypothesis (<it>H</it><sub><it>0</it></sub>) for differential expression/modification is that there is no change in the mean response (<it>&#956;</it>) of a locus due to a change in its condition from <it>A </it>to <it>B </it>(Eqn. 1). The p-value is simply the probability that FC values drawn from such a distribution are reproducible. Therefore, a low p-value (&lt;0.05) implies that is it highly unlikely that the measured differential response is a consequence of random chance alone. The Student t-test is a classical parametric test used to assign the significance levels (Eqn. 2).</p>
            <p>
               <display-formula id="M1">
                  <m:math name="1471-2105-8-359-i1" xmlns:m="http://www.w3.org/1998/Math/MathML">
                     <m:semantics>
                        <m:mrow>
                           <m:msub>
                              <m:mi>H</m:mi>
                              <m:mn>0</m:mn>
                           </m:msub>
                           <m:mo>=</m:mo>
                           <m:mi>E</m:mi>
                           <m:mo stretchy="false">(</m:mo>
                           <m:msub>
                              <m:mover accent="true">
                                 <m:mi>&#956;</m:mi>
                                 <m:mo>&#175;</m:mo>
                              </m:mover>
                              <m:mi>B</m:mi>
                           </m:msub>
                           <m:mo>&#8722;</m:mo>
                           <m:msub>
                              <m:mover accent="true">
                                 <m:mi>&#956;</m:mi>
                                 <m:mo>&#175;</m:mo>
                              </m:mover>
                              <m:mi>A</m:mi>
                           </m:msub>
                           <m:mo stretchy="false">)</m:mo>
                           <m:mo>=</m:mo>
                           <m:mn>0</m:mn>
                        </m:mrow>
                        <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGibasdaWgaaWcbaGaeGimaadabeaakiabg2da9iabdweafjabcIcaOGGaciqb=X7aTzaaraWaaSbaaSqaaiabdkeacbqabaGccqGHsislcuWF8oqBgaqeamaaBaaaleaacqWGbbqqaeqaaOGaeiykaKIaeyypa0JaeGimaadaaa@3BB7@</m:annotation>
                     </m:semantics>
                  </m:math>
               </display-formula>
            </p>
            <p>
               <display-formula id="M2">
                  <m:math name="1471-2105-8-359-i2" xmlns:m="http://www.w3.org/1998/Math/MathML">
                     <m:semantics>
                        <m:mrow>
                           <m:mi>t</m:mi>
                           <m:mo>&#8722;</m:mo>
                           <m:mi>s</m:mi>
                           <m:mi>t</m:mi>
                           <m:mi>a</m:mi>
                           <m:mi>t</m:mi>
                           <m:mi>i</m:mi>
                           <m:mi>s</m:mi>
                           <m:mi>t</m:mi>
                           <m:mi>i</m:mi>
                           <m:mi>c</m:mi>
                           <m:mo>=</m:mo>
                           <m:mrow>
                              <m:mo>(</m:mo>
                              <m:mrow>
                                 <m:mfrac>
                                    <m:mrow>
                                       <m:mo stretchy="false">(</m:mo>
                                       <m:msub>
                                          <m:mover accent="true">
                                             <m:mi>&#956;</m:mi>
                                             <m:mo>&#175;</m:mo>
                                          </m:mover>
                                          <m:mi>B</m:mi>
                                       </m:msub>
                                       <m:mo>&#8722;</m:mo>
                                       <m:msub>
                                          <m:mover accent="true">
                                             <m:mi>&#956;</m:mi>
                                             <m:mo>&#175;</m:mo>
                                          </m:mover>
                                          <m:mi>A</m:mi>
                                       </m:msub>
                                       <m:mo stretchy="false">)</m:mo>
                                       <m:mo>&#8722;</m:mo>
                                       <m:mn>0</m:mn>
                                    </m:mrow>
                                    <m:mrow>
                                       <m:mover accent="true">
                                          <m:mi>&#963;</m:mi>
                                          <m:mo>^</m:mo>
                                       </m:mover>
                                       <m:mo stretchy="false">(</m:mo>
                                       <m:msub>
                                          <m:mover accent="true">
                                             <m:mi>&#956;</m:mi>
                                             <m:mo>&#175;</m:mo>
                                          </m:mover>
                                          <m:mi>B</m:mi>
                                       </m:msub>
                                       <m:mo>&#8722;</m:mo>
                                       <m:msub>
                                          <m:mover accent="true">
                                             <m:mi>&#956;</m:mi>
                                             <m:mo>&#175;</m:mo>
                                          </m:mover>
                                          <m:mi>A</m:mi>
                                       </m:msub>
                                       <m:mo stretchy="false">)</m:mo>
                                    </m:mrow>
                                 </m:mfrac>
                              </m:mrow>
                              <m:mo>)</m:mo>
                           </m:mrow>
                        </m:mrow>
                        <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWG0baDcqGHsislcqWGZbWCcqWG0baDcqWGHbqycqWG0baDcqWGPbqAcqWGZbWCcqWG0baDcqWGPbqAcqWGJbWycqGH9aqpdaqadaqaamaalaaabaGaeiikaGccciGaf8hVd0MbaebadaWgaaWcbaGaemOqaieabeaakiabgkHiTiqb=X7aTzaaraWaaSbaaSqaaiabdgeabbqabaGccqGGPaqkcqGHsislcqaIWaamaeaacuWFdpWCgaqcaiabcIcaOiqb=X7aTzaaraWaaSbaaSqaaiabdkeacbqabaGccqGHsislcuWF8oqBgaqeamaaBaaaleaacqWGbbqqaeqaaOGaeiykaKcaaaGaayjkaiaawMcaaaaa@5349@</m:annotation>
                     </m:semantics>
                  </m:math>
               </display-formula>
            </p>
            <p>There are obvious deficiencies in this analytical paradigm; the primary one arises from the fact that microarray data follows a non-normal distribution <abbrgrp><abbr bid="B6">6</abbr></abbrgrp>. It can be argued that the t-test results remain asymptotically correct for any distribution but only if the number of replicates tend to infinity. This makes an experiment logistically difficult and cost-prohibitive. Thus, in a global sense, due to the inaccurate definition of H<sub>0 </sub>the classical approach does not verify if the genes are truly differentially regulated or are false positives of a stochastic origin.</p>
            <p>Multiple hypothesis testing is the other element that needs to be addressed. Table <tblr tid="T1">1</tblr> recounts its fundamental principles and the error rates as summarized in Benjamini and Hochberg <abbrgrp><abbr bid="B7">7</abbr></abbrgrp>; the following summary of error rates utilizes the symbols defined in the table. Fundamentally, there are two types of error rates <abbrgrp><abbr bid="B7">7</abbr><abbr bid="B8">8</abbr><abbr bid="B9">9</abbr><abbr bid="B10">10</abbr><abbr bid="B11">11</abbr></abbrgrp>: type I or false positive (M<sub>0</sub>-F) and type II or false negative (T); the former is associated with rejection of a true null hypothesis and the latter with the failure to reject the false null hypothesis. For microarray experiments, control of the type I error under any combination of the true and false hypotheses is critical <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>. Briefly, the type I error rates are:</p>
            <tbl id="T1">
               <title>
                  <p>Table 1</p>
               </title>
               <caption>
                  <p>Multiple hypothesis testing matrix</p>
               </caption>
               <tblbdy cols="4">
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c ca="left">
                        <p>
                           <b>Hypotheses: Accepted</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>Hypotheses: Rejected</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>Total</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c cspan="4">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Null: True</p>
                        <p>(Null: no differential change)</p>
                     </c>
                     <c ca="left">
                        <p>F</p>
                     </c>
                     <c ca="left">
                        <p>M0 &#8211; F</p>
                     </c>
                     <c ca="left">
                        <p>M0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Alternative: True Or Null: False</p>
                     </c>
                     <c ca="left">
                        <p>T</p>
                     </c>
                     <c ca="left">
                        <p>M1 &#8211; T</p>
                     </c>
                     <c ca="left">
                        <p>M1</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Total</p>
                     </c>
                     <c ca="left">
                        <p>S</p>
                     </c>
                     <c ca="left">
                        <p>M &#8211; S</p>
                     </c>
                     <c ca="left">
                        <p>M</p>
                     </c>
                  </r>
               </tblbdy>
            </tbl>
            <p>i) <it>Per family error rate </it>(<it>PFER</it>): refers to the expected number of false positives (Eqn. 3);</p>
            <p>ii) <it>Per comparison error rate </it>(<it>PCER</it>): refers to the expected value of the number of false positives compared to the number of hypotheses (Eqn. 4);</p>
            <p>iii) <it>Family-wise error rate </it>(<it>FWER</it>): refers to the probability of at least one false positive <abbrgrp><abbr bid="B7">7</abbr><abbr bid="B12">12</abbr><abbr bid="B13">13</abbr><abbr bid="B14">14</abbr></abbrgrp> (Eqn. 5);</p>
            <p>iv) <it>False Discovery Rate </it>(<it>FDR</it>): refers to the expected proportion of false positives among rejected hypotheses <abbrgrp><abbr bid="B7">7</abbr><abbr bid="B12">12</abbr><abbr bid="B15">15</abbr><abbr bid="B16">16</abbr></abbrgrp> (Eqn. 6);</p>
            <p>
               <display-formula id="M3"><it>PFER </it>= <it>E</it>(<it>M</it><sub>0 </sub>- <it>F</it>)</display-formula>
            </p>
            <p>
               <display-formula id="M4">
                  <m:math name="1471-2105-8-359-i3" xmlns:m="http://www.w3.org/1998/Math/MathML">
                     <m:semantics>
                        <m:mrow>
                           <m:mi>P</m:mi>
                           <m:mi>C</m:mi>
                           <m:mi>E</m:mi>
                           <m:mi>R</m:mi>
                           <m:mo>=</m:mo>
                           <m:mfrac>
                              <m:mrow>
                                 <m:mi>E</m:mi>
                                 <m:mo stretchy="false">(</m:mo>
                                 <m:msub>
                                    <m:mi>M</m:mi>
                                    <m:mn>0</m:mn>
                                 </m:msub>
                                 <m:mo>&#8722;</m:mo>
                                 <m:mi>F</m:mi>
                                 <m:mo stretchy="false">)</m:mo>
                              </m:mrow>
                              <m:mi>M</m:mi>
                           </m:mfrac>
                        </m:mrow>
                        <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGqbaucqWGdbWqcqWGfbqrcqWGsbGucqGH9aqpdaWcaaqaaiabdweafjabcIcaOiabd2eannaaBaaaleaacqaIWaamaeqaaOGaeyOeI0IaemOrayKaeiykaKcabaGaemyta0eaaaaa@3A6B@</m:annotation>
                     </m:semantics>
                  </m:math>
               </display-formula>
            </p>
            <p>
               <display-formula id="M5"><it>FWER </it>= <it>p</it>((<it>M</it><sub>0 </sub>- <it>F</it>) > 0)</display-formula>
            </p>
            <p>
               <display-formula id="M6">
                  <m:math name="1471-2105-8-359-i4" xmlns:m="http://www.w3.org/1998/Math/MathML">
                     <m:semantics>
                        <m:mrow>
                           <m:mi>F</m:mi>
                           <m:mi>D</m:mi>
                           <m:mi>R</m:mi>
                           <m:mo>=</m:mo>
                           <m:mi>E</m:mi>
                           <m:mrow>
                              <m:mo>(</m:mo>
                              <m:mrow>
                                 <m:mfrac>
                                    <m:mrow>
                                       <m:mo stretchy="false">(</m:mo>
                                       <m:msub>
                                          <m:mi>M</m:mi>
                                          <m:mn>0</m:mn>
                                       </m:msub>
                                       <m:mo>&#8722;</m:mo>
                                       <m:mi>F</m:mi>
                                       <m:mo stretchy="false">)</m:mo>
                                    </m:mrow>
                                    <m:mrow>
                                       <m:mo stretchy="false">(</m:mo>
                                       <m:mi>M</m:mi>
                                       <m:mo>&#8722;</m:mo>
                                       <m:mi>S</m:mi>
                                       <m:mo stretchy="false">)</m:mo>
                                    </m:mrow>
                                 </m:mfrac>
                              </m:mrow>
                              <m:mo>)</m:mo>
                           </m:mrow>
                           <m:mtext>&#160;</m:mtext>
                           <m:mi>i</m:mi>
                           <m:mi>f</m:mi>
                           <m:mtext>&#160;</m:mtext>
                           <m:mo stretchy="false">(</m:mo>
                           <m:mi>M</m:mi>
                           <m:mo>&#8722;</m:mo>
                           <m:mi>S</m:mi>
                           <m:mo stretchy="false">)</m:mo>
                           <m:mo>></m:mo>
                           <m:mn>0</m:mn>
                        </m:mrow>
                        <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGgbGrcqWGebarcqWGsbGucqGH9aqpcqWGfbqrdaqadaqaamaalaaabaGaeiikaGIaemyta00aaSbaaSqaaiabicdaWaqabaGccqGHsislcqWGgbGrcqGGPaqkaeaacqGGOaakcqWGnbqtcqGHsislcqWGtbWucqGGPaqkaaaacaGLOaGaayzkaaGaeeiiaaIaemyAaKMaemOzayMaeeiiaaIaeiikaGIaemyta0KaeyOeI0Iaem4uamLaeiykaKIaeyOpa4JaeGimaadaaa@49C2@</m:annotation>
                     </m:semantics>
                  </m:math>
               </display-formula>
            </p>
            <p>In general, the procedures controlling the FWER are more conservative than the ones controlling the PCER or FDR. Hence the classical Bonferroni correction (FWER) is much too stringent for array-based differential regulation studies, especially encompassing partially coding to non-coding regions. The SAM algorithm, built on a re-sampling framework, virtually, increases the number of replicates, via random permutation of the sample labels; this formalizes a refinement to the multiple-testing corrected p-value and false positive rate (<it>FPR</it>) and is referred to as the q-value and FDR <abbrgrp><abbr bid="B5">5</abbr><abbr bid="B17">17</abbr><abbr bid="B18">18</abbr><abbr bid="B19">19</abbr></abbrgrp>. Fundamentally, the test statistic in SAM (Eqn. 7) is a t-statistic variant where a constant (<it>s</it><sub><it>0</it></sub>) is added to the variance term in the denominator. s<sub>0</sub>, computed empirically controls for a reduction in SNR with decreasing differential change. Traditionally, the d-statistic is defined as a function of a <it>gene </it>under two conditions <it>A </it>and <it>B</it>, but in gSAM this has been generalized to a genomic interval, <it>I</it>.</p>
            <p>
               <display-formula id="M7">
                  <m:math name="1471-2105-8-359-i5" xmlns:m="http://www.w3.org/1998/Math/MathML">
                     <m:semantics>
                        <m:mrow>
                           <m:mi>d</m:mi>
                           <m:mo>&#8722;</m:mo>
                           <m:mi>s</m:mi>
                           <m:mi>t</m:mi>
                           <m:mi>a</m:mi>
                           <m:mi>t</m:mi>
                           <m:mi>i</m:mi>
                           <m:mi>s</m:mi>
                           <m:mi>t</m:mi>
                           <m:mi>i</m:mi>
                           <m:mi>c</m:mi>
                           <m:mo stretchy="false">(</m:mo>
                           <m:mi>I</m:mi>
                           <m:mo stretchy="false">)</m:mo>
                           <m:mo>=</m:mo>
                           <m:mrow>
                              <m:mo>(</m:mo>
                              <m:mrow>
                                 <m:mfrac>
                                    <m:mrow>
                                       <m:mo stretchy="false">(</m:mo>
                                       <m:msub>
                                          <m:mover accent="true">
                                             <m:mi>&#956;</m:mi>
                                             <m:mo>&#175;</m:mo>
                                          </m:mover>
                                          <m:mi>B</m:mi>
                                       </m:msub>
                                       <m:mo stretchy="false">(</m:mo>
                                       <m:mi>I</m:mi>
                                       <m:mo stretchy="false">)</m:mo>
                                       <m:mo>&#8722;</m:mo>
                                       <m:msub>
                                          <m:mover accent="true">
                                             <m:mi>&#956;</m:mi>
                                             <m:mo>&#175;</m:mo>
                                          </m:mover>
                                          <m:mi>A</m:mi>
                                       </m:msub>
                                       <m:mo stretchy="false">(</m:mo>
                                       <m:mi>I</m:mi>
                                       <m:mo stretchy="false">)</m:mo>
                                       <m:mo stretchy="false">)</m:mo>
                                    </m:mrow>
                                    <m:mrow>
                                       <m:mi>s</m:mi>
                                       <m:mo stretchy="false">(</m:mo>
                                       <m:mi>I</m:mi>
                                       <m:mo stretchy="false">)</m:mo>
                                       <m:mo>&#8722;</m:mo>
                                       <m:msub>
                                          <m:mi>s</m:mi>
                                          <m:mn>0</m:mn>
                                       </m:msub>
                                    </m:mrow>
                                 </m:mfrac>
                              </m:mrow>
                              <m:mo>)</m:mo>
                           </m:mrow>
                        </m:mrow>
                        <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGKbazcqGHsislcqWGZbWCcqWG0baDcqWGHbqycqWG0baDcqWGPbqAcqWGZbWCcqWG0baDcqWGPbqAcqWGJbWycqGGOaakcqWGjbqscqGGPaqkcqGH9aqpdaqadaqaamaalaaabaGaeiikaGccciGaf8hVd0MbaebadaWgaaWcbaGaemOqaieabeaakiabcIcaOiabdMeajjabcMcaPiabgkHiTiqb=X7aTzaaraWaaSbaaSqaaiabdgeabbqabaGccqGGOaakcqWGjbqscqGGPaqkcqGGPaqkaeaacqWGZbWCcqGGOaakcqWGjbqscqGGPaqkcqGHsislcqWGZbWCdaWgaaWcbaGaeGimaadabeaaaaaakiaawIcacaGLPaaaaaa@56EE@</m:annotation>
                     </m:semantics>
                  </m:math>
               </display-formula>
            </p>
         </sec>
         <sec>
            <st>
               <p>Basics of gSAM</p>
            </st>
            <p>The purpose of gSAM is to transform genomic intervals of enrichment originating from changes in RNA levels, binding/occupancy of transcriptional regulators, modified histones, levels of chromatin modification, among others, to a temporal/spatial differential signature for these elements. Unlike gene-centric expression arrays which have a 3' end bias or exon arrays which specifically interrogate the exons, in tiling arrays multiple probes interrogate a single locus in an unbiased manner. Here a locus can encompass multiple transcripts and/or interaction sites of multiple regulatory elements and can include exons, introns and un-translated regions (<it>UTRs</it>). Therefore, instead of computing a gene-level (with 3' bias) differential measure, in gSAM the differential measurement follows a piece-wise response model. This is described in Eqn. 8 where <it>ig, ex, in, UTR </it>correspond to the inter-genic, exon, intron and un-translated region respectively. Under this model, the time-series, for example, is subdivided into a number of <it>logical </it>segments &#8211; in this case the underlying logic is governed by enrichment &#8211; and differential change is summarized over each segment. Fundamentally, the definition of the segments is completely independent of annotations. This enables extension of the methodology to beyond the framework of annotations and hence to those genomes other than human where the annotation is not as complete. However, the availability of annotation facilitates visualization of the outcome from a protein-coding perspective.</p>
            <p>The piece-wise system model in gSAM supports two inherent characteristics of transcriptome data &#8211; heterogeneity and superposition of states. This is demonstrated in Eqn. 9 where, for example, the inter-genic component is a superposition of states with <it>n </it>variable enrichment patterns. According to current knowledge, SAM assumes a homogenous and static one-gene, one-locus model; the implicit assumption being that differential response is not a complex, superposition of responses but is a homogenous/uniform response across all nucleotides comprising a gene. Consideration of a gene as an atomic entity does not enable discrimination of the differential response of alternative isoforms in a developmental transcriptome or even exons versus introns versus UTRs for a transcript. The <it>system </it>definition which is the primary point of differentiation between SAM and gSAM consequently impacts the interpretation of the differential changes at a cellular level. The following sections elucidate the rationale underlying gSAM and discuss its impact on transcriptome-level differential data analysis.</p>
            <p>
               <display-formula id="M8"><it>f</it>(&#916;)<sub><it>A</it>,<it>B </it></sub>&#8594; (<it>f</it>(<it>ig</it>) + <it>f</it>(<it>ex</it>) + <it>f</it>(<it>in</it>) + <it>f</it>(<it>UTR</it>))<sub><it>A,B</it></sub></display-formula>
            </p>
            <p>
               <display-formula id="M9"><it>f</it>(<it>ig</it>) &#8594; <it>&#967;</it>(<it>ig</it>)<sub>1 </sub>+ ... + <it>&#967;</it>(<it>ig</it>)<sub><it>n</it></sub></display-formula>
            </p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Methods</p>
         </st>
         <sec>
            <st>
               <p>Time course experimental design</p>
            </st>
            <p>The development and application of gSAM are presented here in the context of a differential time-course study conducted in HL60 cell-line, performed as part of the Encyclopedia of DNA elements (<it>ENCODE</it>) consortium project <abbrgrp><abbr bid="B20">20</abbr><abbr bid="B21">21</abbr><abbr bid="B22">22</abbr></abbrgrp>. The cells are stimulated by all-trans retinoic acid (<it>ATRA</it>) for distinct time periods &#8211; 0, 2, 8 and 32 hours &#8211; to induce differentiation along the granulocytic lineage. The biological motivation of the experiments is to study the associated processes of RNA transcription, the binding of transcriptional regulators, and to identify regions of histone modification. The differential RNA transcription <abbrgrp><abbr bid="B23">23</abbr><abbr bid="B24">24</abbr><abbr bid="B25">25</abbr></abbrgrp> comprises a single sample experiment where the level of RNA is monitored with respect to a baseline as quantified via negative control probes based on bacterial sequences. The differential modification study involves a two-sample chromatin immunoprecipitation on array/chip (<it>ChIP on chip</it>) experiment <abbrgrp><abbr bid="B26">26</abbr><abbr bid="B27">27</abbr><abbr bid="B28">28</abbr><abbr bid="B29">29</abbr><abbr bid="B30">30</abbr><abbr bid="B31">31</abbr><abbr bid="B32">32</abbr><abbr bid="B33">33</abbr><abbr bid="B34">34</abbr></abbrgrp> comprising a control and treatment. The control is amplified genomic DNA (without immunoprecipitation), and the treatment is the chromatin immunoprecipitated sample. The assay protocol used in these experiments is not strand specific; this is a method of sample preparation that does not preserve information about the strand of the nucleic acids, hence it cannot be discerned conclusively as to which strand the observed effects originate from. An example of such method is conversion of RNA into double-stranded cDNA (used in these experiments) for measuring RNA abundance. Details regarding the specific assays have been described in the literature <abbrgrp><abbr bid="B24">24</abbr><abbr bid="B34">34</abbr></abbrgrp>. The example biological datasets used to demonstrate the application of gSAM include RNA (whole-cell poly A+), a trio of modified histones: <it>H4Kac4</it>-Histone H4 tetra-acetylated lysine (<it>HisH4</it>), <it>H3K9K14ac2 </it>-Histone H3 K9 K14 di-acetylated (<it>H3K9K14D</it>), <it>H3K27me3</it>-Histone H3 tri-methylated lysine 27(<it>H3K27T</it>) and <it>RNA Polymerase II-</it>8WG16 antibody against pre-initiation complex form (<it>RNA PolII</it>). For each regulation factor investigated, the experiment comprises three to five biological replicates, per time-point, with duplicate hybridizations performed for each.</p>
         </sec>
         <sec>
            <st>
               <p>Tiling arrays &#8211; the Affymetrix platform</p>
            </st>
            <p>These arrays employ short oligonucleotide probe-pairs (<it>pp</it>), of length 25 bases (25 mers), to interrogate a specified genomic region <abbrgrp><abbr bid="B35">35</abbr><abbr bid="B36">36</abbr><abbr bid="B37">37</abbr></abbrgrp>. Each pp includes a perfect match (<it>PM</it>) and a mismatch (<it>MM</it>). The MM sequence is identical to its corresponding PM sequence, except for the central (13<sup>th</sup>) base. The objective of pairing a PM with a MM is to estimate the degree of cross-hybridization. A variety of tiling arrays with different probe and feature resolution are used for genome-wide transcription regulation studies <abbrgrp><abbr bid="B38">38</abbr><abbr bid="B39">39</abbr><abbr bid="B40">40</abbr></abbrgrp>. The probe resolution defines the center to center distance between two adjacent probes, in genomic space. A 22 base-pair (<it>bp</it>) probe resolution for 25 mers implies a 3 bp overlap (on average) between 2 adjacent probes. Currently, the probe resolution of the arrays encompasses a range from 5 bp-35 bp with probe synthesis areas of 5<it>&#956; </it>and 10<it>&#956;</it>.</p>
         </sec>
         <sec>
            <st>
               <p>Application of gSAM for detection of differential change</p>
            </st>
            <p>gSAM operates on enrichment site-level data and estimates the temporal differential regulation signature. The H<sub>0 </sub>in this study is that there is no difference in RNA levels, histone modification or binding of regulators due to stimulation by ATRA over a designated time-course. Although the methodology encompasses both PM and MM probes, it can be extended to PM only arrays or exclude MM probes. The following sections detail the algorithmic steps:</p>
            <p>I. Preliminary data analysis</p>
            <p>II. Definition of the pair-wise system</p>
            <p>III. Modeling the input to gSAM</p>
            <p>IV. Probe-level signal intensity/enrichment summarization</p>
            <p>V. Summarization of differential response</p>
            <sec>
               <st>
                  <p>I. Preliminary data analysis</p>
               </st>
               <p>This section summarizes the steps for the generation of sites corresponding to RNA or modified histone and/or RNA PolII binding.</p>
               <p>i) Probe-level normalization: This includes median scaling and quantile normalization <abbrgrp><abbr bid="B41">41</abbr><abbr bid="B42">42</abbr></abbrgrp> of all PM and MM probes. The former is a linear operation, where fluorescence data from the arrays are scaled relative to the median intensity distributions of all arrays. The quantile normalization accounts for linear and non-linear effects.</p>
               <p>ii) RNA profiling experiments: The pp signal intensity (<it>SI</it>) distribution is computed based on PM-MM intensity; regions of detected RNA referred to as transfrags (transcribed fragments) are then estimated against a baseline transcription signal derived from both positive and negative bacterial controls on the same microarray. For the data presented here, the intensity threshold for transcriptionally positive probes is set based on a 5 percent FPR <abbrgrp><abbr bid="B23">23</abbr><abbr bid="B24">24</abbr><abbr bid="B25">25</abbr></abbrgrp>.</p>
               <p>iii) ChIP on chip experiments: The probe-level signal enrichment (<it>SE</it>) profiles are generated based on a comparison of the signal intensity of the treatment and control probe pairs (Eqn. 10). Putative transcriptional regulatory elements (<it>TREs</it>) are generated per factor on a per time point basis using the Rank Statistics based site prediction algorithm <abbrgrp><abbr bid="B43">43</abbr></abbrgrp>. In general, the enriched fragments exhibit the following types of bias <abbrgrp><abbr bid="B31">31</abbr></abbrgrp>:</p>
               <p>a) Canonical regulatory sites have a 5'end bias;</p>
               <p>b) Non-canonical sites are distal to the annotated 5'ends<abbrgrp><abbr bid="B22">22</abbr><abbr bid="B31">31</abbr><abbr bid="B44">44</abbr></abbrgrp>;</p>
               <p>
                  <display-formula id="M10">
                     <m:math name="1471-2105-8-359-i6" xmlns:m="http://www.w3.org/1998/Math/MathML">
                        <m:semantics>
                           <m:mrow>
                              <m:mi>S</m:mi>
                              <m:msub>
                                 <m:mi>E</m:mi>
                                 <m:mrow>
                                    <m:mi>p</m:mi>
                                    <m:mi>p</m:mi>
                                 </m:mrow>
                              </m:msub>
                              <m:mo>=</m:mo>
                              <m:mfrac>
                                 <m:mrow>
                                    <m:mi>max</m:mi>
                                    <m:mo>&#8289;</m:mo>
                                    <m:msub>
                                       <m:mrow>
                                          <m:mo stretchy="false">(</m:mo>
                                          <m:mn>1</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:mi>log</m:mi>
                                          <m:mo>&#8289;</m:mo>
                                          <m:msub>
                                             <m:mrow>
                                                <m:mo stretchy="false">(</m:mo>
                                                <m:mi>P</m:mi>
                                                <m:mi>M</m:mi>
                                                <m:mo>&#8722;</m:mo>
                                                <m:mi>M</m:mi>
                                                <m:mi>M</m:mi>
                                                <m:mo stretchy="false">)</m:mo>
                                             </m:mrow>
                                             <m:mrow>
                                                <m:mi>T</m:mi>
                                                <m:mi>r</m:mi>
                                                <m:mi>e</m:mi>
                                                <m:mi>a</m:mi>
                                                <m:mi>t</m:mi>
                                                <m:mi>m</m:mi>
                                                <m:mi>e</m:mi>
                                                <m:mi>n</m:mi>
                                                <m:mi>t</m:mi>
                                             </m:mrow>
                                          </m:msub>
                                          <m:mo stretchy="false">)</m:mo>
                                       </m:mrow>
                                       <m:mrow>
                                          <m:mi>p</m:mi>
                                          <m:mi>p</m:mi>
                                       </m:mrow>
                                    </m:msub>
                                 </m:mrow>
                                 <m:mrow>
                                    <m:mi>max</m:mi>
                                    <m:mo>&#8289;</m:mo>
                                    <m:msub>
                                       <m:mrow>
                                          <m:mo stretchy="false">(</m:mo>
                                          <m:mn>1</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:mi>log</m:mi>
                                          <m:mo>&#8289;</m:mo>
                                          <m:msub>
                                             <m:mrow>
                                                <m:mo stretchy="false">(</m:mo>
                                                <m:mi>P</m:mi>
                                                <m:mi>M</m:mi>
                                                <m:mo>&#8722;</m:mo>
                                                <m:mi>M</m:mi>
                                                <m:mi>M</m:mi>
                                                <m:mo stretchy="false">)</m:mo>
                                             </m:mrow>
                                             <m:mrow>
                                                <m:mi>C</m:mi>
                                                <m:mi>o</m:mi>
                                                <m:mi>n</m:mi>
                                                <m:mi>t</m:mi>
                                                <m:mi>r</m:mi>
                                                <m:mi>o</m:mi>
                                                <m:mi>l</m:mi>
                                             </m:mrow>
                                          </m:msub>
                                          <m:mo stretchy="false">)</m:mo>
                                       </m:mrow>
                                       <m:mrow>
                                          <m:mi>p</m:mi>
                                          <m:mi>p</m:mi>
                                       </m:mrow>
                                    </m:msub>
                                 </m:mrow>
                              </m:mfrac>
                           </m:mrow>
                           <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGtbWucqWGfbqrdaWgaaWcbaGaemiCaaNaemiCaahabeaakiabg2da9maalaaabaGagiyBa0MaeiyyaeMaeiiEaGNaeiikaGIaeGymaeJaeiilaWIagiiBaWMaei4Ba8Maei4zaCMaeiikaGIaemiuaaLaemyta0KaeyOeI0Iaemyta0Kaemyta0KaeiykaKYaaSbaaSqaaiabdsfaujabdkhaYjabdwgaLjabdggaHjabdsha0jabd2gaTjabdwgaLjabd6gaUjabdsha0bqabaGccqGGPaqkdaWgaaWcbaGaemiCaaNaemiCaahabeaaaOqaaiGbc2gaTjabcggaHjabcIha4jabcIcaOiabigdaXiabcYcaSiGbcYgaSjabc+gaVjabcEgaNjabcIcaOiabdcfaqjabd2eanjabgkHiTiabd2eanjabd2eanjabcMcaPmaaBaaaleaacqWGdbWqcqWGVbWBcqWGUbGBcqWG0baDcqWGYbGCcqWGVbWBcqWGSbaBaeqaaOGaeiykaKYaaSbaaSqaaiabdchaWjabdchaWbqabaaaaaaa@7526@</m:annotation>
                        </m:semantics>
                     </m:math>
                  </display-formula>
               </p>
            </sec>
            <sec>
               <st>
                  <p>II. Definition of the pair-wise system</p>
               </st>
               <p>This section provides a rationale for the choice of pair-wise conditions at which the cellular responses are profiled and analyzed.</p>
               <p>Cellular response to an exogenous stimulus is not necessarily synchronized; however the reaction is on a very short time-scale &#8211; essentially continuous. In capturing events over time-points separated on the order of hours, a discrete time-differential response is generated by sampling a continuous time-signal. The sampling process is analogous to an <it>accumulator system</it><abbrgrp><abbr bid="B45">45</abbr></abbrgrp> where the output state of the system (<it>y</it>) at any given time <it>n </it>is essentially a summation/accumulation of the response of all its states (<it>x</it>) up to the present state x[n] (Eqn. 11). Although the superimposed cellular states measured by the experiment cannot be de-convoluted, fundamentally because of the mentioned system characteristic, there is information loss when the states are profiled at large time intervals. Temporal resolution therefore is a critical component of the experimental design. The optimal resolution varies for different responding functional elements, conditions of cell growth and cell/tissue/organism type, with a likelihood of non-linear increments in the time-series. In this particular study, the choice of 0-2-8-32 hours represents the undifferentiated state, an early time point (2 hours), a midway time point (8 hours) and a moderately late time point (32 hours) based on the previously published profiles of HL60 differentiation <abbrgrp><abbr bid="B46">46</abbr><abbr bid="B47">47</abbr></abbrgrp>.</p>
               <p>The associated property that needs to be appreciated is that the differential response follows a <it>cascade connection </it>model <abbrgrp><abbr bid="B45">45</abbr></abbrgrp>. Here the un-stimulated(baseline) state at the 0 hour serves as the original input to the system; the output(response) at the 2 hour serves as the input to the 8 hour with the output of the 32 hour (latest) being the overall output. Thus any measurement performed at any state other than the baseline has a memory of the system even prior to its current state.</p>
               <p>
                  <display-formula id="M11">
                     <m:math name="1471-2105-8-359-i7" xmlns:m="http://www.w3.org/1998/Math/MathML">
                        <m:semantics>
                           <m:mrow>
                              <m:mi>y</m:mi>
                              <m:mo stretchy="false">[</m:mo>
                              <m:mi>n</m:mi>
                              <m:mo stretchy="false">]</m:mo>
                              <m:mo>=</m:mo>
                              <m:mstyle displaystyle="true">
                                 <m:munderover>
                                    <m:mo>&#8721;</m:mo>
                                    <m:mrow>
                                       <m:mi>t</m:mi>
                                       <m:mo>=</m:mo>
                                       <m:mn>0</m:mn>
                                    </m:mrow>
                                    <m:mi>n</m:mi>
                                 </m:munderover>
                                 <m:mrow>
                                    <m:mi>x</m:mi>
                                    <m:mo stretchy="false">[</m:mo>
                                    <m:mi>t</m:mi>
                                    <m:mo stretchy="false">]</m:mo>
                                 </m:mrow>
                              </m:mstyle>
                           </m:mrow>
                           <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWG5bqEcqGGBbWwcqWGUbGBcqGGDbqxcqGH9aqpdaaeWbqaaiabdIha4jabcUfaBjabdsha0jabc2faDbWcbaGaemiDaqNaeyypa0JaeGimaadabaGaemOBa4ganiabggHiLdaaaa@3F88@</m:annotation>
                        </m:semantics>
                     </m:math>
                  </display-formula>
               </p>
               <p>
                  <display-formula id="M12">
                     <m:math name="1471-2105-8-359-i8" xmlns:m="http://www.w3.org/1998/Math/MathML">
                        <m:semantics>
                           <m:mrow>
                              <m:mtable columnalign="left">
                                 <m:mtr columnalign="left">
                                    <m:mtd columnalign="left">
                                       <m:mrow>
                                          <m:mi>&#916;</m:mi>
                                          <m:mi>y</m:mi>
                                          <m:mo>=</m:mo>
                                          <m:mi>y</m:mi>
                                          <m:mo stretchy="false">[</m:mo>
                                          <m:mi>T</m:mi>
                                          <m:mo stretchy="false">]</m:mo>
                                          <m:mo>&#8722;</m:mo>
                                          <m:mi>y</m:mi>
                                          <m:mo stretchy="false">[</m:mo>
                                          <m:mi>T</m:mi>
                                          <m:mo>&#8722;</m:mo>
                                          <m:mi>n</m:mi>
                                          <m:mo stretchy="false">]</m:mo>
                                          <m:mo>=</m:mo>
                                          <m:mstyle displaystyle="true">
                                             <m:munderover>
                                                <m:mo>&#8721;</m:mo>
                                                <m:mrow>
                                                   <m:mi>t</m:mi>
                                                   <m:mo>=</m:mo>
                                                   <m:mn>0</m:mn>
                                                </m:mrow>
                                                <m:mi>T</m:mi>
                                             </m:munderover>
                                             <m:mrow>
                                                <m:mi>x</m:mi>
                                                <m:mo stretchy="false">[</m:mo>
                                                <m:mi>t</m:mi>
                                                <m:mo stretchy="false">]</m:mo>
                                             </m:mrow>
                                          </m:mstyle>
                                          <m:mo>&#8722;</m:mo>
                                          <m:mstyle displaystyle="true">
                                             <m:munderover>
                                                <m:mo>&#8721;</m:mo>
                                                <m:mrow>
                                                   <m:mi>t</m:mi>
                                                   <m:mo>=</m:mo>
                                                   <m:mn>0</m:mn>
                                                </m:mrow>
                                                <m:mrow>
                                                   <m:mi>n</m:mi>
                                                   <m:mo>&lt;</m:mo>
                                                   <m:mi>T</m:mi>
                                                </m:mrow>
                                             </m:munderover>
                                             <m:mrow>
                                                <m:mi>x</m:mi>
                                                <m:mo stretchy="false">[</m:mo>
                                                <m:mi>t</m:mi>
                                                <m:mo stretchy="false">]</m:mo>
                                             </m:mrow>
                                          </m:mstyle>
                                          <m:mo>=</m:mo>
                                          <m:mstyle displaystyle="true">
                                             <m:munderover>
                                                <m:mo>&#8721;</m:mo>
                                                <m:mrow>
                                                   <m:mi>t</m:mi>
                                                   <m:mo>=</m:mo>
                                                   <m:mn>0</m:mn>
                                                </m:mrow>
                                                <m:mrow>
                                                   <m:mi>n</m:mi>
                                                   <m:mo>&lt;</m:mo>
                                                   <m:mi>T</m:mi>
                                                </m:mrow>
                                             </m:munderover>
                                             <m:mrow>
                                                <m:mi>x</m:mi>
                                                <m:mo stretchy="false">[</m:mo>
                                                <m:mi>t</m:mi>
                                                <m:mo stretchy="false">]</m:mo>
                                             </m:mrow>
                                          </m:mstyle>
                                          <m:mo>+</m:mo>
                                          <m:mstyle displaystyle="true">
                                             <m:munderover>
                                                <m:mo>&#8721;</m:mo>
                                                <m:mrow>
                                                   <m:mi>t</m:mi>
                                                   <m:mo>=</m:mo>
                                                   <m:mi>n</m:mi>
                                                   <m:mo>&lt;</m:mo>
                                                   <m:mi>T</m:mi>
                                                </m:mrow>
                                                <m:mi>T</m:mi>
                                             </m:munderover>
                                             <m:mrow>
                                                <m:mi>x</m:mi>
                                                <m:mo stretchy="false">[</m:mo>
                                                <m:mi>t</m:mi>
                                                <m:mo stretchy="false">]</m:mo>
                                             </m:mrow>
                                          </m:mstyle>
                                          <m:mo>&#8722;</m:mo>
                                          <m:mstyle displaystyle="true">
                                             <m:munderover>
                                                <m:mo>&#8721;</m:mo>
                                                <m:mrow>
                                                   <m:mi>t</m:mi>
                                                   <m:mo>=</m:mo>
                                                   <m:mn>0</m:mn>
                                                </m:mrow>
                                                <m:mrow>
                                                   <m:mi>n</m:mi>
                                                   <m:mo>&lt;</m:mo>
                                                   <m:mi>T</m:mi>
                                                </m:mrow>
                                             </m:munderover>
                                             <m:mrow>
                                                <m:mi>x</m:mi>
                                                <m:mo stretchy="false">[</m:mo>
                                                <m:mi>t</m:mi>
                                                <m:mo stretchy="false">]</m:mo>
                                             </m:mrow>
                                          </m:mstyle>
                                       </m:mrow>
                                    </m:mtd>
                                 </m:mtr>
                                 <m:mtr columnalign="left">
                                    <m:mtd columnalign="left">
                                       <m:mrow>
                                          <m:mi>&#916;</m:mi>
                                          <m:mi>y</m:mi>
                                          <m:mo>=</m:mo>
                                          <m:mi>y</m:mi>
                                          <m:mo stretchy="false">[</m:mo>
                                          <m:mi>T</m:mi>
                                          <m:mo stretchy="false">]</m:mo>
                                          <m:mo>&#8722;</m:mo>
                                          <m:mi>y</m:mi>
                                          <m:mo stretchy="false">[</m:mo>
                                          <m:mi>T</m:mi>
                                          <m:mo>&#8722;</m:mo>
                                          <m:mi>n</m:mi>
                                          <m:mo stretchy="false">]</m:mo>
                                          <m:mo>=</m:mo>
                                          <m:mstyle displaystyle="true">
                                             <m:munderover>
                                                <m:mo>&#8721;</m:mo>
                                                <m:mrow>
                                                   <m:mi>t</m:mi>
                                                   <m:mo>=</m:mo>
                                                   <m:mi>n</m:mi>
                                                   <m:mo>&lt;</m:mo>
                                                   <m:mi>T</m:mi>
                                                </m:mrow>
                                                <m:mi>T</m:mi>
                                             </m:munderover>
                                             <m:mrow>
                                                <m:mi>x</m:mi>
                                                <m:mo stretchy="false">[</m:mo>
                                                <m:mi>t</m:mi>
                                                <m:mo stretchy="false">]</m:mo>
                                             </m:mrow>
                                          </m:mstyle>
                                       </m:mrow>
                                    </m:mtd>
                                 </m:mtr>
                              </m:mtable>
                           </m:mrow>
                           <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaafaqaaeGabaaabaGaeuiLdqKaemyEaKNaeyypa0JaemyEaKNaei4waSLaemivaqLaeiyxa0LaeyOeI0IaemyEaKNaei4waSLaemivaqLaeyOeI0IaemOBa4Maeiyxa0Laeyypa0ZaaabCaeaacqWG4baEcqGGBbWwcqWG0baDcqGGDbqxaSqaaiabdsha0jabg2da9iabicdaWaqaaiabdsfaubqdcqGHris5aOGaeyOeI0YaaabCaeaacqWG4baEcqGGBbWwcqWG0baDcqGGDbqxaSqaaiabdsha0jabg2da9iabicdaWaqaaiabd6gaUjabgYda8iabdsfaubqdcqGHris5aOGaeyypa0ZaaabCaeaacqWG4baEcqGGBbWwcqWG0baDcqGGDbqxaSqaaiabdsha0jabg2da9iabicdaWaqaaiabd6gaUjabgYda8iabdsfaubqdcqGHris5aOGaey4kaSYaaabCaeaacqWG4baEcqGGBbWwcqWG0baDcqGGDbqxaSqaaiabdsha0jabg2da9iabd6gaUjabgYda8iabdsfaubqaaiabdsfaubqdcqGHris5aOGaeyOeI0YaaabCaeaacqWG4baEcqGGBbWwcqWG0baDcqGGDbqxaSqaaiabdsha0jabg2da9iabicdaWaqaaiabd6gaUjabgYda8iabdsfaubqdcqGHris5aaGcbaGaeuiLdqKaemyEaKNaeyypa0JaemyEaKNaei4waSLaemivaqLaeiyxa0LaeyOeI0IaemyEaKNaei4waSLaemivaqLaeyOeI0IaemOBa4Maeiyxa0Laeyypa0ZaaabCaeaacqWG4baEcqGGBbWwcqWG0baDcqGGDbqxaSqaaiabdsha0jabg2da9iabd6gaUjabgYda8iabdsfaubqaaiabdsfaubqdcqGHris5aaaaaaa@ABD3@</m:annotation>
                        </m:semantics>
                     </m:math>
                  </display-formula>
               </p>
               <p>These two properties, motivates the quantification of the temporal differential response as a <it>pairwise time-forward </it>system encompassing very specific reference and target time-points; Eqn. 12 generalizes this concept. A time-forward analysis implies that samples obtained at (T-n)<sup>th </sup>and T<sup>th</sup>(where n&lt;T) time-points comprise the reference and target respectively. Here the reference precedes the target time-point, and may or may not represent the un-stimulated condition. While measurement of a response between two time-points might seem trivial, given the underlying accumulator and cascade connection properties, the choice of these time-points is critical; a pairwise combination at random, without appropriate de-convolution will result in erroneous interpretation of the underlying biology. Measurement of first order effects, which is the difference between two contiguous time-points profiled, is simpler to interpret than higher order effects, which include differential profiling across non-contiguous time-points potentially involving non-linear effects.</p>
               <p>For the described time-series experiment, a measurement of an increased differential response from 0 to 8 hours, without knowledge of the 2 hour time-point, does not uniquely characterize the underlying differential mechanism. Any of the following are equally probable for a given locus:</p>
               <p>i) Between 0 and 8 hours, there is a steady increase in response to ATRA stimulus;</p>
               <p>ii) There is an initial decrease in the response between 0 and 2 hours, with a subsequent increase between 2 and 8 hours;</p>
               <p>iii) There is a rapid increase in response between 0 and 2 hours with a significantly slower decrease in response between 2 and 8 hours;</p>
               <p>Quantification of the first order response slopes significantly reduces the complexity of interpretation. All results presented here comprise the first order differential analysis. Although, gSAM is presented in a temporal context, it is equally applicable in a spatial one; this facilitates quantification of differential response across tissue-types derived from normal (reference) and diseased (target) sites, for example.</p>
            </sec>
            <sec>
               <st>
                  <p>III. Modeling the input to gSAM</p>
               </st>
               <p>This section summarizes the logical segmentation of the enrichment regions which constitutes the input model for gSAM.</p>
               <p>Based on published research <abbrgrp><abbr bid="B23">23</abbr><abbr bid="B24">24</abbr><abbr bid="B25">25</abbr><abbr bid="B48">48</abbr><abbr bid="B49">49</abbr><abbr bid="B50">50</abbr></abbrgrp>, presumed non-coding transcripts of yet unknown functionality are widespread in the genome. Thus the analysis of differential response should not be biased toward protein-coding genes but be based on a generalized framework. The generalization in gSAM arises primarily from the piece-wise modeling of the input, which simultaneously accommodates for responses from genic and inter-genic regions.</p>
               <p>The gSAM piece-wise model introduced in Eqn. 8&#8211;9 is elaborated in Eqn. 13&#8211;14. Fragmented enrichment sites &#8211; histone/RNA PolII binding sites, transfrags of canonical and/or non-canonical origin, emanating from the coding and/or non-coding regions of the genome, independent of annotation, serve as the input. Eqn. 13 defines the <it>probe-specific </it>input, where the atomic entity is a probe-pair; the differential response is estimated individually for each pp encompassing an enrichment site (<it>&#949;</it>).</p>
               <p>
                  <display-formula id="M13"><it>Input </it>= {<it>pp </it>| <it>pp </it>&#8712; <it>&#949;</it>}</display-formula>
               </p>
               <p>
                  <display-formula id="M14">
                     <m:math name="1471-2105-8-359-i9" xmlns:m="http://www.w3.org/1998/Math/MathML">
                        <m:semantics>
                           <m:mrow>
                              <m:mtable columnalign="left">
                                 <m:mtr columnalign="left">
                                    <m:mtd columnalign="left">
                                       <m:mrow>
                                          <m:mi>I</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>p</m:mi>
                                          <m:mi>u</m:mi>
                                          <m:mi>t</m:mi>
                                          <m:mo>=</m:mo>
                                          <m:mstyle displaystyle="true">
                                             <m:munder>
                                                <m:mo>&#8721;</m:mo>
                                                <m:mi>g</m:mi>
                                             </m:munder>
                                             <m:mrow>
                                                <m:mi>G</m:mi>
                                                <m:mi>e</m:mi>
                                                <m:mi>n</m:mi>
                                                <m:mi>i</m:mi>
                                                <m:msub>
                                                   <m:mi>c</m:mi>
                                                   <m:mi>g</m:mi>
                                                </m:msub>
                                             </m:mrow>
                                          </m:mstyle>
                                          <m:mo>+</m:mo>
                                          <m:mstyle displaystyle="true">
                                             <m:munder>
                                                <m:mo>&#8721;</m:mo>
                                                <m:mrow>
                                                   <m:mi>i</m:mi>
                                                   <m:mi>g</m:mi>
                                                </m:mrow>
                                             </m:munder>
                                             <m:mrow>
                                                <m:mi>I</m:mi>
                                                <m:mi>n</m:mi>
                                                <m:mi>t</m:mi>
                                                <m:mi>e</m:mi>
                                                <m:mi>r</m:mi>
                                                <m:mi>g</m:mi>
                                                <m:mi>e</m:mi>
                                                <m:mi>n</m:mi>
                                                <m:mi>i</m:mi>
                                                <m:msub>
                                                   <m:mi>c</m:mi>
                                                   <m:mrow>
                                                      <m:mi>i</m:mi>
                                                      <m:mi>g</m:mi>
                                                   </m:mrow>
                                                </m:msub>
                                             </m:mrow>
                                          </m:mstyle>
                                       </m:mrow>
                                    </m:mtd>
                                 </m:mtr>
                                 <m:mtr columnalign="left">
                                    <m:mtd columnalign="left">
                                       <m:mrow>
                                          <m:mi>G</m:mi>
                                          <m:mi>e</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>i</m:mi>
                                          <m:mi>c</m:mi>
                                          <m:mo stretchy="false">[</m:mo>
                                          <m:mi>g</m:mi>
                                          <m:mo stretchy="false">]</m:mo>
                                          <m:mo>=</m:mo>
                                          <m:mstyle displaystyle="true">
                                             <m:munder>
                                                <m:mo>&#8721;</m:mo>
                                                <m:mi>&#945;</m:mi>
                                             </m:munder>
                                             <m:mrow>
                                                <m:mi>p</m:mi>
                                                <m:msub>
                                                   <m:mi>p</m:mi>
                                                   <m:mi>&#945;</m:mi>
                                                </m:msub>
                                             </m:mrow>
                                          </m:mstyle>
                                          <m:mo>;</m:mo>
                                          <m:mi>I</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>t</m:mi>
                                          <m:mi>e</m:mi>
                                          <m:mi>r</m:mi>
                                          <m:mi>g</m:mi>
                                          <m:mi>e</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>i</m:mi>
                                          <m:mi>c</m:mi>
                                          <m:mo stretchy="false">[</m:mo>
                                          <m:mi>i</m:mi>
                                          <m:mi>g</m:mi>
                                          <m:mo stretchy="false">]</m:mo>
                                          <m:mo>=</m:mo>
                                          <m:mstyle displaystyle="true">
                                             <m:munder>
                                                <m:mo>&#8721;</m:mo>
                                                <m:mi>&#946;</m:mi>
                                             </m:munder>
                                             <m:mrow>
                                                <m:mi>p</m:mi>
                                                <m:msub>
                                                   <m:mi>p</m:mi>
                                                   <m:mi>&#946;</m:mi>
                                                </m:msub>
                                             </m:mrow>
                                          </m:mstyle>
                                          <m:mo>;</m:mo>
                                          <m:mtext>&#160;</m:mtext>
                                          <m:mi>p</m:mi>
                                          <m:mi>p</m:mi>
                                          <m:mo>&#8712;</m:mo>
                                          <m:mi>&#949;</m:mi>
                                       </m:mrow>
                                    </m:mtd>
                                 </m:mtr>
                                 <m:mtr columnalign="left">
                                    <m:mtd columnalign="left">
                                       <m:mrow>
                                          <m:mi>G</m:mi>
                                          <m:mi>e</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>i</m:mi>
                                          <m:mi>c</m:mi>
                                          <m:mo>=</m:mo>
                                          <m:mo>{</m:mo>
                                          <m:mi>E</m:mi>
                                          <m:mi>x</m:mi>
                                          <m:mi>o</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mo>,</m:mo>
                                          <m:mi>I</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>t</m:mi>
                                          <m:mi>r</m:mi>
                                          <m:mi>o</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mo>,</m:mo>
                                          <m:mi>U</m:mi>
                                          <m:mi>T</m:mi>
                                          <m:mi>R</m:mi>
                                          <m:mo>}</m:mo>
                                       </m:mrow>
                                    </m:mtd>
                                 </m:mtr>
                                 <m:mtr columnalign="left">
                                    <m:mtd columnalign="left">
                                       <m:mrow>
                                          <m:mi>I</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>t</m:mi>
                                          <m:mi>e</m:mi>
                                          <m:mi>r</m:mi>
                                          <m:mi>g</m:mi>
                                          <m:mi>e</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>i</m:mi>
                                          <m:mi>c</m:mi>
                                          <m:mo>=</m:mo>
                                          <m:mo>{</m:mo>
                                          <m:mi>F</m:mi>
                                          <m:mi>u</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>c</m:mi>
                                          <m:mi>t</m:mi>
                                          <m:mi>i</m:mi>
                                          <m:mi>o</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>a</m:mi>
                                          <m:mi>l</m:mi>
                                          <m:mi>C</m:mi>
                                          <m:mi>o</m:mi>
                                          <m:mi>m</m:mi>
                                          <m:mi>p</m:mi>
                                          <m:mi>l</m:mi>
                                          <m:mi>e</m:mi>
                                          <m:mi>x</m:mi>
                                          <m:mi>i</m:mi>
                                          <m:mi>t</m:mi>
                                          <m:mi>y</m:mi>
                                          <m:mo>,</m:mo>
                                          <m:mi>S</m:mi>
                                          <m:mi>e</m:mi>
                                          <m:mi>q</m:mi>
                                          <m:mi>u</m:mi>
                                          <m:mi>e</m:mi>
                                          <m:mi>n</m:mi>
                                          <m:mi>c</m:mi>
                                          <m:mi>e</m:mi>
                                          <m:mi>C</m:mi>
                                          <m:mi>o</m:mi>
                                          <m:mi>m</m:mi>
                                          <m:mi>p</m:mi>
                                          <m:mi>l</m:mi>
                                          <m:mi>e</m:mi>
                                          <m:mi>x</m:mi>
                                          <m:mi>i</m:mi>
                                          <m:mi>t</m:mi>
                                          <m:mi>y</m:mi>
                                          <m:mo>,</m:mo>
                                          <m:mn>...</m:mn>
                                          <m:mo>}</m:mo>
                                       </m:mrow>
                                    </m:mtd>
                                 </m:mtr>
                              </m:mtable>
                           </m:mrow>
                           <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaafaqaaeabbaaaaeaacqWGjbqscqWGUbGBcqWGWbaCcqWG1bqDcqWG0baDcqGH9aqpdaaeqbqaaiabdEeahjabdwgaLjabd6gaUjabdMgaPjabdogaJnaaBaaaleaacqWGNbWzaeqaaaqaaiabdEgaNbqab0GaeyyeIuoakiabgUcaRmaaqafabaGaemysaKKaemOBa4MaemiDaqNaemyzauMaemOCaiNaem4zaCMaemyzauMaemOBa4MaemyAaKMaem4yam2aaSbaaSqaaiabdMgaPjabdEgaNbqabaaabaGaemyAaKMaem4zaCgabeqdcqGHris5aaGcbaGaem4raCKaemyzauMaemOBa4MaemyAaKMaem4yamMaei4waSLaem4zaCMaeiyxa0Laeyypa0ZaaabuaeaacqWGWbaCcqWGWbaCdaWgaaWcbaacciGae8xSdegabeaaaeaacqWFXoqyaeqaniabggHiLdGccqGG7aWocqWGjbqscqWGUbGBcqWG0baDcqWGLbqzcqWGYbGCcqWGNbWzcqWGLbqzcqWGUbGBcqWGPbqAcqWGJbWycqGGBbWwcqWGPbqAcqWGNbWzcqGGDbqxcqGH9aqpdaaeqbqaaiabdchaWjabdchaWnaaBaaaleaacqWFYoGyaeqaaaqaaiab=j7aIbqab0GaeyyeIuoakiabcUda7iabbccaGiabdchaWjabdchaWjabg2da9iab=v7aLbqaaiabdEeahjabdwgaLjabd6gaUjabdMgaPjabdogaJjabg2da9iabcUha7jabdweafjabdIha4jabd+gaVjabd6gaUjabcYcaSiabdMeajjabd6gaUjabdsha0jabdkhaYjabd+gaVjabd6gaUjabcYcaSiabdwfavjabdsfaujabdkfasjabc2ha9bqaaiabdMeajjabd6gaUjabdsha0jabdwgaLjabdkhaYjabdEgaNjabdwgaLjabd6gaUjabdMgaPjabdogaJjabg2da9iabcUha7jabdAeagjabdwha1jabd6gaUjabdogaJjabdsha0jabdMgaPjabd+gaVjabd6gaUjabdggaHjabdYgaSjabdoeadjabd+gaVjabd2gaTjabdchaWjabdYgaSjabdwgaLjabdIha4jabdMgaPjabdsha0jabdMha5jabcYcaSiabdofatjabdwgaLjabdghaXjabdwha1jabdwgaLjabd6gaUjabdogaJjabdwgaLjabdoeadjabd+gaVjabd2gaTjabdchaWjabdYgaSjabdwgaLjabdIha4jabdMgaPjabdsha0jabdMha5jabcYcaSiabc6caUiabc6caUiabc6caUiabc2ha9baaaaa@F4C3@</m:annotation>
                        </m:semantics>
                     </m:math>
                  </display-formula>
               </p>
               <p>Eqn. 14 defines a <it>probe-set </it>specific input; here, a probe-set (<it>&#945;</it>/<it>&#946;</it>), which is a cluster of probe-pairs, interrogates a sequence of nucleotides spanning <it>&#949;</it>. This suggests a heterogeneous model which at the most generalized level is a superposition of genic (<it>g</it>) and inter-genic (<it>ig</it>) states. The genic state can be further partitioned, based on annotations, into elements such as exons, introns and UTRs. Analysis can be performed independently on each element or on the cumulative elements. The flexibility of selective inclusion of genic elements enhances the localization and specificity of estimation of gene-expression effects in various pathways. For example, this enables the localization of activity in regulatory elements with a 3'UTR bias <abbrgrp><abbr bid="B51">51</abbr></abbrgrp>. A response aggregated over the entirety of the genic components would not elucidate this. The inter-genic state is a mixture model as well, encompassing variants in terms of functional and sequence complexity. It can be partitioned based on regulatory potential, for example presence of CpG islands associated with gene expression, regions of sequence conservation, or sequence motifs for transcription factor(s). The framework to selectively integrate elements in the model, solely driven by co-regulation effects, highlights the adaptability and power of gSAM.</p>
            </sec>
            <sec>
               <st>
                  <p>IV. Probe-level signal intensity/enrichment summarization</p>
               </st>
               <p>This section details the probe-specific summarization of signal intensity/enrichment, for pair-wise sample (<it>s</it>).</p>
               <p>Subsequent to the setup of the gSAM model, a probe-specific or probe-set specific log transformed SI or SE is computed per replicate(<it>r </it>&#8712; <it>s</it>) and time-point (<it>t </it>&#8712; <it>s</it>). This constitutes the input value in gSAM. For a probe-set based system, the transfrags/binding sites are used to define the <it>domain </it>over which the intensity/enrichment summary is computed. Frequently, the enrichment sites as determined in the reference and target pairs have non-identical spatial bounds. This might be because of biological reasons: the same locus might not be enriched (expressed) at two different time-points; alternatively, this might be due to edge artifacts in the definition of enrichment sites <abbrgrp><abbr bid="B43">43</abbr></abbrgrp>, or a combination of both. This incongruity of bounds requires a formal definition of domains based on the segmentation of enriched fragments that are unique to, versus common in both the reference and target samples. Once the domains are defined for a pair-wise sample, they are held constant across all replicates in the relevant reference and target. The following describes the definition of domains and estimation of domain-specific summaries:</p>
               <p>i) The first step involves outlier removal. The low complexity filter (<it>LCF</it>) estimates the median absolute deviation (<it>MAD</it>) of the SE/SI of all pp belonging to a fragment. All fragments with a MAD value of zero are assumed to represent signal from low complexity repeat probes and are therefore eliminated. It is possible that this step introduces a false negative bias, by eliminating enriched fragments composed of a shorter run of probes; this is not detrimental, since the statistical confidence obtained from less than three contiguous probes(66 nucleotides) &#8211; is low (data not shown). Independent of LCF enriched fragments with a minimum of three probe-sets is retained. Since these filters address a tiling design and/or sequence specific properties, their effect is assumed to be equivalent for all replicates and is therefore assessed based on a single replicate. Additionally, the MAD serves as a metric of co-regulation. Based on the cumulative MAD distribution, as estimated across all enriched fragments, a user defined threshold can be determined and only fragments with less than the MAD cutoff can be retained for analysis. This filter is replicate quality dependent and should be used with caution, since it can introduce incongruity of bounds.</p>
               <p>ii) In this step, the enriched fragments are ordered and labeled &#8211; independently in the reference and target &#8211; based on their genomic location. It is probable that the bounds of a single enrichment site in the reference might overlap with multiple sites in the target (or vice-versa). In this case, the single reference site (R) has <it>n </it>associated target labels, where n corresponds to the number of distinct target sites (T<sub>1</sub>...T<sub>n</sub>) it overlaps with. This labeling scheme identifies the membership of fragments and their relationship across the reference and target.</p>
               <p>iii) This step entails identification of genomic segments with overlapping (including partial overlaps) spatial bounds of enrichment between the reference and target. A union of the bounds of the overlapping regions is created. This is referred to as the <it>overlapping enrichment domain </it>(<it>OED</it>) distribution for a given sample(<it>s</it>) (Eqn. 15). The OED therefore comprises a mixture of enriched segments: a common enrichment fraction between reference and target, and a unique fraction with evidence of enrichment in either the reference or target.</p>
               <p>iv) In a given OED, the probes interrogating the intersecting and unique enrichment fractions comprise the <it>FragmentDomainI (FDI) </it>and <it>FragmentDomainU (FDU)</it>, respectively (Eqn. 17&#8211;18).</p>
               <p>v) This step localizes genomic segments with non-overlapping bounds of enrichment between the reference and target. By definition, these segments have no enriched probes in their counterpart samples and these are referred to as <it>null </it>probes. This is referred to as the <it>non-overlapping enrichment domain </it>(<it>NOED</it>) (Eqn. 18).</p>
               <p>vi) Subsequent to the segmentation, there exist three distinct types of enrichment domains: FDI, FDU and NOED. Elements of each domain are denoted by start and stop coordinates to specify their bounds (Eqn. 15) and their reference and target specific labels. For a pair-wise analysis these domains are uniquely labeled and ordered based on their genomic location. A comparison of the labels across the domains &#8211; FDI and FDU &#8211; potentially identifies differential response from alternative isoforms and/or provides a tool to isolate differential signal from selective genic elements, for example UTRs. Since the enrichment sites are generated in the first-place via a multi-replicate analysis, the spatial bounds of the above domains are held constant across all replicates within a given reference or target.</p>
               <p>
                  <display-formula id="M15"><it>OED</it><sub><it>s </it></sub>= (<it>Enrichment</it><sub><it>R </it></sub>&#8899; <it>Enrichment</it><sub><it>T</it></sub>)<sub><it>s </it></sub><it>where Enrichment</it><sub><it>R </it></sub>&#8745; <it>Enrichment</it><sub><it>T </it></sub>> 0</display-formula>
               </p>
               <p>
                  <display-formula id="M16"><it>FDI</it><sub><it>s </it></sub>= (<it>Enrichment</it><sub><it>R </it></sub>&#8898; <it>Enrichment</it><sub><it>T</it></sub>)<sub><it>s </it></sub><it>where FDI</it><sub><it>s </it></sub>&#8834; <it>OED</it><sub><it>s</it></sub></display-formula>
               </p>
               <p>
                  <display-formula id="M17"><it>FDU</it><sub><it>s </it></sub>= <it>OED</it><sub><it>s </it></sub>- <it>FDI</it><sub><it>s</it></sub></display-formula>
               </p>
               <p>
                  <display-formula id="M18"><it>NOED</it><sub><it>s </it></sub>= (<it>Enrichment</it><sub><it>R </it></sub>&#8745; <it>Enrichment</it><sub><it>T </it></sub>= &#934;)<sub><it>s </it></sub><it>where </it>&#934; : <it>nullset</it></display-formula>
               </p>
               <p>vii) RNA transcription: The gSAM test-statistic operates on the log transformed probe-pair signal intensity (<it>SI</it><sub><it>pp</it></sub>). For a domain-specific input, a trimmed mean signal intensity(<it>TRSI</it><sub><it>drt</it></sub>) estimate (Eqn. 19) is generated for each of the domain elements on a per replicate(<it>r</it>), per time-point (<it>t</it>) basis. This estimate considers all probes belonging to an element in a given domain and uses an optimal trim factor of <it>&#954; </it>= 0.2.</p>
               <p>viii) ChIP on chip: gSAM operates on the winsorized mean (robust estimator) (Eqn. 20) of the SE of all probe-pairs per element per labeled domain; these estimates are also computed per replicate and per time-point. In Eqn. 20, <it>n </it>refers to the number of probe-pairs in an element of a given domain and <it>k </it>refers to the number of smallest and largest observations that are replaced with (k+1)<sup>th </sup>smallest and largest observations respectively.</p>
               <p>ix) The <it>null </it>probes in NOED are not set to zero, but their true signal intensities considered. This obviates the missing data problem in gSAM.</p>
               <p>
                  <display-formula id="M19"><it>TRSI</it><sub><it>drt </it></sub>= <it>TrimmedMean</it>((log(<it>SI</it><sub>1</sub>...<it>SI</it><sub><it>pp</it></sub>)),0.2)<sub><it>drt</it></sub></display-formula>
               </p>
               <p>
                  <display-formula id="M20">
                     <m:math name="1471-2105-8-359-i10" xmlns:m="http://www.w3.org/1998/Math/MathML">
                        <m:semantics>
                           <m:mrow>
                              <m:mtable>
                                 <m:mtr>
                                    <m:mtd>
                                       <m:mrow>
                                          <m:msub>
                                             <m:mover accent="true">
                                                <m:mi>x</m:mi>
                                                <m:mo>&#175;</m:mo>
                                             </m:mover>
                                             <m:mi>k</m:mi>
                                          </m:msub>
                                          <m:mo>=</m:mo>
                                          <m:mfrac>
                                             <m:mn>1</m:mn>
                                             <m:mrow>
                                                <m:mo stretchy="false">(</m:mo>
                                                <m:mi>n</m:mi>
                                                <m:mo>&#8722;</m:mo>
                                                <m:mn>2</m:mn>
                                                <m:mi>k</m:mi>
                                                <m:mo stretchy="false">)</m:mo>
                                             </m:mrow>
                                          </m:mfrac>
                                          <m:mstyle displaystyle="true">
                                             <m:munderover>
                                                <m:mo>&#8721;</m:mo>
                                                <m:mrow>
                                                   <m:mi>i</m:mi>
                                                   <m:mo>=</m:mo>
                                                   <m:mi>k</m:mi>
                                                   <m:mo>+</m:mo>
                                                   <m:mn>1</m:mn>
                                                </m:mrow>
                                                <m:mrow>
                                                   <m:mi>n</m:mi>
                                                   <m:mo>&#8722;</m:mo>
                                                   <m:mi>k</m:mi>
                                                </m:mrow>
                                             </m:munderover>
                                             <m:mrow>
                                                <m:msub>
                                                   <m:mi>x</m:mi>
                                                   <m:mi>i</m:mi>
                                                </m:msub>
                                             </m:mrow>
                                          </m:mstyle>
                                       </m:mrow>
                                    </m:mtd>
                                 </m:mtr>
                                 <m:mtr>
                                    <m:mtd>
                                       <m:mrow>
                                          <m:msub>
                                             <m:mi>x</m:mi>
                                             <m:mi>w</m:mi>
                                          </m:msub>
                                          <m:mo>=</m:mo>
                                          <m:mfrac>
                                             <m:mrow>
                                                <m:mi>k</m:mi>
                                                <m:msub>
                                                   <m:mi>x</m:mi>
                                                   <m:mrow>
                                                      <m:mo stretchy="false">(</m:mo>
                                                      <m:mi>k</m:mi>
                                                      <m:mo>+</m:mo>
                                                      <m:mn>1</m:mn>
                                                      <m:mo stretchy="false">)</m:mo>
                                                   </m:mrow>
                                                </m:msub>
                                                <m:mo>+</m:mo>
                                                <m:msub>
                                                   <m:mover accent="true">
                                                      <m:mi>x</m:mi>
                                                      <m:mo>&#175;</m:mo>
                                                   </m:mover>
                                                   <m:mi>k</m:mi>
                                                </m:msub>
                                                <m:mo stretchy="false">(</m:mo>
                                                <m:mi>n</m:mi>
                                                <m:mo>&#8722;</m:mo>
                                                <m:mn>2</m:mn>
                                                <m:mi>k</m:mi>
                                                <m:mo stretchy="false">)</m:mo>
                                                <m:mo>+</m:mo>
                                                <m:mi>k</m:mi>
                                                <m:msub>
                                                   <m:mi>x</m:mi>
                                                   <m:mrow>
                                                      <m:mo stretchy="false">(</m:mo>
                                                      <m:mi>n</m:mi>
                                                      <m:mo>&#8722;</m:mo>
                                                      <m:mi>k</m:mi>
                                                      <m:mo stretchy="false">)</m:mo>
                                                   </m:mrow>
                                                </m:msub>
                                             </m:mrow>
                                             <m:mi>n</m:mi>
                                          </m:mfrac>
                                       </m:mrow>
                                    </m:mtd>
                                 </m:mtr>
                              </m:mtable>
                           </m:mrow>
                           <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaafaqadeGabaaabaGafmiEaGNbaebadaWgaaWcbaGaem4AaSgabeaakiabg2da9maalaaabaGaeGymaedabaGaeiikaGIaemOBa4MaeyOeI0IaeGOmaiJaem4AaSMaeiykaKcaamaaqahabaGaemiEaG3aaSbaaSqaaiabdMgaPbqabaaabaGaemyAaKMaeyypa0Jaem4AaSMaey4kaSIaeGymaedabaGaemOBa4MaeyOeI0Iaem4AaSganiabggHiLdaakeaacqWG4baEdaWgaaWcbaGaem4DaChabeaakiabg2da9maalaaabaGaem4AaSMaemiEaG3aaSbaaSqaaiabcIcaOiabdUgaRjabgUcaRiabigdaXiabcMcaPaqabaGccqGHRaWkcuWG4baEgaqeamaaBaaaleaacqWGRbWAaeqaaOGaeiikaGIaemOBa4MaeyOeI0IaeGOmaiJaem4AaSMaeiykaKIaey4kaSIaem4AaSMaemiEaG3aaSbaaSqaaiabcIcaOiabd6gaUjabgkHiTiabdUgaRjabcMcaPaqabaaakeaacqWGUbGBaaaaaaaa@6802@</m:annotation>
                        </m:semantics>
                     </m:math>
                  </display-formula>
               </p>
               <p>Fig. <figr fid="F1">1(A&#8211;B)</figr> presents the schematics of input bounds to gSAM. In this example, 0 and 2 hours constitute the reference and target, respectively. In panel A the probes and enrichment regions are represented by red and light blue bars, respectively; panel B demonstrates the domains defined in Eqn. 15&#8211;18; FDU: 1, 3, 5, 7; FDI: 2, 6, NOED: 4. Fig. <figr fid="F1">1C</figr> applies the domain definition to biological data, where the top five SE graphs (blue) are representative of five replicates for the reference and the bottom five graphs (yellow) are representative of the target; all graphs have been scaled identically. Additionally, there are three levels of annotation between the reference and target graphs; the annotations in blue and yellow are representative of enrichment fragments unique to the reference and target, respectively; the annotation in red is representative of the intersecting enrichment fragments. Peaks representative of the binding of putative TREs are evident upstream of and at the 5'end of the <it>HIC </it>gene as well as in the first few exons and in the introns. This data is visualized in the Integrated Genome Browser (<it>IGB</it>) <abbrgrp><abbr bid="B52">52</abbr></abbrgrp>.</p>
               <fig id="F1">
                  <title>
                     <p>Figure 1</p>
                  </title>
                  <caption>
                     <p>(A-B) A schematic defining FragmentDomainI, FragmentDomainU and NOED</p>
                  </caption>
                  <text>
                     <p>(A-B) A schematic defining FragmentDomainI, FragmentDomainU and NOED. In this example, 0 and 2 hours constitute the reference and target, respectively. In panel A the probes and the enrichment regions are represented in red and light blue respectively; in panel B, the enrichment fragments FragmentDomainU: 1, 3, 5, 7: FragmentDomainI:2, 6; NOED:4. (C): This is an IGB visualization of the fragmented enrichment domains as defined in the reference (blue) and target (yellow). The SE graphs represent biological data from 5 replicates each for reference and target. There are three levels of annotation between the reference and target graphs; the annotations in blue and yellow are representative of enrichment fragments unique to the reference and target, respectively; the annotation in red is representative of the intersecting enrichment fragments. Peaks representative of the binding of putative regulatory elements are evident upstream of and at the 5'end of the HIC gene and in the first few exons and in the introns.</p>
                  </text>
                  <graphic file="1471-2105-8-359-1"/>
               </fig>
               <p>The sample-size can be improved by considering probe-specific as opposed to domain-specific input values. This comes at the cost of computational time and potential increase in noise; it requires a post-differential analysis data clustering, followed by the application of a more conservative FDR-based significance threshold for downstream significance analysis. Alternatively, gaussian smoothing of the probe-level data can also enhance the SNR. Finally, it has been validated that the differential transcription/regulation outcome under the probe and domain-specific inputs are consistent with one another (R<sup>2</sup>~0.91). All data presented here, unless specifically noted is generated using domain-specific input. For either types of input, no probe-specific correction is required, since in a pair-wise analysis, signal from identical probes are summarized for both the reference and target samples.</p>
            </sec>
            <sec>
               <st>
                  <p>V. Summarization of differential response</p>
               </st>
               <p>This section summarizes the differential response of a pair-wise system. This is encapsulated by the four elements in Eqn. 21:</p>
               <p>
                  <display-formula id="M21">&#915; = (<b>d </b>- <it>statistic</it>,<it>&#948;</it>,<it>FDR</it>,<it>FC</it>)</display-formula>
               </p>
               <p>i) D-statistic: A variant on the t-statistic, it is a standardized differential change index. Fig. <figr fid="F2">2A</figr> presents a comparison of the t-statistic (green) and d-statistic (black) distributions. The additional variance term in the latter is responsible for the shrinkage in the tails, consequently boosting the peak centered about zero. Use of the t-statistic as an estimator of differential change potentially increases the FPR; the d-statistic essentially controls it, thereby optimizing the sensitivity and specificity for differential detection. In the original SAM publication <abbrgrp><abbr bid="B5">5</abbr></abbrgrp>, the core bootstrapping step to generate the null d-statistic distribution is carried out across untreated controls and samples treated with ionizing radiation. For ChIP on chip experiments, the labels on the treatment and control replicates are shuffled across the time-pairs in a balanced manner -with equal number of replicates and entries in the reference and target time-pairs &#8211; to generate a null (expected) distribution. For RNA transcription the signal intensity across the time-points are permuted to generate the null. Fig. <figr fid="F2">2B</figr> presents the observed (y-axis) versus expected (x-axis) d-statistic distribution.</p>
               <fig id="F2">
                  <title>
                     <p>Figure 2</p>
                  </title>
                  <caption>
                     <p>(A) This represents the t-statistic (green) versus d-statistic (black) distribution; the shrinkage in the tails of the latter is due to the additional variance term</p>
                  </caption>
                  <text>
                     <p>(A) This represents the t-statistic (green) versus d-statistic (black) distribution; the shrinkage in the tails of the latter is due to the additional variance term. (B): This is a scatter-plot of the observed (y-axis) versus expected (x-axis) d-statistic distributions, where the open circles represent the data points. The delta (&#177;&#916;) envelope (green) defined about a d-statistic of zero, indicates a null domain &#8211; such that regions above and below the positive and negative &#916; cutoff indicate up-regulation and down-regulation, respectively.</p>
                  </text>
                  <graphic file="1471-2105-8-359-2"/>
               </fig>
               <p>ii) <it>&#948;</it>: The direction of differential change, often referred to as up or down regulation in the gene expression terminology, is also referred to as positive/negative/null differential shift in gSAM.</p>
               <p>iii) FDR: Significance of the differential change is quantified via the FDR or q-value <abbrgrp><abbr bid="B18">18</abbr><abbr bid="B19">19</abbr></abbrgrp>. Analogous to the p-value, a measure of significance in terms of the FPR, the q-value is a measure of significance in terms of the FDR. Q-value is the minimal FDR at which a differential change is deemed significant.</p>
               <p>iv) FC: Microarray-based FC is a commonly used discriminator for differential change. This essentially estimates true biological change over background by comparing signal intensity/enrichment between the reference and target pair.</p>
               <p>Theoretically, any/combination of the output metrics (&#915;) can be used for the segmentation of significant versus non-significant differential response. An early method suggested by Tusher <it>et al </it><abbrgrp><abbr bid="B5">5</abbr></abbrgrp> is that of using a delta (&#177;&#916;) envelope about a d-statistic of zero, to define a null domain &#8211; such that regions above and below the positive and negative &#916; cutoff indicate up-regulation and down-regulation, respectively. This is elucidated in Fig. <figr fid="F2">2B</figr>, where the &#916; envelope is shown in green. This is a symmetric approach about the d-statistic but does not guarantee symmetric FDR bounds for both up and down regulated regions. Researchers <abbrgrp><abbr bid="B53">53</abbr></abbrgrp> have discussed the dependence of the outcome of SAM, specifically, the variation in the list of significant genes, as a function of the initial threshold. The <it>Results </it>discusses the inter-relationship amongst the output metrics, and contrasts the biases introduced by each in the context of transcriptome data.</p>
            </sec>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Results</p>
         </st>
         <p>The results for the following are presented here: RNA transcription, binding of RNA PolII, and modification of histone factors: HisH4, H3K9K14D (both acetylated), H3K27T (methylated). All samples are hybridized to Affymetrix <abbrgrp><abbr bid="B20">20</abbr><abbr bid="B21">21</abbr><abbr bid="B22">22</abbr><abbr bid="B37">37</abbr></abbrgrp> ENCODE tiling arrays of 22 bp (average) probe resolution and 10<it>&#956; </it>feature resolution. The ENCODE array interrogates approximately 1 percent of the human genome &#8211; a coverage of 15 Mb of the non-repeat portions of the 30 Mb &#8211; and does not include regions from chromosomes 3, 17 and Y. Prior to the differential analysis, enriched elements: transfrags <abbrgrp><abbr bid="B23">23</abbr><abbr bid="B24">24</abbr><abbr bid="B25">25</abbr></abbrgrp> and putative TREs <abbrgrp><abbr bid="B43">43</abbr></abbrgrp> are identified. The predictive algorithm, gSAM is applied to enriched elements of inter-genic, intronic and exonic origin, encompassing the entirety of protein-coding and non-coding regions. The output of gSAM is a ranked list of differentially changing transfrags or TREs per pair-wise time-point.</p>
         <p>The following summarizes the results after application of gSAM:</p>
         <p>I. Segmentation metrics for estimation of differential response;</p>
         <p>II. Differential expression of annotated and un-annotated transcribed RNA;</p>
         <p>III. Differential regulation of putative TREs;</p>
         <p>IV. Mono-phasic versus multi-phasic differential regulation clusters;</p>
         <p>V. Loci specific examples;</p>
         <sec>
            <st>
               <p>I. Segmentation metrics for estimation of differential response</p>
            </st>
            <p>This section elucidates the relationship of the segmentation metrics &#8211; FC, FDR/q-value and d-statistics.</p>
            <p>Fig. <figr fid="F3">3A</figr> representing HisH4 differential data at the 0&#8211;2 hour interval, but generalizable to all samples, shows the relationship of FDR, d-statistic and logarithmic fold change distributions along the three axes. The data corroborates the following:</p>
            <fig id="F3">
               <title>
                  <p>Figure 3</p>
               </title>
               <caption>
                  <p>(A) Distribution of the FDR versus d-statistic versus log fold change as shown in the representative 0&#8211;2 hour HisH4 data</p>
               </caption>
               <text>
                  <p>(A) Distribution of the FDR versus d-statistic versus log fold change as shown in the representative 0&#8211;2 hour HisH4 data. (B) 0&#8211;2 hour HisH4 data, corroborates that no length based bias is introduced the estimation of the d-statistic and/or FDR.</p>
               </text>
               <graphic file="1471-2105-8-359-3"/>
            </fig>
            <p>i) There is a strong positive correlation between the computed d-statistic and FC;</p>
            <p>ii) The FDR(q-value) has an inverse relationship with the absolute value of the d-statistic;</p>
            <p>iii) The relationship between FDR and FC is nuanced. Loci (blue oval), exhibiting down regulation (-) at 0.1 percent FDR correspond to a minimum FC of 1.15; in contrast, loci (red oval) exhibiting up-regulation(+) at 0.1 percent FDR correspond to a minimum FC of 4.7. This highlights instances where the arbitrary choice of the 2x FC threshold can result in false positives as well as false negatives.</p>
            <p>In general terms, the results underscore the importance of the choice of segmentation metrics, since this affects the gene-significance ranking.</p>
            <p>Microarrays tend to compress the real FC; hence an observed small change might indicate a more significant underlying differential. The following results corroborate the stringency of a 2x FC threshold in these transcriptome experiments. For RNA mapping data, the median FC as computed exclusively in exons, across all time-intervals, is 1.59. The median, computed across all transfrags of genic and inter-genic origin is reduced to 1.21&#8211;1.35 (across time-intervals). In contrast to the median value (50<sup>th </sup>percentile), a FC of 2x corresponds to 82<sup>nd </sup>&#8211; 96<sup>th </sup>percentile (across time intervals); this indicates the introduction of a potentially significant false negative bias, if 2x is used to estimate significant differential change. Similar observations have been made for the differential modification data. For H3K9K14D the median FC ranges from 1.19&#8211;1.27 over the time intervals; the 99<sup>th </sup>percentile values range from 1.62&#8211;1.79. The other acetylated histone, HisH4, exhibits slightly higher median FC ranging from 1.22&#8211;1.43 with the 99<sup>th </sup>percentile of the distribution ranging from 1.89&#8211;2.14. For RNA PolII, the median range is from 1.28&#8211;1.32 with the 99<sup>th </sup>percentile of the distribution ranging from 1.9&#8211;2.36.</p>
            <p>Results show a R<sup>2</sup>~0.997 between t-statistic and d-statistic. P-value is however not considered for segmentation of differential change. The existences of multiple cutoffs associated with p-values, which as Lee <it>et al </it><abbrgrp><abbr bid="B54">54</abbr></abbrgrp> describe introduce an artificial binarization of bound-unbound states for each protein interaction. Change in the p-value threshold from 0.001 to 0.05 results in an increase of the regulator-promoter interactions by an order of magnitude. However, the q-value (FDR), which makes use of the bounds on the d-statistic that may be asymmetric (Fig. <figr fid="F3">3A</figr>), is a measure of significance that can be associated with each region.</p>
            <p>Finally, it is important to investigate the impact of fragmentation (introduced via domain creation) on the segmentation metrics. Fig. <figr fid="F3">3B</figr> represents 0&#8211;2 hour HisH4 data, where the y-axes in the left and right figures correspond to the fragment length and the x-axes correspond to the d-statistic and percent FDR respectively. This corroborates that no length-based bias is introduced in the estimation of the d-statistic and/or FDR (q-value). A predominantly negative shift in the d-statistic bias indicates that there is increased de-acetylation in the 2 hour (target) compared to the 0 hour (reference). Hence the d-statistic is not expected to be symmetric about the point of no change or zero. All differential expression/regulation data discussed from this point forward utilize FDR as the segmentation metric.</p>
         </sec>
         <sec>
            <st>
               <p>II. Differential expression of annotated and un-annotated transcribed regions</p>
            </st>
            <p>This section summarizes the RNA transcription data.</p>
            <p>In these experiments each of the four time points are represented by three biological replicates (B<sub>1</sub>-B<sub>3</sub>), with each sample hybridized in duplicate. Thus gSAM utilizes six samples per time-point. The median R<sup>2 </sup>across all replicates and over all time intervals is 0.9 with a median slope of 1.12, attesting to high reproducibility across samples.</p>
            <p>The analyses identify and quantify distinctly different temporal and spatial expression profiles. The highest and lowest fraction of differential expression, when summarized across all transfrags, is observed during the 8&#8211;32 hour and 2&#8211;8 hour time intervals, respectively. Down-regulation dominates the former and up-regulation the latter interval. Complete results are tabulated in Table <tblr tid="T2">2</tblr>. On the whole, down-regulation is statistically more significant and 2.4&#8211;2.8 percent of the down-regulated fraction has a FDR less than 12 percent. The annotated transfrags demonstrate dominant up-regulation throughout the entire time-course, with 16x (16.17 versus 0.52) higher up-regulation, at FDR less than 12 percent, observed between 0&#8211;2 hours. Transfrags in the non-coding regions and introns demonstrate a dominant down-regulation at comparable statistical significance. The piece-wise model of gSAM highlights the observation that not all exons of a transcript demonstrate consistent FC. The observed variance in differential expression across exons of transcripts is reduced from 2.25 to 0.25 upon exclusion of terminal exons. Since the assay is not strand specific, it can be hypothesized that the increased variance may reflect effects of differential expression arising from overlapping transcripts. The hypothesis has to be validated via experiments such as strand-specific Northern blots.</p>
            <tbl id="T2">
               <title>
                  <p>Table 2</p>
               </title>
               <caption>
                  <p>Temporal differential expression profile observed in RNA transcription study</p>
               </caption>
               <tblbdy cols="4">
                  <r>
                     <c ca="left">
                        <p>
                           <b>Time-Interval</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>Source</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>Observed Differential Expression</b>
                        </p>
                     </c>
                     <c ca="left">
                        <p>
                           <b>Fraction: Down Regulated</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c cspan="4">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>0&#8211;2 hour</p>
                     </c>
                     <c ca="left">
                        <p>All transfrags (genic+intergenic)</p>
                     </c>
                     <c ca="left">
                        <p>34%</p>
                     </c>
                     <c ca="left">
                        <p>62% of 34%</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>2&#8211;8 hour</p>
                     </c>
                     <c ca="left">
                        <p>All transfrags</p>
                     </c>
                     <c ca="left">
                        <p>19.75%</p>
                     </c>
                     <c ca="left">
                        <p>28% of 19.75%</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>8&#8211;32 hour</p>
                     </c>
                     <c ca="left">
                        <p>All transfrags</p>
                     </c>
                     <c ca="left">
                        <p>53.8%</p>
                     </c>
                     <c ca="left">
                        <p>53% of 53.8%</p>
                     </c>
                  </r>
               </tblbdy>
            </tbl>
            <p>The differential responses across the ENCODE regions vary significantly. The changes in the expression level range from approximately 30 percent of the interrogated bases on chromosome 8 to un-detectable in that on chromosome 10. The general trend that is consistent across the chromosomes is an increase in the percent bp that is differentially expressed as a progression of time, as summarized in Fig. <figr fid="F4">4</figr>. This is potentially due to the fact that as the time intervals increase, the observed differential response incorporates residual changes from the prior state(s) &#8211; upholding the assumption of an accumulator system in gSAM.</p>
            <fig id="F4">
               <title>
                  <p>Figure 4</p>
               </title>
               <caption>
                  <p>The histogram summarizes the differential expression profiles in each ENCODE region on each chromosome</p>
               </caption>
               <text>
                  <p>The histogram summarizes the differential expression profiles in each ENCODE region on each chromosome. Chromosome region specific differential expression is observed across the time-points &#8211; 30 percent change on chromosome 8 to no detectable change on chromosome 10. Globally, the highest fraction of differential expression when summarized across all transfrag is observed between 8&#8211;32 hours (53.8 percent),. The most statistically significant (FDR &#8804;12 percent) changes are also observed between 8&#8211;32 hours.</p>
               </text>
               <graphic file="1471-2105-8-359-4"/>
            </fig>
         </sec>
         <sec>
            <st>
               <p>III. Differential regulation of putative TREs</p>
            </st>
            <p>This section summarizes the ChIP-chip data.</p>
            <p>For modified histones and RNA PolII each of the four time points are represented by five biological replicates (B<sub>1</sub>-B<sub>5</sub>) with each sample hybridized in duplicate. Thus gSAM utilizes ten samples per time-point. The reproducibility in the ChIP on chip samples is more variable compared to the RNA samples. For HisH4 across all time-points, R<sup>2 </sup>ranges from 0.6&#8211;0.71. For H3K27T, R<sup>2 </sup>ranges from 0.54&#8211;0.7 with 32 hour contributing to the low end. For H3K9K14D, R<sup>2 </sup>ranges from 0.6&#8211;0.77 with maximal variation at 32 hours. For RNA PolII, R<sup>2 </sup>is approximately 0.53 percent. In all cases, the ENCODE regions on chromosome 4, which only interrogates un-annotated regions, is a significant contributor to the low end of the correlation distribution. While the permutative framework in gSAM provides resilience against the variance, the overall reduced reproducibility does affect the outcome by resulting in an increased FDR. This can introduce a false negative bias in the segmentation of differential sites. This bias can be exacerbated, if poor reproducibility is coupled with too few replicates available for permutation. This premise has been tested in a simulation experiment where inter-replicate reproducibility is reduced via artificial introduction of noise such that the R<sup>2 </sup>for HisH4 is reduced to &lt;0.50. This resulted in an average increase of FDR by 6 percent.</p>
            <p>The IGB visualization in Fig. <figr fid="F5">5</figr> shows an example of enrichment fragments within and upstream of the second intron of the <it>HIC </it>gene (pink). The upstream fragment is possibly un-annotated (UA), in so far as no RefSeq annotation is available. The top four tracks represent the HisH4 p-value graphs at 0 (red), 2 (light-blue), 8 (dark-blue) and 32 (green) hours, scaled appropriately for comparison; the subsequent track-pairs represent the d-statistic (top) and FDR (bottom) for the 0&#8211;2 (red), 2&#8211;8 (cyan) and 8&#8211;32 (blue) hour time intervals. The horizontal lines associated with the FDR data demarcate the 5 percent threshold.</p>
            <p>There are four salient observations, in this data:</p>
            <p>i. The putative TREs at the 5'end and upstream of the 5'end of <it>HIC </it>exhibit temporally distinct differential regulation profiles. For the 0&#8211;2 hour interval both manifest down-regulation, followed by up-regulation between 2&#8211;8 hours and subsequent down-regulation between 8&#8211;32 hours. This differentiation would not have been evident if broader time-intervals were selected, attesting to the importance of the temporal resolution in overall experimental design.</p>
            <p>ii. The piece-wise model in gSAM facilitates tracking of the variable levels of differential regulation throughout a putative TRE, as well as the associated modulation in FDR. The 0&#8211;2 hour interval the most significant (less than 5 percent FDR) differential change is associated with the peak of the d-statistic in the second intron. This is not afforded by SAM in the current mode.</p>
            <p>iii. Although no annotation is available for the differential regulation observed upstream of <it>HIC</it>, the observed differential activity is also significant at less than 5 percent FDR (0&#8211;2 hours). Due to the underlying permutative framework the FDR estimates of the novel and known regions are on par with one another. This putative and novel TRE constitutes a perfect co-regulation candidate for validation via alternative biochemical means.</p>
            <p>iv. gSAM is a signal enrichment based metric, but as is evident from the figure, there is a strong correlation &#8211; R<sup>2 </sup>> 0.965 &#8211; with the p-value based enrichment peaks <abbrgrp><abbr bid="B42">42</abbr></abbrgrp>.</p>
            <p>While a single example is presented above, the observations can be generalized across the genome. In general, the d-statistic defines the footprint of the putative TRE, and the FDR differential, frequently facilitates identification of the peak of the TRE.</p>
            <fig id="F5">
               <title>
                  <p>Figure 5</p>
               </title>
               <caption>
                  <p>D-statistic versus FDR relationship at putative TREs, across the time-series (IGB view)</p>
               </caption>
               <text>
                  <p>D-statistic versus FDR relationship at putative TREs, across the time-series (IGB view). Examples of enrichment fragments are observed within and upstream of the second intron of the HIC gene (pink). The upstream fragment is possibly un-annotated (UA), in so far as no RefSeq annotation is available. The top four tracks represent the HisH4 p-value graphs at 0 (red), 2 (light-blue), 8 (dark-blue) and 32 (green) hours, scaled appropriately for comparison; the subsequent tracks represent the d-statistic (top) and FDR (bottom) pair for the 0&#8211;2 (red), 2&#8211;8 (cyan) and 8&#8211;32 (blue) hour time intervals. The horizontal lines associated with the FDR data refer to the 5 percent threshold in each case.</p>
               </text>
               <graphic file="1471-2105-8-359-5"/>
            </fig>
            <p>Very few of the biological factors show significant differential change at 5 percent FDR. The following summarizes the significant changes observed in interrogated genic annotation, which is inclusive of exons, introns, UTRs and 250 base-pairs upstream and downstream of 5' and 3' ends respectively. For predictions at 5 percent FDR, among the histone factors HisH4 shows maximal change; 22.9 percent of interrogated genic annotation manifest down-regulation between 0&#8211;2 hours, followed by 6 percent exhibiting up-regulation during 2&#8211;8 hours. No significant changes are observed between 8&#8211;32 hours. For RNA PolII maximal changes are observed between 2&#8211;8 hours; at 5 and 7 percent FDR approximately 1 and 5 percent of the interrogated genic annotation exhibit up-regulation, respectively; the observed increase in differentially regulated loci potentially implies that the choice of 5 percent FDR might be too stringent for the case of RNA PolII. On the basis of loci-level coverage, the ranked list of factors undergoing significant differential change is: HisH4 >> RNA PolII > H3K27D, H3K9K14D. A catalog of loci-level gSAM predictions of differential regulation have been presented in Section V.</p>
         </sec>
         <sec>
            <st>
               <p>IV. Mono-phasic versus multi-phasic differential regulation clusters</p>
            </st>
            <p>This section discusses the classification of the observed differential modification pattern.</p>
            <p>The differential pattern observed in the pair-wise analysis can be broadly classified into the following three phases (summarized in Eqn. 22):</p>
            <p>i. A positive differential shift(<it>&#948;</it><sub>+</sub>) is indicative of increased activity in the target with respect to the reference;</p>
            <p>ii. A negative shift(<it>&#948;</it><sub>-</sub>) is indicative of the converse;</p>
            <p>iii. A null shift (<it>&#948;</it><sub>null</sub>), is indicative of no change in enrichment response.</p>
            <p>
               <display-formula id="M22">
                  <m:math name="1471-2105-8-359-i11" xmlns:m="http://www.w3.org/1998/Math/MathML">
                     <m:semantics>
                        <m:mrow>
                           <m:mtable columnalign="left">
                              <m:mtr columnalign="left">
                                 <m:mtd columnalign="left">
                                    <m:mrow>
                                       <m:msub>
                                          <m:mi>&#948;</m:mi>
      