<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1471-2105-5-103</ui>
   <ji>1471-2105</ji>
   <fm>
      <dochead>Methodology article</dochead>
      <bibl>
         <title>
            <p>Improving the scaling normalization for high-density oligonucleotide GeneChip expression microarrays</p>
         </title>
         <aug>
            <au id="A1" ca="yes">
               <snm>Lu</snm>
               <fnm>Chao</fnm>
               <insr iid="I1"/>
               <email>chao.lu@utoronto.ca</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Microarray Facility, The Centre for Applied Genomics, The Hospital for Sick Children, 555 University Avenue, Elm Wing Room 10104, Toronto, Ontario M5G 1X8, Canada</p>
            </ins>
         </insg>
         <source>BMC Bioinformatics</source>
         <issn>1471-2105</issn>
         <pubdate>2004</pubdate>
         <volume>5</volume>
         <issue>1</issue>
         <fpage>103</fpage>
         <url>http://www.biomedcentral.com/1471-2105/5/103</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">15283861</pubid>
               <pubid idtype="doi">10.1186/1471-2105-5-103</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>17</day>
               <month>7</month>
               <year>2003</year>
            </date>
         </rec>
         <acc>
            <date>
               <day>29</day>
               <month>7</month>
               <year>2004</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>29</day>
               <month>7</month>
               <year>2004</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2004</year>
         <collab>Lu; licensee BioMed Central Ltd.</collab>
         <note>This is an open-access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <kwdg>
         <kwd>Microarray</kwd>
         <kwd>normalization</kwd>
         <kwd>gene expression</kwd>
         <kwd>DNA</kwd>
         <kwd>RNA</kwd>
         <kwd>oligonucleotide</kwd>
         <kwd>GeneChip</kwd>
         <kwd>scaling</kwd>
      </kwdg>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>Normalization is an important step for microarray data analysis to minimize biological and technical variations. Choosing a suitable approach can be critical. The default method in GeneChip expression microarray uses a constant factor, the scaling factor (<it>SF</it>), for every gene on an array. The <it>SF </it>is obtained from a trimmed average signal of the array after excluding the 2% of the probe sets with the highest and the lowest values.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>Among the 76 U34A GeneChip experiments, the total signals on each array showed 25.8% variations in terms of the coefficient of variation, although all microarrays were hybridized with the same amount of biotin-labeled cRNA. The 2% of the probe sets with the highest signals that were normally excluded from <it>SF </it>calculation accounted for 34% to 54% of the total signals (40.7% &#177; 4.4%, mean &#177; sd). In comparison with normalization factors obtained from the median signal or from the mean of the log transformed signal, <it>SF </it>showed the greatest variation. The normalization factors obtained from log transformed signals showed least variation.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusions</p>
               </st>
               <p>Eliminating 40% of the signal data during <it>SF </it>calculation failed to show any benefit. Normalization factors obtained with log transformed signals performed the best. Thus, it is suggested to use the mean of the logarithm transformed data for normalization, rather than the arithmetic mean of signals in GeneChip gene expression microarrays.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <meta>
      <classifications>
         <classification type="bmc" subtype="user_supplied_xml" id="endnote"/>
      </classifications>
   </meta>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>The high-density oligonucleotide microarray, also known as GeneChip<sup>&#174;</sup>, made by Affymetrix Inc (Santa Clara, CA), has been widely used in both academic institutions and industrial companies, and is considered as the "standard" of gene expression microarrays among several platforms. A single GeneChip<sup>&#174; </sup>can hold more than 50,000 probe sets for every gene in human genome. A probe set is a collection of probe pairs that interrogates the same sequence, or set of sequences, and typically contains 11 probe pairs of 25-mer oligonucleotides <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B2">2</abbr><abbr bid="B3">3</abbr></abbrgrp>. Each pair contains the complementary sequence to the gene of interest, the so-called perfect match (PM), and a specificity control, called the Mismatch (MM) <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>. Gene expression level is obtained from the calculation of hybridization intensity to the probe pairs and is referred to as the "signal" <abbrgrp><abbr bid="B4">4</abbr><abbr bid="B5">5</abbr><abbr bid="B6">6</abbr><abbr bid="B7">7</abbr><abbr bid="B8">8</abbr><abbr bid="B9">9</abbr><abbr bid="B10">10</abbr></abbrgrp>. The normalization method used in GeneChip software is called scaling and is defined as an adjustment of the average signal value of all arrays to a common value, the target signal value in order to make the data from multiple arrays comparable <abbrgrp><abbr bid="B4">4</abbr><abbr bid="B11">11</abbr></abbrgrp>.</p>
         <p>The purpose of data normalization is to minimize the effects of experimental and/or technical variations so that meaningful biological comparisons can be made and true biological changes can be found among multiple experiments. Several approaches have been proposed and shown to be effective and beneficial. They were mostly from studies on two-color spotted microarrays <abbrgrp><abbr bid="B12">12</abbr><abbr bid="B13">13</abbr><abbr bid="B14">14</abbr><abbr bid="B15">15</abbr><abbr bid="B16">16</abbr><abbr bid="B17">17</abbr><abbr bid="B18">18</abbr><abbr bid="B19">19</abbr></abbrgrp>. Some authors proposed normalization of the hybridization intensities, while others preferred to normalize the intensity ratios. Some used global, linear methods, while others used local, non-linear methods. Some suggested using the spike-in controls, or house-keeping genes, or invariant genes, while others preferred all the genes on the array. For GeneChip data, some have proposed different models to normalize signal values or normalize probe pair values <abbrgrp><abbr bid="B10">10</abbr><abbr bid="B20">20</abbr><abbr bid="B21">21</abbr><abbr bid="B22">22</abbr><abbr bid="B23">23</abbr><abbr bid="B24">24</abbr></abbrgrp>. Despite the presence of other alternatives, many biologists still use the default scaling method and consider that such method is satisfactory and is useful to identify biological alterations <abbrgrp><abbr bid="B23">23</abbr><abbr bid="B25">25</abbr><abbr bid="B26">26</abbr></abbrgrp>. With the increasing awareness and usage of GeneChip technology and willingness to continue to use GeneChip software among many biologists, it is worth improving the performance or correcting the problems of the software. In this report, the author has demonstrated that in the scaling algorithm excluding 2% of the probe sets with the highest and the lowest values did not have much benefit. However, the logarithmic transformation of signal values prior to scaling proved to be the optimum normalization strategy and is strongly recommended.</p>
      </sec>
      <sec>
         <st>
            <p>Results</p>
         </st>
         <p>The statistical algorithm in current GeneChip software (MAS 5 and GCOS 1) for gene expression microarray data has eliminated the negative gene expression values, a problem present in earlier versions of the software <abbrgrp><abbr bid="B5">5</abbr><abbr bid="B7">7</abbr></abbrgrp>. It uses a robust averaging method based on the Tukey biweight function to calculate the gene expression level from the logarithm transformed hybridization data <abbrgrp><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr><abbr bid="B5">5</abbr><abbr bid="B11">11</abbr></abbrgrp>. The reported data of a probe set is the antilog of the Tukey biweight mean multiplied by a <it>SF </it>and/or a normalization factor (<it>NF</it><sub><it>affy</it></sub>). When both the <it>SF </it>and <it>NF</it><sub><it>affy </it></sub>are equal to 1, there is no normalization or manipulation of original data. Both <it>NF</it><sub><it>affy </it></sub>and <it>SF </it>are computed in virtually the same way. <it>NF</it><sub><it>affy </it></sub>is calculated in comparison analysis to compare the array average of one experiment with that of a baseline experiment, while <it>SF </it>is obtained from the signal average of one experiment comparing with a common value, the target signal in absolute analysis <abbrgrp><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr><abbr bid="B5">5</abbr><abbr bid="B11">11</abbr><abbr bid="B22">22</abbr></abbrgrp>. The average value used in GeneChip is a trimmed average. It is not calculated from all probe sets, but from 96% of the probe sets after the 2% of the probe sets with the highest and the 2% of the lowest signals were removed.</p>
         <p>In this report, a total of 76 experiments with rat U34A GeneChip were analyzed. As shown in Table <tblr tid="T1">1</tblr>, the total hybridization signals varied although all arrays were hybridized with the same amount of biotin-labeled cRNA and scanned with the same scanner of identical settings. The array of the highest hybridization intensities had 2.8 times more signals than that of the lowest. The average array signals had 25.8% variation in terms of coefficient of variation. The mean signals were significantly greater than the median signals on each array, indicating a non-normal distribution. The density plot showed a long-tailed and skewed distribution (not shown) and the average of such data is known to be sensitive to the larger values in the data set.</p>
         <tbl id="T1">
            <title>
               <p>Table 1</p>
            </title>
            <caption>
               <p>Summary of signal data in 76 rat genome U34A GeneChip microarrays.</p>
            </caption>
            <tblbdy cols="6">
               <r>
                  <c ca="left">
                     <p/>
                  </c>
                  <c ca="center">
                     <p>Lowest</p>
                  </c>
                  <c ca="center">
                     <p>Highest</p>
                  </c>
                  <c ca="center">
                     <p>Mean</p>
                  </c>
                  <c ca="center">
                     <p>SD</p>
                  </c>
                  <c ca="center">
                     <p>CV (%)</p>
                  </c>
               </r>
               <r>
                  <c cspan="6">
                     <hr/>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Total signal</p>
                  </c>
                  <c ca="right">
                     <p>832,561.4</p>
                  </c>
                  <c ca="right">
                     <p>3,161,392.7</p>
                  </c>
                  <c ca="right">
                     <p>2,039,655.7</p>
                  </c>
                  <c ca="right">
                     <p>526,295.0</p>
                  </c>
                  <c ca="right">
                     <p>25.80%</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Sum of signals used for SF</p>
                  </c>
                  <c ca="right">
                     <p>524,513.7</p>
                  </c>
                  <c ca="right">
                     <p>1,986,236.9</p>
                  </c>
                  <c ca="right">
                     <p>1,212,296.5</p>
                  </c>
                  <c ca="right">
                     <p>336,138.0</p>
                  </c>
                  <c ca="right">
                     <p>27.73%</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Trimmed total</p>
                  </c>
                  <c ca="right">
                     <p>308,047.7</p>
                  </c>
                  <c ca="right">
                     <p>1,240,257.3</p>
                  </c>
                  <c ca="right">
                     <p>827,359.1</p>
                  </c>
                  <c ca="right">
                     <p>215,325.1</p>
                  </c>
                  <c ca="right">
                     <p>26.03%</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Mean signal</p>
                  </c>
                  <c ca="right">
                     <p>94.6</p>
                  </c>
                  <c ca="right">
                     <p>359.3</p>
                  </c>
                  <c ca="right">
                     <p>231.0</p>
                  </c>
                  <c ca="right">
                     <p>59.8</p>
                  </c>
                  <c ca="right">
                     <p>25.80%</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Median of signals</p>
                  </c>
                  <c ca="right">
                     <p>17.8</p>
                  </c>
                  <c ca="right">
                     <p>54.8</p>
                  </c>
                  <c ca="right">
                     <p>35.7</p>
                  </c>
                  <c ca="right">
                     <p>8.7</p>
                  </c>
                  <c ca="right">
                     <p>24.41%</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Mean of log signals</p>
                  </c>
                  <c ca="right">
                     <p>4.3</p>
                  </c>
                  <c ca="right">
                     <p>5.8</p>
                  </c>
                  <c ca="right">
                     <p>5.1</p>
                  </c>
                  <c ca="right">
                     <p>0.4</p>
                  </c>
                  <c ca="right">
                     <p>7.17%</p>
                  </c>
               </r>
               <r>
                  <c ca="left">
                     <p>Trimmed percentage</p>
                  </c>
                  <c ca="right">
                     <p>34.4</p>
                  </c>
                  <c ca="right">
                     <p>54.1</p>
                  </c>
                  <c ca="right">
                     <p>40.7</p>
                  </c>
                  <c ca="right">
                     <p>4.4</p>
                  </c>
                  <c ca="right">
                     <p>10.70%</p>
                  </c>
               </r>
            </tblbdy>
            <tblfn>
               <p>"Total signal" is the sum of all the signals on each array. "Sum of signals used for <it>SF</it>" is the sum of signals excluding the trimmed data and used to calculate <it>SF</it>. "Trimmed total" is the sum of the 2% probe sets with the highest signals on the array. "Mean of log signals" is the mean of log<sub>2 </sub>transformed signals. "Trimmed percentage" = (Trimmed total/Total signal) &#215; 100%. See also in <it>Methods</it>. The "lowest" and "highest" showed the lowest and highest number in the category among the 76 chips, respectively. The mean, standard deviation (SD) and coefficient of variation (CV) were also calculated.</p>
            </tblfn>
         </tbl>
         <p>The rat U34A GeneChip contained 8799 probe sets; hence 2% was about 176 probe sets. The sum of the 2% of the probe sets with the lowest signals accounts for less than 0.1% of the total signals (0.05% &#177; 0.01%, mean &#177; SD, n = 76) and its impact on <it>SF </it>calculation can be ignored. However, the sum of the 2% of the probe sets with the highest signals, the <it>TrimTotal </it>as used in this report, was responsible for about 40% of the total signals (from 34% to 54%, Table <tblr tid="T1">1</tblr>). The remaining 96% of the probe sets used for <it>SF </it>calculation, produced only about 60% of the signals. Excluding 4% of the probe sets did not reduce the variation, but rather slightly increased the variation, which in turn resulted in a wider range of <it>SF</it>s (Table <tblr tid="T1">1</tblr>). It was also found that the <it>TrimTotal </it>was highly correlated with total signal (R = 0.928), but less with medians (R = 0.536) and the mean of log signals (R = 0.643). The trimmed percentage (<it>Tp</it>) was found to be negatively associated with the median (R = 0.558, b = -1.116) and the mean of log signals (R = 0.495, b = -0.968), but not with the total signal of all probe sets.</p>
         <p>Among other approaches to global linear normalization, one can also use the median signal or the mean of logarithm transformed signals to calculate the NF. <it>NFLogMean </it>showed a higher correlation with <it>NFMedian </it>than with <it>SF</it>. There were larger differences between <it>NFLogMean </it>and <it>SF </it>than those between <it>NFLogMean </it>and <it>NFMedian </it>(Fig. <figr fid="F1">1</figr>). To test if the larger difference was a result of removing 4% of the probe sets from the calculation, another NF, the <it>NFTrimLogMean </it>was obtained using the same data as for <it>SF</it>, but with a log transformation. There is a very significant correlation between <it>NFTrimLogMean </it>and <it>NFLogMean </it>(R = 0.9998). The 4% of the probe sets that was removed from <it>NFTrimLogMean </it>calculation reduced the total data by only 4% after log transformation.</p>
         <fig id="F1">
            <title>
               <p>Figure 1</p>
            </title>
            <caption>
               <p>(A) Comparison among different normalization factors</p>
            </caption>
            <text>
               <p>(A) Comparison among different normalization factors. <it>NFLogMean </it>(x-axis) is plotted against <it>SF </it>(red open triangle) and <it>NFMedian </it>(black closed circle). The correlation between <it>NFLogMean </it>and <it>NFMedian </it>is higher (R = 0.971) than that between <it>NFLogMean </it>and <it>SF </it>(R = 0.918). (B) The NF score, <it>NFscore</it>, for <it>SF </it>(red open triangle), <it>NFMedian </it>(blue open diamond) and <it>NFLogMean </it>(black closed circle) is expressed as a function of respective '<it>true NF</it>'. <it>NFTrimLogMean </it>is not shown here to simplify the graph since it is similar to <it>NFLogMean</it>. See also in<it> Methods</it>.</p>
            </text>
            <graphic file="1471-2105-5-103-1"/>
         </fig>
         <p>Since it is impossible to obtain the true normalization factor, an average of the four global linear <it>NF</it>s mentioned above was used instead to estimate the 'true' NF. To compare them with the true NF, a score (<it>NFscore</it>) is introduced. Each NF is calculated against the respective 'true' NF to obtain its <it>NFscore</it>. The average <it>NFscore </it>(&#177; SD) is 7.01% (&#177; 6.24%), 4.51% (&#177; 3.48%), 2.25%(&#177; 2.33%) and 1.95% (&#177; 1.61%), and the sum of <it>NFscore </it>is 5.33, 3.43, 1.71 and 1.48 for <it>SF</it>, <it>NFMedian</it>, <it>NFTrimLogMean </it>and <it>NFLogMean</it>, respectively (Fig. <figr fid="F1">1</figr>). The sum of <it>NFscore </it>indicated an accumulated variation from the true NF, and the larger the number, the larger the accumulated variation. An attempt to add a 5th NF obtained from the arithmetic mean of all probe sets of the array was also made to calculate and compare <it>NFscore </it>with each NFs, and the results showed the same conclusion (data not shown). It is fair to conclude that <it>NFLogMean </it>produced the least variation.</p>
      </sec>
      <sec>
         <st>
            <p>Discussion</p>
         </st>
         <p>Logarithmic transformation is a well-accepted approach for stabilizing variance and has become a common choice for data transformation and normalization for spotted microarrays <abbrgrp><abbr bid="B12">12</abbr><abbr bid="B16">16</abbr></abbrgrp>. Much improvement has been made in GeneChip microarray technology and accompanying software during the past few years. The current version of GeneChip software has improved its performance and is better than the earlier versions that used the Average Difference to express levels of gene expression <abbrgrp><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr></abbrgrp>. However, the normalization algorithm was inherited and remains the only and default option for gene expression data processing in both MAS 5 and the newly released GeneChip Operating Software (GCOS) software. They continue to use the arithmetic mean of signals to obtain the <it>SF </it>in absolute analysis (single array) and the <it>NF </it>in comparison analysis (two arrays) <abbrgrp><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr><abbr bid="B5">5</abbr><abbr bid="B7">7</abbr><abbr bid="B11">11</abbr><abbr bid="B22">22</abbr></abbrgrp>. It is clearly shown here that the trimmed average and the resulting <it>SF </it>had a larger variance than the median-based NF, or the NF based on the mean of log transformed signals. Similar results were observed in other GeneChip expression arrays, such as mouse U74A and human U133A (data not shown). Elimination of the highest and the lowest 2% of the probe set signals did not stabilize the trimmed means. When intra-array variance was reduced by 40%, this approach cannot be considered to be optimal. The logarithmic transformation of signals stabilized the variation well and made the normalization process much less dependent upon the mean and less affected by the outliers.</p>
         <p>Although simple and popular, the global linear normalization has its drawbacks, especially when the relationship among multiple experiments or genes is not linear. To address such problems, several methods have been proposed to conduct local and non-linear normalization, <abbrgrp><abbr bid="B12">12</abbr><abbr bid="B14">14</abbr><abbr bid="B15">15</abbr><abbr bid="B16">16</abbr><abbr bid="B17">17</abbr><abbr bid="B20">20</abbr><abbr bid="B22">22</abbr><abbr bid="B27">27</abbr></abbrgrp>. Data normalization is a very critical and important step for microarray data mining process. The use of different approaches to normalization may have a profound impact on the selection of differentially expressed genes and conclusions about the underlying biological processes especially when subtle biological changes are investigated <abbrgrp><abbr bid="B12">12</abbr><abbr bid="B16">16</abbr><abbr bid="B28">28</abbr></abbrgrp>.</p>
      </sec>
      <sec>
         <st>
            <p>Conclusions</p>
         </st>
         <p>Normalization of microarray data allows direct comparison of gene expression levels among experiments. A global linear normalization, called scaling has been widely used in GeneChip microarray technology for gene expression analysis. The scaling factor (<it>SF</it>) is calculated from a trimmed average of gene expression level after excluding the 2% of the data points of the highest values and the lowest values. It is shown here that the 2% of the probe sets of the highest signals contained from 34% to 54% of the total signals. Elimination of the outliers did not reduce, but increased the variation among multiple arrays. Instead, normalization factors obtained from the mean of the log transformed signals had the best performance. Thus, the current scaling method, although widely used, is not optimal and needs further improvement. The mean of logarithm transformed signals is highly recommended to use for normalization factor calculation.</p>
      </sec>
      <sec>
         <st>
            <p>Methods</p>
         </st>
         <sec>
            <st>
               <p>GeneChip experiments and data</p>
            </st>
            <p>Total RNA was isolated from rat tissues or cells in Trizol reagent and purified with Qiagen Rneasy kit. cDNA was synthesized in presence of oligo(dT)24-T4 (Genset Corp, La Jolla, CA) and biotinlated UTP and CTP were used to generate biotin labeled cRNA according to the recommended protocols <abbrgrp><abbr bid="B29">29</abbr></abbrgrp>. Rat genome microarray, U34A GeneChip (Affymetrix Inc., Santa Clara, CA) was used and hybridized with 15 &#956;g of gel-verified fragmented cRNA. Hybridization intensity was scanned in GeneArray 2500 scanner (Agilent, Palo Alto, CA) with Microarray Suite (MAS) 5.0 software <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>. Data from a total of 76 independent GeneChip experiments were used in this study.</p>
         </sec>
         <sec>
            <st>
               <p>Normalization factor (NF)</p>
            </st>
            <p>Gene expression data exported from MAS 5.0 were submitted to a Perl script to calculate different normalization factors. In the scaling approach, a trimmed average signal is calculated after excluding 2% probe sets with the highest signals and 2% with the lowest signal values. The scaling factor (<it>SF</it>) is obtained using equation (1) in comparison with a chosen fixed number, called the target signal (<it>TS</it>) and is verified with the results from MAS 5.0 of the same settings <abbrgrp><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr><abbr bid="B11">11</abbr></abbrgrp>.</p>
            <p><it>SF</it><sub><it>j </it></sub>= <it>TS </it>/ <it>S</it><sub><it>TrimMeanj </it></sub>&#160;&#160;&#160; (1)</p>
            <p>Other normalization factors for comparison were obtained by the following:</p>
            <p><it>NFMedian</it><sub><it>j </it></sub>= <it>TS </it>/ <it>S</it><sub>med<it>j </it></sub>&#160;&#160;&#160; (2)</p>
            <p><it>NFLogMean</it><sub><it>j </it></sub>= 2 <sup><it>nf</it></sup><sub><it>j</it></sub></p>
            <p>
               <graphic file="1471-2105-5-103-i1.gif"/>
            </p>
            <p>where <it>i </it>= 1..., n represents the probe sets, <it>j </it>= 1..., J represented the array experiments, <it>Si </it>is the signal of the anti-log of a robust average (Tukey biweight) of log(PM-MM) reported from MAS 5.0 <abbrgrp><abbr bid="B5">5</abbr></abbrgrp>, <it>S</it><sub>med<it>j </it></sub>is the median signal on the array <it>j</it>, <it>S</it><sub><it>TrimMeanj </it></sub>is the trimmed average on array <it>j </it>after excluding 2% of the probe sets with the highest and the lowest signals <abbrgrp><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr><abbr bid="B11">11</abbr><abbr bid="B22">22</abbr></abbrgrp>. <it>NFMedian</it><sub><it>j </it></sub>is obtained by using the median signal on array <it>j</it>, and <it>NFLogMean</it><sub><it>j </it></sub>is obtained by using the mean of log transformed signals. <it>TS </it>was set to 150, 38 and 38 for <it>SF</it>, <it>NFMedian </it>and <it>NFLogMean</it>, respectively in order to have similar NFs.</p>
            <p>In comparison with different NFs, a score, <it>NFscore </it>is introduced. <it>NFscore</it><sub><it>j </it></sub>= (<it>NF</it><sub><it>j </it></sub>- <it>TrueNF</it><sub><it>j</it></sub>)/<it>TrueNF</it><sub><it>j</it></sub>, and <it>TrueNF</it><sub><it>j </it></sub>= (<it>SF</it><sub><it>j </it></sub>+ <it>NFMedian</it><sub><it>j </it></sub>+ <it>NFLogMean</it><sub><it>j </it></sub>+ <it>NFTrimLogMean</it><sub><it>j</it></sub>)/4, where <it>NFTrimLogMean</it><sub><it>j</it></sub>, was calculated from equation (3) excluding the 2% of the probe sets with the highest and lowest signals, <it>TrueNF</it><sub><it>j </it></sub>was used as a 'true' NF. Sum of <graphic file="1471-2105-5-103-i2.gif"/>.</p>
         </sec>
         <sec>
            <st>
               <p>Other analysis</p>
            </st>
            <p>Unless otherwise specified, logarithm transformation is carried out with the logarithm base 2. Trimmed total signal <it>TrimTotal </it>is the sum of the signals from the 2% of the probe sets with the highest signal values. Total signal <it>Total </it>is the sum of the signals of all probe sets in the array, and trimmed percentage <it>Tp</it><sub><it>j </it></sub>= (<it>TrimTotal</it><sub><it>j </it></sub>/ <it>Total</it><sub><it>j</it></sub>) &#215; 100%.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Abbreviations</p>
         </st>
         <p>GeneChip<sup>&#174; </sup>is the registered trademark owned by Affymetrix Inc.</p>
         <p>PM: perfect Match; MM: mismatch; SF: scaling factor; NF: normalization factor; TS: target signal Short phrase: Normalization of GeneChip microarray data</p>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>I would like to acknowledge the support from Dr. H. D. Lipshitz, Dr. S. Scherer, The Centre for Applied Genomics, and The Hospital for Sick Children. The excellent technical work by Lan He is highly appreciated. I would also like to thank Drs. P. Liu, M. Post, K. Tanswell, G. Fantus and S. Keshavjee for sharing their U34A data. Review and comments from Drs. C. Greenwood, J. Beyene, C.E. M'lan and P. McLoughlin are highly appreciated. Finally, suggestions to improve this paper from the editor and referees are deeply appreciated.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>High density synthetic oligonucleotide arrays</p>
            </title>
            <aug>
               <au>
                  <snm>Lipshutz</snm>
                  <fnm>RJ</fnm>
               </au>
               <au>
                  <snm>Fodor</snm>
                  <fnm>SP</fnm>
               </au>
               <au>
                  <snm>Gingeras</snm>
                  <fnm>TR</fnm>
               </au>
               <au>
                  <snm>Lockhart</snm>
                  <fnm>DJ</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>1999</pubdate>
            <volume>21</volume>
            <fpage>20</fpage>
            <lpage>24</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/4447</pubid>
                  <pubid idtype="pmpid" link="fulltext">9915496</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>Expression monitoring by hybridization to high-density oligonucleotide arrays</p>
            </title>
            <aug>
               <au>
                  <snm>Lockhart</snm>
                  <fnm>DJ</fnm>
               </au>
               <au>
                  <snm>Dong</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Byrne</snm>
                  <fnm>MC</fnm>
               </au>
               <au>
                  <snm>Follettie</snm>
                  <fnm>MT</fnm>
               </au>
               <au>
                  <snm>Gallo</snm>
                  <fnm>MV</fnm>
               </au>
               <au>
                  <snm>Chee</snm>
                  <fnm>MS</fnm>
               </au>
               <au>
                  <snm>Mittmann</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Kobayashi</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Horton</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Brown</snm>
                  <fnm>EL</fnm>
               </au>
            </aug>
            <source>Nat Biotechnol</source>
            <pubdate>1996</pubdate>
            <volume>14</volume>
            <fpage>1675</fpage>
            <lpage>1680</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nbt1296-1675</pubid>
                  <pubid idtype="pmpid">9634850</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>GeneChip Expression Analysis: Data Analysis Fundamentals</p>
            </title>
            <aug>
               <au>
                  <cnm>Affymetrix</cnm>
               </au>
            </aug>
            <url>http://www.affymetrix.com/</url>
         </bibl>
         <bibl id="B4">
            <title>
               <p>Microarray Suite 5.0 User's Guide</p>
            </title>
            <aug>
               <au>
                  <cnm>Affymetrix</cnm>
               </au>
            </aug>
            <publisher>Santa Clara, CA, USA, Affymetrix Inc</publisher>
            <edition>2002</edition>
            <pubdate>2001</pubdate>
         </bibl>
         <bibl id="B5">
            <title>
               <p>Robust estimators for expression analysis</p>
            </title>
            <aug>
               <au>
                  <snm>Hubbell</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Liu</snm>
                  <fnm>WM</fnm>
               </au>
               <au>
                  <snm>Mei</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2002</pubdate>
            <volume>18</volume>
            <fpage>1585</fpage>
            <lpage>1592</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/18.12.1585</pubid>
                  <pubid idtype="pmpid" link="fulltext">12490442</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>Model-based analysis of oligonucleotide arrays: expression index computation and outlier detection</p>
            </title>
            <aug>
               <au>
                  <snm>Li</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Wong</snm>
                  <fnm>WH</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci U S A</source>
            <pubdate>2001</pubdate>
            <volume>98</volume>
            <fpage>31</fpage>
            <lpage>36</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1073/pnas.011404098</pubid>
                  <pubid idtype="pmpid" link="fulltext">11134512</pubid>
                  <pubid idtype="pmcid">14539</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>Analysis of high density expression microarrays with signed-rank call algorithms</p>
            </title>
            <aug>
               <au>
                  <snm>Liu</snm>
                  <fnm>WM</fnm>
               </au>
               <au>
                  <snm>Mei</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Di</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Ryder</snm>
                  <fnm>TB</fnm>
               </au>
               <au>
                  <snm>Hubbell</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Dee</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Webster</snm>
                  <fnm>TA</fnm>
               </au>
               <au>
                  <snm>Harrington</snm>
                  <fnm>CA</fnm>
               </au>
               <au>
                  <snm>Ho</snm>
                  <fnm>MH</fnm>
               </au>
               <au>
                  <snm>Baid</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Smeekens</snm>
                  <fnm>SP</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2002</pubdate>
            <volume>18</volume>
            <fpage>1593</fpage>
            <lpage>1599</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/18.12.1593</pubid>
                  <pubid idtype="pmpid" link="fulltext">12490443</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>Statistical analysis of high-density oligonucleotide arrays: a multiplicative noise model</p>
            </title>
            <aug>
               <au>
                  <snm>Sasik</snm>
                  <fnm>R., Calvo, E., and Corbeil, J.</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2002</pubdate>
            <volume>18</volume>
            <fpage>1633</fpage>
            <lpage>1640</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/18.12.1633</pubid>
                  <pubid idtype="pmpid" link="fulltext">12490448</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>Empirical characterization of the expression ratio noise structure in high-density oligonucleotide arrays</p>
            </title>
            <aug>
               <au>
                  <snm>Naef</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Hacker</snm>
                  <fnm>CR</fnm>
               </au>
               <au>
                  <snm>Patil</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Magnasco</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2002</pubdate>
            <volume>3</volume>
            <fpage>RESEARCH0018</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1186/gb-2002-3-4-research0018</pubid>
                  <pubid idtype="pmpid" link="fulltext">11983059</pubid>
                  <pubid idtype="pmcid">115253</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>Summaries of Affymetrix GeneChip probe level data</p>
            </title>
            <aug>
               <au>
                  <snm>Irizarry</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Bolstad</snm>
                  <fnm>BM</fnm>
               </au>
               <au>
                  <snm>Collin</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Cope</snm>
                  <fnm>LM</fnm>
               </au>
               <au>
                  <snm>Hobbs</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Speed</snm>
                  <fnm>TP</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2003</pubdate>
            <volume>31</volume>
            <fpage>e15</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/nar/gng015</pubid>
                  <pubid idtype="pmpid" link="fulltext">12582260</pubid>
                  <pubid idtype="pmcid">150247</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>GeneChip Operating Software: User's Guide</p>
            </title>
            <aug>
               <au>
                  <cnm>Affymetrix</cnm>
               </au>
            </aug>
            <source>http://wwwaffymetrixcom/</source>
            <edition>1.0</edition>
            <url>http://www.affymetrix.com/index.affx</url>
         </bibl>
         <bibl id="B12">
            <title>
               <p>Microarray data normalization and transformation</p>
            </title>
            <aug>
               <au>
                  <snm>Quackenbush</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>2002</pubdate>
            <volume>32</volume>
            <issue>Suppl</issue>
            <fpage>496</fpage>
            <lpage>501</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/ng1032</pubid>
                  <pubid idtype="pmpid" link="fulltext">12454644</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B13">
            <title>
               <p>Between-group analysis of microarray data</p>
            </title>
            <aug>
               <au>
                  <snm>Culhane</snm>
                  <fnm>AC</fnm>
               </au>
               <au>
                  <snm>Perriere</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Considine</snm>
                  <fnm>EC</fnm>
               </au>
               <au>
                  <snm>Cotter</snm>
                  <fnm>TG</fnm>
               </au>
               <au>
                  <snm>Higgins</snm>
                  <fnm>DG</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2002</pubdate>
            <volume>18</volume>
            <fpage>1600</fpage>
            <lpage>1608</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/18.12.1600</pubid>
                  <pubid idtype="pmpid" link="fulltext">12490444</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>A variance-stabilizing transformation for gene-expression microarray data</p>
            </title>
            <aug>
               <au>
                  <snm>Durbin</snm>
                  <fnm>BP</fnm>
               </au>
               <au>
                  <snm>Hardin</snm>
                  <fnm>JS</fnm>
               </au>
               <au>
                  <snm>Hawkins</snm>
                  <fnm>DM</fnm>
               </au>
               <au>
                  <snm>Rocke</snm>
                  <fnm>DM</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2002</pubdate>
            <volume>18</volume>
            <issue>Suppl 1</issue>
            <fpage>S105</fpage>
            <lpage>10</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">12169537</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B15">
            <title>
               <p>Normalization and analysis of DNA microarray data by self-consistency and local regression</p>
            </title>
            <aug>
               <au>
                  <snm>Kepler</snm>
                  <fnm>TB</fnm>
               </au>
               <au>
                  <snm>Crosby</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Morgan</snm>
                  <fnm>KT</fnm>
               </au>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2002</pubdate>
            <volume>3</volume>
            <fpage>RESEARCH0037</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1186/gb-2002-3-7-research0037</pubid>
                  <pubid idtype="pmpid" link="fulltext">12184811</pubid>
                  <pubid idtype="pmcid">126242</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <title>
               <p>Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation</p>
            </title>
            <aug>
               <au>
                  <snm>Yang</snm>
                  <fnm>YH</fnm>
               </au>
               <au>
                  <snm>Dudoit</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Luu</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Lin</snm>
                  <fnm>DM</fnm>
               </au>
               <au>
                  <snm>Peng</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Ngai</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Speed</snm>
                  <fnm>TP</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2002</pubdate>
            <volume>30</volume>
            <fpage>e15</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/nar/30.4.e15</pubid>
                  <pubid idtype="pmpid" link="fulltext">11842121</pubid>
                  <pubid idtype="pmcid">100354</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>Feature extraction and normalization algorithms for high-density oligonucleotide gene expression array data</p>
            </title>
            <aug>
               <au>
                  <snm>Schadt</snm>
                  <fnm>EE</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Ellis</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Wong</snm>
                  <fnm>WH</fnm>
               </au>
            </aug>
            <source>J Cell Biochem Suppl</source>
            <pubdate>2001</pubdate>
            <volume>37</volume>
            <issue>Suppl</issue>
            <fpage>120</fpage>
            <lpage>125</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/jcb.10073</pubid>
                  <pubid idtype="pmpid">11842437</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p>Evaluation of normalization procedures for oligonucleotide array data based on spiked cRNA controls</p>
            </title>
            <aug>
               <au>
                  <snm>Hill</snm>
                  <fnm>AA</fnm>
               </au>
               <au>
                  <snm>Brown</snm>
                  <fnm>EL</fnm>
               </au>
               <au>
                  <snm>Whitley</snm>
                  <fnm>MZ</fnm>
               </au>
               <au>
                  <snm>Tucker-Kellogg</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Hunter</snm>
                  <fnm>CP</fnm>
               </au>
               <au>
                  <snm>Slonim</snm>
                  <fnm>DK</fnm>
               </au>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2001</pubdate>
            <volume>2</volume>
            <fpage>RESEARCH0055</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1186/gb-2001-2-12-research0055</pubid>
                  <pubid idtype="pmpid" link="fulltext">11790258</pubid>
                  <pubid idtype="pmcid">64840</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B19">
            <title>
               <p>Within the fold: assessing differential expression measures and reproducibility in microarray assays</p>
            </title>
            <aug>
               <au>
                  <snm>Yang</snm>
                  <fnm>IV</fnm>
               </au>
               <au>
                  <snm>Chen</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Hasseman</snm>
                  <fnm>JP</fnm>
               </au>
               <au>
                  <snm>Liang</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Frank</snm>
                  <fnm>BC</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Sharov</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Saeed</snm>
                  <fnm>AI</fnm>
               </au>
               <au>
                  <snm>White</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Li</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Lee</snm>
                  <fnm>NH</fnm>
               </au>
               <au>
                  <snm>Yeatman</snm>
                  <fnm>TJ</fnm>
               </au>
               <au>
                  <snm>Quackenbush</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2002</pubdate>
            <volume>3</volume>
            <fpage>research0062</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">133446</pubid>
                  <pubid idtype="pmpid" link="fulltext">12429061</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B20">
            <title>
               <p>Exploration, normalization, and summaries of high density oligonucleotide array probe level data</p>
            </title>
            <aug>
               <au>
                  <snm>Irizarry</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Hobbs</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Collin</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Beazer-Barclay</snm>
                  <fnm>YD</fnm>
               </au>
               <au>
                  <snm>Antonellis</snm>
                  <fnm>KJ</fnm>
               </au>
               <au>
                  <snm>Scherf</snm>
                  <fnm>U</fnm>
               </au>
               <au>
                  <snm>Speed</snm>
                  <fnm>TP</fnm>
               </au>
            </aug>
            <source>Biostatistics</source>
            <pubdate>2003</pubdate>
            <volume>4</volume>
            <fpage>249</fpage>
            <lpage>264</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/biostatistics/4.2.249</pubid>
                  <pubid idtype="pmpid" link="fulltext">12925520</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B21">
            <title>
               <p>Model-based analysis of oligonucleotide arrays: model validation, design issues and standard error application</p>
            </title>
            <aug>
               <au>
                  <snm>Li</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Hung Wong</snm>
                  <fnm>W</fnm>
               </au>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2001</pubdate>
            <volume>2</volume>
            <fpage>RESEARCH0032</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">55329</pubid>
                  <pubid idtype="pmpid" link="fulltext">11532216</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B22">
            <title>
               <p>A comparison of normalization methods for high density oligonucleotide array data based on variance and bias</p>
            </title>
            <aug>
               <au>
                  <snm>Bolstad</snm>
                  <fnm>BM</fnm>
               </au>
               <au>
                  <snm>Irizarry</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Astrand</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Speed</snm>
                  <fnm>TP</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2003</pubdate>
            <volume>19</volume>
            <fpage>185</fpage>
            <lpage>193</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/19.2.185</pubid>
                  <pubid idtype="pmpid" link="fulltext">12538238</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B23">
            <title>
               <p>Transformation and normalization of oligonucleotide microarray data</p>
            </title>
            <aug>
               <au>
                  <snm>Geller</snm>
                  <fnm>SC</fnm>
               </au>
               <au>
                  <snm>Gregg</snm>
                  <fnm>JP</fnm>
               </au>
               <au>
                  <snm>Hagerman</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Rocke</snm>
                  <fnm>DM</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2003</pubdate>
            <volume>19</volume>
            <fpage>1817</fpage>
            <lpage>1823</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/btg245</pubid>
                  <pubid idtype="pmpid" link="fulltext">14512353</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B24">
            <title>
               <p>Changes in global gene expression patterns during development and maturation of the rat kidney</p>
            </title>
            <aug>
               <au>
                  <snm>Stuart</snm>
                  <fnm>RO</fnm>
               </au>
               <au>
                  <snm>Bush</snm>
                  <fnm>KT</fnm>
               </au>
               <au>
                  <snm>Nigam</snm>
                  <fnm>SK</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci U S A</source>
            <pubdate>2001</pubdate>
            <volume>98</volume>
            <fpage>5649</fpage>
            <lpage>5654</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1073/pnas.091110798</pubid>
                  <pubid idtype="pmpid" link="fulltext">11331749</pubid>
                  <pubid idtype="pmcid">33267</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B25">
            <title>
               <p>Significance analysis of microarrays applied to the ionizing radiation response</p>
            </title>
            <aug>
               <au>
                  <snm>Tusher</snm>
                  <fnm>VG</fnm>
               </au>
               <au>
                  <snm>Tibshirani</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Chu</snm>
                  <fnm>G</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci U S A</source>
            <pubdate>2001</pubdate>
            <volume>98</volume>
            <fpage>5116</fpage>
            <lpage>5121</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1073/pnas.091062498</pubid>
                  <pubid idtype="pmpid" link="fulltext">11309499</pubid>
                  <pubid idtype="pmcid">33173</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B26">
            <title>
               <p>A current profile of microarray laboratories: the 2002-2003 ABRF microarray research group survey of laboratories using microarray technologies</p>
            </title>
            <aug>
               <au>
                  <snm>Knudtson</snm>
                  <fnm>KL</fnm>
               </au>
               <au>
                  <snm>Griffin</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Iacobas</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>Johnson</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Khitrov</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Levy</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Massimi</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Nowak</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Viale</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Grill</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Brooks</snm>
                  <fnm>AI</fnm>
               </au>
            </aug>
            <source>http://wwwabrforg</source>
         </bibl>
         <bibl id="B27">
            <title>
               <p>Issues in cDNA microarray analysis: quality filtering, channel normalization, models of variations and assessment of gene effects</p>
            </title>
            <aug>
               <au>
                  <snm>Tseng</snm>
                  <fnm>GC</fnm>
               </au>
               <au>
                  <snm>Oh</snm>
                  <fnm>MK</fnm>
               </au>
               <au>
                  <snm>Rohlin</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Liao</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Wong</snm>
                  <fnm>WH</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2001</pubdate>
            <volume>29</volume>
            <fpage>2549</fpage>
            <lpage>2557</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/nar/29.12.2549</pubid>
                  <pubid idtype="pmpid" link="fulltext">11410663</pubid>
                  <pubid idtype="pmcid">55725</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B28">
            <title>
               <p>Profound effect of normalization on detection of differentially expressed genes in oligonucleotide microarray data analysis</p>
            </title>
            <aug>
               <au>
                  <snm>Hoffmann</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Seidl</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Dugas</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2002</pubdate>
            <volume>3</volume>
            <fpage>RESEARCH0033</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">126238</pubid>
                  <pubid idtype="pmpid" link="fulltext">12184807</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B29">
            <title>
               <p>GeneChip Expression Analysis: Technical Manual</p>
            </title>
            <aug>
               <au>
                  <cnm>Affymetrix</cnm>
               </au>
            </aug>
            <source>http://wwwaffymetrixcom/</source>
            <url>http://www.affymetrix.com/</url>
         </bibl>
      </refgrp>
   </bm>
</art>
