<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
	<ui>gb-2006-7-8-404</ui>
	<ji>GBJ</ji>
	<fm>
		<dochead>Correspondence</dochead>
		<bibl>
			<title>
				<p>Feature-level exploration of a published Affymetrix GeneChip control dataset</p>
			</title>
			<aug>
				<au id="A1" ca="yes">
					<snm>Irizarry</snm>
					<mi>A</mi>
					<fnm>Rafael</fnm>
					<insr iid="I1"/>
					<email>rafa@jhu.edu</email>
				</au>
				<au id="A2">
					<snm>Cope</snm>
					<mi>M</mi>
					<fnm>Leslie</fnm>
					<insr iid="I2"/>
				</au>
				<au id="A3">
					<snm>Wu</snm>
					<fnm>Zhijin</fnm>
					<insr iid="I3"/>
				</au>
			</aug>
			<insg>
				<ins id="I1">
					<p>Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, 615 North Wolfe Street, Baltimore, MD 21205-2179, USA</p>
				</ins>
				<ins id="I2">
					<p>Department of Oncology, The Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University, 550 N. Broadway, Suite 1131 Baltimore, MD 21205, USA</p>
				</ins>
				<ins id="I3">
					<p>Center for Statistical Sciences, Department of Community Health, Brown University, 167 Angell Street, Providence, RI 02912, USA</p>
				</ins>
			</insg>
			<source>Genome Biology</source>
			<issn>1465-6906</issn>
			<pubdate>2006</pubdate>
			<volume>7</volume>
			<issue>8</issue>
			<fpage>404</fpage>
			<url>http://genomebiology.com/2006/7/8/404</url>
			<xrefbib>
				<pubidlist><pubid idtype="pmpid">16953902</pubid><pubid idtype="doi">10.1186/gb-2006-7-8-404</pubid>
				</pubidlist></xrefbib>
		</bibl>
		<history>
			<pub>
				<date>
					<day>1</day>
					<month>9</month>
					<year>2006</year>
				</date>
			</pub>
		</history>
		<cpyrt>
			<year>2005</year>
			<collab>BioMed Central Ltd</collab>
		</cpyrt>
		<shorttitle>
			<p>Feature-level exploration of a published Affymetrix GeneChip control dataset</p>
		</shorttitle>
		<shortabs>
			<p><it>A comment on </it><b>Preferred analysis methods for Affymetrix GeneChips revealed by a wholly defined control dataset </b>by SE Choe, M Boutros, AM Michelson, GM Church and MS Halfon. <it>Genome Biology </it>2005, <b>6:</b>R16</p>
		</shortabs>
		<abs>
			<sec>
				<st>
					<p/>
				</st>
				<p><it>A comment on </it><b>Preferred analysis methods for Affymetrix GeneChips revealed by a wholly defined control dataset </b>by SE Choe, M Boutros, AM Michelson, GM Church and MS Halfon. <it>Genome Biology </it>2005, <b>6:</b>R16.</p>
			</sec>
		</abs>
	</fm>
	<meta>
		<classifications>
			<classification type="BMC" subtype="man_spc_id" id="30010010">Genome studies</classification>
			<classification type="BMC" subtype="man_spc_id" id="30010013">Methods</classification>
		</classifications>
	</meta>
	<bdy>
		<sec>
			<st>
				<p/>
			</st>
			<p>In a recent <it>Genome Biology</it> article, Choe <it>et al</it>. <abbrgrp><abbr bid="B1">1</abbr></abbrgrp> describe a spike-in experiment that they use to compare expression measures for Affymetrix GeneChip technology. In this work, two sets of triplicates were created to represent control (C) and experimental (S) samples. We describe here some properties of the Choe <it>et al</it>. <abbrgrp><abbr bid="B1">1</abbr></abbrgrp> control dataset one should consider before using it to assess GeneChip expression measures. In <abbrgrp><abbr bid="B2">2</abbr></abbrgrp> and <abbrgrp><abbr bid="B3">3</abbr></abbrgrp> we describe a benchmark for such measures based on experiments developed by Affymetrix and GeneLogic. These datasets are described in detail in <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>. A web-based implementation of the benchmark, is available at <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>. The experiment described in <abbrgrp><abbr bid="B1">1</abbr></abbrgrp> is a worthy contribution to the field as it permits assessments with data that is likely to better emulate the nonspecific binding (NSB) and cross-hybridization seen in typical experiments. However, there are various inconsistencies between the conclusions reached by <abbrgrp><abbr bid="B1">1</abbr></abbrgrp> and <abbrgrp><abbr bid="B3">3</abbr></abbrgrp> that we do not believe are due to NSB and cross-hybridization effects. In this Correspondence we describe certain characteristics of the feature-level data produced by <abbrgrp><abbr bid="B1">1</abbr></abbrgrp> which we believe explain these inconsistencies. These can be divided into characteristics induced by the experimental design and an artifact.</p>
		</sec>
		<sec>
			<st>
				<p>Experimental design</p>
			</st>
			<p>There are three characteristics of the experimental design described by <abbrgrp><abbr bid="B1">1</abbr></abbrgrp> that one should consider before using it for assessments like those carried out by Affycomp. We enumerate them below and explain how they may lead to unfair assessments. Other considerations are described by Dabney and Storey <abbrgrp><abbr bid="B5">5</abbr></abbrgrp>.</p>
			<p>First, the spike-in concentrations are unrealistically high. In <abbrgrp><abbr bid="B3">3</abbr></abbrgrp> we demonstrate that background noise makes it harder to detect differentially expression for genes that are present at low concentrations. We point out that in the Affymetrix spike-in experiments <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B3">3</abbr></abbrgrp> the concentrations for spiked-in features result in artificially high intensities but that a large range of the nominal concentrations are actually in a usable range (Figure <figr fid="F1">1a</figr> of this Correspondence). Figure <figr fid="F1">1b</figr> demonstrates that in a typical experiment <abbrgrp><abbr bid="B6">6</abbr></abbrgrp>, features related to differentially expressed genes show intensities with a similar range as the rest of the genes - in particular, that less than 10% of genes, including the differentially expressed genes, are above intensities of 10. Figure ADF5-3 in the Additional data files for <abbrgrp><abbr bid="B1">1</abbr></abbrgrp> shows that less than 20% of their spiked-in gene intensities are below 10. Additional data file 5 of <abbrgrp><abbr bid="B1">1</abbr></abbrgrp> also contains a reanalysis using only the lower-intensity genes, which provide results that agree a bit better with results from Affycomp. A problem is that for the Affycomp assessment one needs to decide <it>a priori</it> which genes to include in the analysis, for example, setting a cutoff based on nominal spike-in concentration. In the analysis described in Additional data file 5 of <abbrgrp><abbr bid="B1">1</abbr></abbrgrp> one needs to choose genes <it>a posteriori</it>, that is, based on observed intensities. The latter approach can easily lead to problems such as favoring the inclusion of probesets exhibiting low intensities as a result of defective probes. Furthermore, our Figure <figr fid="F1">1c</figr> shows that, despite the use of an experimental design that should induce about 72% of absent genes, we observe intensities for which the higher percentiles (75-95%) are twice as large as what we observe in typical experiments. This suggests that the spike-in concentrations were high enough to make this experiment produce atypical data. We do not expect a preprocessing algorithm that performs well on this data to necessarily perform well in
general, and vice versa.</p>
			<fig id="F1">
				<title>
					<p>Figure 1</p>
				</title>
				<caption>
					<p>MA and cumulative distribution function (CDF) plots. MA-plots are log expression in treatment minus (M) log expression in control versus average (A) log expression plots</p>
				</caption>
				<text>
					<p>MA and cumulative distribution function (CDF) plots. MA-plots are log expression in treatmentminus (M) log expression in control versus average (A) log expression plots. <b>(a)</b> For two sets of triplicates from the Affymetrix HGU133A spike-in experiment [2,3] we calculated the average log ratio across the three comparisons (M) and the average log intensity (A) across all six arrays for each feature. The figure shows M plotted against A. However, because there are hundreds of thousands of features, instead of plotting each point, we use shades of blue to denote the amount of points in each region of the plot. About 90% of the data is contained in the dark-blue regions. Orange points are the 405 features from the 36 genes with nominal fold changes of 2. <b>(b)</b> As in (a) but using two sets of biological triplicates from a study comparing three trisomic human brains to three normal human brains. The orange dots are 385 features representing 35 genes on chromosome 21 for which we expect fold changes of 1.5. <b>(c)</b> Empirical cumulative density functions for the median corrected log (base 2) intensities of 50 randomly chosen arrays from the Gene Expression Omnibus (GEO), three randomly selected arrays from Affymetrix HGU133A spike-in experiment, and the three S samples from Choe <it>et al</it>. [1] facilitate the comparison; the intensities were made to have the same median. The dashed black horizontal lines show the 75% and 95% percentiles. <b>(d)</b> As (a) but showing the two sets of triplicates described by Choe <it>et al</it>. [1]. The orange dots are 375 features randomly sampled from those that were spiked-in to have fold changes greater than 1. The yellow ellipse is used to illustrate an artifact: among the data with nominal fold changes of 1, there appear to be two clusters having different overall observed log ratios.</p>
				</text>
				<graphic file="gb-2006-7-8-404-1"/>
			</fig>
			<p>Second, a large percentage of the genes (about 10%) are spiked-in to be differentially expressed and all of these are expected to be upregulated. This design makes this spike-in data very different from that produced by many experiments where at least one of the following assumptions is expected to hold: a small percentage of genes are differentially expressed, and there is a balance between up- and downregulation. Many preprocessing algorithms (for example, loess normalization, variance
stabilizing normalization (VSN), rank-invariant) implement normalization routines motivated by one or both of these assumptions; thus we should not expect many of the existing expression measure methodologies to perform well with the Choe <it>et al</it>. <abbrgrp><abbr bid="B1">1</abbr></abbrgrp> data.</p>
			<p>Third, a careful look at Table 1 in <abbrgrp><abbr bid="B1">1</abbr></abbrgrp> shows that nominal concentrations and fold-change sizes are confounded. This problem will slightly cloud the distinction between ability to detect small fold changes from the ability to detect differential expression when concentration is low. Why this distinction is
important is shown in <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>. However, Figure ADF5-1 in Additional data file 1 of Choe <it>et al</it>. <abbrgrp><abbr bid="B1">1</abbr></abbrgrp> demonstrates that this difference in nominal concentrations does not appear to translated into observed intensities. This could, however, be an indication of saturation, which is a common problem when high intensities are observed (see the first point of this argument above). One case of the confounding is seen: genes with nominal fold-changes larger than 1 result in intensities that, on average, are about three times larger than genes with nominal fold-changes of 1.</p>
		</sec>
		<sec>
			<st>
				<p>The artifact</p>
			</st>
			<p>Figure <figr fid="F1">1a-c</figr> of this Correspondence is based on raw feature-level data. No preprocessing or normalization was performed. We randomly selected 100 pairs of arrays from experiments stored in the Gene Expression Omnibus (GEO) and without exception they produced MA-plots similar to those seen in Figure <figr fid="F1">1a,b</figr> (MA-plots are log expression in treatment minus (M) log expression in control versus average (A) log expression plots). These plots have most of the points in the lower range of concentrations and an exponential tapering as concentration increases <abbrgrp><abbr bid="B7">7</abbr></abbrgrp>. However, the Choe <it>et al</it>. <abbrgrp><abbr bid="B1">1</abbr></abbrgrp> data show a second cluster centered at a high concentration and a negative log ratio. Not one of the MA-plots from GEO looked like this. Figure <figr fid="F2">2</figr> in this Correspondence reveals that the feature intensities for genes spiked-in to be at 1:1 ratios behave very differently from the features from non-spiked- in genes which, in a typical experiment, exhibit, on average, log fold changes of 0 (in practice there are shifts, some nonlinear, but standard normalization procedures correct this).</p>
			<fig id="F2">
				<title>
					<p>Figure 2</p>
				</title>
				<caption>
					<p>Log-ratio box-plots</p>
				</caption>
				<text>
					<p>Log-ratio box-plots. <b>(a)</b> For the raw probe-level data in [1] we computed log fold changes comparingthe control and spike-in arrays for each of the three replicates. The C and S arrays were pairedaccording to their filenames: C1-S1, C2-S2, and C3-S3. Box-plots are shown for five groups of probes:not spiked-in (gray), spiked-in at equal concentrations (purple), spiked-in with nominal fold-changes between 1 and 2, 2 and 3, and 3 and 4 (orange). <b>(b)</b> As (a) but after quantile normalizing the probes.</p>
				</text>
				<graphic file="gb-2006-7-8-404-2"/>
			</fig>
			<p>This problem implies that, unless an <it>ad hoc</it> correction is applied, what Choe <it>et al</it>. <abbrgrp><abbr bid="B1">1</abbr></abbrgrp> define as false positive might in fact be true positives. Figure <figr fid="F2">2</figr> shows that this problem persists even after quantile normalization <abbrgrp><abbr bid="B8">8</abbr></abbrgrp>. In Choe <it>et al</it>. <abbrgrp><abbr bid="B1">1</abbr></abbrgrp> a normalization scheme based on knowledge of which genes have fold-changes of 1 is used to correct this problem. However, preprocessing algorithms are not designed to work with data that has been manipulated in this way, which makes this dataset particularly difficult to use in assessment tools such as Affycomp. Furthermore, Figure <figr fid="F1">1c,d</figr> of this Correspondence shows that the data produced by <abbrgrp><abbr bid="B1">1</abbr></abbrgrp> is quite different from data from typical experiments which most preprocessing algorithms were developed.</p>
			<p>Currently, experiments where the normalization assumptions do not hold seem to be a small minority. However, our experience is that they are becoming more common. For this type of experiment we will need new preprocessing algorithms, and the Choe <it>et al</it>. <abbrgrp><abbr bid="B1">1</abbr></abbrgrp> data may be useful for the development of these new methods.</p>
			<p>
				<it>Sung E Choe, Michael Boutros, Alan M Michelson, George M Church and Marc S Halfon respond:</it>
			</p>
			<p>Irizarry <it>et al</it>. raise a number of interesting points in their Correspondence that highlight the continued need for carefully designed control microarray experiments. They posit that "the spikein concentrations are unrealistically high" in our experimental design. Although we have estimated that the average per-gene concentration is similar to that in a typical experiment <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>, we do not know individual RNA concentrations and so cannot verify or deny this assertion. Since the majority of probesets in our dataset correspond to non-spiked-in genes, and therefore have a signal range consistent with absent genes, we think it seems reasonable that the spiked-in genes have higher signal than the rest of the chip. Regardless of this, in Additional Data File 5 of <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>, we repeated the receiver-operator characteristics (ROC) analysis using as the "known differentially expressed" probe sets only the subset with low signal levels. The results we obtained for gcrma (robust mutli-array average using sequence information) <abbrgrp><abbr bid="B9">9</abbr></abbrgrp> were very similar to the conclusions in <abbrgrp><abbr bid="B3">3</abbr></abbrgrp> and <abbrgrp><abbr bid="B10">10</abbr></abbrgrp>; in addition, the performance of MAS5 <abbrgrp><abbr bid="B11">11</abbr></abbrgrp> was similar between <abbrgrp><abbr bid="B1">1</abbr></abbrgrp> and <abbrgrp><abbr bid="B10">10</abbr></abbrgrp>. The inconsistencies between the different studies may therefore be less extreme than they seem. In particular, we think that a large source of the disagreement between <abbrgrp><abbr bid="B1">1</abbr></abbrgrp> and <abbrgrp><abbr bid="B3">3</abbr></abbrgrp> is simply the different choice of metric for the ROC curves.</p>
			<p>There is no question that our analysis of low-signal-intensity probesets as well as the specific selection of non-differentially expressed genes to use for normalization purposes required prior knowledge of the composition of the dataset. This, of course, is one of the great strengths of a wholly-defined dataset such as that from <abbrgrp><abbr bid="B1">1</abbr></abbrgrp> - we can choose idealized conditions for assessing the performance of different aspects of the analysis. Unfortunately, as Irizarry <it>et al</it>. correctly point out, it also makes it difficult to use for certain other types of assessment, such as those provided by Affycomp <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>.</p>
			<p>A more critical consideration lies in the point raised by Irizarry <it>et al</it>. that our dataset violates two main assumptions of most normalization methods: that a small fraction of genes should be differentially expressed; and that there should be roughly equal numbers of up and down regulated genes. It is important to note that these two assumptions are just that - assumptions - and ones that are extremely difficult to prove or disprove in any given microarray experiment. Thus there is an inherent circularity in the design of analysis algorithms that explicitly rely on these assumptions: they perform well on data assumed to have the properties based on which they are designed to perform well. This is an issue all too often overlooked in the microarray field. The violation of these two core assumptions seen in our dataset may be more common than generally appreciated; certainly we can conceive of many situations in which they are unlikely to hold (for example, when comparing different tissue types, in certain developmental time courses, or in cases of immune challenge). Developing assumption-free normalization methods, and diagnostics to assess the efficacy of the normalization used for a given dataset (see <abbrgrp><abbr bid="B12">12</abbr></abbrgrp> for an example), should thus be important research priorities.</p>
			<p>This discussion underscores the need for more control datasets that specifically address matters of RNA concentration, fractions of differentially expressed genes, direction of changes in gene regulation, and the like. Only then can we truly devise and assess the performance of analysis methods for the large variety of possible scenarios encountered in the course of conducting microarray experiments focused on real biological problems.</p>
			<p>Correspondence should be sent to Marc S Halfon: Department of Biochemistry and Center of Excellence in Bioinformatics and the Life Sciences, State University of New York at Buffalo, Buffalo, NY 14214, USA. Email: mshalfon@buffalo.edu</p>
		</sec>
		<sec>
			<st>
				<p>Additional data files</p>
			</st>
			<p>Additional data file <supplr sid="s1">1</supplr> contains MA plots for 100 randomly chosen pairs of arrays from the Gene Expression Omnibus (GEO) is available.</p>
			<suppl id="s1">
				<title>
					<p>Additional data file 1</p>
				</title>
				<caption>
					<p>MA plots for 100 randomly chosen pairs of arrays from the Gene Expression Omnibus (GEO)</p>
				</caption>
				<text>
					<p>MA plots for 100 randomly chosen pairs of arrays from the Gene Expression Omnibus (GEO)</p>
				</text>
				<file name="gb-2006-7-8-404-s1.BIB">
					<p>Click here for additional data file</p>
				</file>
			</suppl>
		</sec>
	</bdy>
	<bm>
		<ack>
			<sec>
				<st>
					<p>Acknowledgements</p>
				</st>
				<p>The work of R.A.I. is partially funded by the National Institutes of Health Specialized Centers of Clinically Oriented Research (SCCOR) translational research funds (212- 2492 and 212-2496).</p>
			</sec>
		</ack>
		<refgrp>
			<bibl id="B1">
				<title>
					<p>Preferred analysis methods for AffymetrixGeneChips revealed by a wholly defined control dataset.</p>
				</title>
				<aug>
					<au>
						<snm>Choe</snm>
						<fnm>SE</fnm>
					</au>
					<au>
						<snm>Boutros</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Michelson</snm>
						<fnm>AM</fnm>
					</au>
					<au>
						<snm>Church</snm>
						<fnm>GM</fnm>
					</au>
					<au>
						<snm>Halfon</snm>
						<fnm>MS</fnm>
					</au>
				</aug>
				<source>Genome Biol</source>
				<pubdate>2005</pubdate>
				<volume>6</volume>
				<fpage>R16</fpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">551536</pubid>
						<pubid idtype="pmpid" link="fulltext">15693945</pubid>
						<pubid idtype="doi">10.1186/gb-2005-6-2-r16</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B2">
				<title>
					<p>A benchmark for Affymetrix GeneChip expression measures. </p>
				</title>
				<aug>
					<au>
						<snm>Cope</snm>
						<fnm>L</fnm>
					</au>
					<au>
						<snm>Irizarry</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Jaffee</snm>
						<fnm>H</fnm>
					</au>
					<au>
						<snm>Wu</snm>
						<fnm>Z</fnm>
					</au>
					<au>
						<snm>Speed</snm>
						<fnm>T</fnm>
					</au>
				</aug>
				<source>Bioinformatics</source>
				<pubdate>2004</pubdate>
				<volume>20</volume>
				<fpage>323</fpage>
				<lpage>331</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1093/bioinformatics/btg410</pubid>
						<pubid idtype="pmpid" link="fulltext">14960458</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B3">
				<title>
					<p>Comparisonof Affymetrix GeneChip expression
measures.</p>
				</title>
				<aug>
					<au>
						<snm>Irizarry</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Wu</snm>
						<fnm>Z</fnm>
					</au>
					<au>
						<snm>Jaffe</snm>
						<fnm>H</fnm>
					</au>
				</aug>
				<source>Bioinformatics</source>
				<pubdate>2006</pubdate>
				<volume>22</volume>
				<fpage>789</fpage>
				<lpage>794</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1093/bioinformatics/btk046</pubid>
						<pubid idtype="pmpid" link="fulltext">16410320</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B4">
				<title>
					<p>Affycomp II: A benchmark for Affymetrix GeneChip expression measures</p>
				</title>
				<url>http://affycomp.biostat.jhsph.edu</url>
			</bibl>
			<bibl id="B5">
				<title>
					<p>A reanalysis of a published Affymetrix GeneChip control dataset.</p>
				</title>
				<aug>
					<au>
						<snm>Dabney</snm>
						<fnm>A</fnm>
					</au>
					<au>
						<snm>Storey</snm>
						<fnm>J</fnm>
					</au>
				</aug>
				<source>Genome Biol</source>
				<pubdate>2006</pubdate>
				<volume>7</volume>
				<fpage>401</fpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">1557755</pubid>
						<pubid idtype="pmpid" link="fulltext">16563185</pubid>
						<pubid idtype="doi">10.1186/gb-2006-7-3-401</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B6">
				<title>
					<p>Global disruption of the cerebellar transcriptome in a Down syndrome mouse model.</p>
				</title>
				<aug>
					<au>
						<snm>Saran</snm>
						<fnm>NG</fnm>
					</au>
					<au>
						<snm>Pletcher</snm>
						<fnm>MT</fnm>
					</au>
					<au>
						<snm>Natale</snm>
						<fnm>JE</fnm>
					</au>
					<au>
						<snm>Cheng</snm>
						<fnm>Y</fnm>
					</au>
					<au>
						<snm>Reeves</snm>
						<fnm>RH</fnm>
					</au>
				</aug>
				<source>Hum Mol Genet</source>
				<pubdate>2003</pubdate>
				<volume>12</volume>
				<fpage>2013</fpage>
				<lpage>2019</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1093/hmg/ddg217</pubid>
						<pubid idtype="pmpid" link="fulltext">12913072</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B7">
				<title>
					<p>One hundred MA plots from GEO</p>
				</title>
				<url>http://www.biostat.jhsph.edu/~ririzarr/papers/hundredMAs.pdf</url>
			</bibl>
			<bibl id="B8">
				<title>
					<p>A comparison of normalization methods for high density oligonucleotide array data based on variance and bias.</p>
				</title>
				<aug>
					<au>
						<snm>Bolstad</snm>
						<fnm>B</fnm>
					</au>
					<au>
						<snm>Irizarry</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>&#197;strand</snm>
						<fnm>M</fnm>
					</au>
					<au>
						<snm>Speed</snm>
						<fnm>T</fnm>
					</au>
				</aug>
				<source>Bioinformatics</source>
				<pubdate>2003</pubdate>
				<volume>19</volume>
				<fpage>185</fpage>
				<lpage>193</lpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="doi">10.1093/bioinformatics/19.2.185</pubid>
						<pubid idtype="pmpid" link="fulltext">12538238</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B9">
				<title>
					<p>A model-based background adjustment for oligonucleotide expression arrays.</p>
				</title>
				<aug>
					<au>
						<snm>Wu</snm>
						<fnm>Z</fnm>
					</au>
					<au>
						<snm>Irizarry</snm>
						<fnm>R</fnm>
					</au>
					<au>
						<snm>Gentleman</snm>
						<fnm>RC</fnm>
					</au>
					<au>
						<snm>Martinez-Murillo</snm>
						<fnm>F</fnm>
					</au>
					<au>
						<snm>Spencer</snm>
						<fnm>F</fnm>
					</au>
				</aug>
				<source>Journal of the American Statistical Association</source>
				<pubdate>2004</pubdate>
				<volume>99</volume>
				<fpage>909</fpage>
				<lpage>917</lpage>
				<xrefbib>
					<pubid idtype="doi">10.1198/016214504000000683</pubid>
				</xrefbib>
			</bibl>
			<bibl id="B10">
				<title>
					<p>Evaluation of methods for oligonucleotide array data via quantitative real-time PCR.</p>
				</title>
				<aug>
					<au>
						<snm>Qin</snm>
						<fnm>LX</fnm>
					</au>
					<au>
						<snm>Beyer</snm>
						<fnm>RP</fnm>
					</au>
					<au>
						<snm>Hudson</snm>
						<fnm>FNX</fnm>
					</au>
					<au>
						<snm>Linford</snm>
						<fnm>NJ</fnm>
					</au>
					<au>
						<snm>Morris</snm>
						<fnm>DE</fnm>
					</au>
					<au>
						<snm>Kerr</snm>
						<fnm>KF</fnm>
					</au>
				</aug>
				<source>BMC Bioinformatics</source>
				<pubdate>2006</pubdate>
				<volume>7</volume>
				<fpage>23</fpage>
				<xrefbib>
					<pubidlist>
						<pubid idtype="pmcid">1360686</pubid>
						<pubid idtype="pmpid" link="fulltext">16417622</pubid>
						<pubid idtype="doi">10.1186/1471-2105-7-23</pubid>
					</pubidlist>
				</xrefbib>
			</bibl>
			<bibl id="B11">
				<title>
					<p>GeneChip Expression Analysis: data
analysis fundamentals</p>
				</title>
				<url>http://www.affymetrix.com/support/downloads/manuals/data_analysis_fundamentals_manual.pdf</url>
			</bibl>
			<bibl id="B12">
				<title>
					<p>Putative null distributionscorresponding to tests of differentialexpression in the Golden Spike dataset are intensity dependent. Technical report 06-01. Buffalo, N.Y.: Department of Biostatistics, State University.</p>
				</title>
				<url>http://sphhp.buffalo.edu/biostat/research/techreports/UB_Biostatistics_TR0601.pdf</url>
			</bibl>
		</refgrp>
	</bm>
</art>
