Skip to main content
  • Methodology article
  • Open access
  • Published:

Methodological study of affine transformations of gene expression data with proposed robust non-parametric multi-dimensional normalization method

Abstract

Background

Low-level processing and normalization of microarray data are most important steps in microarray analysis, which have profound impact on downstream analysis. Multiple methods have been suggested to date, but it is not clear which is the best. It is therefore important to further study the different normalization methods in detail and the nature of microarray data in general.

Results

A methodological study of affine models for gene expression data is carried out. Focus is on two-channel comparative studies, but the findings generalize also to single- and multi-channel data. The discussion applies to spotted as well as in-situ synthesized microarray data. Existing normalization methods such as curve-fit ("lowess") normalization, parallel and perpendicular translation normalization, and quantile normalization, but also dye-swap normalization are revisited in the light of the affine model and their strengths and weaknesses are investigated in this context. As a direct result from this study, we propose a robust non-parametric multi-dimensional affine normalization method, which can be applied to any number of microarrays with any number of channels either individually or all at once. A high-quality cDNA microarray data set with spike-in controls is used to demonstrate the power of the affine model and the proposed normalization method.

Conclusion

We find that an affine model can explain non-linear intensity-dependent systematic effects in observed log-ratios. Affine normalization removes such artifacts for non-differentially expressed genes and assures that symmetry between negative and positive log-ratios is obtained, which is fundamental when identifying differentially expressed genes. In addition, affine normalization makes the empirical distributions in different channels more equal, which is the purpose of quantile normalization, and may also explain why dye-swap normalization works or fails. All methods are made available in the aroma package, which is a platform-independent package for R.

Background

The objective of most gene-expression measurements is to assess the expression levels of (all or a subset of) genes in one or several cell populations. Typically, mRNA abundances are measured, although techniques for measuring protein-levels also exist. The microarray technique [1] provides a way to measure mRNA transcripts for a large number of genes simultaneously, typically in the order of 103 – 105 or more. Microarrays have well defined immobilized regions, which each consists of clones or synthesized sequences of DNA specific to a unique gene. We refer to these (non-hybridized) regions or spots as probes [2]. A cocktail of cDNA created from the RNA extract from the cell population in study is then, for a few hours, hybridized to the DNA on the microarray after which excess cDNA is washed off. The result is that each region of the microarray contains a certain amount of hybridized DNA unique to the corresponding gene. By first labeling the cDNA strands in the sample cocktail with a radioactive or a fluorescent probe, the amount of hybridized DNA can be measured utilizing radioactive sensitive film or a color-sensitive scanner, respectively.

By measuring the gene expression for a specific gene, we try to assess how active that gene is (measured on some scale). Because it is hard to identify an absolute scale to measure on, often, but also for various other reasons, a reference is used to obtain a relative scale. As even genes from the same sample are not directly comparable to each other, each gene gets its own reference, which is typically the same gene from a reference sample. With this approach, we can obtain gene-expression ratios for every gene, which for instance can be used to test the hypothesis if a gene (in the test sample) is differentially expressed or not (compared to the gene in the reference sample). This is the core idea behind the two-channel microarray technology, in which the test and the reference cDNA cocktails are hybridized simultaneously and in a competitive way to the same array. The same idea has been adopted by single-channel hybridization technologies where the comparison instead is done numerically in the data analysis step. Even if gene-by-gene references are used, the measurements are not perfect and they are likely to contain systematic errors, which possibly vary from measurement to measurement, and the obtained gene-expression ratios may still be biased and not comparable to each other. What we ultimately would like to do is to measure all control and all reference samples under identical conditions. The aforementioned two-color microarray technology tries, in some sense, to do this by measuring the control/reference pairs for each gene in one hybridization (although it is not clear if the gain from co-hybridizing two samples with different labels is larger than hybridizing twice with identical labels and then scanning the samples separately).

In this paper, we present an affine model that explains many of the systematic effects frequently observed when gene-expression levels from two (or more) samples are compared. The main contributors to such systematic effects are offsets in the individual channel signals, which give non-linear systematic effects in ratios. We will not provide an error model, but only a deterministic model. The main reason for this is that an error-free model makes it easier to understand the impact that channel offsets have on the downstream analysis regardless of gene-expression technology used. This is especially of interest as these are often implicitly assumed to be small and of no effect, which we believe is a too strong assumption. The impact of channel offsets is much larger that the noise, which is why we allow us to assume zero noise in the discussion. Although some error models have been suggested for microarray data [3], we believe research beyond this article is required before we can understand and correctly model the various error sources introduced in the microarray process.

The outline of this paper is as follows. In the Model section, a general model that incorporates all steps of any gene-expression technology is given. By dissecting the generic model and focusing more on the microarray technologies, an affine model is introduced. Here is also the widely adopted and accepted log-ratio log-intensity transform under affine transformations formalized. The Results section consists of three main parts. In the first, we show how the affine transform introduces intensity and fold-change dependent biases in the log-ratios. In the second part, we revisit common normalization methods, to which dye-swap and background correction may also be counted, and discuss them using the affine model. In the third and concluding part, we suggest a novel and multi-purpose robust normalization method to back-transform data to the linear (proportional) space. We end the paper with a Discussion section where we give similarities to other normalization methods followed by a Conclusions section. Details on calculations and the data set used are given in the Methods appendix.

Results

General model

Consider an experiment with genes i = 1,..., I from RNA extracts c = 1,..., C. For example, in oligonucleotide microarrays each slide measures the gene-expression levels of exactly one RNA extract whereas for a two-color microarrays each slide measures two RNA extracts, one in each channel. From now on, we refer to the RNA extracts or replicates of such as channels. Let x c,i be the true gene-expression level of gene i in channel c and let y c, i be the corresponding observed gene-expression level. The relationship between the observed and the true expression levels can be written as

y c,i = f c (x c,i ) + ε c,i     (1)

where f c is a channel specific measurement function, which includes all steps in the gene-expression acquisition process. Most generally, we have that E [ε c,i ] = 0 and V [ε c,i ] = σ c , i 2 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaaiiGacqWFdpWCdaqhaaWcbaGaem4yamMaeiilaWIaemyAaKgabaGaeGOmaidaaaaa@331F@ , where the variance can take any form. Importantly, the properties of ε c are not well understood and depends on platform used, but also which part of the process that is studied. For this reason and because of the many interesting effects that the affine transformation (presented below) generates by itself, we conduct this study under the assumption of noise-free data. Relationship (1) may be specified for subsets of genes or probes, e.g. print tip [4], microtiter plate or clone library [5] groups. Spatial dependencies may also be modeled. However, to simplify the discussion that follows, we avoid such details.

Since inference is ideally based on xc,i, the inverse of f c has to be identified, something that, in theory, is possible if it is strictly increasing. Violation of this constraint has been observed in, for instance, two-color microarray data. This can be due to too high concentrations of fluorophores, which sometimes quenches the signal so much that the signal decreases when the concentration increases [6, 7]. Extreme saturation in the scanner, which is commonly observed when the PMT gain is set too high, results in censored signals, which in turn prevents a unique inverse of the measurement function to be found. This paper does not discuss saturation further, because we believe that saturation can and should be avoided.

Dissection of the overall measurement function

Formally, each step in the microarray process can be seen as a function that takes a set of input objects and outputs another set of objects. The sequential nature of the process makes it possible to think of the measurement function f c as a composite function (function of functions); f c = fc,Sfc,S-1fc,1, where S is the number of steps in the process. For instance, and of course simplified, it could be that fc,1models the extraction of the RNA from the cell, fc,2models the reverse transcription of RNA into cDNA and so on. Some of these submeasurement functions are shared by several channels and others are channel specific or even gene specific. Moreover, there may be joining subfunctions too, e.g. the hybridization of labeled cDNA sequences to the probes on the array. In this paper, measurement functions of different channels are treated independently.

A first-order Taylor series expansion of an arbitrary measurement function f c (x c,i ), has the form

f c (x c,i ) = a c + b c x c,i + R c (x c,i ), c,i.     (2)

From the above dissection of a measurement functions, it is easy to argue that some of the subfunctions may introduce offset (bias) and that there for this reason ought to be an offset in f c (we will use the terms bias and offset interchangeably). For instance, the offset terms may be due to non-uniformity of the reverse transcription, the labeling [7] or the hybridization, due to dark noise in the PMT [8] or laser scatter light in the scanner, background noise, non-uniformity of the scanned glass slide [9], or threshold effects etc. In [10] it is shown how various background estimates based on different image analysis methods may introduce bias. Similarly, we have shown that different scanners may introduce bias [11].

The affine measurement function

In order to focus on the effects of a c and b c , but also because it results in the simplest parametric measurement function possible, we assume R c (x c,i ) in (2) to be small. The affine measurement function is

f c (x c,i ) = a c + b c x c,i , c,i,     (3)

with unique inverse

x c , i = f c 1 ( y c , i ) = y c , i a c b c , c , i , ( 4 ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWG4baEdaWgaaWcbaGaem4yamMaeiilaWIaemyAaKgabeaakiabg2da9iabdAgaMnaaDaaaleaacqWGJbWyaeaacqGHsislcqaIXaqmaaGccqGGOaakcqWG5bqEdaWgaaWcbaGaem4yamMaeiilaWIaemyAaKgabeaakiabcMcaPiabg2da9maalaaabaGaemyEaK3aaSbaaSqaaiabdogaJjabcYcaSiabdMgaPbqabaGccqGHsislcqWGHbqydaWgaaWcbaGaem4yamgabeaaaOqaaiabdkgaInaaBaaaleaacqWGJbWyaeqaaaaakiabcYcaSiabbccaGiabgcGiIiabdogaJjabcYcaSiabdMgaPjabcYcaSiaaxMaacaWLjaWaaeWaaeaacqaI0aanaiaawIcacaGLPaaaaaa@5614@

where a c is the overall offset (bias) and b c > 0 is the overall scale factor in channel c. The a c parameters are commonly positive, but under certain circumstances, for instance, as demonstrated later, when two different measuring techniques are compared, the effective offset may be negative. Modeling microarray data by an affine transform is not novel [3, 1214], but the reasons for it might have been different in those papers.

The log-ratio log-intensity transform

In two-color but also in oligonucleotide microarray experiments, it is convenient to do statistical analysis on the log-ratios and the log-intensities [15] of the gene-expression levels in two channels instead of on the expression levels directly. For gene i we have that

M i = log 2 y R , i y G , i = log 2 f R ( x R , i ) f G ( x G , i ) ( 5 ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGnbqtdaWgaaWcbaGaemyAaKgabeaakiabg2da9iGbcYgaSjabc+gaVjabcEgaNnaaBaaaleaacqaIYaGmaeqaaOWaaSaaaeaacqWG5bqEdaWgaaWcbaGaemOuaiLaeiilaWIaemyAaKgabeaaaOqaaiabdMha5naaBaaaleaacqWGhbWrcqGGSaalcqWGPbqAaeqaaaaakiabg2da9iGbcYgaSjabc+gaVjabcEgaNnaaBaaaleaacqaIYaGmaeqaaOWaaSaaaeaacqWGMbGzdaWgaaWcbaGaemOuaifabeaakiabcIcaOiabdIha4naaBaaaleaacqWGsbGucqGGSaalcqWGPbqAaeqaaOGaeiykaKcabaGaemOzay2aaSbaaSqaaiabdEeahbqabaGccqGGOaakcqWG4baEdaWgaaWcbaGaem4raCKaeiilaWIaemyAaKgabeaakiabcMcaPaaacaWLjaGaaCzcamaabmaabaGaeGynaudacaGLOaGaayzkaaaaaa@5CCF@

A i = 1 2 log 2 ( y R , i y G , i ) = 1 2 log 2 ( f R ( x R , i ) f G ( x G , i ) ) . ( 6 ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaafaqadeGabaaabaGaemyqae0aaSbaaSqaaiabdMgaPbqabaGccqGH9aqpdaWcaaqaaiabigdaXaqaaiabikdaYaaacyGGSbaBcqGGVbWBcqGGNbWzdaWgaaWcbaGaeGOmaidabeaakiabcIcaOiabdMha5naaBaaaleaacqWGsbGucqGGSaalcqWGPbqAaeqaaOGaeyyXICTaemyEaK3aaSbaaSqaaiabdEeahjabcYcaSiabdMgaPbqabaGccqGGPaqkaeaacqGH9aqpdaWcaaqaaiabigdaXaqaaiabikdaYaaacyGGSbaBcqGGVbWBcqGGNbWzdaWgaaWcbaGaeGOmaidabeaakiabcIcaOiabdAgaMnaaBaaaleaacqWGsbGuaeqaaOGaeiikaGIaemiEaG3aaSbaaSqaaiabdkfasjabcYcaSiabdMgaPbqabaGccqGGPaqkcqGHflY1cqWGMbGzdaWgaaWcbaGaem4raCeabeaakiabcIcaOiabdIha4naaBaaaleaacqWGhbWrcqGGSaalcqWGPbqAaeqaaOGaeiykaKIaeiykaKIaeiOla4caaiaaxMaacaWLjaWaaeWaaeaacqaI2aGnaiaawIcacaGLPaaaaaa@6968@

For simplicity, we denoted channels 1 and 2 by R and G, which are mnemonics for the red and the green dyes commonly used in two-color microarray data. A rationale for this bijective transform (if the observed signals are positive) is that the main measure of interest, the fold change, is contained in one variable. However, since the transform is based on observed expression levels and not the true ones, M alone does indeed not carry all information about the biological fold change. This can be seen if the true fold change for an arbitrary gene i is considered;

r i = x R,i /x G,i     (7)

where r i > 0. Dropping gene index i in (5) and (6), M and A can be written as functions of x G and r, i.e. M = g r (x G ) and A = h r (x G ). Thus,

M = m r ( A ) = g r ( h r 1 ( A ) ) , ( 8 ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGnbqtcqGH9aqpcqWGTbqBdaWgaaWcbaGaemOCaihabeaakiabcIcaOiabdgeabjabcMcaPiabg2da9iabdEgaNnaaBaaaleaacqWGYbGCaeqaaOGaeiikaGIaemiAaG2aa0baaSqaaiabdkhaYbqaaiabgkHiTiabigdaXaaakiabcIcaOiabdgeabjabcMcaPiabcMcaPiabcYcaSiaaxMaacaWLjaWaaeWaaeaacqaI4aaoaiaawIcacaGLPaaaaaa@468C@

which shows that M is a function of A (and r). Hence, and discussed thoroughly below, commonly observed intensity-dependent effects in the log-ratios may contain valuable information, and consequently, applying normalization methods without care may result in loss of information and introduced bias.

Log-ratios as a function of log-intensities with affine transformations

Under an affine transformation, the relationship between the observed log-ratios and the observed log-intensities for a fixed fold change r, omitting gene index i, is

M = m r ( A ) = log 2 r + log 2 β + log 2 1 2 α ( r ) + 1 4 [ α ( r ) ] 2 + r β 2 2 A 1 2 α ( r ) + 1 4 [ α ( r ) ] 2 + r β 2 2 A ( 9 ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaafaqadeGabaaabaGaemyta0Kaeyypa0JaemyBa02aaSbaaSqaaiabdkhaYbqabaGccqGGOaakcqWGbbqqcqGGPaqkcqGH9aqpcyGGSbaBcqGGVbWBcqGGNbWzdaWgaaWcbaGaeGOmaidabeaakiabdkhaYjabgUcaRiGbcYgaSjabc+gaVjabcEgaNnaaBaaaleaacqaIYaGmaeqaaGGacOGae8NSdigabaGaey4kaSIagiiBaWMaei4Ba8Maei4zaC2aaSbaaSqaaiabikdaYaqabaGcdaWcaaqaamaalaaabaGaeGymaedabaGaeGOmaidaaiab=f7aHjabcIcaOiabdkhaYjabcMcaPiabgUcaRmaakaaabaWaaSaaaeaacqaIXaqmaeaacqaI0aanaaGaei4waSLae8xSdeMaeiikaGIaemOCaiNaeiykaKIaeiyxa01aaWbaaSqabeaacqaIYaGmaaGccqGHRaWkcqWGYbGCcqWFYoGycqaIYaGmdaahaaWcbeqaaiabikdaYiabdgeabbaaaeqaaaGcbaGaeyOeI0YaaSaaaeaacqaIXaqmaeaacqaIYaGmaaGae8xSdeMaeiikaGIaemOCaiNaeiykaKIaey4kaSYaaOaaaeaadaWcaaqaaiabigdaXaqaaiabisda0aaacqGGBbWwcqWFXoqycqGGOaakcqWGYbGCcqGGPaqkcqGGDbqxdaahaaWcbeqaaiabikdaYaaakiabgUcaRiabdkhaYjab=j7aIjabikdaYmaaCaaaleqabaGaeGOmaiJaemyqaeeaaaqabaaaaaaakiaaxMaacaWLjaWaaeWaaeaacqaI5aqoaiaawIcacaGLPaaaaaa@812E@

where α(r) = a R - r β a G quantifies how much M depends on A at the given fold change, and β = b R /b G is the relative scale factor between the two channels compared. See Methods for details. Recall that log2r is the variable of interest. The derivative of M with respect to A for a fixed fold change r is

d M d A | x R = r x G ( A ) = α ( r ) 1 4 [ α ( r ) ] 2 + r β 2 2 A . ( 10 ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaadaabcaqaamaalaaabaGaemizaqMaemyta0eabaGaemizaqMaemyqaeeaaaGaayjcSdWaaSbaaSqaaiabdIha4naaBaaameaacqWGsbGuaeqaaSGaeyypa0JaemOCaiNaemiEaG3aaSbaaWqaaiabdEeahbqabaaaleqaaOGaeiikaGIaemyqaeKaeiykaKIaeyypa0JaeyOeI0YaaSaaaeaaiiGacqWFXoqycqGGOaakcqWGYbGCcqGGPaqkaeaadaGcaaqaamaalaaabaGaeGymaedabaGaeGinaqdaaiabcUfaBjab=f7aHjabcIcaOiabdkhaYjabcMcaPiabc2faDnaaCaaaleqabaGaeGOmaidaaOGaey4kaSIaemOCaiNae8NSdiMaeGOmaiZaaWbaaSqabeaacqaIYaGmcqWGbbqqaaaabeaaaaGccqGGUaGlcaWLjaGaaCzcamaabmaabaGaeGymaeJaeGimaadacaGLOaGaayzkaaaaaa@5BFE@

Consider a fixed r and define α = α(r). Then there are only two parameters in (9) and (10) that determine the shape of m r (A), namely α and β. Consequently, when a R , a G ≠ 0, M is independent of A if and only if α = 0, that is, when r = (b G a R )/(b R a G ). For this particular value of r, we have that the observed log-ratio is M = log2 (a R /a G ), which is independent of scale factors. Moreover, for log-ratios of non-differential expressions, that is log2r = 0, to be independent of A, it must be true that b G a R = b R a G or, equivalently, b R /b G = a R /a G . It is also clear from (10) that the scale parameters cannot introduce any curvature themselves, but only enhance or decrease curvature introduced by the offset. In addition to this, relative scale different from one shifts the log-ratios up or down. Moreover, the size of the effect that the offset terms have on the log-ratios decreases as the intensity increases. At high intensities the only observable effect is that from the relative scale between the two channels. The observed log-ratio for non-differentially expressed genes at high intensity is M ≈ log2β. In the case of a linear transform (a R = a G = 0), α is (always) zero and M is therefore independent of A for all r. The remaining log-ratio bias is log2β. If a R , a G > 0, the "weakest" observable data point is (A0, M0) = (1/2·log2(a R a G ), log2(a R /a G )), which is independent of both gene expression and scale parameters. All fold-change curves converge to this point. In the left graph of Figure 1 the effect of the affine transform A 1 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbwvMCKfMBHbqedmvETj2BSbWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aqee0evGueE0jxyaibaieYdOi=BH8vipeYdI8qiW7rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbbG8FasPYRqj0=yi0lXdbba9pGe9qqFf0dXdHuk9fr=xfr=xfrpiWZqaaeaabiGaaiaacaqabeaadaqacqaaaOqaaGWaaiab=bq8bnaaBaaaleaatCvAUfKttLearCWrP9MDH5MBPbIqV92AaGqbaiab+fdaXaqabaaaaa@43C0@ = {(a G a R ) = (200,20), (b G , b R ) = (1-4, 0.8)} at different fold changes is depicted. The different curves plotted are the functions M = m r (A) for different fold changes. Note the asymmetry in curvature between up and down regulation. From the above discussion we know that the observed log-ratios are independent of the log-intensities for log2r ≈ -2.51 with value M0 ≈ -3.32. The log-ratio for non-differentially expressed genes at high intensities is M ≈ -0.81. A real-world example taken from [11], where the same array was scanned four times at various scanner PMT (sensitivity) settings, is shown in the right plot of Figure 1. Observed within-channel log-ratios M c = log2( y c ( v ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWG5bqEdaqhaaWcbaGaem4yamgabaGaeiikaGIaemODayNaeiykaKcaaaaa@32CA@ / y c ( w ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWG5bqEdaqhaaWcbaGaem4yamgabaGaeiikaGIaem4DaCNaeiykaKcaaaaa@32CC@ ) are plotted against the within-channel log-intensities A c = log2( y c ( v ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWG5bqEdaqhaaWcbaGaem4yamgabaGaeiikaGIaemODayNaeiykaKcaaaaa@32CA@ y c ( w ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWG5bqEdaqhaaWcbaGaem4yamgabaGaeiikaGIaem4DaCNaeiykaKcaaaaa@32CC@ ) /2 for the red channel (c = R) where y c ( v ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWG5bqEdaqhaaWcbaGaem4yamgabaGaeiikaGIaemODayNaeiykaKcaaaaa@32CA@ and y c ( w ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWG5bqEdaqhaaWcbaGaem4yamgabaGaeiikaGIaem4DaCNaeiykaKcaaaaa@32CC@ are observations at two different scanner PMT settings. In this case it turned out that all scans share the same offset. For more details, see [11]. For another example, see Figure 9.

Figure 1
figure 1

Affine transformation of the red and the green signals. Left: Affine transformation of the red and the green signals for A 1 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbwvMCKfMBHbqedmvETj2BSbWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aqee0evGueE0jxyaibaieYdOi=BH8vipeYdI8qiW7rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbbG8FasPYRqj0=yi0lXdbba9pGe9qqFf0dXdHuk9fr=xfr=xfrpiWZqaaeaabiGaaiaacaqabeaadaqacqaaaOqaaGWaaiab=bq8bnaaBaaaleaatCvAUfKttLearCWrP9MDH5MBPbIqV92AaGqbaiab+fdaXaqabaaaaa@43C0@ = {(a G , a R ) = (200, 20), (b G , b R ) = (1.4, 0.8)}. The observed log-ratios as a function of the observed log-intensities for different fold changes. The blue dot-dash curve corresponds to the non-differentially expressed genes and the thinner curves above and below this curve represent log2r = ± 1, ± 2,... as labeled to the right of the curves. The lines in the gray grid, which is rotated 45 degrees (in (2A, M)), show the levels where the true signals log2 x R and log2 x G are equal to ..., -1, 0, 1,..., 16. These levels have been labeled to the left of the grid. No observations can lie outside this grid. Right: Real-world example of an affine transformation. The same slide was scanned four times at four different PMT settings. For each of the six scan pairs, the within-channel log-ratio and log-intensities were calculated. Data shown is from the red channel, which was estimated to have an offset of a R = 20.3 for all scans.

Bias in the log-ratios

From (9) we see that the bias in the log-ratios introduced by the affine transform is intensity dependent. This non-linearity can be observed as a propeller shaped graph in Figure 2, where the log-ratios under the affine transform A 1 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbwvMCKfMBHbqedmvETj2BSbWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aqee0evGueE0jxyaibaieYdOi=BH8vipeYdI8qiW7rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbbG8FasPYRqj0=yi0lXdbba9pGe9qqFf0dXdHuk9fr=xfr=xfrpiWZqaaeaabiGaaiaacaqabeaadaqacqaaaOqaaGWaaiab=bq8bnaaBaaaleaatCvAUfKttLearCWrP9MDH5MBPbIqV92AaGqbaiab+fdaXaqabaaaaa@43C0@ are plotted against the true log-ratios at different log-intensity levels. If a regression line is fitted between the affine transformed log-ratios and the true log-ratios, the slope will always be less than one. Moreover, this is true for all normalization methods that do not overcompensate for channel offsets. This may explain why some studies show that cDNA microarrays tend to compress the absolute log-ratios compared to oligoarrays and QRT-PCR [1618] including a recent study [19]; the channel offsets in cDNA microarrays are probably larger. When [20] compared cDNA microarray log-ratios to Northern blot log-ratios for their background correction method they found similar behavior, which emphasizes the close relationship between offset and background estimates. We will return to this later. The same patterns is seen in an M versus M scatter plot for non-normalized versus (affine) normalized data. See right scatter plot in Figure 2. To visualize the intensity dependency of the log-ratios, only data points at certain log-intensity levels are plotted. For details on data, see Methods.

Figure 2
figure 2

Bias in the log-ratios introduced by the affine transform. Left: Bias in the log-ratios introduced by the affine transform A 1 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbwvMCKfMBHbqedmvETj2BSbWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aqee0evGueE0jxyaibaieYdOi=BH8vipeYdI8qiW7rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbbG8FasPYRqj0=yi0lXdbba9pGe9qqFf0dXdHuk9fr=xfr=xfrpiWZqaaeaabiGaaiaacaqabeaadaqacqaaaOqaaGWaaiab=bq8bnaaBaaaleaatCvAUfKttLearCWrP9MDH5MBPbIqV92AaGqbaiab+fdaXaqabaaaaa@43C0@ . Each line displays the relationship between the observed and the true log-ratios at a certain (observed) log-intensity A. Each curve is marked with the value of A. We have chosen to truncate the curves when the signals become saturated and the labels for those curves are positioned approximately where they have been truncated. For low intensities there is a great bias (deviance from the diagonal line), especially for large fold changes. At higher intensities the bias is smaller. The curves intersect at the one fold-change level that is independent of the intensity. Right: Real-world example of log-ratios for non-normalized versus affine normalized (with 5% negative) signals. The affine parameters are ( a ^ G MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacuWGHbqygaqcamaaBaaaleaacqWGhbWraeqaaaaa@2F4A@ , a ^ R MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacuWGHbqygaqcamaaBaaaleaacqWGsbGuaeqaaaaa@2F60@ , log2 β ^ MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaaiiGacuWFYoGygaqcaaaa@2E64@ ) = (45.7, 27.0, -0.418). To clarify the intensity-dependent effect only data points close to A = 0.0, 0.5,...,16 are shown.

Normalization in general

Depending on the design of the microarray experiment, we expect to observe different types of patterns in data. A typical example is where a subset of the genes studied is expected to be non-differentially expressed in a test sample compared to a reference sample. However, it is common that the patterns of the observed expression levels are not in line with the expected patterns of the true expression levels. Whenever this happens various strategies can be adopted in order to make the normalized data meet the expectations. Normalization of microarray data is about identifying and removing such artifactual variations that are not due to noise or natural variability. An example is the intensity-dependent log-ratio artifact.

In the following section we will, with the affine model in mind, revisit various more or less well known normalization methods that directly or indirectly remove intensity-dependent artifacts. With the gained knowledge, we then propose a generic and robust multi-dimensional normalization method for affine transformed data.

To be more precise in what follows, we will refer to methods that correct for differences in observed and expected data, that is, conform the signals to a standard or a norm, as normalization methods, where normalization has the meaning of conforming to expectations. Sometimes calibration data, also known as control data, which contains true relative or absolute expression levels, is available. Such data can be used to correct for discrepancies between observed and true expression levels. We refer to methods that use calibration (read known) data points to correct for artifacts as calibration methods. To this category we also count methods that are based on models for which we can find the inverse of the measurement function. For precise definitions, see the introduction of [21]. Calibration methods are not discussed further in this paper.

Typically a normalization method is only capable of estimating α = a R - βa G for r = 1 in (9) and not the individual offset terms. This is because the often used assumption that most genes are non-differentially expressed (and/or that there is an equal amount of up and down regulated genes) will only help us identify one fold-change curve, namely log2 r = 0. For a normalization method, like most calibration methods, to be able to estimate both a R and a G more constraints are needed and without known data this can only be done based on more assumptions. As more research is needed, we will not elaborate on such additional assumptions in this paper. Thus, the rest of this paper will only discuss normalization methods based on the commonly accepted assumption that it is possible to identify a set of genes that can be used to normalize the non-differentially expressed genes.

Curve-fit normalization revisited

When [4] first observed the intensity-dependent effects on the log-ratios they suggested a curve-fit normalization method that is often referred to as lo(w)ess normalization. The simplest version of this assumes that the majority of the genes are non-differentially expressed regardless of expression level and for this reason the log-ratios are expected to be centered around zero for all intensities. Under the above assumption, curves estimated using robust local regression methods such as lowess [22, 23] or loess [24], or curves modeled by smoothing splines [25] will be good approximations for the mr = 1(A) function, which then can be subtracted from the observed log-ratios

MM - mr=1(A) = m r (A) - mr=1(A).     (11)

Under an affine transform, m r (A) and mr=1(A) are as in (9), but we do not know of a closed form expression for (11). An example of a curve-fit normalization under the affine transform is depicted in Figure 3. Note that the asymmetry between up- and down-regulated genes is not corrected for. Moreover, if we look at the overlaid (log2x G , log2x R ) grid in the left graph of Figure 3, we find that the curve-fit normalization warps it and removes the otherwise orthogonal relationship between log2x R and log2x G (if the (2A, M) plane is considered).

Figure 3
figure 3

Curve-fit normalization of affine transformed data. Curve-fit normalization of A 1 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbwvMCKfMBHbqedmvETj2BSbWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aqee0evGueE0jxyaibaieYdOi=BH8vipeYdI8qiW7rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbbG8FasPYRqj0=yi0lXdbba9pGe9qqFf0dXdHuk9fr=xfr=xfrpiWZqaaeaabiGaaiaacaqabeaadaqacqaaaOqaaGWaaiab=bq8bnaaBaaaleaatCvAUfKttLearCWrP9MDH5MBPbIqV92AaGqbaiab+fdaXaqabaaaaa@43C0@ transformed data. Left: Log-ratios as a function of log-intensities for different fold changes. Note that the distance between up- and down-regulated genes at any intensity is the same before and after the normalization. Right: Normalized log-ratios versus true log-ratios. We see that intensity-dependent artifacts have been removed for the observed and true log-ratios where all curves intersect (here at (0, 0)).

Perpendicular translation normalization revisited

The perpendicular (shift-log) normalization method proposed by [13] corrects for differences in the channel offsets. It normalizes log-ratios using a translation transform where a constant, a , is added to the signals in one channel and subtracted from the other;

y R , i a R + b R x R , i + a ; i y G , i a G + b G x G , i a ; i . ( 12 ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaafaqabeGabaaabaGaemyEaK3aaSbaaSqaaiabdkfasjabcYcaSiabdMgaPbqabaGccqGHqgcRcqWGHbqydaWgaaWcbaGaemOuaifabeaakiabgUcaRiabdkgaInaaBaaaleaacqWGsbGuaeqaaOGaemiEaG3aaSbaaSqaaiabdkfasjabcYcaSiabdMgaPbqabaGccqGHRaWkcqWGHbqycqGG7aWocqqGGaaicqGHaiIicqWGPbqAaeaacqWG5bqEdaWgaaWcbaGaem4raCKaeiilaWIaemyAaKgabeaakiabgcziSkabdggaHnaaBaaaleaacqWGhbWraeqaaOGaey4kaSIaemOyai2aaSbaaSqaaiabdEeahbqabaGccqWG4baEdaWgaaWcbaGaem4raCKaeiilaWIaemyAaKgabeaakiabgkHiTiabdggaHjabcUda7iabbccaGiabgcGiIiabdMgaPjabc6caUaaacaWLjaGaaCzcamaabmaabaGaeGymaeJaeGOmaidacaGLOaGaayzkaaaaaa@62EB@

We refer to this translation normalization transform as the perpendicular translation normalization, because it moves (x G , x R ) perpendicular to the x R = x G line. From (9), we get that the observed log-ratios m r (A) can be made independent of the intensities if and only if

a = r b R a G b G a R b G + r b R , r > 0. ( 13 ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGHbqycqGH9aqpdaWcaaqaaiabdkhaYjabdkgaInaaBaaaleaacqWGsbGuaeqaaOGaemyyae2aaSbaaSqaaiabdEeahbqabaGccqGHsislcqWGIbGydaWgaaWcbaGaem4raCeabeaakiabdggaHnaaBaaaleaacqWGsbGuaeqaaaGcbaGaemOyai2aaSbaaSqaaiabdEeahbqabaGccqGHRaWkcqWGYbGCcqWGIbGydaWgaaWcbaGaemOuaifabeaaaaGccqGGSaalcqqGGaaicqWGYbGCcqGH+aGpcqaIWaamcqGGUaGlcaWLjaGaaCzcamaabmaabaGaeGymaeJaeG4mamdacaGLOaGaayzkaaaaaa@4E2F@

As this is a function of r, it is only for a single fold change at a time this method can make M independent of A. The most common choice is r = 1 for which the optimal perpendicular shift is

a = b R a G b G a R b G + b R , ( 14 ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGHbqycqGH9aqpdaWcaaqaaiabdkgaInaaBaaaleaacqWGsbGuaeqaaOGaemyyae2aaSbaaSqaaiabdEeahbqabaGccqGHsislcqWGIbGydaWgaaWcbaGaem4raCeabeaakiabdggaHnaaBaaaleaacqWGsbGuaeqaaaGcbaGaemOyai2aaSbaaSqaaiabdEeahbqabaGccqGHRaWkcqWGIbGydaWgaaWcbaGaemOuaifabeaaaaGccqGGSaalcaWLjaGaaCzcamaabmaabaGaeGymaeJaeGinaqdacaGLOaGaayzkaaaaaa@4649@

which is the weighted difference between a R and a G with weights b G /(b G + b R ) and b R /(b G + b R ), respectively. The distance from the r = 1 curve to the M = 0 curve for the optimal perpendicular shift is log2 β. In other words, the perpendicular shift normalization will not remove an overall bias in the log-ratios (although it is not hard to estimate β afterward). The optimal shift for A 1 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbwvMCKfMBHbqedmvETj2BSbWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aqee0evGueE0jxyaibaieYdOi=BH8vipeYdI8qiW7rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbbG8FasPYRqj0=yi0lXdbba9pGe9qqFf0dXdHuk9fr=xfr=xfrpiWZqaaeaabiGaaiaacaqabeaadaqacqaaaOqaaGWaaiab=bq8bnaaBaaaleaatCvAUfKttLearCWrP9MDH5MBPbIqV92AaGqbaiab+fdaXaqabaaaaa@43C0@ is a = 60 with log2β = 0.57. The result of this normalization is depicted in Figure 4. Note that m r (A) after normalization is constant for r = 1.

Figure 4
figure 4

Perpendicular translation normalization of affine transformed data. Perpendicular translation normalization of A 1 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbwvMCKfMBHbqedmvETj2BSbWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aqee0evGueE0jxyaibaieYdOi=BH8vipeYdI8qiW7rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbbG8FasPYRqj0=yi0lXdbba9pGe9qqFf0dXdHuk9fr=xfr=xfrpiWZqaaeaabiGaaiaacaqabeaadaqacqaaaOqaaGWaaiab=bq8bnaaBaaaleaatCvAUfKttLearCWrP9MDH5MBPbIqV92AaGqbaiab+fdaXaqabaaaaa@43C0@ transformed data. The optimal amount of normalization shift in the raw data is a = 60, which corresponds to a R MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacuWGHbqygaqbamaaBaaaleaacqWGsbGuaeqaaaaa@2F5C@ = 80 and a G MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacuWGHbqygaqbamaaBaaaleaacqWGhbWraeqaaaaa@2F46@ = 140. Left: Log-ratios as a function of log-intensities for certain fold changes. The r = 1 curve (dot-dash blue) is horizontal, that is, for this specific value of r and a the log-ratios are independent of the log-intensities. Right: Normalized log-ratios versus true log-ratios. From this graph it is clear that we obtain the minimum error in log-ratios at zero-fold change. The dotted curves correspond to the minimum and maximum log-intensities possible to observe.

As suggested by [13], one way to find the optimal shift a is to minimize the curvature by minimizing the variation of the log-ratios after applying the shift a. To do this robustly, the median absolute deviation (MAD) can be used as a measure of variation;

a ^ = arg min a MAD 1 i I ( M i ( a ) ) . ( 15 ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacuWGHbqygaqcaiabg2da9iGbcggaHjabckhaYjabcEgaNnaaxababaGagiyBa0MaeiyAaKMaeiOBa4galeaacqWGHbqyaeqaaOWaaCbeaeaacqqGnbqtcqqGbbqqcqqGebaraSqaaiabigdaXiabgsMiJkabdMgaPjabgsMiJkabdMeajbqabaGccqGGOaakcqWGnbqtdaWgaaWcbaGaemyAaKgabeaakiabcIcaOiabdggaHjabcMcaPiabcMcaPiabc6caUiaaxMaacaWLjaWaaeWaaeaacqaIXaqmcqaI1aqnaiaawIcacaGLPaaaaaa@5013@

We have found that the variance of a ^ MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacuWGHbqygaqcaaaa@2E07@ is unnecessarily large.

A problem with the perpendicular translation normalization methods, which is not related to estimator (15), is that the optimal shift can result in non-positive signals making a huge number of expression ratios invalid. The normalization method discussed next does not have this problem, but on the other hand, it will not work or work badly under certain conditions.

Parallel translation normalization revisited

For historical reasons, but also because it contributes to our discussion about background correction, the shift-log method proposed by [26] for stabilizing (read decreasing or shrinking) the variance of the measured log-ratios is of interest. A side effect of this method is that it can correct for intensity-dependent curvature. It is based on a translation transform where the same constant, a , is added to the signals in both channels;

y R , i a R + b R x R , i + a ; i y G , i a G + b G x G , i + a ; i . ( 16 ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaafaqabeGabaaabaGaemyEaK3aaSbaaSqaaiabdkfasjabcYcaSiabdMgaPbqabaGccqGHqgcRcqWGHbqydaWgaaWcbaGaemOuaifabeaakiabgUcaRiabdkgaInaaBaaaleaacqWGsbGuaeqaaOGaemiEaG3aaSbaaSqaaiabdkfasjabcYcaSiabdMgaPbqabaGccqGHRaWkcqWGHbqycqGG7aWocqqGGaaicqGHaiIicqWGPbqAaeaacqWG5bqEdaWgaaWcbaGaem4raCKaeiilaWIaemyAaKgabeaakiabgcziSkabdggaHnaaBaaaleaacqWGhbWraeqaaOGaey4kaSIaemOyai2aaSbaaSqaaiabdEeahbqabaGccqWG4baEdaWgaaWcbaGaem4raCKaeiilaWIaemyAaKgabeaakiabgUcaRiabdggaHjabcUda7iabbccaGiabgcGiIiabdMgaPjabc6caUaaacaWLjaGaaCzcamaabmaabaGaeGymaeJaeGOnaydacaGLOaGaayzkaaaaaa@62E8@

Because (16) moves data (x G , x R ) parallel to the x R = x G line, it is referred to as the parallel translation normalization. Again, as this is a function of r, M can only be made independent of A for one unique r at the time, cf. (9). For r = 1 the optimal parallel shift is

a = b R a G b G a R b G b R , b G b R , ( 17 ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGHbqycqGH9aqpdaWcaaqaaiabdkgaInaaBaaaleaacqWGsbGuaeqaaOGaemyyae2aaSbaaSqaaiabdEeahbqabaGccqGHsislcqWGIbGydaWgaaWcbaGaem4raCeabeaakiabdggaHnaaBaaaleaacqWGsbGuaeqaaaGcbaGaemOyai2aaSbaaSqaaiabdEeahbqabaGccqGHsislcqWGIbGydaWgaaWcbaGaemOuaifabeaaaaGccqGGSaalcqqGGaaicqWGIbGydaWgaaWcbaGaem4raCeabeaakiabgcMi5kabdkgaInaaBaaaleaacqWGsbGuaeqaaOGaeiilaWIaaCzcaiaaxMaadaqadaqaaiabigdaXiabiEda3aGaayjkaiaawMcaaaaa@4F12@

which may be estimated as in (15). For example, for A 1 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbwvMCKfMBHbqedmvETj2BSbWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aqee0evGueE0jxyaibaieYdOi=BH8vipeYdI8qiW7rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbbG8FasPYRqj0=yi0lXdbba9pGe9qqFf0dXdHuk9fr=xfr=xfrpiWZqaaeaabiGaaiaacaqabeaadaqacqaaaOqaaGWaaiab=bq8bnaaBaaaleaatCvAUfKttLearCWrP9MDH5MBPbIqV92AaGqbaiab+fdaXaqabaaaaa@43C0@ the optimal parallel shift is a = 220 with the r = 1 curve 0.57 units below the M = 0 line. The result of this normalization is depicted in Figure 5. From the above expression, we also see that an optimal value of a can indeed be negative. For example, if (a G , a R ) = (200,140) and (b G , b R ) = (1-4, 0.8), the optimal parallel shift is a = -60, which corresponds to an effective shift of ( a G MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacuWGHbqygaqbamaaBaaaleaacqWGhbWraeqaaaaa@2F46@ , a R MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacuWGHbqygaqbamaaBaaaleaacqWGsbGuaeqaaaaa@2F5C@ ) = (140, 80). However, it can also result in non-positive signals and therefore undefined log-ratios. For example, with (a G , a R ) = (20, 200) and (b G , b R ) = (1-4, 0.8), the optimal parallel shift is a = -440, which corresponds to an effective shift of ( a G MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacuWGHbqygaqbamaaBaaaleaacqWGhbWraeqaaaaa@2F46@ , a R MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacuWGHbqygaqbamaaBaaaleaacqWGsbGuaeqaaaaa@2F5C@ ) = (-420, -240). Moreover, from (17) we see that when the scale parameters are equal there is no solution. This is because in such cases data is moved in parallel to the x R = x G line making it impossible to get closer. As in the case of the perpendicular shift normalization, the distance between the r = 1 curve and the M = 0 curve is log2 β. Hence, a parallel shift normalization will not remove an overall bias in the log-ratios either and rescaling is necessary.

Figure 5
figure 5

Parallel translation normalization of affine transformed data. Parallel translation normalization of A 1 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbwvMCKfMBHbqedmvETj2BSbWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aqee0evGueE0jxyaibaieYdOi=BH8vipeYdI8qiW7rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbbG8FasPYRqj0=yi0lXdbba9pGe9qqFf0dXdHuk9fr=xfr=xfrpiWZqaaeaabiGaaiaacaqabeaadaqacqaaaOqaaGWaaiab=bq8bnaaBaaaleaatCvAUfKttLearCWrP9MDH5MBPbIqV92AaGqbaiab+fdaXaqabaaaaa@43C0@ transformed data. The optimal amount of normalization shift in the raw data is a = 220, which corresponds to an effective shift of ( a G MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacuWGHbqygaqbamaaBaaaleaacqWGhbWraeqaaaaa@2F46@ , a R MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacuWGHbqygaqbamaaBaaaleaacqWGsbGuaeqaaaaa@2F5C@ ) = (420, 240). Left: Log-ratios as a function of log-intensities for certain fold changes. The r = 1 curve (dot-dash blue) is horizontal, that is, for this specific value of r and a the log-ratios are independent of the log-intensities. Right: Normalized log-ratios versus true log-ratios. From this graph it is clear that we obtain the minimum error in log-ratios at zero-fold change.

Single-channel translation normalization

A hybrid of the previous two methods is a normalization method that translates the signals in one of the channels at the time according to

y R , i a R + b R x R , i + a I ( ( a 0 ) ; i y G , i a G + b G x G , i a I ( a < 0 ) ; i , ( 18 ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBamrtHrhAL1wy0L2yHvtyaeHbnfgDOvwBHrxAJfwnaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaWaaeGaeaaakqaabeqaaiabdMha5naaBaaaleaacqWGsbGucqGGSaalcqWGPbqAaeqaaOGaeyiKHWQaemyyae2aaSbaaSqaaiabdkfasbqabaGccqGHRaWkcqWGIbGydaWgaaWcbaGaemOuaifabeaakiabdIha4naaBaaaleaacqWGsbGucqGGSaalcqWGPbqAaeqaaOGaey4kaSIaemyyaeMaeyyXICDefCuASX2uH52CaGqbaiab=Leajjab=HcaOGqaaiab+HcaOiab+fgaHjabgwMiZkab+bdaWiab+LcaPiab+Tda7iabbccaGiabgcGiIiabdMgaPbqaaiabdMha5naaBaaaleaacqWGhbWrcqGGSaalcqWGPbqAaeqaaOGaeyiKHWQaemyyae2aaSbaaSqaaiabdEeahbqabaGccqGHRaWkcqWGIbGydaWgaaWcbaGaem4raCeabeaakiabdIha4naaBaaaleaacqWGhbWrcqGGSaalcqWGPbqAaeqaaOGaeyOeI0IaemyyaeMaeyyXICTae8xsaKKae4hkaGIae4xyaeMae4hpaWJae4hmaaJae4xkaKIae43oaSJaeeiiaaIaeyiaIiIaemyAaKMaeiilaWIaaCzcaiaaxMaadaqadaqaaiabigdaXiabiIda4aGaayjkaiaawMcaaaaaaa@81E2@

where I MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBamrtHrhAL1wy0L2yHvtyaeHbnfgDOvwBHrxAJfwnaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaWaaeGaeaaakeaaruWrPXgBtfMBZbacfaGae8xsaKeaaa@3AB6@ is the indicator function and a . This will not generate non-positive signals as only positive translations are applied. Moreover, because only one channel is shifted an optimal shift will always be found.

Rescale normalization

The above translation normalization methods remove curvature for non-differentially by adjusting the offset parameters in α = a R - βa G keeping the relative scale β fixed. Similarly, if the offset parameters are kept fixed, curvature can be removed by adjusting the relative scale β. In [11] we show that the scanner may introduce scale (PMT) insensitive (read fixed) biases to the channels. Thus, by adjusting the PMT settings such that the curvature of the pre-scanned data is as small as possible one minimizes |α| = |a R - βa G |. Indeed, this strategy may in practice be used by many. However, from above we know that this can equally well be done numerically. It is much more important to adjust the PMT (and laser) settings such that the dynamical range of the signals is as large as possible. Furthermore, as scanner settings are often adjusted for each array separately, there will be a discrepancy between arrays, which in any case has to be normalized for.

Dye-swap normalization revisited

Dye-swap normalization, also known as reverse labeling and paired-slides normalization, is a balanced experimental design for two-color microarrays that can be used whenever two technically replicated hybridizations are available. Consider an experiment with two sets of cell populations, A and B, for which relative gene expressions, {r i } i , are to be investigated. After cDNA is obtained through reverse transcription, the two samples are each split into two identical parts, one which is labeled with a red fluorescent dye and one which is labeled with a green fluorescent dye. The red cDNA cocktail from sample A is mixed with the green ditto from sample B and co-hybridized to the DNA on the first array. After scanning, expression levels { ( f G 1 ( x B , i ) , f R 1 ( x A , i ) ) } i MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqGG7bWEcqGGOaakcqWGMbGzdaWgaaWcbaGaem4raC0aaSbaaWqaaiabigdaXaqabaaaleqaaOGaeiikaGIaemiEaG3aaSbaaSqaaiabdkeacjabcYcaSiabdMgaPbqabaGccqGGPaqkcqGGSaalcqWGMbGzdaWgaaWcbaGaemOuai1aaSbaaWqaaiabigdaXaqabaaaleqaaOGaeiikaGIaemiEaG3aaSbaaSqaaiabdgeabjabcYcaSiabdMgaPbqabaGccqGGPaqkcqGGPaqkcqGG9bqFdaWgaaWcbaGaemyAaKgabeaaaaa@48BF@ are observed. The same is done for the remaining red-green pair for which { ( f G 2 ( x A , i ) , f R 2 ( x B , i ) ) } i MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqGG7bWEcqGGOaakcqWGMbGzdaWgaaWcbaGaem4raC0aaSbaaWqaaiabikdaYaqabaaaleqaaOGaeiikaGIaemiEaG3aaSbaaSqaaiabdgeabjabcYcaSiabdMgaPbqabaGccqGGPaqkcqGGSaalcqWGMbGzdaWgaaWcbaGaemOuai1aaSbaaWqaaiabikdaYaqabaaaleqaaOGaeiikaGIaemiEaG3aaSbaaSqaaiabdkeacjabcYcaSiabdMgaPbqabaGccqGGPaqkcqGGPaqkcqGG9bqFdaWgaaWcbaGaemyAaKgabeaaaaa@48C3@ are observed. Dropping gene index i, the dye-swap normalization suggested by [27] is

M = 1 2 ( M 1 + M 2 ) = 1 2 ( log 2 f R 1 ( x A ) f G 1 ( x B ) log 2 f R 2 ( x B ) f G 2 ( x A ) ) = 1 2 ( log 2 f R 1 ( x A ) f R 2 ( x B ) + log 2 f G 2 ( x A ) f G 1 ( x B ) ) = 1 2 ( M 1 + M 2 ) ( 19 ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaafaqadeabbaaaaeaacqWGnbqtcqGH9aqpdaWcaaqaaiabigdaXaqaaiabikdaYaaacqGGOaakcqWGnbqtdaWgaaWcbaGaeGymaedabeaakiabgUcaRiabd2eannaaBaaaleaacqaIYaGmaeqaaOGaeiykaKcabaGaeyypa0ZaaSaaaeaacqaIXaqmaeaacqaIYaGmaaWaaeWaaeaacyGGSbaBcqGGVbWBcqGGNbWzdaWgaaWcbaGaeGOmaidabeaakmaalaaabaGaemOzay2aaSbaaSqaaiabdkfasnaaBaaameaacqaIXaqmaeqaaaWcbeaakiabcIcaOiabdIha4naaBaaaleaacqWGbbqqaeqaaOGaeiykaKcabaGaemOzay2aaSbaaSqaaiabdEeahnaaBaaameaacqaIXaqmaeqaaaWcbeaakiabcIcaOiabdIha4naaBaaaleaacqWGcbGqaeqaaOGaeiykaKcaaiabgkHiTiGbcYgaSjabc+gaVjabcEgaNnaaBaaaleaacqaIYaGmaeqaaOWaaSaaaeaacqWGMbGzdaWgaaWcbaGaemOuai1aaSbaaWqaaiabikdaYaqabaaaleqaaOGaeiikaGIaemiEaG3aaSbaaSqaaiabdkeacbqabaGccqGGPaqkaeaacqWGMbGzdaWgaaWcbaGaem4raC0aaSbaaWqaaiabikdaYaqabaaaleqaaOGaeiikaGIaemiEaG3aaSbaaSqaaiabdgeabbqabaGccqGGPaqkaaaacaGLOaGaayzkaaaabaGaeyypa0ZaaSaaaeaacqaIXaqmaeaacqaIYaGmaaWaaeWaaeaacyGGSbaBcqGGVbWBcqGGNbWzdaWgaaWcbaGaeGOmaidabeaakmaalaaabaGaemOzay2aaSbaaSqaaiabdkfasnaaBaaameaacqaIXaqmaeqaaaWcbeaakiabcIcaOiabdIha4naaBaaaleaacqWGbbqqaeqaaOGaeiykaKcabaGaemOzay2aaSbaaSqaaiabdkfasnaaBaaameaacqaIYaGmaeqaaaWcbeaakiabcIcaOiabdIha4naaBaaaleaacqWGcbGqaeqaaOGaeiykaKcaaiabgUcaRiGbcYgaSjabc+gaVjabcEgaNnaaBaaaleaacqaIYaGmaeqaaOWaaSaaaeaacqWGMbGzdaWgaaWcbaGaem4raC0aaSbaaWqaaiabikdaYaqabaaaleqaaOGaeiikaGIaemiEaG3aaSbaaSqaaiabdgeabbqabaGccqGGPaqkaeaacqWGMbGzdaWgaaWcbaGaem4raC0aaSbaaWqaaiabigdaXaqabaaaleqaaOGaeiikaGIaemiEaG3aaSbaaSqaaiabdkeacbqabaGccqGGPaqkaaaacaGLOaGaayzkaaaabaGaeyypa0ZaaSaaaeaacqaIXaqmaeaacqaIYaGmaaGaeiikaGIafmyta0KbauaadaWgaaWcbaGaeGymaedabeaakiabgUcaRiqbd2eanzaafaWaaSbaaSqaaiabikdaYaqabaGccqGGPaqkaaGaaCzcaiaaxMaadaqadaqaaiabigdaXiabiMda5aGaayjkaiaawMcaaaaa@A938@

and similarly for the log-intensities

A = 1 2 ( A 1 + A 2 ) = log 2 ( f R 1 ( x A ) f G 1 ( x B ) ) + log 2 ( f R 2 ( x B ) f G 2 ( x A ) ) 4 = log 2 ( f R 1 ( x A ) f R 2 ( x B ) ) + log 2 ( f G 2 ( x A ) f G 1 ( x B ) ) 4 = 1 2 ( A 1 + A 2 ) . ( 20 ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaafaqadeabbaaaaeaacqWGbbqqcqGH9aqpdaWcaaqaaiabigdaXaqaaiabikdaYaaacqGGOaakcqWGbbqqdaWgaaWcbaGaeGymaedabeaakiabgUcaRiabdgeabnaaBaaaleaacqaIYaGmaeqaaOGaeiykaKcabaGaeyypa0ZaaSaaaeaacyGGSbaBcqGGVbWBcqGGNbWzdaWgaaWcbaGaeGOmaidabeaakiabcIcaOiabdAgaMnaaBaaaleaacqWGsbGudaWgaaadbaGaeGymaedabeaaaSqabaGccqGGOaakcqWG4baEdaWgaaWcbaGaemyqaeeabeaakiabcMcaPiabdAgaMnaaBaaaleaacqWGhbWrdaWgaaadbaGaeGymaedabeaaaSqabaGccqGGOaakcqWG4baEdaWgaaWcbaGaemOqaieabeaakiabcMcaPiabcMcaPiabgUcaRiGbcYgaSjabc+gaVjabcEgaNnaaBaaaleaacqaIYaGmaeqaaOGaeiikaGIaemOzay2aaSbaaSqaaiabdkfasnaaBaaameaacqaIYaGmaeqaaaWcbeaakiabcIcaOiabdIha4naaBaaaleaacqWGcbGqaeqaaOGaeiykaKIaemOzay2aaSbaaSqaaiabdEeahnaaBaaameaacqaIYaGmaeqaaaWcbeaakiabcIcaOiabdIha4naaBaaaleaacqWGbbqqaeqaaOGaeiykaKIaeiykaKcabaGaeGinaqdaaaqaaiabg2da9maalaaabaGagiiBaWMaei4Ba8Maei4zaC2aaSbaaSqaaiabikdaYaqabaGccqGGOaakcqWGMbGzdaWgaaWcbaGaemOuai1aaSbaaWqaaiabigdaXaqabaaaleqaaOGaeiikaGIaemiEaG3aaSbaaSqaaiabdgeabbqabaGccqGGPaqkcqWGMbGzdaWgaaWcbaGaemOuai1aaSbaaWqaaiabikdaYaqabaaaleqaaOGaeiikaGIaemiEaG3aaSbaaSqaaiabdkeacbqabaGccqGGPaqkcqGGPaqkcqGHRaWkcyGGSbaBcqGGVbWBcqGGNbWzdaWgaaWcbaGaeGOmaidabeaakiabcIcaOiabdAgaMnaaBaaaleaacqWGhbWrdaWgaaadbaGaeGOmaidabeaaaSqabaGccqGGOaakcqWG4baEdaWgaaWcbaGaemyqaeeabeaakiabcMcaPiabdAgaMnaaBaaaleaacqWGhbWrdaWgaaadbaGaeGymaedabeaaaSqabaGccqGGOaakcqWG4baEdaWgaaWcbaGaemOqaieabeaakiabcMcaPiabcMcaPaqaaiabisda0aaaaeaacqGH9aqpdaWcaaqaaiabigdaXaqaaiabikdaYaaacqGGOaakcuWGbbqqgaqbamaaBaaaleaacqaIXaqmaeqaaOGaey4kaSIafmyqaeKbauaadaWgaaWcbaGaeGOmaidabeaakiabcMcaPiabc6caUaaacaWLjaGaaCzcamaabmaabaGaeGOmaiJaeGimaadacaGLOaGaayzkaaaaaa@AB27@

Thus, the result of a dye-swap can be written as the average of two "virtual" hybridizations ( A 1 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacuWGbbqqgaqbamaaBaaaleaacqaIXaqmaeqaaaaa@2EDF@ , M 1 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2Caerbhv2BYDwAHbqedmvETj2BSbqee0evGueE0jxyaibaiKI8=vI8tuQ8FMI8Gi=hEeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciGacaGaaeqabaqadeqadaaakeaaceWGnbGbauaadaWgaaWcbaGaaGymaaqabaaaaa@34F2@ ) and ( A 2 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacuWGbbqqgaqbamaaBaaaleaacqaIYaGmaeqaaaaa@2EE1@ , M 2 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacuWGnbqtgaqbamaaBaaaleaacqaIYaGmaeqaaaaa@2EF9@ ). Moreover, if (and only if) the measurement functions are equal for each array, that is, f R 1 = f R 2 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGMbGzdaWgaaWcbaGaemOuai1aaSbaaWqaaiabigdaXaqabaaaleqaaOGaeyypa0JaemOzay2aaSbaaSqaaiabdkfasnaaBaaameaacqaIYaGmaeqaaaWcbeaaaaa@356A@ and f G 1 = f G 2 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGMbGzdaWgaaWcbaGaem4raC0aaSbaaWqaaiabigdaXaqabaaaleqaaOGaeyypa0JaemOzay2aaSbaaSqaaiabdEeahnaaBaaameaacqaIYaGmaeqaaaWcbeaaaaa@353E@ , then the observed ratios will be identical to the true ratios for non-differentially expressed genes. For this to be true for differentially expressed genes we know that they also have to be linear, that is, affine with zero intercept.

Several authors [28, 29] have reported that dye-swap normalization does remove curvature, but less successful results have also been reported [30]. To better understand the reasons why and when dye-swap normalization works or not, we dissect the measurement functions f c of the four channels c = R1, G1, R2, G2 into (v c u c t c s c ) where s c models the process of all steps up to the step where the (not yet labeled) cDNA sample is obtained, t c models the labeling, u c models the following steps including the hybridization, and v c models the scanning etc. As channel R1 and G2 are from sample A and the other two are from sample B, we know that s R 1 = s G 2 = s A MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGZbWCdaWgaaWcbaGaemOuai1aaSbaaWqaaiabigdaXaqabaaaleqaaOGaeyypa0Jaem4Cam3aaSbaaSqaaiabdEeahnaaBaaameaacqaIYaGmaeqaaaWcbeaakiabg2da9iabdohaZnaaBaaaleaacqWGbbqqaeqaaaaa@393E@ and s R 2 = s G 1 = s B MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGZbWCdaWgaaWcbaGaemOuai1aaSbaaWqaaiabikdaYaqabaaaleqaaOGaeyypa0Jaem4Cam3aaSbaaSqaaiabdEeahnaaBaaameaacqaIXaqmaeqaaaWcbeaakiabg2da9iabdohaZnaaBaaaleaacqWGcbGqaeqaaaaa@3940@ . Furthermore, if the labeling process is well controlled, we can assume that t R 1 t R 2 t R MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWG0baDdaWgaaWcbaGaemOuai1aaSbaaWqaaiabigdaXaqabaaaleqaaOGaeyisISRaemiDaq3aaSbaaSqaaiabdkfasnaaBaaameaacqaIYaGmaeqaaaWcbeaakiabgIKi7kabdsha0naaBaaaleaacqWGsbGuaeqaaaaa@3AD2@ and t G 1 t G 2 t G MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWG0baDdaWgaaWcbaGaem4raC0aaSbaaWqaaiabigdaXaqabaaaleqaaOGaeyisISRaemiDaq3aaSbaaSqaaiabdEeahnaaBaaameaacqaIYaGmaeqaaaWcbeaakiabgIKi7kabdsha0naaBaaaleaacqWGhbWraeqaaaaa@3A90@ . When channel R1 and G1 are hybridized to array 1 and the other two to array 2 we have that u R 1 u G 1 u 1 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWG1bqDdaWgaaWcbaGaemOuai1aaSbaaWqaaiabigdaXaqabaaaleqaaOGaeyisISRaemyDau3aaSbaaSqaaiabdEeahnaaBaaameaacqaIXaqmaeqaaaWcbeaakiabgIKi7kabdwha1naaBaaaleaacqaIXaqmaeqaaaaa@3A83@ and u R 2 u G 2 u 2 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWG1bqDdaWgaaWcbaGaemOuai1aaSbaaWqaaiabikdaYaqabaaaleqaaOGaeyisISRaemyDau3aaSbaaSqaaiabdEeahnaaBaaameaacqaIYaGmaeqaaaWcbeaakiabgIKi7kabdwha1naaBaaaleaacqaIYaGmaeqaaaaa@3A89@ . Moreover, if the same scanner settings are used for both arrays and everything else is equal, we have that v R 1 v R 2 v R MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWG2bGDdaWgaaWcbaGaemOuai1aaSbaaWqaaiabigdaXaqabaaaleqaaOGaeyisISRaemODay3aaSbaaSqaaiabdkfasnaaBaaameaacqaIYaGmaeqaaaWcbeaakiabgIKi7kabdAha2naaBaaaleaacqWGsbGuaeqaaaaa@3ADE@ and v G 1 v G 2 v G MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWG2bGDdaWgaaWcbaGaem4raC0aaSbaaWqaaiabigdaXaqabaaaleqaaOGaeyisISRaemODay3aaSbaaSqaaiabdEeahnaaBaaameaacqaIYaGmaeqaaaWcbeaakiabgIKi7kabdAha2naaBaaaleaacqWGhbWraeqaaaaa@3A9C@ . The overall measurement functions for the channels are then approximately

f R 1 v R u 1 t R s A f G 1 v G u 1 t G s B f R 2 v R u 2 t R s B f G 2 v G u 2 t G s A . ( 21 ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaafaqabeabbaaaaeaacqWGMbGzdaWgaaWcbaGaemOuai1aaSbaaWqaaiabigdaXaqabaaaleqaaOGaeyisISRaemODay3aaSbaaSqaaiabdkfasbqabaGccqWIyiYBcqWG1bqDdaWgaaWcbaGaeGymaedabeaakiablIHiVjabdsha0naaBaaaleaacqWGsbGuaeqaaOGaeSigI8Maem4Cam3aaSbaaSqaaiabdgeabbqabaaakeaacqWGMbGzdaWgaaWcbaGaem4raC0aaSbaaWqaaiabigdaXaqabaaaleqaaOGaeyisISRaemODay3aaSbaaSqaaiabdEeahbqabaGccqWIyiYBcqWG1bqDdaWgaaWcbaGaeGymaedabeaakiablIHiVjabdsha0naaBaaaleaacqWGhbWraeqaaOGaeSigI8Maem4Cam3aaSbaaSqaaiabdkeacbqabaaakeaacqWGMbGzdaWgaaWcbaGaemOuai1aaSbaaWqaaiabikdaYaqabaaaleqaaOGaeyisISRaemODay3aaSbaaSqaaiabdkfasbqabaGccqWIyiYBcqWG1bqDdaWgaaWcbaGaeGOmaidabeaakiablIHiVjabdsha0naaBaaaleaacqWGsbGuaeqaaOGaeSigI8Maem4Cam3aaSbaaSqaaiabdkeacbqabaaakeaacqWGMbGzdaWgaaWcbaGaem4raC0aaSbaaWqaaiabikdaYaqabaaaleqaaOGaeyisISRaemODay3aaSbaaSqaaiabdEeahbqabaGccqWIyiYBcqWG1bqDdaWgaaWcbaGaeGOmaidabeaakiablIHiVjabdsha0naaBaaaleaacqWGhbWraeqaaOGaeSigI8Maem4Cam3aaSbaaSqaaiabdgeabbqabaGccqGGUaGlaaGaaCzcaiaaxMaadaqadaqaaiabikdaYiabigdaXaGaayjkaiaawMcaaaaa@82A8@

For the dye-swap normalization to be efficient, we conclude that we must control the process of extracting the RNA etc. to an extent such that we can expect s A s B . Moreover, we must also be able to reproduce hybridizations well enough such that u1u2. If these requirements are met, data will be self-normalized. Turning to the affine model, from (19) we have, if f R 1 = f R 2 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGMbGzdaWgaaWcbaGaemOuai1aaSbaaWqaaiabigdaXaqabaaaleqaaOGaeyypa0JaemOzay2aaSbaaSqaaiabdkfasnaaBaaameaacqaIYaGmaeqaaaWcbeaaaaa@356A@ and f G 1 = f G 2 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGMbGzdaWgaaWcbaGaem4raC0aaSbaaWqaaiabigdaXaqabaaaleqaaOGaeyypa0JaemOzay2aaSbaaSqaaiabdEeahnaaBaaameaacqaIYaGmaeqaaaWcbeaaaaa@353E@ . that a dye-swap normalization of affine transformation data gives

M 1 = log 2 a R + b R x A a R + b R x B , M 2 = log 2 a G + b G x A a G + b G x B , ( 22 ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakqaabeqaaiqbd2eanzaafaWaaSbaaSqaaiabigdaXaqabaGccqGH9aqpcyGGSbaBcqGGVbWBcqGGNbWzdaWgaaWcbaGaeGOmaidabeaakmaalaaabaGaemyyae2aaSbaaSqaaiabdkfasbqabaGccqGHRaWkcqWGIbGydaWgaaWcbaGaemOuaifabeaakiabdIha4naaBaaaleaacqWGbbqqaeqaaaGcbaGaemyyae2aaSbaaSqaaiabdkfasbqabaGccqGHRaWkcqWGIbGydaWgaaWcbaGaemOuaifabeaakiabdIha4naaBaaaleaacqWGcbGqaeqaaaaakiabcYcaSaqaaiqbd2eanzaafaWaaSbaaSqaaiabikdaYaqabaGccqGH9aqpcyGGSbaBcqGGVbWBcqGGNbWzdaWgaaWcbaGaeGOmaidabeaakmaalaaabaGaemyyae2aaSbaaSqaaiabdEeahbqabaGccqGHRaWkcqWGIbGydaWgaaWcbaGaem4raCeabeaakiabdIha4naaBaaaleaacqWGbbqqaeqaaaGcbaGaemyyae2aaSbaaSqaaiabdEeahbqabaGccqGHRaWkcqWGIbGydaWgaaWcbaGaem4raCeabeaakiabdIha4naaBaaaleaacqWGcbGqaeqaaaaakiabcYcaSiaaxMaacaWLjaWaaeWaaeaacqaIYaGmcqaIYaGmaiaawIcacaGLPaaaaaaa@681C@

and similar for A 1 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacuWGbbqqgaqbamaaBaaaleaacqaIXaqmaeqaaaaa@2EDF@ and A 2 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacuWGbbqqgaqbamaaBaaaleaacqaIYaGmaeqaaaaa@2EE1@ . For both virtual arrays, the signals in both channels have undergone identical affine transformations. We know from before that identical transformation in both channels does not introduce curvature for the non-differentially expressed genes and that symmetry between up- and down-regulated genes is preserved, cf. perpendicular and parallel shift normalization. If the offsets in any of the two replicated channels are not equal ( a R 1 a R 2 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGHbqydaWgaaWcbaGaemOuai1aaSbaaWqaaiabigdaXaqabaaaleqaaOGaeyiyIKRaemyyae2aaSbaaSqaaiabdkfasnaaBaaameaacqaIYaGmaeqaaaWcbeaaaaa@3617@ or a G 1 a G 2 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGHbqydaWgaaWcbaGaem4raC0aaSbaaWqaaiabigdaXaqabaaaleqaaOGaeyiyIKRaemyyae2aaSbaaSqaaiabdEeahnaaBaaameaacqaIYaGmaeqaaaWcbeaaaaa@35EB@ ), the dye-swap normalization will not work.

The above discussion assumed that the same cell samples have been replicated. If biological replicates are used, an additional source of variability is introduced. However, as long as it is possible to assume that for most genes x A 1 x A 2 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWG4baEdaWgaaWcbaGaemyqae0aaSbaaWqaaiabigdaXaqabaaaleqaaOGaeyisISRaemiEaG3aaSbaaSqaaiabdgeabnaaBaaameaacqaIYaGmaeqaaaWcbeaaaaa@3619@ and x B 1 x B 2 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWG4baEdaWgaaWcbaGaemOqai0aaSbaaWqaaiabigdaXaqabaaaleqaaOGaeyisISRaemiEaG3aaSbaaSqaaiabdkeacnaaBaaameaacqaIYaGmaeqaaaWcbeaaaaa@361D@ . dye-swap normalization should still perform well.

In [11] we observed that scanners can introduce channel-specific offsets that are stable over time, i.e. a R 1 = a R 2 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGHbqydaWgaaWcbaGaemOuai1aaSbaaWqaaiabigdaXaqabaaaleqaaOGaeyypa0Jaemyyae2aaSbaaSqaaiabdkfasnaaBaaameaacqaIYaGmaeqaaaWcbeaaaaa@3556@ and a G 1 = a G 2 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGHbqydaWgaaWcbaGaem4raC0aaSbaaWqaaiabigdaXaqabaaaleqaaOGaeyypa0Jaemyyae2aaSbaaSqaaiabdEeahnaaBaaameaacqaIYaGmaeqaaaWcbeaaaaa@352A@ . Assume that everything else is perfect, but the PMT is adjusted separately for each array resulting in b R 1 / b R 2 b R 2 / b R 1 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGIbGydaWgaaWcbaGaemOuai1aaSbaaWqaaiabigdaXaqabaaaleqaaOGaei4la8IaemOyai2aaSbaaSqaaiabdkfasnaaBaaameaacqaIYaGmaeqaaaWcbeaakiabgcMi5kabdkgaInaaBaaaleaacqWGsbGudaWgaaadbaGaeGOmaidabeaaaSqabaGccqGGVaWlcqWGIbGydaWgaaWcbaGaemOuai1aaSbaaWqaaiabigdaXaqabaaaleqaaaaa@3F99@ so that (22) is not obtained. This may be a reason why dye-swap normalization sometimes fails.

Alternative dye-swap normalization

An alternative dye-swap normalization method is to average the observed expression levels before taking the logarithm

M = log 2 ( f R 1 ( x A ) + f G 2 ( x A ) ) / 2 ( f R 2 ( x B ) + f G 1 ( x B ) ) / 2 = log 2 f R 1 ( x A ) + f G 2 ( x A ) f R 2 ( x B ) + f G 1 ( x B ) , ( 23 ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaafaqadeGabaaabaGaemyta0Kaeyypa0JagiiBaWMaei4Ba8Maei4zaC2aaSbaaSqaaiabikdaYaqabaGcdaWcaaqaaiabcIcaOiabdAgaMnaaBaaaleaacqWGsbGudaWgaaadbaGaeGymaedabeaaaSqabaGccqGGOaakcqWG4baEdaWgaaWcbaGaemyqaeeabeaakiabcMcaPiabgUcaRiabdAgaMnaaBaaaleaacqWGhbWrdaWgaaadbaGaeGOmaidabeaaaSqabaGccqGGOaakcqWG4baEdaWgaaWcbaGaemyqaeeabeaakiabcMcaPiabcMcaPiabc+caViabikdaYaqaaiabcIcaOiabdAgaMnaaBaaaleaacqWGsbGudaWgaaadbaGaeGOmaidabeaaaSqabaGccqGGOaakcqWG4baEdaWgaaWcbaGaemOqaieabeaakiabcMcaPiabgUcaRiabdAgaMnaaBaaaleaacqWGhbWrdaWgaaadbaGaeGymaedabeaaaSqabaGccqGGOaakcqWG4baEdaWgaaWcbaGaemOqaieabeaakiabcMcaPiabcMcaPiabc+caViabikdaYaaaaeaacqGH9aqpcyGGSbaBcqGGVbWBcqGGNbWzdaWgaaWcbaGaeGOmaidabeaakmaalaaabaGaemOzay2aaSbaaSqaaiabdkfasnaaBaaameaacqaIXaqmaeqaaaWcbeaakiabcIcaOiabdIha4naaBaaaleaacqWGbbqqaeqaaOGaeiykaKIaey4kaSIaemOzay2aaSbaaSqaaiabdEeahnaaBaaameaacqaIYaGmaeqaaaWcbeaakiabcIcaOiabdIha4naaBaaaleaacqWGbbqqaeqaaOGaeiykaKcabaGaemOzay2aaSbaaSqaaiabdkfasnaaBaaameaacqaIYaGmaeqaaaWcbeaakiabcIcaOiabdIha4naaBaaaleaacqWGcbGqaeqaaOGaeiykaKIaey4kaSIaemOzay2aaSbaaSqaaiabdEeahnaaBaaameaacqaIXaqmaeqaaaWcbeaakiabcIcaOiabdIha4naaBaaaleaacqWGcbGqaeqaaOGaeiykaKcaaiabcYcaSaaacaWLjaGaaCzcamaabmaabaGaeGOmaiJaeG4mamdacaGLOaGaayzkaaaaaa@8CDD@

and analogously for A. This approach uses the arithmetic mean of the observed signals whereas the previous dye-swap method used the geometric mean. To be able to say more about the difference between the two approaches, we turn to the affine transformation for which we have

M = log 2 a + b x A a + b x B A = log 2 ( a + b x A ) ( a + b x B ) 2 ( 24 ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaafaqadeGabaaabaGaemyta0Kaeyypa0JagiiBaWMaei4Ba8Maei4zaC2aaSbaaSqaaiabikdaYaqabaGcdaWcaaqaaiqbdggaHzaafaGaey4kaSIafmOyaiMbauaacqWG4baEdaWgaaWcbaGaemyqaeeabeaaaOqaaiqbdggaHzaafaGaey4kaSIafmOyaiMbauaacqWG4baEdaWgaaWcbaGaemOqaieabeaaaaaakeaacqWGbbqqcqGH9aqpdaWcaaqaaiGbcYgaSjabc+gaVjabcEgaNnaaBaaaleaacqaIYaGmaeqaaOGaeiikaGIafmyyaeMbauaacqGHRaWkcuWGIbGygaqbaiabdIha4naaBaaaleaacqWGbbqqaeqaaOGaeiykaKIaeiikaGIafmyyaeMbauaacqGHRaWkcuWGIbGygaqbaiabdIha4naaBaaaleaacqWGcbGqaeqaaOGaeiykaKcabaGaeGOmaidaaaaacaWLjaGaaCzcamaabmaabaGaeGOmaiJaeGinaqdacaGLOaGaayzkaaaaaa@5DE0@

where a' = a R + a G and b' = b R + b G . Again, we note that the dye-swap method makes the transforms in the resulting two virtual channels equal. Comparing the bias in log-intensities between the geometrical and the arithmetical approaches, for the latter we have

A 0 = log 2 a R + a G 2 ( 25 ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGbbqqdaWgaaWcbaGaeGimaadabeaakiabg2da9iGbcYgaSjabc+gaVjabcEgaNnaaBaaaleaacqaIYaGmaeqaaOWaaSaaaeaacqWGHbqydaWgaaWcbaGaemOuaifabeaakiabgUcaRiabdggaHnaaBaaaleaacqWGhbWraeqaaaGcbaGaeGOmaidaaiaaxMaacaWLjaWaaeWaaeaacqaIYaGmcqaI1aqnaiaawIcacaGLPaaaaaa@4108@

whereas for the former we have

A 0 = log 2 a R a G . ( 26 ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGbbqqdaWgaaWcbaGaeGimaadabeaakiabg2da9iGbcYgaSjabc+gaVjabcEgaNnaaBaaaleaacqaIYaGmaeqaaOWaaOaaaeaacqWGHbqydaWgaaWcbaGaemOuaifabeaakiabdggaHnaaBaaaleaacqWGhbWraeqaaaqabaGccqGGUaGlcaWLjaGaaCzcamaabmaabaGaeGOmaiJaeGOnaydacaGLOaGaayzkaaaaaa@401A@

Because (a R + a G )/2 ≥ a R a G MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaadaGcaaqaaiabdggaHnaaBaaaleaacqWGsbGuaeqaaOGaemyyae2aaSbaaSqaaiabdEeahbqabaaabeaaaaa@31F8@ , we conclude that the log-ratio biases are always larger for arithmetic than geometric dye swap. However, there are other differences too. For instance, if each microarray glass array (the u c functions above) introduces the same offset to both channels and this offset is different between arrays, but otherwise everything else is the same, that is, a R 2 = a R 1 + a MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGHbqydaWgaaWcbaGaemOuai1aaSbaaWqaaiabikdaYaqabaaaleqaaOGaeyypa0Jaemyyae2aaSbaaSqaaiabdkfasnaaBaaameaacqaIXaqmaeqaaaWcbeaakiabgUcaRiabdggaHbaa@378D@ and a G 2 = a G 1 + a MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGHbqydaWgaaWcbaGaem4raC0aaSbaaWqaaiabikdaYaqabaaaleqaaOGaeyypa0Jaemyyae2aaSbaaSqaaiabdEeahnaaBaaameaacqaIXaqmaeqaaaWcbeaakiabgUcaRiabdggaHbaa@3761@ , then geometric dye-swap fails whereas arithmetic dye-swap succeeds to remove curvature.

Two-channel quantile normalization

Two-channel or in general multi-channel quantile normalization [31, 32] is based on and relies on the assumption that the true gene-expression levels in the two biological samples are approximately equally distributed. If the measurement functions in the two channels, say f R and f G , are different, then the distributions of the measured signals in the two channels are different even if underlying distributions of true expression levels are identical. By estimating the distributions of the two channels and making them equal, for instance to an average distribution, the log-ratios for the non-differentially expressed genes will be unbiased and independent of the intensities. Thus, making the density functions of measured data equal for the two channels is the same as making their transformation functions equal, say to f RG , which makes M independent of A for non-differentially expressed genes. If f RG could be made linear too, this would be true for all fold changes.

For affine transformations, two-channel quantile normalization removes intensity-dependent effects, because the offsets a R and a G are identical after normalization. In addition, the constant log-ratio bias log2β is also removed. Hence, two-channel quantile normalization can be considered to be both a method that corrects for differences in offset between two channels, but also a method that corrects for biases in the expression ratios. In Figure 6, the quantile normalization of A 1 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbwvMCKfMBHbqedmvETj2BSbWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aqee0evGueE0jxyaibaieYdOi=BH8vipeYdI8qiW7rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbbG8FasPYRqj0=yi0lXdbba9pGe9qqFf0dXdHuk9fr=xfr=xfrpiWZqaaeaabiGaaiaacaqabeaadaqacqaaaOqaaGWaaiab=bq8bnaaBaaaleaatCvAUfKttLearCWrP9MDH5MBPbIqV92AaGqbaiab+fdaXaqabaaaaa@43C0@ transformed data is depicted. The curvature for non-differentially expressed genes is removed.

Figure 6
figure 6

Equalizing the signal densities of the two channels removes the intensity dependency of the log-ratios for non-differentially expressed genes. Equalizing the signal densities of the two channels remove the intensity dependency of the log-ratios of non-differentially expressed genes. Left: Equal gene-expression distributions in both channels will under the non-channel balanced affine transform A 1 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbwvMCKfMBHbqedmvETj2BSbWenfgDOvwBHrxAJfwnHbqeg0uy0HwzTfgDPnwy1aqee0evGueE0jxyaibaieYdOi=BH8vipeYdI8qiW7rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbbG8FasPYRqj0=yi0lXdbba9pGe9qqFf0dXdHuk9fr=xfr=xfrpiWZqaaeaabiGaaiaacaqabeaadaqacqaaaOqaaGWaaiab=bq8bnaaBaaaleaatCvAUfKttLearCWrP9MDH5MBPbIqV92AaGqbaiab+fdaXaqabaaaaa@43C0@ turn into two different densities for the measured data. The (upside-down and dashed) curve at the bottom shows a hypothetical density function, φ x (·), of the true (log) gene-expression levels expected to be equal in both samples. The distributions of the affine transformed signals are shown in the (rotated and dashed) density functions, { φ y c ( ) } c MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqGG7bWEiiGacqWFgpGzdaWgaaWcbaGaemyEaK3aaSbaaWqaaiabdogaJbqabaaaleqaaOGaeiikaGIaeyyXICTaeiykaKIaeiyFa03aaSbaaSqaaiabdogaJbqabaaaaa@3A1B@ , at the left (red and green curves). The average signal density (middle gray curve) to be normalized toward corresponds to a common measurement function (gray function in the main plot). Right: Normalizing the non-equal densities of the two channels makes the log-ratios of the non-differentially expressed genes zero for all intensities.

Background subtraction as a normalization method

We have observed that log-ratios of background signals show the same intensity-dependent effects as ditto for foreground signals do, which suggests that background signals undergo the same transformation as foreground signals. An example of this is shown in Figure 7, where background and foreground estimates are plotted in the same M versus A scatter plots. A probable reason for this is the existence of scanner biases [11]. A widely adopted rationale for background correction is the assumption that the region that defines the spot is contaminated with the same physical noise that can be observed in the surrounding regions. Background noise is believed to be due to dust particles, DNA contaminated buffers, failed washing during printing or hybridization, cross hybridization etc. [20, 33]. This type of background noise is often assumed to add to the foreground signal. Thus, in order to obtain true signals, background is subtracted from foreground signal as

Figure 7
figure 7

Transformation of background signal. Transformation of background signal. Left: An M versus A scatter plot where background signals (blue triangles) and foreground signals (red circles) lye along the same curve, which is evidence that both have been transformed identically. Right: A zoom-in of the left graph. Data is from [50].

y c , i y c , i ( fg ) y c , i ( bg ) ( 27 ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWG5bqEdaWgaaWcbaGaem4yamMaeiilaWIaemyAaKgabeaakiabgcziSkabdMha5naaDaaaleaacqWGJbWycqGGSaalcqWGPbqAaeaacqGGOaakcqqGMbGzcqqGNbWzcqGGPaqkaaGccqGHsislcqWG5bqEdaqhaaWcbaGaem4yamMaeiilaWIaemyAaKgabaGaeiikaGIaeeOyaiMaee4zaCMaeiykaKcaaOGaaCzcaiaaxMaadaqadaqaaiabikdaYiabiEda3aGaayjkaiaawMcaaaaa@4C9C@

where y c , i ( fg ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWG5bqEdaqhaaWcbaGaem4yamMaeiilaWIaemyAaKgabaGaeiikaGIaeeOzayMaee4zaCMaeiykaKcaaaaa@3638@ is the estimated foreground signal and y c , i ( bg ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWG5bqEdaqhaaWcbaGaem4yamMaeiilaWIaemyAaKgabaGaeiikaGIaeeOyaiMaee4zaCMaeiykaKcaaaaa@3630@ is the estimated background signal for channel c and spot i. Under a transformation that is dominated by an affine function at lower intensities (of the same level as the background), subtracting the background from the foreground will shift the biases toward zero and background subtracted signals will have less curvature in the (A,M) plane than non-background subtracted signals (not shown). In this sense we can consider background subtraction to be a normalization method. However, just because the log-ratios as a function of the log-intensities become more flat, it does not imply that foreground regions are contaminated by the same noise as in background regions; unnecessary noise may be introduced. Instead, it may be that the background estimates from the image analysis happen to be close to a non-image-related offset in the foreground signals. Moreover, different image analysis software estimate the background signal differently based on different algorithms such as fixed-size circles, adaptive circles, morphological estimates, and pixel intensity distributions. Although comparative studies have been conducted [10, 34], it is still not clear which background estimate is most correct. Some methods give higher background estimates than others, which means that they all correct for channel biases by different amounts, which by the way is another argument for why there exist channel offsets. makes use of this is [20], which emphasizes that the true signal can not be negative and uses a Bayesian approach to correct for this.

Result of a (relative) negative translation

If too much background is subtracted, or a threshold has to be passed before the reverse transcription takes place, one can imagine that a G , a R < 0. Negative bias also applies if the observed signals are compared, not to the true signals, but to the signals obtained by another measuring technique that has a larger bias. Examples of such comparisons can be two-color microarray data compared to oligonucleotide (Affymetrix) data or two-color microarray data compared to QRT-PCR data. Negative bias may also be observed when control clones, spike-ins, negative and positive controls etc. are compared to the genes/ESTs of interest. The effect of a negative translation is depicted in Figure 8. The fan-out effect in the fold-change curves for the lower intensities is due to the negative translation. Note that this should not be mistaken for the fan-out effect due to decreasing signal-to-noise levels in the same way as lack of a fan-out effect due to a positive offset should not be mistaken for low noise.

Figure 8
figure 8

Affine transformation with negative translation. Affine transformation of the red and the green signals with negative translation where (a G , a R ) = (-87, -24), (b G ,b R ) = (1.4,0.8). Left: Log-ratios as a function of log-intensities for certain fold changes. Right: Translated log-ratios versus true log-ratios. The slope of a line fitted in the M versus M plot will be larger than one, which is due to the negative translation. The grid and the fold-change curves in the left graph, and the intensity curves in the right graph have been truncated such that x R ,x G ≥ 1.

Figure 9
figure 9

Log-ratios versus log-intensities before and after a robust affine normalization. Log-ratios versus log-intensities before and after a robust affine normalization. Left: Non-normalized data. Spike-ins designed to have log2r = +2, 0, and -2 are highlighted in red, yellow and green, respectively. Middle: Affine normalization utilizing constraint (32) resulting in no negative signals. Parameter estimates used in back transformation are ( a ^ G MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacuWGHbqygaqcamaaBaaaleaacqWGhbWraeqaaaaa@2F4A@ , a ^ R MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacuWGHbqygaqcamaaBaaaleaacqWGsbGuaeqaaaaa@2F60@ , log2 β ^ MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaaiiGacuWFYoGygaqcaaaa@2E64@ ) = (39.0, 22.0, -0.418). Right: Affine normalization where 5% (default) negative signals has been allowed; Parameter estimates used in back transformation are ( a ^ G MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacuWGHbqygaqcamaaBaaaleaacqWGhbWraeqaaaaa@2F4A@ , a ^ R MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacuWGHbqygaqcamaaBaaaleaacqWGsbGuaeqaaaaa@2F60@ , log2 β ^ MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaaiiGacuWFYoGygaqcaaaa@2E64@ ) = (45.7, 27.0, -0.418). The rotated binning effects of data points at low intensities are due to (unnecessary) rounding of average spot pixel intensity to nearest integer by the image analysis software.

Robust affine normalization

From the above discussion, it is clear that it is essential to correct for channel offsets when normalizing gene expression data. For two-channel data, we can obtain estimates of a R , a G and β as follows. For non-differentially expressed genes (without noise) we have that

y R,i = α + βy G,i ; i     (28)

with α = a R - βa G and β = b R /b G . Define y = { y i } i = 1 I MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaaieqacqWF5bqEcqGH9aqpcqGG7bWEcqWF5bqEdaWgaaWcbaGaemyAaKgabeaakiabc2ha9naaDaaaleaacqWGPbqAcqGH9aqpcqaIXaqmaeaacqWGjbqsaaaaaa@39D4@ where y i = (y G,i , y R,i ) and let

Q ( α , β ; y ) = i = 1 I w i d i ( α , β ; y i ) 2 ( 29 ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGrbqucqGGOaakiiGacqWFXoqycqGGSaalcqWFYoGycqGG7aWoieqacqGF5bqEcqGGPaqkcqGH9aqpdaaeWbqaaiabdEha3naaBaaaleaacqWGPbqAaeqaaOGaemizaq2aaSbaaSqaaiabdMgaPbqabaaabaGaemyAaKMaeyypa0JaeGymaedabaGaemysaKeaniabggHiLdGccqGGOaakcqWFXoqycqGGSaalcqWFYoGycqGG7aWocqGF5bqEdaWgaaWcbaGaemyAaKgabeaakiabcMcaPmaaCaaaleqabaGaeGOmaidaaOGaaCzcaiaaxMaadaqadaqaaiabikdaYiabiMda5aGaayjkaiaawMcaaaaa@5371@

be our objective function where d i (α, β;y i ) > 0 is the orthogonal Euclidean distance between y i and the line L(α, β) with intercept α and slope β. The estimates of α and β are then

( α ^ , β ^ ) = arg min ( α , β ) Q ( α , β ; y ) . ( 30 ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqGGOaakiiGacuWFXoqygaqcaiabcYcaSiqb=j7aIzaajaGaeiykaKIaeyypa0JagiyyaeMaeiOCaiNaei4zaC2aaCbeaeaacyGGTbqBcqGGPbqAcqGGUbGBaSqaaiabcIcaOiab=f7aHjabcYcaSiab=j7aIjabcMcaPaqabaGccqWGrbqucqGGOaakcqWFXoqycqGGSaalcqWFYoGycqGG7aWoieqacqGF5bqEcqGGPaqkcqGGUaGlcaWLjaGaaCzcamaabmaabaGaeG4mamJaeGimaadacaGLOaGaayzkaaaaaa@50E5@

With w i = 1 for all observations we obtain standard principal component analysis (PCA), which minimizes the orthogonal distances in the L2 norm [35]. With w i ≠ 1, (sample-) weighted PCA (WPCA), a special case of generalized PCA, is obtained [35, 36]. With weights w i = 1/(d i ( α ^ MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaaiiGacuWFXoqygaqcaaaa@2E62@ , β ^ MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaaiiGacuWFYoGygaqcaaaa@2E64@ ; y i ) + δ) we can minimize the distances in the L1 norm, if we let δ → 0+. The distance d i ( α ^ MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaaiiGacuWFXoqygaqcaaaa@2E62@ , β ^ MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaaiiGacuWFYoGygaqcaaaa@2E64@ ; y i ), which equals the sum of squares of the values of all but the first principal component, was first suggested by [37]. Thus, our choice of weight function down-weigh outliers as defined by [37] in order to obtain a robust estimate of L(α, β) corresponding to the first principal component. Our procedure is related to principal component analysis applied to an M-estimator of the covariance (scatter) matrix of data. The main difference is that we use weights w = w(d i ) = 1/ (δ + d i ) based on the orthogonal distance d i from y i to L(α, β) whereas for M-estimation one uses weights w = w(d i ) based on a robustified Mahalanobis distance of y i , which is computed from an M-estimator of the covariance matrix of data. M-estimation of location and scatter was first defined by [38], and subsequently applied to principal component analysis by [39]. For other more recent papers on robust multivariate analysis, see [40, 41] and the references therein. Alternative robust estimators can be obtained by choosing other weight functions w(d i ), but we choose to optimize in L1. Moreover, if one suspects a non-symmetric distribution of data points around the line, a trimmed version of the weight function may be considered. In practice, the above optimization can be performed by an iterative reweighted principal component analysis (IWPCA) scheme. For iteration l = 1,2,..., minimize (29) using WPCA where w i ( 1 ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWG3bWDdaqhaaWcbaGaemyAaKgabaGaeiikaGIaeGymaeJaeiykaKcaaaaa@324D@ = 1 and w i ( l + 1 ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWG3bWDdaqhaaWcbaGaemyAaKgabaGaeiikaGIaemiBaWMaey4kaSIaeGymaeJaeiykaKcaaaaa@3490@ = 1/(d i (α(l),β(l);y i ) + δ) with δ being a small positive number to avoid infinite weights.

As a last step, in order to get estimates of the four parameters a R , a G , b R , and b G from the two parameter estimates α ^ MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaaiiGacuWFXoqygaqcaaaa@2E62@ and β ^ MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaaiiGacuWFYoGygaqcaaaa@2E64@ , we introduce additional constraints. Let yc,(1)= min i y c,i for c = {R, G} and choose

b ^ G = 1 b ^ R = β ^ ( 31 ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakqaabeqaaiqbdkgaIzaajaWaaSbaaSqaaiabdEeahbqabaGccqGH9aqpcqaIXaqmaeaacuWGIbGygaqcamaaBaaaleaacqWGsbGuaeqaaOGaeyypa0dcciGaf8NSdiMbaKaacaWLjaGaaCzcamaabmaabaGaee4mamJaeeymaedacaGLOaGaayzkaaaaaaa@3B74@

a ^ G = max { a G ; a G < y G , ( 1 ) α ^ + β ^ a G < y R , ( 1 ) } α ^ R = α ^ + β ^ a ^ G ( 32 ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaafaqabeGabaaabaGafmyyaeMbaKaadaWgaaWcbaGaem4raCeabeaakiabg2da9iGbc2gaTjabcggaHjabcIha4jabcUha7jabdggaHnaaBaaaleaacqWGhbWraeqaaOGaei4oaSJaemyyae2aaSbaaSqaaiabdEeahbqabaGccqGH8aapcqWG5bqEdaWgaaWcbaGaem4raCKaeiilaWIaeiikaGIaeGymaeJaeiykaKcabeaakiabgEIizJGaciqb=f7aHzaajaGaey4kaSIaf8NSdiMbaKaacqWGHbqydaWgaaWcbaGaem4raCeabeaakiabgYda8iabdMha5naaBaaaleaacqWGsbGucqGGSaalcqGGOaakcqaIXaqmcqGGPaqkaeqaaOGaeiyFa0habaGaf8xSdeMbaKaadaWgaaWcbaGaemOuaifabeaakiabg2da9iqb=f7aHzaajaGaey4kaSIaf8NSdiMbaKaaieGacuGFHbqygaqcamaaBaaaleaacqWGhbWraeqaaaaakiaaxMaacaWLjaWaaeWaaeaacqaIZaWmcqaIYaGmaiaawIcacaGLPaaaaaa@64A3@

to be the estimates of the bias and the scale parameters in model (3). Constraint (32) is only correct in the noise-free case. If we allow noise, say

y c,i = a c + b c x c,i + ε c,i ,     (33)

where E[ε c,i ] = 0 and V[ε c,i ] = σ c , i 2 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaaiiGacqWFdpWCdaqhaaWcbaGaem4yamMaeiilaWIaemyAaKgabaGaeGOmaidaaaaa@331F@ for c = {R, G}, it is possible that the bias terms a R and a G are larger than the smallest observed value in the respective channel. This is especially important if the distributions of ε c,i for c = {R, G} have heavy negative tails. An alternative, which introduces negative estimates, is to replace yc,(1)in (32) with yc, (j)for some order index (j) such that j - 1 non-positive signals are obtained in channel c. Choosing an optimal value on j is currently investigated by the authors, but beyond this article. Furthermore, it has been observed that the noise in each channel is roughly proportional to the signal strength, that is, σ c,i x c,i . Thus, a positive side effect of the above estimation algorithm is that, contrary to have equal weights for all spots (w i = 1), more weight will be given to low-intensity spots compared to high-intensity ones. This makes the method more robust to saturation and other non-linear effects that might occur at high intensities, effects for which classical line fits, which rely on homoscedasticity, would fail. Finally, with backward transformation (4) based on estimates ( a ^ G MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacuWGHbqygaqcamaaBaaaleaacqWGhbWraeqaaaaa@2F4A@ , a ^ R MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacuWGHbqygaqcamaaBaaaleaacqWGsbGuaeqaaaaa@2F60@ , b ^ G MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacuWGIbGygaqcamaaBaaaleaacqWGhbWraeqaaaaa@2F4C@ , b ^ R MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacuWGIbGygaqcamaaBaaaleaacqWGsbGuaeqaaaaa@2F62@ ), data is translated and rotated such that it falls around the diagonal line that goes through (0, 0) and (1, 1).

To illustrate the affine normalization method we have applied it to six two-color microarray data sets each containing 240 spike-in controls designed to have log2 r = (-2, 0, +2) at various intensities. See also Methods. These controls were not used to estimate the normalization parameters. As shown in Figure 9, which is for one of the arrays, there is a small curvature for non-differentially expressed genes (and spike-ins) before normalization, a curvature that corresponds to - α ^ MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaaiiGacuWFXoqygaqcaaaa@2E62@ ≈ +7 > 0 (small positive derivative) at log2 r = 0, cf. (10). More importantly, the intensity dependent effect is profound for the log2 r = ± 2 controls. Affine normalization allowing no negative signals removes curvature (α ≈ 0) for log2 r = 0, but not for the log2 r = ± 2 controls, which indicates equal affine transformation in both channels, cf. right graph of Figure 6. If 5% negative signals is allowed, the log-ratios of all controls become roughly independent of intensity, which indicates that the observed signals are proportional to the concentrations of the spike-ins. All six arrays in this study show very similar properties.

Generalization to multiple channels and multiple arrays

A multi-dimensional version of the above algorithm can be summarized as follows. Say there are N arrays each hybridized with K samples (colors) such that there is in total C = NK channels. Let y i = (y1,i,..., yK,i,..., y(N-1)K + 1,i,..., yNK,i) be the NK observations for gene i. Thus, {y i } i spans an NK-dimensional space. Analogously to the above two-dimensional procedure, we can fit a robust line L through data in NK and constrain the estimate of a = (a1,..., a NK ) by enforcing that a <y i ; i, where < is the component-wise inequality. Backward transformation (4) translates and rotates data such that it lies along the diagonal line. By normalizing all arrays at once, signals from all hybridizations are brought onto the same scale and no further, so called, between-slide scale normalization is needed.

To apply the multi-dimensional normalization, the assumption that most genes are non-differentially expressed for all possible hybridization/channel pairs must be added. For most experimental setups this is not a problem. For instance, in two-channel microarrays experiments it is common to hybridize one test sample and one reference sample, which is selected such that it does not differ too much from the test sample, to the same array. The same reference is then used between arrays (in either channel). Thus, since each test-reference pair is "close" to each other, all test-test pairs should be approximately "close" to each other too. Alternatively, all reference channels can be normalized together. Then, keeping the reference signals fixed, each test channel is normalized toward the corresponding reference channel.

An implementation of the above algorithm is made available in the R [42] package named aroma [43], which is platform independent. In addition, the methods are available as an R plugin [44] for BASE [45]. A typically call is normalizeAffine(rg), which will normalize all arrays and all channels in the microarray object rg at once. The first parameter that has to be specified in the above algorithm is δ. However, its value is not critical and we have found that for instance δ = 0.02 works well in general and is therefore the default value. The second parameter to be specified is the number of negative signals allowed after normalization. By default the method allows 5% negative signals, but any fraction (or absolute number) of negative signals can be specified. Moreover, the method can be applied to any subsets of genes separately such as print-tip groups, clone groups and spike-ins. Finally, support for datapoint weights has been implemented so that the influence each spot has in the estimation procedure can be specified (not to be mistaken for the iterative weights above). Such weights may for instance be calculated from spot quality measures obtained by image analysis methods.

Discussion

If we compare the robust affine normalization method with the perpendicular and the parallel translation normalization methods optimized by minimizing the curvature, we find that there are similarities, because minimizing the curvature is identical to finding estimates of the bias parameters along the line L(α, β; y). Assuming a pure affine transformation, there are also similarities to the curve-fit method, which fits approximately the same line (curve) through data. The difference is how data is transformed to meet the assumptions. The affine method translates and rescales data in the original domain whereas the curve-fit method operates in a rotated and log-transformed domain.

Moreover, the translation and the curve-fit methods rely on two-dimensional data (log-ratios) and it is not clear how to generalize them to multi-dimensional data, although re-iterative versions such as the cyclic loess [31] and the (multi-dimensional) contrast based method [46] have been suggested. Our affine normalization method is not limited to two-dimensional data, but can be applied to any number of channels, which means that three and four-color microarray data can be normalized as easily as two-color data.

It is interesting to note the close relationship between the quantile and the affine normalization method. In quantile normalization data points are shifted such that the sample densities of both channels are made identical. This results in new measurement functions, which may not be linear (or affine), but for which log-ratios for non-differentially expressed genes are zero. The affine normalization method can be though of as a quantile normalization method with special constraints on the underlying densities. An interesting continuation of the affine method and quantile normalization method is to relax the affine constraint by using other parametric or semi-parametric models. One possibility is to add smoothness constraints to the transformation functions using smoothing splines [25].

In previous sections, we did not discuss the variance stabilizing methods suggested by [12, 47, 48], which are based on error models that also contain channel-specific bias terms. Thus, those methods do indeed correct for intensity-dependent effects. Because they are based on specific error models and target hypothesis testing of non-differentially expressed genes, but also because they stabilize the log-ratio variances, they do not fit well into the above deterministic discussion. In addition, stabilizing the variance introduces bias for differentially expressed genes, which is not useful if absolute expression levels are of interest. However, we do believe that the directions drawn up by their underlying error models are promising.

Moreover, in the spirit of [20], it would be interesting to incorporate an empirical Bayes component to allow for non-positive signals more naturally.

An interesting study on microarray scanner calibration curves was published while submitting this article [19]. From their results on under-estimated log-ratios and propeller-shaped log-ratio versus log-ratio scatter plots, we suspect that they observe nothing but affine transformed signals. It would be of great interest to redo their analysis with affine normalization.

Finally, offset and scale parameters in (3) can be extended to incorporate, say, spatial structures by replacing them with a c (u i ) and b c (u i ) where u i = (u i,x , u i,y ) is the spatial position of spot i.

Conclusion

We have proposed a robust non-parametric normalization method for affine transformed gene-expression data, which centers and symmetrizes log-ratios at all intensities. Symmetric log-ratios are fundamental for statistical tests on non-differentially expressed genes, typically utilizing t-tests or similar. In addition and contrary to other normalization methods (except quantile normalization), which are exclusively for paired channels, the method applies equally well to multi-array and multi-channel data. We believe that normalization based on affine transformations, such as our proposed IWPCA method, is very promising and has the potential of being used for many microarray applications. However, more comparison with other normalization methods is needed to fully understand its advantages and disadvantages.

Methods

Log-ratios as a function of log-intensities

Let x g = b G x G ≥ 0. Equation (6) for affine transformations (3) can then be written as

A = 1 2 log 2 [ ( a R + r β x g ) ( a G + x g ) ] MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGbbqqcqGH9aqpdaWcaaqaaiabigdaXaqaaiabikdaYaaacyGGSbaBcqGGVbWBcqGGNbWzdaWgaaWcbaGaeGOmaidabeaakiabcUfaBjabcIcaOiabdggaHnaaBaaaleaacqWGsbGuaeqaaOGaey4kaSIaemOCaihcciGae8NSdiMaemiEaG3aaSbaaSqaaiabdEgaNbqabaGccqGGPaqkcqGGOaakcqWGHbqydaWgaaWcbaGaem4raCeabeaakiabgUcaRiabdIha4naaBaaaleaacqWGNbWzaeqaaOGaeiykaKIaeiyxa0faaa@4C04@

with β = b R /b G and r = x R /x G . After a few steps, one gets that

x g = ( r β ) 1 ( 1 2 ( a R + r β a G ) + 1 4 ( a R r β a G ) 2 + r β 2 2 A ) . MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWG4baEdaWgaaWcbaGaem4zaCgabeaakiabg2da9iabcIcaOiabdkhaYHGaciab=j7aIjabcMcaPmaaCaaaleqabaGaeyOeI0IaeGymaedaaOWaaeWaaeaacqGHsisldaWcaaqaaiabigdaXaqaaiabikdaYaaacqGGOaakcqWGHbqydaWgaaWcbaGaemOuaifabeaakiabgUcaRiabdkhaYjab=j7aIjabdggaHnaaBaaaleaacqWGhbWraeqaaOGaeiykaKIaey4kaSYaaOaaaeaadaWcaaqaaiabigdaXaqaaiabisda0aaacqGGOaakcqWGHbqydaWgaaWcbaGaemOuaifabeaakiabgkHiTiabdkhaYjab=j7aIjabdggaHnaaBaaaleaacqWGhbWraeqaaOGaeiykaKYaaWbaaSqabeaacqaIYaGmaaGccqGHRaWkcqWGYbGCcqWFYoGycqaIYaGmdaahaaWcbeqaaiabikdaYiabdgeabbaaaeqaaaGccaGLOaGaayzkaaGaeiOla4caaa@5DD2@

It follows that

a G + b G x G = a G + x g = ( r β ) 1 ( 1 2 α ( r ) + 1 4 [ α ( r ) ] 2 + r β 2 2 A ) a R + b R x R = a R + r β x g = 1 2 α ( r ) + 1 4 [ α ( r ) ] 2 + r β 2 2 A MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaafaqadeabbaaaaeaacqWGHbqydaWgaaWcbaGaem4raCeabeaakiabgUcaRiabdkgaInaaBaaaleaacqWGhbWraeqaaOGaemiEaG3aaSbaaSqaaiabdEeahbqabaGccqGH9aqpcqWGHbqydaWgaaWcbaGaem4raCeabeaakiabgUcaRiabdIha4naaBaaaleaacqWGNbWzaeqaaaGcbaGaeyypa0JaeiikaGIaemOCaihcciGae8NSdiMaeiykaKYaaWbaaSqabeaacqGHsislcqaIXaqmaaGcdaqadaqaaiabgkHiTmaalaaabaGaeGymaedabaGaeGOmaidaaiab=f7aHjabcIcaOiabdkhaYjabcMcaPiabgUcaRmaakaaabaWaaSaaaeaacqaIXaqmaeaacqaI0aanaaGaei4waSLae8xSdeMaeiikaGIaemOCaiNaeiykaKIaeiyxa01aaWbaaSqabeaacqaIYaGmaaGccqGHRaWkcqWGYbGCcqWFYoGycqaIYaGmdaahaaWcbeqaaiabikdaYiabdgeabbaaaeqaaaGccaGLOaGaayzkaaaabaGaemyyae2aaSbaaSqaaiabdkfasbqabaGccqGHRaWkcqWGIbGydaWgaaWcbaGaemOuaifabeaakiabdIha4naaBaaaleaacqWGsbGuaeqaaOGaeyypa0Jaemyyae2aaSbaaSqaaiabdkfasbqabaGccqGHRaWkcqWGYbGCcqWFYoGycqWG4baEdaWgaaWcbaGaem4zaCgabeaaaOqaaiabg2da9maalaaabaGaeGymaedabaGaeGOmaidaaiab=f7aHjabcIcaOiabdkhaYjabcMcaPiabgUcaRmaakaaabaWaaSaaaeaacqaIXaqmaeaacqaI0aanaaGaei4waSLae8xSdeMaeiikaGIaemOCaiNaeiykaKIaeiyxa01aaWbaaSqabeaacqaIYaGmaaGccqGHRaWkcqWGYbGCcqWFYoGycqaIYaGmdaahaaWcbeqaaiabikdaYiabdgeabbaaaeqaaaaaaaa@8E3F@

with α(r) = a R - rβa G . Equation (9) follows immediately.

Data

Arrays and hybridization

Six arrays were used in this study. The arrays contain Operon's Human Array-Ready Oligo Sets™ and 240 Stratagene SpotReport™ (Alien and Alien Oligo) control spots with layout of 12-by-4 print-tip groups each containing 25-by-25 spots. In total there are 30000 spots on each array. The arrays were produced by the SWEGENE DNA Microarray Resource Centre, Department of Oncology at Lund University using a MicroGrid II 600R arrayer fitted with MicroSpot 10 K pins (BioRobotics). Arrays were spotted on UltraGAPS™ coated slides (Corning Incorporated). Printing was performed in a temperature (18–20°C) and humidity (44–49% RH) controlled area. After printing was completed, arrays were left in a desiccator to dry for 48 hours, rehydrated for 1 second over steaming water, snap dried on a hot plate (98°C), UV-cross-linked (800 mJ/cm2) and subsequently hybridized with various test and reference RNA samples. Samples and Stratagene RNA spikes were labeled, purified and hybridized using Pronto!™ Plus System 6 (Corning Incorporated) according to manufacturer's instructions.

Scanning and Image analysis

The arrays were scanned on an Agilent G2505A DNA microarray scanner (Agilent Technologies) at laser power and PMT gain both at 100% and scan resolution 10 μ m/pixel. The so called dark offset intentionally added to all signals by the Agilent scanner [[49], p. 18] has been uninstalled. Multiscan calibration [11] was not used for this study.

The scanned images (65536 gray scales) were analyzed using the Axon GenePix Pro v4.1.1.40 software (Axon Instruments). The median spot pixel intensity was used for the foreground signal. Background estimates were not considered in this analysis. No spot signals were discarded.

References

  1. Schena M, Shalon D, Davis RW, Brown PO: Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science 1995, 270(5235):467–470.

    Article  CAS  PubMed  Google Scholar 

  2. Duggan DJ, Bittner M, Chen Y, Meltzer P, Trent JM: Expression profiling using cDNA microarrays. Nature Genetics 1999, 21(1 Supplement):10–14. 10.1038/4434

    Article  CAS  PubMed  Google Scholar 

  3. Rocke DM, Durbin B: A Model for Measurement Error for Gene Expression Arrays. Journal of Computational Biology 2001, 8(6):557–569. 10.1089/106652701753307485

    Article  CAS  PubMed  Google Scholar 

  4. Yang YH, Dudoit S, Luu P, Lin DM, Peng V, Ngai J, Speed TP: Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. Nucelic Acids Research 2002, 30(4):e15. 10.1093/nar/30.4.e15

    Article  Google Scholar 

  5. Bengtsson H: Identification and normalization of plate effects in cDNA microarray data. Preprints in Mathematical Sciences 2002:28, Mathematical Statistics, Centre for Mathematical Sciences, Lund University, Sweden; 2002.

    Google Scholar 

  6. Ramdas L, Coombes KR, Baggerly K, Abruzzo L, Highsmith WE, Krogmann T, Hamilton SR, Zhang W: Sources of nonlinearity in cDNA microarray expression measurements. Genome Biology 2001, 2(11):research0047.1–0047.7. 10.1186/gb-2001-2-11-research0047

    Article  Google Scholar 

  7. Li X, Gu W, Mohan S, Baylink DJ: DNA microarrays: their use and misuse. Microcirculation 2002, 9: 13–22. 10.1038/sj.mn.7800118

    Article  CAS  PubMed  Google Scholar 

  8. Burle Industries Inc: Photomultiplier Handbook. Lancaster, PA, U.S.A.; 1980.

    Google Scholar 

  9. Handran S, Wang C, Aziz D: Assessing Slide Flatness. 2001.

    Google Scholar 

  10. Bengtsson A, Bengtsson H: Microarray Image Analysis: Background Estimation using Quantile and Morphological Filters. BMC Bioinformatics 2006, 7(1):96. 10.1186/1471-2105-7-96

    Article  PubMed Central  PubMed  Google Scholar 

  11. Bengtsson H, Jönsson G, Vallon-Christersson J: Calibration and assessment of channel-specific biases in microarray data with extended dynamical range. BMC Bioinformatics 2004., 5(177):

    Google Scholar 

  12. Huber W, von Heydebreck A, Sültmann H, Poustka A, Vingron M: Variance stabilization applied to microarray data calibration and to the quantification of differential expression. Bioinformatics 2002, 18(Suppl 1):S96–104.

    Article  PubMed  Google Scholar 

  13. Kerr MK, Afshari CA, Bennett L, Bushel P, Martinez J, Walker NJ, Churchill GA: Statistical analysis of a gene expression microarray experiment with replication. In Tech rep. The Jackson Laboratory, Bar Harbor, Maine; 2001.

    Google Scholar 

  14. Cui X, Kerr MK, Churchill GA: Data Transformations for cDNA Microarray Data. In Tech rep. The Jackson Laboratory, USA; 2002.

    Google Scholar 

  15. Callow M, Dudoit S, Gong E, Speed T, Rubin E: Microarray Expression Profiling Identifies Genes with Altered Expression in HDL-Deficient Mice. Genome Research 2000, 10(12):2022–9. 10.1101/gr.10.12.2022

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  16. Yue H, Eastman P, Wang B, Minor J, Doctolero M, Nuttall R, Stack R, Becker J, Montgomery J, Vainer M, Johnston R: An evaluation of the performance of cDNA microarrays for detecting changes in global mRNA expression. Nucelic Acids Research 2001, 29(8):E41–1. 10.1093/nar/29.8.e41

    Article  CAS  Google Scholar 

  17. Yuen T, Wurmbach E, Pfeffer RL, Ebersole BJ, Sealfon SC: Accuracy and calibration of commercial oligonucleotide and custom cDNA microarrays. Nucelic Acids Research 2002., 30:

    Google Scholar 

  18. Barczak A, Rodriguez MW, Hanspers K, Koth LL, Tai YC, Bolstad BM, Speed TP, Erie DJ: Spotted long oligonucleotide arrays for human gene expression analysis. Genome Research 2003, 13(7):1775–85. 10.1101/gr.1048803

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  19. Shi L, Tong W, Su Z, Han T, Han J, Puri RK, Fang H, Frueh FW, Goodsaid FM, Guo L, Branham WS, Chen JJ, Xu ZA, Harris SC, Hong H, Xie Q, Perkins RG, Fuscoe JC: Microarray scanner calibration curves: characteristics and implications. BMC Bioinformatics 2005, 6(Suppl 2):S11. 10.1186/1471-2105-6-S2-S11

    Article  PubMed Central  PubMed  Google Scholar 

  20. Kooperberg C, Fazzio TG, Delrow JJ, Tsukiyama T: Improved background correction for spotted DNA microarrays. Journal of Computational Biology 2002, 9: 55–66. 10.1089/10665270252833190

    Article  CAS  PubMed  Google Scholar 

  21. Bengtsson H: Low-level analysis of microarray data. PhD thesis. Centre for Mathematical Sciences, Division of Mathematical Statistics, Lund University; 2004.

    Google Scholar 

  22. Cleveland W: Robust locally weighted regression and smoothing scatterplots. Journal of American Statistics Association 1979, 74: 829–836. 10.2307/2286407

    Article  Google Scholar 

  23. Cleveland W: LOWESS: A program for smoothing scatterplots by robust locally weighted regression. The American Statistician 1981, 35: 54. 10.2307/2683591

    Article  Google Scholar 

  24. Cleveland W, Grosse E, Shyu W: Local regression models. MIT Press/McGraw-Hill; 1992.

    Google Scholar 

  25. Green P, Silverman B: Nonparametric Regression and Generalized Linear Models – A roughness penalty approach. Chapman and Hall; 1994.

    Book  Google Scholar 

  26. Newton MA, Kendziorski CM, Richmond CS, Blattner FR, Tsui KW: On differential variability of expression ratios: improving statistical inference about gene expression changes from microarray data. Journal of Computational Biology 2001, 8: 37–52. 10.1089/106652701300099074

    Article  CAS  PubMed  Google Scholar 

  27. Yang YH, Dudoit S, Luu P, Speed TP: Normalization for cDNA microarray data. Technical Report 589, Department of Statistics, University of California at Berkeley; 2000.

    Google Scholar 

  28. Marton MJ, DeRisi JL, Bennett HA, Iyer VR, Meyer MR, Roberts CJ, Stoughton R, Burchard J, Slade D, Dai H, Jr DEB, Hartwell LH, Brown PO, Friend SH: Drug validation and identification of secondary drug target effects using DNA microarrays. Nature Medicine 1998, 4(11):1293–1301. 10.1038/3282

    Article  CAS  PubMed  Google Scholar 

  29. Kerr MK, Martin M, Churchill GA: Analysis of variance for gene expression microarray data. Journal of Computational Biology 2000, 7: 819–837. 10.1089/10665270050514954

    Article  CAS  PubMed  Google Scholar 

  30. Tseng GC, Oh MK, Rohlin L, Liao JC, Wong WH: Issues in cDNA microarray analysis: quality filtering, channel normalization, models of variations and assessment of gene effects. Nucelic Acids Research 2001, 29(12):2549–2557. 10.1093/nar/29.12.2549

    Article  CAS  Google Scholar 

  31. Bolstad B, Irizarry R, Astrand M, Speed T: A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics 2003, 19(2):185–93. 10.1093/bioinformatics/19.2.185

    Article  CAS  PubMed  Google Scholar 

  32. Yang YH, Thome NP: Normalization for Two-color cDNA Microarray Data. In Science and Statistics: A Festschrift for Terry Speed, Monograph Series. Volume 40. Edited by: Goldstein DR. IMS Lecture Notes; 2003:403–418.

    Chapter  Google Scholar 

  33. Schena M: Microarrays Analysis. Hoboken, New Jersey: John Wiley & Sons, Inc.; 2003.

    Google Scholar 

  34. Yang YH, Buckley M, Dudoit S, Speed T: Comparison of methods for image analysis on cDNA microarray data. Journal of Computational and Graphical Statistics 2002, 11: 108–136. 10.1198/106186002317375640

    Article  Google Scholar 

  35. Jolliffe I: Principal Component Analysis. Springer series in statistics, Springer-Verlag New York Inc.; 1986.

    Book  Google Scholar 

  36. Greenacre M: Theory and Applications of Correspondence Analysis. London and Orlando: Academic Press; 1984.

    Google Scholar 

  37. Rao CR: The use and interpretation of principal component analysis in applied research. Sankhya Series A 1964, 26: 329–358.

    Google Scholar 

  38. Maronna RA: Robust M-Estimators of Multivariate Location and Scatter. The Annals of Statistics 1976, 4: 51–67.

    Article  Google Scholar 

  39. Campbell NA: Robust procedures in multivariate analysis. I. Robust covariance estimation. Applied Statistics 1980, 29(3):231–237. 10.2307/2346896

    Article  Google Scholar 

  40. Croux C, Haesbroeck G: Principal Component Analysis based on Robust Estimators of the Covariance or Correlation Matrix: Influence Functions and Efficiencies. 2000, 87: 603–618.

    Google Scholar 

  41. Pison G, Rousseeuw PJ, Filzmoser P, Croux C: Robust factor analysis. J Multivar Anal 2003, 84: 145–172. 10.1016/S0047-259X(02)00007-6

    Article  Google Scholar 

  42. R Development Core Team: R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria; 2005.

    Google Scholar 

  43. Bengtsson H: aroma – An R Object-oriented Microarray Analysis environment. Preprint in Mathematical Sciences 2004:18, Mathematical Statistics, Centre for Mathematical Sciences, Lund University, Sweden; 2004.

    Google Scholar 

  44. Bengtsson H: aroma.Base – A generic R plugin dispatcher for BASE. online 2005. [http://www.maths.lth.se/bioinformatics/]

    Google Scholar 

  45. Saal LH, Troein C, Vallon-Christersson J, Gruvberger S, Borg Å, Peterson C: BioArray Software Environment (BASE): a platform for comprehensive management and analysis of microarray data. Genome Biology 2002, 3(8):SOFTWARE0003. 10.1186/gb-2002-3-8-software0003

    Article  PubMed Central  PubMed  Google Scholar 

  46. Åstrand M: Contrast Normalization of Oligonucleotide Arrays. Journal of Computational Biology 2003, 10: 95–102. 10.1089/106652703763255697

    Article  PubMed  Google Scholar 

  47. Durbin B, Hardin J, Hawkins D, Rocke D: A variance-stabilizing transformation for gene-expression microarray data. Bioinformatics 2002, 18: S105-S110.

    Article  PubMed  Google Scholar 

  48. Rocke DM, Durbin B: Approximate variance-stabilizing transformations for gene-expression microarray data. Bioinformatics 2003, 19(8):966–72. 10.1093/bioinformatics/btg107

    Article  CAS  PubMed  Google Scholar 

  49. Agilent Technologies Inc.: Agilent G2565AA and Agilent G2565BA Microarray Scanner System – User Manual. third, Palo Alto, CA; 2002.

    Google Scholar 

  50. Jögi A, Vallon-Christersson J, Holmquist L, Åke Borg HA, Påhlman S: Human neuroblastoma cells exposed to hypoxia: induction of genes associated with growth, survival, and aggressive behavior. Experimental Cell Research 2004, 295(2):469–87. 10.1016/j.yexcr.2004.01.013

    Article  PubMed  Google Scholar 

Download references

Acknowledgements

This work would not have been achieved without scientific support from Terry Speed at UC Berkeley and Walter and Eliza Hall Institute of Medical Research (WEHI), Patyaksha Wirapati (at the time at WEHI), Gordon Smyth (WEHI), and Halfdan Grage (at the time at Lund University). While at UC Berkeley (2000) and WEHI (2002), HB was financially supported by The Swedish Foundation for International Cooperation in Research and Higher Education (STINT), The Fulbright Commission, The Foundation Blanceflor Boncompagni-Ludovisi née Bildt, The Royal Swedish Academy of Sciences, and The Royal Physiographic Society in Lund. OH was financially supported by the Swedish Research Council. Microarray data was kindly provided by the SWEGENE DNA Microarray Resource Center at the BioMedical Center B10 in Lund, supported by the Knut and Alice Wallenberg foundation through the SWEGENE consortium. We also wish to thank the reviewers for feedback improving this manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Henrik Bengtsson.

Additional information

Authors' contributions

HB drafted the first version of the manuscript. Both authors contributed equally to the study and the final version of the manuscript.

Rights and permissions

Open Access This article is published under license to BioMed Central Ltd. This is an Open Access article is distributed under the terms of the Creative Commons Attribution License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Bengtsson, H., Hössjer, O. Methodological study of affine transformations of gene expression data with proposed robust non-parametric multi-dimensional normalization method. BMC Bioinformatics 7, 100 (2006). https://doi.org/10.1186/1471-2105-7-100

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/1471-2105-7-100

Keywords