Abstract
Background
As the use of microarray technology becomes more prevalent it is not unusual to find several laboratories employing the same microarray technology to identify genes related to the same condition in the same species. Although the experimental specifics are similar, typically a different list of statistically significant genes result from each data analysis.
Results
We propose a statisticallybased metaanalytic approach to microarray analysis for the purpose of systematically combining results from the different laboratories. This approach provides a more precise view of genes that are significantly related to the condition of interest while simultaneously allowing for differences between laboratories. Of particular interest is the widely used Affymetrix oligonucleotide array, the results of which are naturally suited to a metaanalysis. A simulation model based on the Affymetrix platform is developed to examine the adaptive nature of the metaanalytic approach and to illustrate the usefulness of such an approach in combining microarray results across laboratories. The approach is then applied to real data involving a mouse model for multiple sclerosis.
Conclusion
The quantitative estimates from the metaanalysis model tend to be closer to the "true" degree of differential expression than any single lab. Metaanalytic methods can systematically combine Affymetrix results from different laboratories to gain a clearer understanding of genes' relationships to specific conditions of interest.
Background
Microarray technology allows simultaneous assessment of transcript abundance for thousands of genes. This exciting research tool permits the identification of genes which are significantly differentially expressed between conditions. With the use of microarrays becoming more commonplace, it is not unusual for several different laboratories to investigate the genetic implications of the same condition(s). Each lab may produce its own list of candidate genes which they believe to be related to the condition of interest. As a result of sound statistical approaches, each lab will also have for each candidate gene some quantitative measure that serves as the basis for the claim of statistical significance.
Of interest in this paper are the methods by which these quantitative measures may be combined across labs to arrive at a more comprehensive understanding of the effects of the different candidate genes. Where the term "analysis" is used to describe the quantitative approaches to draw useful information from raw data, the term "metaanalysis" [1] refers to the approaches used to draw useful information from the results of previous analyses. Metaanalysis has been predominantly used in the medical and social sciences, in situations where several studies may have been conducted to investigate the effect of the same treatment, and the researcher seeks to combine the results of the different studies in a meaningful way in order to arrive at a single estimate of the true effect of the treatment. For the current application, metaanalytic approaches can be employed to combine the results from several different labs without having access to the original raw data that yielded the initial results. Such approaches have particular utility with the results of Affymetrix GeneChip^{® }microarrays and other fabricated arrays, where results are given in a uniform format that readily lends itself to comparison between labs and combination across labs.
A measure of the degree or magnitude of differential expression provides more information regarding a gene's relation to a disease or condition of interest than does a statement regarding its significance or nonsignificance. This information is useful because it allows for greater precision of estimation of the gene's effect with respect to the condition of interest. That is, to arrive at a clearer understanding of a gene's true effect relating to the condition of interest, it is most helpful to have a quantitative measure of the magnitude of differential expression rather than a simple declaration of significance.
Prior applications of metaanalysis to microarray data have either sought to combine Pvalues or to combine results across platforms (i.e., combining Affymetrix and cDNA array results) [26]. Combining only Pvalues, while useful in obtaining more precise estimates of significance, does not provide information that is easily interpretable by a biologist, may not indicate the direction of significance (e.g., up or downregulation), and most importantly, gives no information regarding the magnitude of the estimated expression change. Similarly, while a "votecounting" approach based on Pvalues [6] addresses differences in lists of significant genes from separate experiments, it gives no information regarding the magnitude of the estimated expression change. While an "integrative correlation" approach [5] will help identify genes with reproducible expression patterns, it also does not provide any information regarding the magnitude of the estimated expression change
Previous attempts to combine results across microarray platforms (i.e., technologies) assume that spot intensities or signal values for a given gene can be directly compared even though they represent different segments of the gene. That is, a spot for a given gene on a cDNA array represents the entire gene, while each spot for the same gene on an Affymetrix array represents a specific small section of the gene. Thus, combining results across technologies using only spot intensities is problematic from a biological perspective because the measurements represent different physical quantities. Even if the average spot intensity on an Affymetrix array is used, it is not certain that this average spot intensity value is at all comparable to the spot intensity value of the gene on a cDNA array.
Moreau et al. [4] report that 'after appropriate filtering, ratio and intensity data from different platforms can be compared and are amenable to' be used in a metaanalysis. However, "filtering", or "averaging out" outliers or nonreproducible spots, requires some subjectivity in the method of choice and may force agreement between platforms where no agreement should exist due to fundamental technological differences. Parmigiani et al. [5] attempt to address this problem of crossplatform consistency by identifying a set of genes whose expression patterns are essentially reproducible across platforms. However, even for these "reproducible" genes, there remains the question of how to systematically combine their corresponding results from the several laboratories to arrive at a single quantitative measure of differential expression. At the very least, if results are to be combined across platforms in a metaanalysis, the use of covariates [7] should be employed to account for the underlying differences between oligonucleotide (e.g., Affymetrix) and cDNA platforms. The focus of the current application is restricted to standard Affymetrix microarray results, and a method to combine results across laboratories is proposed and evaluated.
Results
Affymetrix technology
The Affymetrix GeneChip^{® }microarray [8] represents individual genes by 25mer segments (probes) fixed to the chip, and also makes use of mismatch probes differing at position 13. Each gene on the chip is typically represented by the same number of probe pairs on the chip (usually 14–20), although exceptions exist. It is now possible for some organisms' entire genomes to be represented on a single microarray (e.g., Arabidopsis). Appropriately prepared tissue sample is hybridized to the array and the array is scanned, producing raw data consisting of the intensities of the individual spots on the array. These intensities come in pairs, with PM denoting the intensity of a perfectmatch probe and MM denoting the intensity of the corresponding mismatch probe.
Affymetrix algorithms
Affymetrix has developed statistical algorithms [9] that employ these individual spot intensities for the purpose of estimating the true expression levels of individual genes in single samples. Furthermore, the Affymetrix approach compares gene expression levels in two different tissues (samples or treatment conditions) and reports a "signal log ratio" (SLR) with 95 percent confidence bounds. The signal log ratio is the signed Iog_{2 }of the signed fold change (FC) familiar to biologists [9]. That is, FC = 2^{SLR }if SLR ≥ 0 and FC = (1)2^{SLR }if SLR < 0. The algorithm used by Affymetrix to compute the SLR is based on Tukey's biweight algorithm [10] and for each gene takes a weighted average of the log_{2 }of the ratio of PM  MM between treatments or conditions, with weights related to the deviations from the median log_{2 }ratio for the gene, and with adjustments made when PM <MM. The resulting weighted average is the SLR.
Between the two conditions of interest, each gene either changes its level of expression or the level remains the same. A declaration of significant differential expression results from sufficient evidence that the gene is not expressed the same in the two conditions (i.e., that the SLR differs significantly from zero). Tukey's biweight algorithm provides an estimate for the variability of the SLR and an approximate distribution for the SLR estimate. The Affymetrix software (Microarray Suite Version 5.0, or MAS 5.0) reports a 95 percent confidence interval for the SLR [11], from which the estimated standard error can be computed. Individual laboratories can use this information to make a declaration of significant differential expression.
It should be noted that this SLR estimate represents a measure of differential expression between two chips (generically referred to as a base sample chip and an experimental sample chip). In practice, of course, it is recommended that experiments involve more than two chips, but the current MAS 5.0 algorithm is designed to represent differential expression only between two chips at a time. Other approaches exist to measure differential expression (dChip [12] and RMA [13], for example), and future work will evaluate their performance in metaanalyses. However, for the purposes of this current work, the focus of differential expression will rely on the SLR estimate of differential expression between two chips, because these are the estimates provided automatically by the commercial MAS 5.0 software.
If we let denote the estimate of the SLR θ_{i,k }for gene k in lab i, and be the upper bound for the 95 percent confidence interval for the same, then both and are reported by the Affymetrix software. The Affymetrix documentation [9] gives the estimated standard error of as s_{i,k }= (  ) / t_{i,k}(.975), where t_{i,k}(.975) is the upper .025 critical value of the t distribution with df_{i,k }degrees of freedom, where df_{i,k }= max(0.7(n_{i,k } 1), 1), with n_{i,k }representing the number of probe pairs representing gene k on each array in study i. The estimated variance of the SLR estimate is v_{i,k }= .
Accordingly, lab i could then test for significant differential expression of gene k (i.e., test the hypothesis : θ_{i,k }= 0) by use of the test statistic A_{i,k }= /s_{i,k}. Under , A_{i,k }approximately follows the t distribution with df_{i,k }degrees of freedom. The significance Pvalue (P_{i,k}) for gene k in lab i is the value such that A_{i,k} is the upper P_{i,k}/2 critical value of the t distribution with df_{i,k }degrees of freedom, and is rejected at the α_{i,k }level if P_{i,k }<α_{i,k}. That is, if P_{i,k }is sufficiently small, then lab i would declare gene k significantly differentially expressed.
Metaanalysis
The general metaanalytic framework [7] assumes that a measurable relationship exists between certain quantities of interest, and n independent studies have been conducted to examine this relationship. In turn, this relationship can be quantified so that each study produces an estimate of the relationship. If the estimates are appropriately standardized, then each study's estimate can be termed an "effect size" estimate. An effect size is essentially a standardized quantitative expression of the relationship of interest. For example, several different laboratories may investigate which of two drugs are better at treating a particular disease. In this case, the relationship of interest is the difference between the drugs' effects. If each laboratory produces an estimate standardized such that estimates from all laboratories address the same quantity and are on the same scale, then these estimates are effect size estimates.
There are three main classes of effect size estimates [14]. The first and perhaps most common is the standardized difference estimate, such as Hedges's g, similar to the tstatistic in a two sample study: . The second is the standardized relation estimate, such as the sample correlation coefficient r. The third is the measure of significance, such as the Pvalue from a particular hypothesis test. (Although not an effect size in the traditional sense, the measure of significance approach is mentioned here for the sake of completeness.)
In order to be combined across studies, effect size estimates must address the same measure or quantity, be standardized, and (with the exception of Pvalues, which are combined differently [15]) include some measure of variability of the effect size estimate [16]. Once each study i has provided its effect size estimate and its measure of variability v_{i}, a metaanalysis can be performed. Three main metaanalytic approaches exist: fixed effects, random effects, and hierarchical Bayes. The first two approaches are summarized here in order of increasing complexity, and the third is the subject of Choi et al. [3] and a future research interest. The three approaches are discussed more fully in Cooper and Hedges [14] and DuMouchel and Normand [17].
Fixed effects metaanalysis model
Assume that n independent studies have provided effect size estimates and measures of variability v_{i}, i = 1, ..., n. The most general metaanalytic approach assumes that
with sampling error ε_{i }~ N(0, ). That is, each is an estimate of a true fixed underlying effect size θ_{i}, and it is assumed that θ_{1 }= ... = θ_{n }with the common value θ. This is referred to as the homogeneity assumption and can be interpreted as assuming that all studies examined and provided estimates of the same parameter θ, and any differences between the estimates are attributable to sampling error alone. This common value parameter θ is estimated as a weighted average of the effect size estimates:
The weights w_{i }are chosen to minimize the variance of , and this is achieved by , where v_{i }is the estimated variance of . The variance of is .
The underlying assumption of homogeneity : θ_{1 }= ... = θ_{n }can be tested by use of the test statistic
Under , Q is approximately distributed as . Then this χ^{2 }distribution can serve as the basis for an approximate test of homogeneity. If Q is larger than the upper α_{Q }critical value of the distribution, is rejected at the α_{Q }level. Alternatively, the homogeneity Pvalue P_{Q }is the value such that Q is the upper P_{Q }critical value of the distribution, and is rejected at the α_{Q }level if P_{Q }<α_{Q}.
The test of significance : θ = 0 can be considered by use of the test statistic . Under , Z is distributed as N(0, 1), and if Z is larger than the upper α_{Z}/2 critical value of the N(0,1) distribution, is rejected at the α_{Z }level. Alternatively, the significance Pvalue P_{Z }is the value such that Z is the upper P_{Z}/2 critical value of the N(0, 1) distribution, and is rejected at the α_{Z }level if P_{Z }<α_{Z}.
Metaanalysis in the context of a microarray experiment assumes that several laboratories have provided quantitative measurements of differential expression (the effect size) for a number of genes along with variability estimates. For the fixed effects model, the homogeneity assumption () provides that for each laboratory the gene is expressed the same, and differences between laboratories are due to sampling error only. On the other hand, the hypothesis has the biological interpretation that there is no change in gene expression between the conditions of interest. This test of significance identifies genes that are significantly differentially expressed between the two conditions, using information from multiple laboratories.
Random effects metaanalysis model
In practice, the homogeneity assumption (and the resulting fixed effects model) tends to be overly simplistic but is presented in this paper for the sake of completeness. This assumption can be relaxed to make the metaanalysis model more appropriate. The basic random effects model [18] assumes n independent studies have provided effect size estimates and measures of variability v_{i}, i = 1, ..., n. In addition, the model assumes that
In this framework, θ is the population mean effect size, and there are two error components, δ and ε, corresponding to betweenstudy and withinstudy variability, respectively. Each study seeks to make statements regarding this quantity θ, and so takes a sample of individuals from a certain population in order to study the underlying effect size θ. However, due to differences between studies such as time, location, equipment, and other uncontrollable (and possibly unknown) factors, each study will in fact be estimating a slightly different quantity. That is, due to differences between studies, study i is estimating θ_{i}, a random effect size from the population of all possible effect sizes. The error component δ_{i }~ N(0, Δ^{2}) is the random deviation of θ_{i }from θ (representing variability between studies). In this basic model, Δ^{2 }represents the random variation between studies. Within study i, the actual estimate will vary from the "true" effect size θ_{i }based on which random sample is selected. That is, replicates within a study will result in slightly different estimates of the effect size due to sampling error. Here, ε_{i }~ N(0, ) is sampling error (representing variability within study i).
Q is calculated as in the fixed effects model. (Note that the fixed effects model assumes that Δ^{2 }= 0.) The random effects model uses this Q value to calculate new weights , where
Then the metaanalysis estimate for the population mean effect size θ is
The variance of is . The test of significance : θ = 0 can be considered by use of the test statistic . Under , Z_{w }is distributed as N(0, 1), and the significance Pvalue is calculated in the same manner as in the fixed effects model.
When the random effects metaanalysis is applied in the context of a microarray experiment, again it is assumed that several laboratories have provided quantitative measurements of differential expression (the effect size) for a given gene along with variability estimates. The random effects model assumes that there is some true degree of differential expression for the gene, and each lab is actually estimating a slightly different true degree of differential expression. That is, each laboratory has a slightly different "true" degree of differential expression. In addition, the estimate from each laboratory varies randomly about its true degree of differential expression due to sampling error. Then Δ^{2 }is a measure of the amount of variation between the laboratories' true degrees of differential expression, and the test of significance is used to identify differentially expressed genes by using information across multiple laboratories.
Metaanalysis with Affymetrix data
Our motivation for applying metaanalytic techniques to microarray data is threefold. First, standard platforms (e.g., Affymetrix) make combining results across labs straightforward and eliminate the usual criticism of metaanalyses that "apples and oranges" are being mixed [16] because the estimates being combined across labs have each been standardized by the same algorithms [9] in such a way that they are in fact estimates of the same underlying effect. Furthermore, any known differences between laboratories such as sample tissue type can be incorporated into the metaanalysis by use of covariates [7]. Second, combining raw data may provide more information than combining results, but raw data are not always easy to obtain, and it is conceivable that raw data may become unavailable while published (or unpublished) results are available. Third, if it can be shown that metaanalysis produces similar results to the pooling of raw data, then it can be argued that metaanalytic approaches are more efficient in the sense that they only require easily obtainable results rather than the raw data.
The uniformity of chip design and data acquisition from Affymetrix oligonucleotide microarray experiments readily lends itself to a metaanalysis. Given n studies examining the differences in gene expression between two treatments (e.g., healthy vs. diseased), a metaanalysis can combine each study's signal log ratio (SLR) estimates in a meaningful way by taking the SLR as the effect size estimate. The SLR satisfies the criteria for an effect size (i.e., comparability of estimates, standardization to the same scale, and availability of a variance estimate). The SLR for a given gene represents the degree of differential expression between two conditions, and is directly comparable between labs since it estimates the same physical quantity. The SLR from Affymetrix is standardized in the sense that a SLR of zero means no differential expression is observed, and the algorithms used to produce the SLR place all SLR estimates on the same scale. Finally, a variance for the SLR estimate is provided by the Affymetrix algorithms [9,10].
A general fixed effects model can be employed to perform a metaanalysis to estimate the true effect size (signal log ratio, SLR) θ_{k }of gene k. In addition, the test of homogeneity can be evaluated to determine whether the n studies are in fact estimating the same true underlying value of θ_{k}, i.e., whether θ_{1,k }= ... = θ_{n,k}. If this homogeneity assumption is found to be reasonable, then a test of significance can be considered to determine whether the true signal log ratio θ_{k }is significantly different from zero (i.e., whether gene k is significantly differentially expressed between the two conditions). If the homogeneity assumption is deemed unreasonable, then the random effects model can be employed to account for interstudy variability.
Simulation example
In order to evaluate the usefulness of this metaanalytic approach, a simulation study was conducted. The purpose of this simulation study was to illustrate how the results of the metaanalysis compare with the actual ("truth") simulation setting. A simple simulation model was developed with the sole purpose of generating "raw" probelevel data with certain genes "known" to be differentially expressed. While this model may not account for all sources of possible variability, it is nonetheless adequate for the purposes of the current work.
Simulation model
"Raw" probelevel data were generated from a model assuming that mismatch intensities (MM) are random background noise, which is an underlying assumption of the Affymetrix approach [9]. Our investigation of real data indicated that mismatch intensities appear to follow a longtailed Gamma distribution. Based on this, a random mismatch intensity is simulated for each probe l of each gene k such that MM_{kl }~ Gamma(α, β), with mean and variance [19].
In this simulation, larger values of the shape parameter α indicate more signal being detected by mismatch probes, with the peak of the distribution of MM intensities being moved away from zero. Larger values of the scale parameter β make high MM intensities less likely by pulling in the tail of the distribution. For the purposes of this simulation, it was assumed that mismatch intensities did not vary across labs or treatments.
Once the background mismatch intensities were obtained, the perfect match (PM) intensities were generated via the model
Y_{ijkl }= μ + L_{i }+ G_{k }+ P(G)_{(k)l }+ LG_{ik }+ ρ_{k}(T_{j }+ LT_{ij }+ TG_{jk }+ LTG_{ijk }+ TP(G)_{j(k)l}) + ε_{(ijk)l } (7)
where Y_{ijkl }is the log_{2 }of the PM  MM difference for probe l of gene k under treatment j in lab i. N labs were considered with each lab using the same two treatments. The term ρ_{k }~ Bernoulli(p) is 1 if gene k is differentially expressed between conditions j = 1 and j = 2, and is 0 otherwise. The parameter p corresponds to the percentage of genes that are differentially expressed, with higher values resulting in more differentially expressed genes. In this model, L_{i }is the effect of lab i, T_{j }is the effect of treatment j, G_{k }is the effect of gene k, P(G)_{(k)l }is the effect of probe l of gene k, ε_{(ijk)l }is a random error term, and the other terms are interaction effects. To introduce more betweenlab variability, the error variance was allowed to be different in each lab. That is, ε_{(ijk)l }~ N(0, ) for the error terms in lab i. Each term (X) in the model is assumed to be a random effect from a N(0, ) distribution, except for the constant μ, the fixed effect T_{j}, and the ρ_{k }term. The parameters p, μ, T_{j}, σ_{1}, ..., σ_{N}, and σ_{X }for X = L, G, P(G), LG, LT, TG, LTG, and TP(G) can be adjusted to introduce various sources of variability in the "observed" simulated data.
These simulated data can be used to generate "observed" SLR estimates for each gene in each lab. These "observed" SLR estimates can then be combined systematically in a metaanalysis. Note that the "true" SLR value for each gene can be obtained by using the same parameter values as in the simulation model but dropping all lab and error terms. Then the adaptive nature of the metaanalytic approach can be illustrated by comparing the "true" SLR values with the estimates from each lab and from the metaanalysis models.
Simulated data
The simulation was conducted in the R environment [20] with code requiring the use of the affy package [21] from the Bioconductor project [22,23]. While not the purpose of this investigation, the simulation was performed based upon the Affymetrix rat neuro chip RN_U34 with model parameter settings N = 6, α = 0.1, β = 0.0003, p = 0.05, μ = 2.5, σ_{L }= 0, σ_{G }= 0.5, σ_{P(G) }= 0.3, σ_{LG }= 0.1, T_{1 }= 0.2, T_{2 }= 0.2, σ_{LT }= 0.1, σ_{TG }= 1.0, σ_{LTG }= 0.13, σ_{TP(G) }= 0.5, σ_{1 }= .48, σ_{2 }= .60, σ_{3 }= .72, σ_{4 }= .84, σ_{5 }= .96, and σ_{6 }= 1.08. These parameter settings were selected to produce a distribution of MM intensities similar to that observed in real data (Figure 1a,b) and to force the distribution of signal log ratio (SLR) estimates to fall within a reasonable range with some variation between laboratories (Figure 1c,d).
Figure 1. Summary of real and simulated RN_U34 Affymetrix chip data. (a) The real mismatch (MM) intensities are from a RN_U34 Affymetrix chip, and their histogram closely resembles a Gamma distribution with a long tail. (b) The simulated MM intensities are drawn from a Gamma distribution to resemble the real MM intensities. (c) The relationship between the SLR from a simulated lab and the standard error of the SLR for each gene on the RN_U34 Affymetrix chip. Large absolute SLR's with small standard errors indicate significant differential expression. (d) A comparison of SLR estimates from two simulated labs shows general agreement, with some differences between labs.
Most SLR estimates were near zero (Figure 1c), indicating nondifferential expression, while some genes had larger absolute SLR's with smaller standard errors, an indication of significant differential expression. The data were simulated such that there were similar patterns between labs while allowing for lab differences, as evidenced by a comparison of the SLR's from two simulated labs (Figure 1d). While the estimates from the two simulated labs were clearly similar, there were obvious differences between the labs, although not as different as could be observed in real data. As a result, these two labs might produce slightly different lists of significantly differentially expressed genes. The simulation parameters can be adjusted to introduce varying degrees of difference between experiments, and this will affect the final claim made by the metaanalysis regarding statistical significance of differential expression.
Fixed effects metaanalysis results
The results from the six simulated labs were combined using the fixed effects metaanalysis model (Figure 2). The test of homogeneity : θ_{1,k }= ... = θ_{6,k }was performed for each gene k, k = 1, ..., 1322 (the RN_U34 chip has 1322 features or reference sequences), and the Pvalues for each were summarized in a histogram of homogeneity Pvalues (Figure 2a). Clearly there was widespread violation of the homogeneity assumption, as evidenced by the abundance of smaller homogeneity Pvalues.
Figure 2. Fixed effects metaanalysis of simulated results. (a) The abundance of smaller homogeneity Pvalues indicates widespread violation of the homogeneity assumption. (b) The plot of homogeneity Pvalues versus fixed effects SLR estimates shows that this lack of homogeneity exists across a large range of SLR estimates.
When the False Discovery Rate (FDR) [24] was controlled at 0.05, 88 of the 1322 genes failed the homogeneity test. That is, there appeared to be significant interlaboratory differences, such that the laboratories did not appear to provide estimates of the same true degree of differential expression for all genes. This appeared to be true for genes across a wide range of fixed effects metaanalysis SLR estimates, as evidenced by the lack of a clear relationship between fixedeffects SLR estimates and homogeneity Pvalues (Figure 2b). As a result, the random effects metaanalysis model was deemed more appropriate to adjust for the lack of homogeneity.
Random effects metaanalysis results
The same data from the six simulated labs were used in a random effects metaanalysis, and the resulting SLR estimates were similar to those from the fixed effects metaanalysis (Figure 3a). The test of significance : θ_{k }= 0 was performed for k = 1, ..., 1322 (i.e., for all 1322 genes), and the Pvalues for each value of k were summarized in a histogram of significance Pvalues (Figure 3b). As a result of the parameter selections for the simulation, an abundance of small Pvalues was observed, indicating a large number of significantly differentially expressed genes. A comparison of the metaanalysis SLR estimates with the significance Pvalues (Figure 3c) showed a trend of smaller Pvalues for larger absolute SLR. Similar to the results from a single lab (Figure 1c), most metaanalysis SLR estimates were close to zero (Figure 3d), but the standard errors were slightly lower overall for the metaanalysis estimates, after combining the SLR estimates across labs.
Figure 3. Random effects metaanalysis of simulated results. (a) A comparison of the fixed effects and random effects metaanalysis estimates of SLR shows general agreement between the models. (b) The histogram of significance Pvalues shows an abundance of significantly differentially expressed genes, as evidenced by the large number of smaller significance Pvalues. (c) The smallest significance Pvalues tended to occur for genes whose random effects metaanalysis SLR estimates were large in absolute value. The reference line is the Pvalue cutoff used to control the False Discovery Rate at 0.05. Using this cutoff, the random effects metaanalysis declared 72 of the 1322 genes significantly differentially expressed. (d) Demonstration of how the SLR's estimated from the metaanalysis relate to their standard errors.
The differences between the fixed effects and random effects models can be summarized by considering bubble plots for a single gene (Figure 4a,b), with bubble area proportional to weights used in the metaanalysis. For this particular gene, laboratory 2 estimated a SLR considerably smaller than the SLR's from the other labs with very small variance and hence very large weight in the fixed effects metaanalysis. As a result, the fixed effects metaanalysis estimated the true SLR for this gene to be closest to the SLR from lab 2. Such a result would call into question the SLR estimate from lab 2 for this gene. The random effects model took this lack of homogeneity into account and appropriately downweighted the SLR estimates for this gene from all six labs. In particular, the weight for this gene in lab 2 was reduced from 88.5 in the fixed effects model to 1.8 in the random effects model. For comparison, the weights for this gene in the other five labs ranged from 3.0 to 11.9 in the fixed effects model and from 1.2 to 1.6 in the random effects model. Whereas the fixed effects model declared this gene significantly differentially expressed, the random effects model did not (controlling the FDR at 0.05 in both models). Thus, the random effects model was not overly influenced by any single lab's SLR estimate.
Figure 4. Bubble plots from the simulation example. Bubble area is proportional to weights used in the metaanalysis. Dashed lines represent the 95 percent confidence interval for the true value of the SLR, adjusted to control the FDR for all 1322 genes at 0.05. The red dotted line represents the true SLR value. The green bubbles represent labs which claimed significant differential expression for the gene. When zero lies outside the confidence interval, the metaanalysis declares the gene significantly differentially expressed. (a) SLR estimates from the six labs for a particular gene, with fixed effects weights. (b) Plot for the same gene as in (a), but with random effects weights. (c) Plot for one of the twentyone IntegrationDriven Discovery (IDD) genes declared significant by none of the six simulated labs but significant by the random effects metaanalysis. (d) Plot for one of the four IntegrationDriven Revision (IDR) genes declared significant by multiple labs but not significant by the random effects metaanalysis.
Comparing simulated results and "truth"
When the FDR was controlled at 0.05, 72 of the 1322 genes were declared by the random effects metaanalysis to be significantly differentially expressed based on the results of the test of significance . Individually, the six labs identified between 44 and 58 significantly differentially expressed genes (controlling the FDR at 0.05 for each lab) (Table 1). For each lab, most of its significant genes were declared significant by both the fixed effects and random effects metaanalyses.
Table 1. Comparison of results from the simulated data. Comparison of numbers of genes in common declared significant (i.e., significantly differentially expressed) by simulated labs 1 through 6, the SLRbased fixed effexts and random effects metaanalyses, and the previously proposed Pvaluebased metaanalysis [2]. The (i, j)^{th }element of this table is the number of genes declared significant by both lab i and lab j, with F here representing the fixed effects metaanalysis, R the random effects metaanalysis, and T the "truth" behind the simulation model. P represents the metaanalysis based on Pvalues [2]. Each lab (and metaanalysis) had the False Discovery Rate (FDR) controlled at 0.05.
These results demonstrate how a metaanalysis handles discrepancies between labs. A metaanalysis can be useful in finding genes that are statistically significantly differentially expressed and not just declared significant by one or more labs due to random variation between labs. For example, lab 1 declared 46 genes significant and lab 2 declared 49 genes significant, but these two labs declared only 33 of the same genes significant (Table 1). These 33 are not necessarily the most significant in either lab. That is, the 33 are not necessarily the genes with the smallest lab 1 Pvalues or smallest lab 2 Pvalues, but are those genes with the smallest Pvalues from both labs. Alternatively, rather than considering all genes declared significant by any of the labs, the random effects metaanalysis combines information across all six labs in a wellstructured manner and declares 72 genes significantly differentially expressed.
While the numbers of correctly identified differentially expressed genes do not vary drastically between the individual labs (Table 1), the metaanalyses tend to correctly identify a higher number of differentially expressed genes. A comparison of the results from this SLRbased metaanalysis with the results from a previouslyproposed metaanalysis approach based on combining Pvalues [2] is also summarized in Table 1. A slight modification to this Pvalue approach was necessary to account for differences in experimental design. Where the previous approach implicitly required multiple control and experimental sample arrays from each lab, this simulation data (as well as the real data presented subsequently in this work) did not satisfy this requirement in all labs. To modify the previous approach for the current data, probespecific perfect match (PM) intensity differences between the experimental and control conditions were used to obtain a paired t statistic. Then the same permutation approach [2] was used to obtain a significance Pvalue for each gene in each laboratory based on this paired t statistic, and these Pvalues were combined across laboratories as previously proposed. In general, the SLRbased approach presented here tended to result in more genes found significantly differentially expressed by the metaanalysis than this previously proposed Pvalue approach (Table 1). In addition, the Pvalue approach does not provide a final quantitative estimate of the degree of differential expression for each gene, as does the currently proposed SLRbased approach. The metaanalysis SLR estimates tend to be much closer to the true SLR values than do the estimates from individual labs (Figure 5).
Figure 5. Comparison of SLR estimates with true values from simulation example. Green squares represent type I errors, genes incorrectly claimed as differentially expressed. Red triangles represent type II errors, genes incorrectly claimed to be not significantly differentially expressed. The SLR estimates from the random effects metaanalysis tend to approximate the true values much better than does any single lab.
IntegrationDriven Discovery (IDD)
One of the benefits of a metaanalysis is also one of the benefits of pooling raw data, that is the increased power to detect significant differences. It is possible that while a given gene is not declared significantly differentially expressed by any one lab, the combination of results across labs in a metaanalysis provides sufficient evidence to declare significant differential expression. Choi et al. [3] use the term "IntegrationDriven Discovery" (IDD) to refer to a gene identified as differentially expressed by the results of a metaanalysis, but not identified as differentially expressed by any of the individual studies or labs. In this case, the term "integration" is used in the unification sense rather than the mathematical, since the results of several different studies are being integrated into a single metaanalysis.
As shown in Table 2, our particular simulation study produced 21 IDD's (i.e., 21 of the 72 genes declared significant by the random effects metaanalysis were not declared significant by any of the six labs). Of these 21 IDD's, 6 were truly differentially expressed; that is, our simulation study produced 6 true IDD's and 15 false IDD's. An examination of the SLR estimates for these IDD genes (Figure 4c) indicated that IDD's will tend to occur when 'small but consistent' [3] effect size estimates are combined. In addition, high variability of each lab's estimate may cause individual labs to not declare a gene significant while the metaanalysis estimate will have a lower variance, making a declaration of significance more likely.
Table 2. Summary of results for the simulated data. Numbers of genes declared significant (i.e., significantly differentially expressed) by different numbers of labs and the fixed and random effects metaanalyses in the simulation example. The False Discovery Rate (FDR) for the metaanalyses and each lab separately was controlled at 0.05. There were 21 IntegrationDriven Discoveries (IDD's) and 4 IntegrationDriven Revisions (IDR's).
IntegrationDriven Revision (IDR)
In the simulation results presented here, there were 4 genes declared significant by at least two of the simulated labs that were not declared significant by the random effects metaanalysis (Table 2). A closer examination of the SLR estimates for these particular genes (Figure 4d) revealed that while at least two of the labs individually declared the gene significant, the SLR estimates between the six labs differed sufficiently to make the variance of the metaanalysis SLR estimate large. This increased variance of the metaanalysis SLR estimate caused the metaanalysis to declare these genes not significant. In addition, some labs' variability estimates may be artificially low due to chance, thus forcing a false declaration of differential expression at the individual lab level. The random effects metaanalysis is able to account for this possibility by downweighting overly influential results.
We introduce the term "IntegrationDriven Revision" (IDR) to describe a gene identified as differentially expressed by multiple studies or labs, but determined by the results of a metaanalysis to be not differentially expressed. While multiple laboratories might promote such a gene for further study because of its large and significant effect size, the metaanalysis would conclude that, due to the inconsistencies in effect sizes across labs, the gene is not significantly differentially expressed. Whereas IntegrationDriven Discoveries (IDD's) will tend to occur when 'small but consistent' [3] effect size estimates are combined, IntegrationDriven Revisions (IDR's) will tend to occur when large but inconsistent effect size estimates are combined. Of the 4 IDR's made in this simulated study, 3 were not truly differentially expressed; that is, our simulation study made 3 true IDR's and 1 false IDR's.
As noted previously, the simulation parameters can be adjusted to introduce varying degrees of difference between experiments. Increased interlaboratory variability, or greater inconsistency among effect size estimates, will tend to affect the numbers of IDD's and IDR's made by the metaanalysis. Because IDD's occur when effect size estimates are small but consistent and IDR's occur when effect size estimates are large but inconsistent, greater interlaboratory variability will tend to result in fewer IDD's and more IDR's being made.
Real data example
Several laboratories have investigated the genetic basis for EAE (experimental autoimmune encephalomyelitis, the mouse model for human multiple sclerosis) by use of Affymetrix technology, and have reported their findings in published papers [2529]. In each of these published papers, mention is made of appropriate care of the mice following the ethics guidelines at the respective institutions. The three laboratories providing data are Offner [26,29], Carmody [28], and Ibrahim [25,27]. Each laboratory measured gene expression in a base (naive or control) sample and in an experimental (EAEinduced) sample, with some laboratories using multiple experiments. For the current purposes, an "experiment" is a single arraytoarray comparison to study differential expression. Seven total experiments from the three laboratories are summarized in Table 3. While not all labs used the same measure of differential expression in their publications, here the Affymetrix MAS 5.0 algorithm [11] provides the SLR estimates for each lab from the respective raw data sets.
Table 3. Summary of observed data for the EAE example. A summary of the seven observed experiments involving EAE data to be combined in a metaanalysis.
The use of different Affymetrix chip versions presents a nontrivial challenge in comparing and combining results across laboratories. The same gene may be represented on two different chip versions, and yet the names reported by the two chips may differ. Also, different sets of probes may represent the same gene on different chip versions, resulting in different probe set names on different chip versions. For example, the gene 1200011I18Rik on chip Mu11KsubA is identified by Probe Set ID AA000151_at, while on chip MG_U74Av2 it is Probe Set ID 104759_at. Furthermore, different chip versions may have different sets of genes represented on them. In order to combine the results across labs (and consequently, across chip versions), each gene must have a "name" recognized by all chips in the metaanalysis.
Previous metaanalyses of microarray results ([2,6], for example) have relied on Unigene cluster numbers to essentially achieve a uniform gene naming scheme across chip versions and platform types. Other recent work [30] proposed combining raw data from common probes into new probesets based on Unigene clusters. Because the focus of the current work is on combining the results of the Affymetrix algorithms, SLR estimates corresponding to the same Unigene cluster numbers are combined across all experiments. This approach will allow a gene to have multiple SLR estimates (corresponding to different original probe set names) from the same experiment. The Unigene number corresponding to each probe set on an array is available through the NetAffx feature [31] of the Affymetrix website [8].
There were considerable differences in the SLR estimates from the different labs, as represented in Figure 6. Some of these differences may be due to the use of different mouse strains, tissue types, and chip versions in the different laboratories. However, even experiments from the same laboratory tended to show disagreement, highlighting the need for biological replicates to provide more precise estimates of the degree of differential expression for each gene. This interlaboratory variability also illustrates the need for methods to systematically combine results across laboratories. The fixed effects and random effects metaanalysis models were employed to combine these estimates across all laboratories for each Unigene number.
Figure 6. Comparison of SLR estimates from two experiments in the EAE example. This plot illustrates the variation between experiments and the consequent need for a method of systematically combining results from different experiments.
Similar to Table 1 for the simulated example, Table 4 summarizes the overlap in the numbers of genes declared significantly differentially expressed by each experiment in this observed data example. Based on the results of the random effects model, 3,671 genes are identified as statistically significantly differentially expressed. There were 12,775 unique Unigene numbers represented by the genes across all arrays in this metaanalysis, so approximately 28.7% of the genes represented were declared significantly differentially expressed. This may support the prediction made by Carmody et al. [28] that about 28.9% of the genes in the mouse genome 'may be relevant for autoimmune inflammation', or affected by EAE. Similar to the simulated data (Table 1), a comparison of the results from this SLRbased metaanalysis with the results from an implementation of the previously proposed Pvalue approach [2] indicates that even with real data, the SLRbased approach tends to identify more genes as significantly differentially expressed (Table 4).
Table 4. Comparison of results from the EAE example. Comparison of numbers of genes in common declared significant (i.e., significantly differentially expressed) by the observed experiments, the SLRbased fixed effects and random effects metaanalyses, and the previously proposed Pvaluebased metaanalysis [2]. Each experiment (and metaanalysis) had the False Discovery Rate (FDR) controlled at 0.05.
As in the simulated data, the metaanalysis of these observed data produced IntegrationDriven Discoveries (IDD's) and IntegrationDriven Revision (IDR's). Similar to Table 2 for the simulated data, Table 5 summarizes these findings for the observed data. There were 65 IDD's and 5518 IDR's made. Of the IDR's, 32 were made for genes declared significantly differentially expressed by all seven experiments but not by the random effects metaanalysis. Figure 7 presents bubble plots for representative IDD and IDR genes from the observed data, similar to Figure 4 for the simulated data. As in the simulated data, IDD's tend to occur when small but consistent effect sizes are combined, and IDR's tend to occur when large but inconsistent effect sizes are combined. (See 1 : EAE.Random.Effects.Results.csv for the final estimates for all 12,775 genes.)
Table 5. Summary of results from the EAE example. Numbers of genes declared significantly differentially expressed by different numbers of experiments and the fixed and random effects metaanalyses in the observed data example. The False Discovery Rate (FDR) for the metaanalyses and each experiment separately was controlled at 0.05. There were 65 IntegrationDriven Discoveries (IDD's) and 5518 IntegrationDriven Revisions (IDR's). There were 32 IDR's that had been declared significantly differentially expressed by all seven experiments.
Figure 7. Bubble plots from the EAE example. Bubble area is proportional to weights used in the metaanalysis. Dashed lines represent the 95 percent confidence interval for the true value of the SLR, adjusted to control the FDR for all 1322 genes at 0.05. The green bubbles represent experiments which claimed significant differential expression for the gene. When zero lies outside the confidence interval, the metaanalysis declares the gene significantly differentially expressed. (a) Bubble plot for one of the sixtyfive IntegrationDriven Discovery (IDD) genes declared significant by none of the seven observed experiments but significant by the random effects metaanalysis. (b) Bubble plot for one of the thirtytwo IntegrationDriven Revision (IDR) genes declared significant by all seven observed experiments but not significant by the random effects metaanalysis.
Additional File 1. EAE.Random.Effects.Results.csv Summary of random effects metaanalysis model results for all 12,775 genes, including SLR estimates, declaration of statistically significant differential expression, and status as an IntegrationDriven Discovery (IDD) or IntegrationDriven Revision (IDR). Note that the Gene column refers to Unigene cluster number unless the entry has an extension such as _at, in which case it refers to an Affymetrix probe set name for which no Unigene number was available. This file is commadelimited and can be opened in Microsoft Excel.
Format: CSV Size: 699KB Download file
Discussion
Before any clinical decision is made based on the results of a metaanalysis, a biological validation of the results should be performed. Microarray technology is wellsuited for hypothesis generation, and a metaanalysis can be used to effectively combine results across multiple laboratories to refine the list of candidate genes deserving biological validation. This approach will tend to yield more informative results when each lab has used biological and technical replicates in their experimental design [32]. The use of replicates at the laboratory level provides both added power to detect differential expression and more precise estimation of the true degree of differential expression for each gene under consideration.
The model used to generate data for the simulation example can be adjusted to account for various sources of variation and relationships between genes. It is of great interest to investigate how such relationships affect the outcome of a metaanalytic approach. An extension of the fixed effects and random effects models to the hierarchical Bayes approach is also being investigated with the hope of improving the metaanalysis approach as applied to microarrays and to incorporate prior knowledge. Included in this extension is the use of covariates in the metaanalysis framework to account for known differences between labs and the appropriate modeling of possible dependence among effect size estimates. We feel the use of covariates will provide insight into the effects of different labs, tissues, and microarray platforms on the observed differential expressions of genes. Separating out the effects of these covariates will facilitate the identification of those genes which are differentially expressed between two conditions rather than appearing differentially expressed due to external influences such as lab, tissue, or platform. For example, the differences observed between experiments from the same laboratory (Table 4) may be explained by differences in mouse strain or tissue sample, and the inclusion of covariate information in the model would adjust for this.
In the examples presented here, all studies involved used the Affymetrix platform and the data were summarized using the same normalization strategy with the MAS 5.0 algorithm [11]. When multiple studies have employed different platforms (such as cDNA and other oligonucleotide arrays) or normalization strategies, then some adjustments to the approach presented here will be necessary. In particular, a readilyavailable quantitative measure of differential expression common to all platforms involved is needed. In addition, it will be of great interest to consider the effect of a platform covariate in the extended metaanalysis model.
Although we have demonstrated our approach using both simulated rat data and real observed data from essentially genetically homogeneous mice, its utility with human data is of great interest. Along with the increased variability in human data comes an increase in the information about each individual subject and subpopulation. Therefore the incorporation of such covariate information is an important subject of our future work. We anticipate that the use of covariate information with human data will be particularly informative in identifying biologically significant subpopulations  for example, in identifying genes that are related to a disease in one subpopulation but not in another.
Conclusion
The signal log ratio (SLR), automatically reported by MAS 5.0 [11], is naturally suited to serve as an effect size estimate in a metaanalysis of results from multiple laboratories. In order to perform a metaanalysis of microarray results as presented here, the following components are needed for each probe set from each experiment: the corresponding Unigene ID, the SLR estimate, and the estimated variance of the SLR estimate. The random effects metaanalysis model is better suited than the fixed effects model for the analysis of microarray results because of the lack of homogeneity of effects from different laboratories. Genes not declared significantly differentially expressed by any single lab but then declared significantly differentially expressed by the metaanalysis are referred to as IntegrationDriven Discoveries, or IDD's [3]. In addition to the identification of IDD's, our metaanalysis method identified genes declared significantly differentially expressed by multiple (and possibly all) laboratories but not significantly differentially expressed by the metaanalysis. These genes are referred to as IntegrationDriven Revisions, or IDR's. The simulation example demonstrated how the final SLR estimates from the metaanalysis models tend to be much closer to the "true" SLR values than do the SLR estimates from any single lab. These metaanalytic approaches to microarray results provide a systematic method to combine results from different laboratories with the purpose of gaining clearer insight into the true degree of differential expression for each gene.
Authors' contributions
For this research RWD initiated the underlying concept of metaanalysis as applied to microarray technology, and coordinated the focus. RWD is the Ph.D. advisor of JRS. JRS developed and evaluated the simulation model, wrote the R code for the respective analyses and graphical displays, and drafted the manuscript. Both authors read and approved the final manuscript.
Acknowledgements
We thank Drs. Robert Meisel (Purdue University) and Paul Mermelstein (University of Minnesota) for use of their RN_U34 Affymetrix data that provided our simulation parameters. We also thank Drs. Halina Offner (Oregon Health Sciences University), Ruaidhrí J. Carmody (University of Pennsylvania School of Medicine), and Saleh M. Ibrahim (University of Rostock), as well as their colleagues, for providing access to their raw Affymetrix data. We also thank two anonymous reviewers for their helpful suggestions to improve this work.
References

Glass GV: Primary, Secondary, and MetaAnalysis of Research.

Rhodes DR, Barrette TR, Rubin MA, Ghosh D, Chinnaiyan AM: MetaAnalysis of Microarrays: Interstudy Validation of Gene Expression Profiles Reveals Pathway Dysregulation in Prostate Cancer.
Cancer Research 2002, 62:44274433. PubMed Abstract  Publisher Full Text

Choi JK, Yu U, Kim S, Yoo OJ: Combining Multiple Microarray Studies and Modeling Interstudy Variation.
Bioinformatics 2003, 19(Suppl 1):i84i90. PubMed Abstract  Publisher Full Text

Moreau Y, Aerts S, Moor BD, Strooper BD, Dabrowski M: Comparison and MetaAnalysis of Microarray Data: From the Bench to the Computer Desk.
Trends in Genetics 2003, 19(10):570577. PubMed Abstract  Publisher Full Text

Parmigiani G, GarrettMayer ES, Anbazhagan R, Gabrielson E: A CrossStudy Comparison of Gene Expression Studies for the Molecular Classification of Lung Cancer.
Clinical Cancer Research 2004, 10:29222927. PubMed Abstract  Publisher Full Text

Rhodes DR, Yu J, Shanker K, Deshpande N, Varambally R, Ghosh D, Barrette T, Pandey A, Chinnaiyan AM: Largescale metaanalysis of cancer microarray data identifies common transcriptional profiles of neoplastic transformation and progression.
Proceedings of the National Academy of Sciences 2004, 101:93099314. Publisher Full Text

Hedges LV, Olkin I: Statistical Methods for MetaAnalysis. Academic Press, San Diego, CA; 1985.

Affymetrix [http://www.affymetrix.com] webcite

Affymetrix: Statistical Algorithms Description Document. Affymetrix, Santa Clara, CA; 2002.

Hoaglin DC, Mosteller F, Tukey J: Understanding Robust and Exploratory Data Analysis. John Wiley and Sons, New York; 1983.

Affymetrix: Affymetrix Microarray Suite User's Guide Version 5.0. Affymetrix, Santa Clara, CA; 2001.

Li C, Wong WH: DNAChip Analyzer (dChip). In The Analysis of Gene Expression Data: Methods and Software. Edited by Parmigiani G, Garrett ES, Irizarry RA, Zeger SL. Springer, NY; 2003.

Irizarry RA, Bolstad BM, Collin F, Cope LM, Hobbs B, Speed TP: Summaries of Affymetrix GeneChip probe level data.
Nucleic Acids Research 2003, 31(4):el5. Publisher Full Text

Cooper H, Hedges LV: The Handbook of Research Synthesis. Russell Sage Foundation; 1994.

Fisher R: Statistical Methods for Research Workers. 8th edition. Oliver and Boyd, Edinburgh, UK; 1941.

Glass GV: Integrating Findings: The MetaAnalysis of Research.

DuMouchel W, Normand SL: Computermodeling and Graphical Strategies for Metaanalysis.
In MetaAnalysis in Medicine and Health Policy Edited by Stangl DK, Berry DA, Marcel Dekker. 2000, 127178.

DerSimonian R, Laird N: MetaAnalysis in Clinical Trials.
Controlled Clinical Trials 1986, 7:177188. PubMed Abstract  Publisher Full Text

Casella G, Berger RL: Statistical Inference. Duxbury Press, Belmont, CA; 1990.

The Comprehensive R Archive Network [http://cran.rproject.org] webcite

Gautier L, Cope L, Bolstad BM, Irizarry RA: affy – analysis of Affymetrix GeneChip data at the probe level.
Bioinformatics 2004, 20(3):307315. PubMed Abstract  Publisher Full Text

BioConductor: open source software for bioinformatics [http://www.bioconductor.org] webcite

Ihaka R, Gentleman R: A Language for Data Analysis and Graphics.
Journal of Computational and Graphical Statistics 1996, 5(3):299314.

Benjamini Y, Hochberg Y: Controlling the False Discovery Rate: a Practical and Powerful Approach to Multiple Testing.
Journal of the Royal Statistical Society B 1995, 57:289300.

Ibrahim SM, Mix E, Bottcher T, Koczan D, Gold R, Rolfs A, Thiesen HJ: Gene expression profiling of the nervous system in murine experimental autoimmune encephalomyelitis.
Brain 2001, 124:19271938. PubMed Abstract  Publisher Full Text

Matejuk A, Dwyer J, Zamora A, Vandenbark AA, Offner H: Evaluation of the Effects of 17βEstradiol (17βE2) on Gene Expression in Experimental Autoimmune Encephalomyelitis Using DNA Microarray.
Endocrinology 2002, 143:313319. PubMed Abstract  Publisher Full Text

Mix E, Pahnke J, Ibrahim SM: GeneExpression Profiling of Experimental Autoimmune Enchephalomyelitis.
Neurochemical Research 2002, 27(10):11571163. PubMed Abstract  Publisher Full Text

Carmody RJ, Hilliard B, Maguschak K, Chodosh LA, Chen YH: Genomic scale profiling of autoimmune inflammation in the central nervous system: the nervous response to inflammation.
Journal of Neuroimmunology 2002, 133:95107. PubMed Abstract  Publisher Full Text

Matejuk A, Dwyer J, Hopke C, Vandenbark AA, Offner H: 17βEstradiol Treatment Profoundly DownRegulates Gene Expression in Spinal Cord Tissue in Mice Protected from Experimental Autoimmune Encephalomyelitis.
Archivum Immunologiae et Therapiae Experimentalis 2003, 51:185193. PubMed Abstract

Morris JS, Yin G, Baggerly K, Wu C, Zhang L: Identification of Prognostic Genes, Combining Information Across Different Institutions and Oligonucleotide Arrays. [http://www.camda.duke.edu/camda03] webcite
Critical Assessment of Microarray Data Analysis (CAMDA) 2003 Conference Paper 2003.

Liu G, Loraine AE, Shigeta R, Cline M, Cheng J, Valmeekam V, Sun S, Kulp D, SianiRose MA: NetAffx: Affymetrix probesets and annotations.
Nucleic Acids Research 2003, 31:8286. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Parmigiani G, Garrett ES, Irizarry RA, Zeger SL: The Analysis of Gene Expression Data: An Overview of Methods and Software. In The Analysis of Gene Expression Data: Methods and Software. Edited by Parmigiani G, Garrett ES, Irizarry RA, Zeger SL. Springer, NY; 2003.