Abstract
Background
Wise et al. introduced a rankbased statistical technique for metaanalysis of genome scans, the Genome Scan MetaAnalysis (GSMA) method. Levinson et al. recently described two generalizations of the GSMA statistic: (i) a weighted version of the GSMA statistic, so that different studies could be ascribed different weights for analysis; and (ii) an order statistic approach, reflecting the fact that a GSMA statistic can be computed for each chromosomal region or bin width across the various genome scan studies.
Results
We provide an Edgeworth approximation to the null distribution of the weighted GSMA statistic, and, we examine the limiting distribution of the GSMA statistics under the order statistic formulation, and quantify the relevance of the pairwise correlations of the GSMA statistics across different bins on this limiting distribution. We also remark on aggregate criteria and multiple testing for determining significance of GSMA results.
Conclusion
Theoretical considerations detailed herein can lead to clarification and simplification of testing criteria for generalizations of the GSMA statistic.
Background
Wise, Lanchbury and Lewis [1] introduced a rankbased statistical technique for metaanalysis of genome scans, the Genome Scan MetaAnalysis (GSMA) method, and derived its exact null distribution using a clever inclusion/exclusion argument. Koziol and Feng [2] provided an alternative derivation of the null distribution of the GSMA statistic via a combinatoric approach involving probability generating functions, and suggested an Edgeworth series approximation to its exact null distribution that improves upon the Wise [1] normal approximation.
Levinson [3] described two generalizations to the GSMA statistic: (i) a weighted version of the GSMA statistic, so that different studies could be ascribed different weights for analysis; and (ii) an order statistic approach, reflecting the fact that a GSMA statistic can be computed for each chromosomal region or bin across the various genome scan studies. Wise [1] had suggested that each chromosomal region (bin) be about 30 cM, leading to a total of about n = 120 bins spanning the entire genome, and correspondingly 120 GSMA statistics. Wise [1] and Koziol and Feng [2] had investigated the marginal distribution of any of these (exchangeable) GSMA statistics, whereas under the order statistic formulation of Levinson [3], the joint distribution of the entire set of GSMA statistics is taken into account. In this note, we consider both generalizations in turn. In particular, (i) we provide an Edgeworth approximation to the null distribution of the weighted GSMA statistic, analogous to that in Koziol and Feng [2]; and (ii) we examine the limiting distribution of the GSMA statistics under the order statistic formulation, and quantify the relevance of the pairwise correlations of the GSMA statistics across different bins on this limiting distribution. We conclude with remarks concerning the Levinson [3] aggregate criteria and multiple testing for determining significance of GSMA results.
Results
The GSMA statistics
We first introduce some notation. Let X_{ij}, i = 1, ..., m, j = 1, ..., n, denote the rank of any particular linkage test statistic (e.g., LOD score) in the j^{th }chromosomal region (bin) from the i^{th }study, with each study being ranked separately. Levinson [3] rank the bins from 1 = "best" to n = "worst" on the basis of, say, maximum LOD score or lowest p value observed within each bin, but the reverse ranking from 1 = "worst" to n = "best" is also feasible. In practice, m can be as few as 4 (e.g., [4,5]); and, following Wise [1], n is generally about 120. The GSMA statistics are then S_{1}, ..., S_{n}, where . The exact (marginal) null distribution of each S_{j }was derived in Wise [1]; in the notation of Levinson [3], P_{AvgRnk}, the "pointwise probability" of any S_{j}, is obtained from its marginal null distribution. The normal approximation to the exact distribution of the S_{j }is straightforward: the S_{j }are identically distributed, and each S_{j }has an approximate normal distribution with mean and variance under the null hypothesis that ranks are randomly assigned within each study. Koziol and Feng [2] provided an Edgeworth correction to this approximation, and recommended that the correction be used, at least for m ≤ 12.
The weighted GSMA statistic
Levinson [3] proposed a weighted version of the GSMA statistic, namely, , with the weight w_{i }ascribed to the i^{th }study reflecting the relative linkage information from that study. (We are temporarily omitting the j subscript for clarity.) The normal approximation to the marginal null distribution of S_{w }is straightforward, and depends on the two parameters and . The combinatorial argument utilized by Koziol and Feng [2] to derive the exact distribution of the unweighted GSMA statistic (which relies on probability generating functions) is generally no longer applicable in the weighted setting. Nevertheless, as in Koziol and Feng [2], we here provide an Edgeworth correction that may be applied to the weighted GSMA statistic. To this end, we equivalently consider the linear transform , where . We then have E[R_{w}] = 0, , and
(We have used Koziol and Feng [2], eqn. 11, for .)
The Edgeworth Type A series approximation to the density of up to 4^{th }order terms, is f (z) = φ (z)[1 + c_{4}H_{4 }(z)], where φ(·) is the standard normal density function, , H_{4 }(·) is the 4^{th }degree HermiteChebyshev polynomial, H_{4 }(z) = z^{4 } 6z^{2 }+ 3, and the constant c_{4 }given by (Stuart and Ord [6], eqn. 6.42). Furthermore, the cumulative distribution function of the Edgeworth series is given simply by
where Φ(·) denotes the cumulative distribution function of the standard normal distribution, c_{4 }and φ(·) are as above, and H_{3}(·) is the 3^{rd }degree HermiteChebyshev polynomial, H_{3 }(z) = z^{3 } 3z (Stuart and Ord [6], eqn. 6.43).
In practice, we would expect the Edgeworth series approximation to provide an adequate representation of the exact distribution of the weighted GSMA statistic, in a manner analogous to the unweighted case [2]. Here, we briefly investigate the adequacy of the Edgeworth approximation, using an example from Lewis et al. [7]. They had applied the GSMA methodology to data from m = 20 schizophrenia genome scans, and found strong evidence for linkage on chromosome 2q, as well as suggestive evidence for linkage at several other chromosomal locations. The rank data for each scan are available online at D.F. Levinson's website (accessed July 14, 2004) [8], and we use these data to reconstruct first the unweighted GSMA statistics S_{j}, j = 1, 2, ..., 120, corresponding to the 120 bins spanning the entire genome, then their preferred weighted versions. Lewis [7] had recommended weights for each individual study proportional to the square root of the number of affected cases for that study. From Levinson's website [8], the individual weights w_{i}, i = 1, 2, ..., 20, are 2.32, 1.77, 1.20, 1.17, 1.17, 1.16, 1.15, 1.08, 1.03, 1.01, 0.95, 0.88, 0.80, 0.80, 0.68, 0.67, 0.59, 0.54, 0.53, 0.51, greater than a fourfold range.
We simulated the null distribution of the weighted GSMA statistic , where with m = 20, n = 120, by drawing each X_{i }as an independent random integer from 1 to 120 (that is, a uniform distribution of the integers from 1 to 120), then forming R_{w }with the Levinson [8] weights. We used the random number generator in R [9], to produce 10,000 values for R_{w}. We then formed , and compared the empirical distribution of the 10,000 Z values with the Edgeworth approximation as described above; for comparative purposes, we also computed the normal approximation, which is based on matching the first two moments rather than the first four moments with Edgeworth. Figure 1 shows the resulting quantilequantile plot of the empirical distribution of the weighted GSMA Z values with both a normal approximation, panel A, and the Edgeworth approximation, panel B. Note that, even in this setting of the weighted combination of m = 20 individual GSMA statistics, the normal approximation is particularly illfitting in the tails. Agreement in the tails would be of particular relevance in practical applications, as these represent the areas of potentially significant findings (pvalues). The noticeable disagreement in the tails between the weighted GSMA statistic and its normal approximation is largely ameliorated with the Edgeworth approximation. With attendant computational savings, the Edgeworth approximation provides a practical means of determining significance of weighted GSMA results compared to simulation; tail probabilities derived from the normal approximation should only be used with extreme caution.
Figure 1. Quantilequantile plots of the weighted GSMA statistic vs. a normal approximation and an Edgeworth approximation. A. Quantilequantile plot of the null distribution of the weighted GSMA statistic, xaxis, versus the Edgeworth approximation, yaxis. B. Quantilequantile plot of the empirical null distribution of the weighted GSMA statistic, xaxis, versus the Edgeworth approximation, yaxis. The empirical null distribution of the weighted GSMA statistic was obtained from 10,000 simulations, with m = 20 scans, n = 120 bins per scan, and weights for the 20 scans taken from Lewis [7]. The normal approximation shares the first two moments as the weighted GSMA statistic; the Edgeworth approximation shares the first four moments as the weighted GSMA statistic. Quantiles are depicted from the .0001 percentage point to the .9999 percentage point.
The GSMA order statistics
We turn next to order statistic considerations (and reintroduce the subscript j). The Levinson [3] order statistic approach to inference relating to the GSMA statistics takes into account the inherent ordering of the S_{j}: their P_{ord }refers to the probability of any observed S_{k }given the k^{th }bin's place in the ordering of all of the S_{j}. We here derive approximations to this distribution. Let , j = 1, ..., n, and T_{(1) }<T_{(2) }< ... T_{(n) }denote the order statistics. We note first that the T_{j }have (approximately) a singular symmetric multivariate normal distribution, with means 0, variances 1, and correlations . That the joint distribution is singular follows from the observation that for all i, hence, is identically 0. If we dismiss the correlations as negligible (of absolute magnitude < 0.01 for n > 100), then the T_{j }are (approximately) independent, identically distributed N(0,1) (standard normal) random variates, and the cumulative distribution function (cdf) F_{k }of the k^{th }order statistic T_{(k) }is given by
with Φ(·) as above (David [10], eqn. 2.1.3).
We briefly examine whether correlations can be ignored when determining the distributions of the T_{(j)}. Numerical computation of the distributions of the order statistics from a symmetric multivariate normal distribution is feasible in a number of cases; we here examine perhaps the most relevant case, concerning the extreme T_{(n). }Note that
Prob (T_{(n) }≤ x) = Prob (T_{1 }≤ x,T_{2 }≤ x, ...,T_{n }≤ x); (2)
this latter probability may be calculated in R using the mvtnorm package [9], based on methodology by Genz [11,12]. With n = 120, we depict in Figure 2 a QQ plot of the (approximate) distribution of T_{(n) }under independence, eqn. (1), compared to the distribution from eqn. (2) with pairwise correlations . The independence model tends to agree quite closely to the correlation model in this particular case, especially in the critical right tail, and has the virtue of numerical simplicity. We remark that one might improve slightly on the normal independence model by incorporating the Edgeworth correction into the individual cumulative distribution functions in equation (1).
Figure 2. Quantilequantile Plot of T_{(n) }Distribution. Quantilequantile plot of the approximate normal distribution of T_{(n)}, the largest (order) GSMA statistic, under the correlation model (eqn. 2, with pairwise correlations ), xaxis, versus the independence model (eqn.1), yaxis. Following Wise [1], we chose n = 120. Quantiles are depicted from the .001 percentage point to the .999 percentage point.
Aggregate criteria and multiple testing
Levinson [3] had proposed an aggregate criterion for detecting linkage based on both the marginal distributions and the order statistic distributions of the GSMA statistics. In particular, they argued that bins that have achieved both P_{AvgRnk }< 0.05 and P_{ord }< 0.05 "are the most likely to contain linked loci or to be adjacent to linked bins". Note that their criterion entails both the marginal distribution of the T_{j}, through P_{AvgRnk}, and the (joint) order statistic distribution of the T_{j}, through P_{ord}. We remark that there is some redundancy to the aggregate criterion {P_{AvgRnk }< 0.05 and P_{ord }< 0.05}, as can be seen through consideration of critical values relating to their aggregate criterion. With the normal approximation to the distribution of each normalized GSMA statistic T_{j}, the criterion {P_{AvgRnk }< 0.05} is equivalent to the criterion {T_{j }> 1.645}. The criterion {^{P}ord < 0.05} may be computed from eqn. (1), and depends on the ordering of the individual T_{j}. With n = 120, then for the ten largest order statistics T_{(120)}, T_{(119)}, ..., T_{(111)}, the criterion {P_{AvgRnk }< 0.05 and P_{ord }< 0.05} reduces to {P_{ord }< 0.05}, since their 95^{th }percentiles under their joint order statistic distribution exceed 1.645 [implying that, if {P_{ord }< 0.05} obtains, then {^{P}AvgRnk < 0.05} will automatically be satisfied]; and, for the remaining order statistics T_{(110)}, T_{(109)}, ..., T_{(1)}, the criterion {P_{AvgRnk }< 0.05 and P_{ord }< 0.05} reduces to {P_{AvgRnk }< 0.05}, equivalently, {T_{(j) }> 1.645}, as their 95^{th }percentiles under the order distribution, eqn. (1), are less than 1.645 [implying that, if {P_{AvgRnk }< 0.05} obtains, then {P_{ord }< 0.05} will automatically be satisfied].
We conclude with a remark concerning multiple testing. Levinson [3] suggested a simple Bonferroni correction for multiple testing when determining the significance of GSMA results. In particular, they used the criterion {P_{AvgRnk }< 0.000417} (0.05 corrected for 120 tests) for evidence that a bin is likely to contain a linked locus or loci. One can improve on this procedure by using Holm's [13] paradigm for multiple testing rather than Bonferroni. We illustrate Holm's [13] procedure by returning to the Lewis [7] study with m = 20 schizophrenia genome scans. As noted above, we used the online data to reconstruct the normalized unweighted GSMA statistics T_{j},j = 1, 2, ..., 120, corresponding to the 120 bins spanning the entire genome. With m = 20 studies, we shall invoke the normal approximation to the distributions of the individual T_{j}.
Lewis [7] had extensively investigated various criteria for linkage from the 20 schizophrenia genome scans, and we shall not reproduce their analyses. Rather, we here illustrate a graphical procedure for the simultaneous evaluation of pvalues from tests on the same data; this procedure is immediately applicable to the simultaneous consideration of the 120 GSMA statistics. The procedure, originally suggested by Schweder and Spjøtvoll [14], consists of a probability plot of the pvalues versus the uniform distribution. Koziol [15] subsequently suggested that Holm's [13] paradigm for multiple testing be applied to Schweder and Spjøtvoll's [14] scenario, for a graphical determination of the number of true hypotheses.
Let us briefly review the Holm [13] method, which is an extension of the Bonferroni method for multiple comparisons. Suppose we compare the smallest pvalue P_{(1) }among n pvalues with α/n and we find that the pvalue is less than α/n. Then our multiple testing problem has been reduced by one test, and we should compare the next smallest pvalue P_{(2) }to . In general, we would compare P_{(i) }with . Holm's [13] stepdown test begins with i = 1, comparing P_{(i) }with , and stops as soon as P_{(i) }exceeds , rejecting at overall level α all prior tests with smaller pvalues. The Holm [13] method, like Bonferroni, makes no assumption on the dependence of tests, but beyond P_{(1) }is less conservative than Bonferroni.
In Figure 3A we present a probability plot of the 120 pvalues corresponding to the 120 individual T_{j }statistics, which we have recomputed from the online Levinson dataset [8]. On this plot, the points corresponding to the "true" hypotheses of no linkage in individual bins should roughly fall along a straight line passing through the origin. We have also superimposed the Bonferroni and Holm boundaries for overall alpha level 0.05 and n = 120 pvalues; but, the two boundaries are virtually indistinguishable. There is little indication of large departures from the global null hypothesis of no linkage.
Figure 3. Probability plot of GSMA schizophrenia statistics. A. Probability plot of the 120 pvalues corresponding to the 120 GSMA statistics T_{1 }, ..., T_{120 }from the 20 schizophrenia genome scans reported in Lewis [7], versus the (expected) uniform distribution. Also depicted are the Bonferroni (solid line) and Holm (dotted line) boundaries at overall alpha level 0.05. B. Inset of Figure 3A, in which we display solely the Bonferroni and Holm boundaries. We have rescaled the yaxis so as to emphasize the differences in the boundaries, and have relabeled the xaxis to correspond to the fact that we here have n = 120 ordered pvalues. C. Inset of Figure 3A, illustrating the 12 smallest ordered pvalues (circles), along with a Holm [13] boundary (solid line) at overall alpha level 0.05. [We are using the integer ordering of the xaxis as in Figure 3B.] Only the first pvalue, corresponding to bin 2.5, falls outside this boundary. The bins depicted here, from left to right, are: 2.5, 3.2, 11.5, 5.5, 20.2, 8.2, 6.1, 2.6, 22.1, 1.6, 1.7, and 5.3.
In Figure 3B we rescale the yaxis, and focus solely on the Bonferroni and Holm boundaries. Differences are most readily apparent for the largest ordered pvalues. On the other hand, with a large number of hypotheses (here 120), the improvement of Holm over Bonferroni at the smallest ordered pvalues is marginal at best. As a reviewer has presciently remarked, the Holm procedure generally is most helpful (advantageous) relative to Bonferroni with only a small number of hypotheses.
In Figure 3C we zoom in on the part of the probability plot nearest the origin; we here have superimposed the Holm [13] boundary. In accord with Lewis [7], we find that only one GSMA statistic achieves statistical significance at overall alpha level 0.05, namely, the statistic corresponding to bin 2.5. [Recall that the Holm and Bonferroni boundaries coincide at the smallest pvalue, P_{(1)}.] That is, in this particular instance, the unweighted GSMA statistics with either Bonferroni or Holm [13] correction for multiple testing identify statistically evidence for linkage on chromosme 2q.
Conclusion
For practitioners utilizing GSMA statistics, the question arises as to whether the methods proposed here as well as in Koziol and Feng [2] are merely of theoretical interest, or have practical import. If one utilizes solely the unweighted GSMA statistic, and chooses to consider its marginal distribution (corresponding to the P_{AvgRnk }formulation of Levinson [2]), then the exact null distribution of the GSMA statistic is available from Wise [1] or Koziol and Feng [2], and should be preferred over any approximate methods. If the exact null distribution is computationally intractable for practitioners, then the Edgeworth approximation of Koziol and Feng [2] provides a simple and accurate means of assessing significance; we would argue that the Edgeworth approximation is preferable to a normal approximation in this instance. When weights are introduced into the GSMA statistic, then the combinatoric arguments of Wise [1] and Koziol and Feng [2] will typically be insufficient to derive the exact null distribution [though we remark that a moment generating function approach patterned after the probability generating function formulation of Koziol and Feng [2] can be brought to bear on this problem.] One can either simulate the null distribution or derive an Edgeworth approximation: we do not believe either method enjoys global advantages over the other. We caution against simple reliance on a normal approximation: in the situation investigated here, Figure 1, the weighted combination of m = 20 individual GSMS statistics, the normal approximation is particularly illfitting in the tails. [Agreement in the tails is of particular relevance to practitioners, as these represent the areas of potentially significant findings (pvalues).] As for the order statistic formulation and the aggregate criteria of Levinson [3], we believe that the theoretical considerations given in this paper can lead to clarification and simplification of testing criteria.
Authors' contributions
JAK conceived, designed and drafted the manuscript. ACF performed the statistical simulation.
Both authors read and approved the final manuscript.
Acknowledgements
This research was supported in part by NIH grant RR00833 to the Scripps General Clinical Research Center. We thank the reviewers for their insightful comments and suggestions.
References

Wise LH, Lanchbury JS, Lewis CM: Metaanalysis of genome searches.
Ann Hum Genet 1999, 63:26372. PubMed Abstract  Publisher Full Text

Koziol JA, Feng AC: A note on the genome scan metaanalysis statistic.
Ann Hum Genet 2004, 68:376380. PubMed Abstract  Publisher Full Text

Levinson DF, Levinson MD, Segurado R, Lewis CM: Genome scan metaanalysis of schizophrenia and bipolar disorder, part I: Methods and power analysis.
Am J Hum Genet 2003, 73:1733. PubMed Abstract  Publisher Full Text

Chiodini BD, Lewis CM: Metaanalysis of 4 coronary heart disease genomewide linkage studies confirms a susceptibility locus on chromosome 3q.
Arterioscler Thromb Vasc Biol 2003, 23:18631868. PubMed Abstract  Publisher Full Text

Demenais F, Kanninen T, Lindgren CM, Wiltshire S, Gaget S, Dandrieux C, Almgren P, Sjogren M, Hattersley A, Dina C, Tuomi T, McCarthy MI, Froguel P, Groop LC: A metaanalysis of four European genome screens (GIFT consortium) shows evidence for a novel region on chromosome 17p11.2q22 linked to type 2 diabetes.
Hum Mol Genet 2003, 12:18651873. PubMed Abstract  Publisher Full Text

Stuart A, Ord JK: Kendall's Advanced Theory of Statistics. 5th edition. New York: Oxford University Press; 1987.
[Distribution Theory, Vol 1.]

Lewis CM, Levinson DF, Wise LH, DeLisi LE, Straub RE, Hovatta I, Williams NM, Schwab SG, Pulver AE, Faraone SV, Brzustowicz LM, Kaufmann CA, Garver DL, Gurling HM, Lindholm E, Coon H, Moises HW, Byerley W, Shaw SH, Mesen A, Sherrington R, O'Neill FA, Walsh D, Kendler KS, Ekelund J, Paunio T, Lonnqvist J, Peltonen L, O'Donovan MC, Owen MJ, Wildenauer DB, Maier W, Nestadt G, Blouin JL, Antonarakis SE, Mowry BJ, Silverman JM, Crowe RR, Cloninger CR, Tsuang MT, Malaspina D, HarkavyFriedman JM, Svrakic DM, Bassett AS, Holcomb J, Kalsi G, McQuillin A, Brynjolfson J, Sigmundsson T, Petursson H, Jazin E, Zoega T, Helgason T: Genome scan metaanalysis of schizophrenia and bipolar disorder, part II: Schizophrenia.
Am J Hum Genet 2003, 73:3448. PubMed Abstract  Publisher Full Text

Levinson Research Page [http://depressiongenetics.med.upenn.edu/metaanalysis/index.htm] webcite

The R Project for Statistical Computing [http://www.rproject.org/] webcite

Genz A: Numerical computation of multivariate normal probabilities.
Journal of Computational and Graphical Statistics 1992, 1:141150.

Genz A: Comparison of methods for the computation of multivariate normal probabilities.

Holm S: A simple sequentially rejective multiple test procedure.

Schweder T, Spjøtvoll E: Plots of pvalues to evaluate many tests simultaneously.

Koziol JA: A note on plots of pvalues to evaluate many tests simultaneously.