Abstract
Background
Published formulas for casecontrol designs provide sample sizes required to determine that a given diseaseexposure odds ratio is significantly different from one, adjusting for a potential confounder and possible interaction.
Results
The formulas are extended from one control per case to F controls per case and adjusted for a potential multicategory confounder in unmatched or matched designs. Interactive FORTRAN programs are described which compute the formulas. The effect of potential diseaseexposureconfounder interaction may be explored.
Conclusions
Software is now available for computing adjusted sample sizes for casecontrol designs.
Background
Breslow and Day [1] and Smith and Day [2] provide asymptotic formulas for the computation of casecontrol sample sizes required for odds ratios, unadjusted or adjusted for a confounder [1] and for stratified matched designs [2]. The notation we use is their notation. Their formulas are extended here to include more than one control per case. The formulas for stratified matched were deduced from applying the approach of Breslow and Day [1] (pages 305–6) to Table 7 of [2]. Modification of the formulas for specified interactions [1,3] is also shown. These formulas are based on the logarithm of the odds ratio, for which the normal approximation is more accurate than for the exposure difference, so these formulas are more accurate than the exposure difference formula that is given in the majority of general methods references [4,5].
Two conversational FORTRAN programs, DAYSMITH and DESIGN, compute the formulas. They were submitted to STATLIB for noncommercial distribution a few years ago, and are obtained with an email message such as "send design.exe from general" to http://statlib@lib.stat.cmu.edu webcite. The programs produce a table of numbers of cases and controls required for a variety of specifications of Type I and Type II error, adjusted for the confounder, unadjusted, and adjusted for stratified matching, with the strata being the levels of the confounder. The two programs have different input requirements. Program DAYSMITH asks for exactly the items required for the Smith and Day formulas. Program DESIGN accepts alternative input that is converted in the program to the items required for the same formulas. The formulas used are shown in Appendix 1.
Results
The input to program DAYSMITH
The sample sizes computed are for the detection of a given diseaseexposure odds ratio, that is, the sample sizes at which a certain statistical test will reject the null hypothesis that the odds ratio is one. The input items are as follows:
R_{E} = the odds ratio to be detected (typically a minimum value),
S = 1 or 2 for onesided or twosided type I error,
F = the number of controls per case,
P = the control population exposure probability, and
I = an indicator to request interaction adjustment.
Roughly speaking, interaction in statistics corresponds to effect modification in epidemiology. By not selecting an interaction adjustment, we effectively assume that the diseaseexposure odds ratio does not differ across confounder levels. Interaction is discussed further below.
The number of confounder levels, denoted K is asked for next. If K = 1, unadjusted sample sizes only are computed, and no other input is required. Program DESIGN is identical to this point. For most applications, no confounder adjustment is required and so the program returns unadjusted sample sizes and is finished after a 1 is entered for K. The unadjusted formula [1] is more accurate than the usual unadjusted formulas [4,5], and may therefore produce different sample sizes than those.
If K > 1, one of the levels of the confounder is taken to be a reference level, and is referred to as level one. The order of the levels is otherwise immaterial. The input required next is three numbers for each of the K–1 remaining levels, p_{1i}, p_{2i}, and R_{Ci}, i = 2,..., K, which are
p_{1i} = Pr(C_{i}E) = among the exposed population, the proportion at level i of the confounder,
= among the unexposed population, the proportion at level i of the confounder, and
R_{Ci} = the diseaseconfounder odds ratio (with confounder level i versus level 1).
For the reference level, we set R_{c1} = 1 for the formulas that follow. We compute
and .
Input for program DESIGN
Whereas DAYSMITH asks for the same input as requested in the original references [13], we found that alternative input made more sense for our initial applications [6,7], so a second program was written. The input for DESIGN is the same as for DAYSMITH up to the point after which the number of levels of the confounder, K, is asked for.
Again, one of the levels of the confounder is taken to be a reference level, and is referred to as level one. The input that is required next is one number for the reference level, r_{i}, and then three (four when interaction is included) numbers for each of the K–1 remaining levels, r_{i}, p_{i}, and R_{Ci}, i = 2,...,K, which are
r_{i} = Pr(EC_{i}) = the probability of exposure at level i of the confounder,
p_{i} = Pr(C_{i}) = the probability of being in level i of the confounder, and
R_{Ci} = the odds ratio of disease and confounder level i (versus level 1).
For the reference level, we again set R_{Ci} = 1.
From Bayes Theorem, we compute
p_{1i}=r_{i}p_{i}/P and
p_{2i}=(1 – r_{i})p_{i}/(1 – P).
We have one more input item than is actually required, and that is used for a check, where we can use the fact that
What we actually do is check the sum
The sum Δ is supposed to be equal to one. If it is not one, then we redefine and report
and
,
unless they are negative. An alternative used in earlier versions was to compute
and
and replace
p_{ji}
with
for j = 1,2 and i = 1,...,K. This is equivalent to replacing
r_{i}
with
,
i = 1,..., K, which is how the program used to report the change.
An example, adjusting for a confounder
The following example is one of several computations performed for a published research protocol for a study of the association of oral contraceptive (OC) use with cardiovascular risks, controlling for age group [6]. A related protocol [7] has smoking as a confounder.
The numbers entered for P, r_{i}, p_{i}, and R_{Ci}, i = 2,...,K, are all taken from the Saskatchewan government medical database, which includes the entire population from which a casecontrol sample is to be taken. In many applications, such numbers are not available from a reliable source. In that case, one may try sets of alternative minimum and maximum numbers for a range of results. The maximum sample sizes obtained from such sensitivity analyses would be the conservative recommendation.
Both programs first request R_{E} to I. For R_{E}, the outcome of interest is hospitalisation due to certain cardiovascular risks. The exposure is a specific OC with 10% of the market share [7]. Since overall OC prevalence is 30%, then P = .03 for that specific OC. Using > to denote the cursor for computer entry, we type:
>2 2 3 .03 0
for R_{E}, S, F, P and I, respectively, then press enter. We then receive the message:
Type the number of confounder levels, and <enter>. Type 1 if no confounder.
We enter 5 levels and press enter.
>5
Now type in the population exposure probability for the reference level of the confounding variable.
This will be put at level 1, so it is Pr(EC1)
The confounder levels are five age groups, and level 1 corresponds to the youngest age group 15–21, for which we enter the prevalence for a specific OC with 10% of the market share. We type .055 and press enter.
> .055
The reply is:
Now type in, for each of the other 4 level(s) of the confounding variable, Pr(ECi), Pr(Ci), and Rc(i), separated by at least one blank or <enter>, where Pr(ECi) = in the population at level i of the confounder, the proportion exposed, Pr(Ci) = the probability of being at level i, and Rc(i) = odds ratio of disease and confounder level i (versus level 1).
The following numbers are entered for age groups 22–26, 27–31, 22–39 and 40+:
> .038 .24 2
> .021 .2 8
> .008 .18 8
> .004 .15 28.5
Note that Rc(5) = R_{C5} = 28.5, a very high value. That is to be expected if all older women are included. (For the final protocol [6], a cutoff was made at age 45.) When enter is pressed, we receive some confirmation of the input, and a message that the result is written to file design.out. That is, as currently written, the sample sizes and other output are not automatically shown on the screen, but are saved in "design.out" to be viewed directly there. Appendix 2 (Second attached file, app2.txt, a text file) shows the output from the preceding session, which includes a correction of the input values.
Looking at Appendix 2, we see unadjusted sample sizes, those adjusted for age in an unmatched study, and a third set of sample sizes for a matched casecontrol study. For our example [6], both unmatched and matched designs are considered. With the low value of P and the high value R_{C5}, we see that a large difference in sample sizes required for either design may result. In most applications, however, the differences are not so dramatic.
Adjusting for a matching confounder
Epidemiological literature usually gives formulas for matching which are based on the strong assumption that all sources of extraneous variation among a case and its controls are accounted for [1,8,9]. A third program DESIGNM was written to compute such a formula (from [1], p.294), but DESIGNM does not adjust for a confounding variable, and that strong assumption of implicit matching is rarely justified in casecontrol studies, so this program was not made freely available. Software which compute sample sizes for conditional logistic regression, such as EGRET SIZ[10], are alternatives to DESIGNM, which is based on Miettinen's test of the MantelHaenszel odds ratio for matched casecontrol designs. The adjustment in DAYSMITH and DESIGN is for stratified matching [2,11,12], where matching is by confounders. This presumes that the eventual analysis will be unconditional [2] and will account for the stratification. Consequently, it is not required that F controls be linked with each case, only that the total number of controls be F times the total number of cases.
Interaction
The literature [1,3,13,15] discusses stratified analysis interaction adjustment only for confounders with K = 2. It is easy, however, to modify the formulas for multilevel interaction. Every occurrence of R_{E} in the formulas (Appendix 1) is replaced by R_{E}R_{Ij}, where R_{Ij} is the interaction factor corresponding to the j^{th} level, j = 2,..., K. (For ∑', put R_{Ij} inside the first sum.) We set R_{I1} = 1.
For two confounder levels, R_{I2}, which is R_{I} in Smith and Day's notation [3], is the multiplicative factor by which the odds ratio for those exposed and in level 2 of the confounder is different from the odds ratio when there is confounderexposuredisease interaction. For R_{Ij}, contrast is between level j and the reference level (level one).
This adjustment was made available for sensitivity analysis; specifically, to explore how much the sample size result could change if the confounder were in fact an effect modifier. Nevertheless, the adjusted formulas have been used to determine sample size in the presence of geneenvironment interaction [13].
Discussion
The competitors to these programs are regressionbased sample size programs, such as those in EGRET SIZ [10], which compute sample sizes required for unconditional logistic regression. The package nQuery [14] has an unconditional logistic regression option, but is not set up for casecontrol designs. These may be useful for continuous exposures, and make sense when the final analysis is intended to be such a regression, rather than a stratified analysis, such as a MantelHaenszel test, which our programs correspond to. We are unaware of any generally available competitor for stratified analysis.
In a series of papers on samplesize estimation to detect geneenvironment interaction, which is a controversial role for samplesize formulas, comparisons have been made between regression based approaches and the stratified analysis approach [13,15]. One solution is even to consider a caseonly design [16]. EGRET SIZ provides no guidance for interaction adjustment, but it probably could be used for that purpose.
When there is more than one confounder, we define one superconfounder, where each category corresponds to a subcategory. For example, if age, with 5 categories, and smoking, with 2 categories, are both confounders, then we define one superconfounder with 10 = 5 × 2 categories. The estimates of r_{i}, p_{i}, and R_{Ci}, i = 2, ...,10, then all have to take age and smoking into account jointly. As the number of confounders and the size of K increases, regressionbased sample size programs become more advantageous, since information is not required for every subcategory.
The current programs yield results for 80% and 90% power, but versions are available for alternative powers, from 60% to 95%. A new version may print to the screen, if users want that option, and ask whether sample sizes for a specific power and Type I error are required.
The programs described are for two levels of disease (case vs. control) and of exposure. For several levels of exposure or disease, measures are available which correspond to odds ratios, risk ratios and risk differences [17], and it is not difficult to compute sample size formulas for these. If there is some demand, software to do those calculations may be created.
The BreslowDaySmith formulas which we extend utilize the classical method, based on testing. A more modern approach is that based on a confidence interval for the odds ratio [18], which may eventually become a program option. A Bayesian approach seems most suited for the sample size problem, although some issues need to be resolved [19]. Although not yet written, a Bayesian solution will soon be formulated for casecontrol designs.
Appendix files. Appendix 1  Shows the formulas utilized by DESIGN and DAYSMITH. Appendix 2  Shows output from the DESIGN session described in the main text.
Format: PDF Size: 65KB Format: TXT Size: 3KB Download file
This file can be viewed with: Adobe Acrobat Reader
Competing interests
none declared
Acknowledgement
The author is supported by an Équipe grant from the FRSQ (Fonds de la recherche en santé du Québec). I appreciate the input of Eric Johnson, Sholom Wacholder and Jesse Berlin.
References

Statistical Methods in Cancer Research, Vol. 2: The Design and Analysis of Cohort Studies, IARC Scientific Publications No. 82, International Agency of Research on Cancer, Lyon, France,. 1987, Sections 7.87.9:305306.

Smith PG, Day NE: Matching and confounding in the design and analysis of epidemiological casecontrol studies.
Perspectives in Medical Statistics, J.F. Bithell, R. Coppi, eds. London: Academic Press, 1987, 3964.

Smith PG, Day NE: The design of casecontrol studies: the influence of confounding and interaction effects.

Statistical Methods for Rates and Proportions, 2nd Edition, Wiley: New York,. 1981.

CaseControl Studies: design, conduct, analysis, Oxford University Press: New York,. 1982.

Suissa S, Hemmelgarn B, Spitzer WO, Brophy J, Collet JP, Côté R, Downey W, Edouard L, LeClerc J, Paltiel O: The Saskatchewan oral contraceptive cohort study of oral contraceptive use and cardiovascular risks.

Spitzer WO, Thorogood M, Heinemann L: Trinational casecontrol study of oral contraceptives and health.

Parker RA, Bregman DJ: Sample size for individually matched casecontrol studies.

Ejigou A: Power and sample size for matched casecontrol studies.

EGRET. [http://www.cytel.com] webcite
Cytel Software Corporation: Cambridge, MA, 1997.
(SIZ is a separate module).

Woolson RE, Bean JA, Rojas PB: Sample size for casecontrol studies using Cochran's statistic.

Nam J: Sample size determination for casecontrol studies and the comparison of stratified and unstratified analyses.

Hwang SJ, Beaty TH, Liang KY, Coresh J, Khoury MJ: Minimum sample size estimation to detect geneenvironment interaction in casecontrol designs.

Elashoff JD: [http://www.statsol.ie] webcite
nQuery Advisor relase 2.0. Statistical Solutions Ltd.: Cork, Ireland,. 1997.

GarciaClosas M, Lubin JH: Power and sample size calculations in casecontrol studies of geneenvironment interactions: comments on different approaches.

Yang Q, Khoury MJ, Flanders WD: Sample size requirements in caseonly designs to detect geneenvironment interaction.

Edwardes MD, Baltzan M: The generalization of the odds ratio, relative risk and risk difference to r × k tables.
Statistics in Medicine, 2000, 19:19011914. Publisher Full Text

O'Neill RT: Sample sizes for estimation of the odds ratio in unmatched casecontrol studies.

Joseph L, Du Berger R, Bélisle P: Bayesian and mixed Bayesian/likelihood criteria for sample size determination.
Statistics in Medicine, 1997, 16:769781. Publisher Full Text
Prepublication history
The prepublication history for this paper can be accessed here: