Assessing subgroup effects with binary data: can the use of different effect measures lead to different conclusions?
1 MRC Biostatistics Unit Institute of Public Health, Robinson Way, Cambridge CB2 2SR, UK
2 Medical Statistics Unit London School of Hygiene and Tropical Medicine, Keppel Street, London WC1E 7HT, UK
BMC Medical Research Methodology 2005, 5:15 doi:10.1186/1471-2288-5-15Published: 29 April 2005
In order to use the results of a randomised trial, it is necessary to understand whether the overall observed benefit or harm applies to all individuals, or whether some subgroups receive more benefit or harm than others. This decision is commonly guided by a statistical test for interaction. However, with binary outcomes, different effect measures yield different interaction tests. For example, the UK Hip trial explored the impact of ultrasound of infants with suspected hip dysplasia on the occurrence of subsequent hip treatment. Risk ratios were similar between subgroups defined by level of clinical suspicion (P = 0.14), but odds ratios and risk differences differed strongly between subgroups (P < 0.001).
Interaction tests on different effect measures differ because they test different null hypotheses. A graphical technique demonstrates that the difference arises when the subgroup risks differ markedly. We consider that the test of interaction acts as a check on the applicability of the trial results to all included subgroups. The test of interaction should therefore be applied to the effect measure which is least likely a priori to exhibit an interaction. We give examples of how this might be done.
The choice of interaction test is especially important when the risk of a binary outcome varies widely between subgroups. The interaction test should be pre-specified and should be guided by clinical knowledge.