Electronic search strategies to identify reports of cluster randomized trials in MEDLINE: low precision will improve with adherence to reporting standards
1 Ottawa Hospital Research Institute, Clinical Epidemiology Program, Ottawa, Canada
2 Department of Epidemiology and Community Medicine, University of Ottawa, Ottawa, Canada
3 Faculty of Family Medicine, Faculty of Medicine, Ottawa, Canada
4 Institute of Population Health, University of Ottawa, Ottawa, Canada
5 Department of Information Studies, University of Aberystwyth, UK
6 Department of Medicine, University of Ottawa, Ottawa, Canada
7 Department of Epidemiology and Biostatistics, University of Western Ontario, London, Canada
8 Institute of Health & Society, Newcastle University, Newcastle upon Tyne, UK
9 Robarts Clinical Trials, London, Canada
BMC Medical Research Methodology 2010, 10:15 doi:10.1186/1471-2288-10-15Published: 16 February 2010
Cluster randomized trials (CRTs) present unique methodological and ethical challenges. Researchers conducting systematic reviews of CRTs (e.g., addressing methodological or ethical issues) require efficient electronic search strategies (filters or hedges) to identify trials in electronic databases such as MEDLINE. According to the CONSORT statement extension to CRTs, the clustered design should be clearly identified in titles or abstracts; however, variability in terminology may make electronic identification challenging. Our objectives were to (a) evaluate sensitivity ("recall") and precision of a well-known electronic search strategy ("randomized controlled trial" as publication type) with respect to identifying CRTs, (b) evaluate the feasibility of new search strategies targeted specifically at CRTs, and (c) determine whether CRTs are appropriately identified in titles or abstracts of reports and whether there has been improvement over time.
We manually examined a wide range of health journals to identify a gold standard set of CRTs. Search strategies were evaluated against the gold standard set, as well as an independent set of CRTs included in previous systematic reviews.
The existing strategy (randomized controlled trial.pt) is sensitive (93.8%) for identifying CRTs, but has relatively low precision (9%, number needed to read 11); the number needed to read can be halved to 5 (precision 18.4%) by combining with cluster design-related terms using the Boolean operator AND; combining with the Boolean operator OR maximizes sensitivity (99.4%) but would require 28.6 citations read to identify one CRT. Only about 50% of CRTs are clearly identified as cluster randomized in titles or abstracts; approximately 25% can be identified based on the reported units of randomization but are not amenable to electronic searching; the remaining 25% cannot be identified except through manual inspection of the full-text article. The proportion of trials clearly identified has increased from 28% between the years 2000-2003, to 60% between 2004-2007 (absolute increase 32%, 95% CI 17 to 47%).
CRTs should include the phrase "cluster randomized trial" in titles or abstracts; this will facilitate more accurate indexing of the publication type by reviewers at the National Library of Medicine, and efficient textword retrieval of the subset employing cluster randomization.