Open Access Open Badges Research article

Ongoing monitoring of data clustering in multicenter studies

Lauren B Guthrie1, Emily Oken16*, Jonathan AC Sterne2, Matthew W Gillman1, Rita Patel2, Konstantin Vilchuck5, Natalia Bogdanovich5, Michael S Kramer34 and Richard M Martin2

Author Affiliations

1 Obesity Prevention Program, Department of Population Medicine, Harvard Medical School and the Harvard Pilgrim Health Care Institute, Boston, USA

2 School of Social and Community Medicine, University of Bristol, Bristol, UK

3 Department of Pediatrics, McGill University Faculty of Medicine, Montreal, Canada

4 Department of Epidemiology and Biostatistics, McGill University Faculty of Medicine, Montreal, Canada

5 The National Research and Applied Medicine Mother and Child Center, Minsk, Republic of Belarus

6 Obesity Prevention Program, Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care Institute, 133 Brookline Avenue, 6th Floor, Boston, MA 02215, USA

For all author emails, please log on.

BMC Medical Research Methodology 2012, 12:29  doi:10.1186/1471-2288-12-29

Published: 13 March 2012



Multicenter study designs have several advantages, but the possibility of non-random measurement error resulting from procedural differences between the centers is a special concern. While it is possible to address and correct for some measurement error through statistical analysis, proactive data monitoring is essential to ensure high-quality data collection.


In this article, we describe quality assurance efforts aimed at reducing the effect of measurement error in a recent follow-up of a large cluster-randomized controlled trial through periodic evaluation of intraclass correlation coefficients (ICCs) for continuous measurements. An ICC of 0 indicates the variance in the data is not due to variation between the centers, and thus the data are not clustered by center.


Through our review of early data downloads, we identified several outcomes (including sitting height, waist circumference, and systolic blood pressure) with higher than expected ICC values. Further investigation revealed variations in the procedures used by pediatricians to measure these outcomes. We addressed these procedural inconsistencies through written clarification of the protocol and refresher training workshops with the pediatricians. Further data monitoring at subsequent downloads showed that these efforts had a beneficial effect on data quality (sitting height ICC decreased from 0.92 to 0.03, waist circumference from 0.10 to 0.07, and systolic blood pressure from 0.16 to 0.12).


We describe a simple but formal mechanism for identifying ongoing problems during data collection. The calculation of the ICC can easily be programmed and the mechanism has wide applicability, not just to cluster randomized controlled trials but to any study with multiple centers or with multiple observers.