Open Access Open Badges Technical advance

Automated inter-rater reliability assessment and electronic data collection in a multi-center breast cancer study

Soe Soe Thwin12*, Kerri M Clough-Gorr13, Maribet C McCarty4, Timothy L Lash13, Sharon H Alford5, Diana SM Buist6, Shelley M Enger7, Terry S Field8, Floyd Frost9, Feifei Wei4 and Rebecca A Silliman13

Author affiliations

1 Geriatrics Section, Department of Medicine, Boston University School of Medicine, Boston, Massachusetts, USA

2 Department of Biostatistics, Boston University School of Public Health, Boston, Massachusetts, USA

3 Department of Epidemiology, Boston University School of Public Health, Boston, Massachusetts, USA

4 HealthPartners Research Foundation, Minneapolis, Minnesota, USA

5 Henry Ford Health System, Detroit, Michigan, USA

6 Center for Health Studies, Group Health, Seattle, Washington, USA

7 Department of Research and Evaluation, Kaiser Permanente Medical Care Program, Pasadena, California, USA

8 Meyers Primary Care Institute of Fallon Community Health Plan/Fallon Foundation/University of Massachusetts Medical School, Worcester, Massachusetts, USA

9 Lovelace Respiratory Research Institute, Albuquerque, New Mexico, USA

For all author emails, please log on.

Citation and License

BMC Medical Research Methodology 2007, 7:23  doi:10.1186/1471-2288-7-23

Published: 18 June 2007



The choice between paper data collection methods and electronic data collection (EDC) methods has become a key question for clinical researchers. There remains a need to examine potential benefits, efficiencies, and innovations associated with an EDC system in a multi-center medical record review study.


A computer-based automated menu-driven system with 658 data fields was developed for a cohort study of women aged 65 years or older, diagnosed with invasive histologically confirmed primary breast cancer (N = 1859), at 6 Cancer Research Network sites. Medical record review with direct data entry into the EDC system was implemented. An inter-rater and intra-rater reliability (IRR) system was developed using a modified version of the EDC.


Automation of EDC accelerated the flow of study information and resulted in an efficient data collection process. Data collection time was reduced by approximately four months compared to the project schedule and funded time available for manuscript preparation increased by 12 months. In addition, an innovative modified version of the EDC permitted an automated evaluation of inter-rater and intra-rater reliability across six data collection sites.


Automated EDC is a powerful tool for research efficiency and innovation, especially when multiple data collection sites are involved.