Open Access Research article

Approaches for classifying the indications for colonoscopy using detailed clinical data

Hirut Fassil1, Kenneth F Adams2, Sheila Weinmann3, V Paul Doria-Rose4, Eric Johnson5, Andrew E Williams6, Douglas A Corley7 and Chyke A Doubeni1089*

Author Affiliations

1 University of Massachusetts Medical School, 50 Lake Ave North, Worcester, MA 01655, USA

2 HealthPartners Institute for Education and Research, 8170 33rd Ave. S, Bloomington, MN 55425, USA

3 Center for Health Research Northwest, Kaiser Permanente Northwest, 3800 N. Interstate Avenue, Portland, OR 97227, USA

4 Division of Cancer Control and Population Sciences, National Cancer Institute, National Institutes of Health, 9609 Medical Center Dr., Room 3E438, Bethesda, MD 20892, USA

5 Group Health Research Institute, 1730 Minor Ave #1600, Seattle, WA 98101, USA

6 Center for Health Research Hawaii, Kaiser Permanente Hawaii, 501 Alakawa Street, Honolulu, HI 96817, USA

7 Kaiser Permanente Division of Research, 2000 Broadway, Oakland, CA 94612, USA

8 Department of Family Medicine and Community Health, and the Center for Clinical Epidemiology and Biostatistics at the Perelman School of Medicine, University of Pennsylvania, 222 Blockley Hall, 423 Guardian Drive, Philadelphia, PA 19104, USA

9 The Center for Public Health Initiatives, University of Pennsylvania, Philadelphia, PA 19104, USA

10 The Leonard Davis Institute of Health Economics, University of Pennsylvania, Philadelphia, PA 19104, USA

For all author emails, please log on.

BMC Cancer 2014, 14:95  doi:10.1186/1471-2407-14-95

Published: 15 February 2014



Accurate indication classification is critical for obtaining unbiased estimates of colonoscopy effectiveness and quality improvement efforts, but there is a dearth of published systematic classification approaches. The objective of this study was to evaluate the effects of data-source and adjudication on indication classification and on estimates of the effectiveness of screening colonoscopy on late-stage colorectal cancer diagnosis risk.


This was an observational study in members of four U.S. health plans. Eligible persons (n = 1039) were age 55–85 and had been enrolled for 5 years or longer in their health plans during 2006–2008. Patients were selected based on late-stage colorectal cancer diagnosis in a case–control design; each case patient was matched to 1–2 controls by study site, age, sex, and health plan enrollment duration. Reasons for colonoscopies received in the 10-year period before the reference date were collected from three medical records sources (progress notes; referral notes; procedure reports) and categorized using an algorithm, with committee adjudication of some tests. We evaluated indication classification concordance before and after adjudication and used logistic regressions with the Wald Chi-square test to compare estimates of the effects of screening colonoscopy on late-stage colorectal cancer diagnosis risk for each of our data sources to the adjudicated indication.


Classification agreement between each data-source and adjudication was 78.8-94.0% (weighted kappa = 0.53-0.72); the highest agreement (weighted kappa = 0.86-0.88) was when information from all data sources was considered together. The choice of data-source influenced the association between screening colonoscopy and late-stage colorectal cancer diagnosis; estimates based on progress notes were closest to those based on the adjudicated indication (% difference in regression coefficients = 2.4%, p-value = 0.98), as compared to estimates from only referral notes (% difference in coefficients = 34.9%, p-value = 0.12) or procedure reports (% difference in coefficients = 27.4%, p-value = 0.23).


There was no single gold-standard source of information in medical records. The estimates of colonoscopy effectiveness from progress notes alone were the closest to estimates using adjudicated indications. Thus, the details in the medical records are necessary for accurate indication classification.