Open Access Open Badges Research article

Ascertaining invasive breast cancer cases; the validity of administrative and self-reported data sources in Australia

Anna Kemp12*, David B Preen1, Christobel Saunders3, C D’Arcy J Holman4, Max Bulsara5, Kris Rogers6 and Elizabeth E Roughead7

Author Affiliations

1 Centre for Health Services Research, School of Population Health, The University of Western Australia, 35 Stirling Hwy, Crawley, WA, 6009, Australia

2 Illawarra Health and Medical Research Institute, Building 32, University of Wollongong, Wollongong, NSW, 2522, Australia

3 School of Surgery, The University of Western Australia, 35 Stirling Hwy, Crawley, WA, 6009, Australia

4 School of Population Health, The University of Western Australia, 35 Stirling Hwy, Crawley, WA, 6009, Australia

5 Institute of Health and Rehabilitation Research, University of Notre Dame, PO Box 1225, Fremantle, WA, 6959, Australia

6 Prevention Research Collaboration, Sydney School of Public Health, University Fisher Road, Sydney, NSW, 2206, Australia

7 Quality Use of Medicines and Pharmacy Research Centre, School of Pharmacy and Medical Sciences, University of South Australia, GPO Box 2471, Adelaide, SA, 5001, Australia

For all author emails, please log on.

BMC Medical Research Methodology 2013, 13:17  doi:10.1186/1471-2288-13-17

Published: 11 February 2013



Statutory State-based cancer registries are considered the ‘gold standard’ for researchers identifying cancer cases in Australia, but research using self-report or administrative health datasets (e.g. hospital records) may not have linkage to a Cancer Registry and need to identify cases. This study investigated the validity of administrative and self-reported data compared with records in a State-wide Cancer Registry in identifying invasive breast cancer cases.


Cases of invasive breast cancer recorded on the New South Wales (NSW) Cancer Registry between July 2004 and December 2008 (the study period) were identified for women in the 45 and Up Study. Registry cases were separately compared with suspected cases ascertained from: i) administrative hospital separations records; ii) outpatient medical service claims; iii) prescription medicines claims; and iv) the 45 and Up Study baseline survey. Ascertainment flags included diagnosis codes, surgeries (e.g. lumpectomy), services (e.g. radiotherapy), and medicines used for breast cancer, as well as self-reported diagnosis. Positive predictive value (PPV), sensitivity and specificity were calculated for flags within individual datasets, and for combinations of flags across multiple datasets.


Of 143,010 women in the 45 and Up Study, 2039 (1.4%) had an invasive breast tumour recorded on the NSW Cancer Registry during the study period. All of the breast cancer flags examined had high specificity (>97.5%). Of the flags from individual datasets, hospital-derived ‘lumpectomy and diagnosis of invasive breast cancer’ and ‘(lumpectomy or mastectomy) and diagnosis of invasive breast cancer’ had the greatest PPV (89% and 88%, respectively); the later having greater sensitivity (59% and 82%, respectively). The flag with the highest sensitivity and PPV ≥ 85% was 'diagnosis of invasive breast cancer' (both 86%). Self-reported breast cancer diagnosis had a PPV of 50% and sensitivity of 85%, and breast radiotherapy had a PPV of 73% and a sensitivity of 58% compared with Cancer Registry records. The combination of flags with the greatest PPV and sensitivity was ‘(lumpectomy or mastectomy) and (diagnosis of invasive breast cancer or breast radiotherapy)’ (PPV and sensitivity 83%).


In the absence of Cancer Registry data, administrative and self-reported data can be used to accurately identify cases of invasive breast cancer for sample identification, removing cases from a sample, or risk adjustment. Invasive breast cancer can be accurately identified using hospital-derived diagnosis alone or in combination with surgeries and breast radiotherapy.

45 and up study; Sensitivity; Specificity; Positive predictive value; Lumpectomy; Mastectomy; Radiotherapy; Hospital diagnosis; Tamoxifen; Anastrazole; Self-report