Email updates

Keep up to date with the latest news and content from BMC Medical Research Methodology and BioMed Central.

Open Access Open Badges Research article

Markers for early detection of cancer: Statistical guidelines for nested case-control studies

Stuart G Baker1*, Barnett S Kramer2 and Sudhir Srivastava1

Author Affiliations

1 Division of Cancer Prevention, National Cancer Institute, Bethesda, MD, USA

2 Office of Disease Prevention and Medical Applications of Research, National Institutes of Health, Bethesda MD, USA

For all author emails, please log on.

BMC Medical Research Methodology 2002, 2:4  doi:10.1186/1471-2288-2-4

Published: 28 February 2002



Recently many long-term prospective studies have involved serial collection and storage of blood or tissue specimens. This has spurred nested case-control studies that involve testing some specimens for various markers that might predict cancer. Until now there has been little guidance in statistical design and analysis of these studies.


To develop statistical guidelines, we considered the purpose, the types of biases, and the opportunities for extracting additional information.


The following guidelines:

(1) For the clearest interpretation, statistics should be based on false and true positive rates – not odds ratios or relative risks

(2) To avoid overdiagnosis bias, cases should be diagnosed as a result of symptoms rather than on screening.

(3) To minimize selection bias, the spectrum of control conditions should be the same in study and target screening populations.

(4) To extract additional information, criteria for a positive test should be based on combinations of individual markers and changes in marker levels over time.

(5) To avoid overfitting, the criteria for a positive marker combination developed in a training sample should be evaluated in a random test sample from the same study and, if possible, a validation sample from another study.

(6) To identify biomarkers with true and false positive rates similar to mammography, the training, test, and validation samples should each include at least 110 randomly selected subjects without cancer and 70 subjects with cancer.


These guidelines ensure good practice in the design and analysis of nested case-control studies of early detection biomarkers.