Email updates

Keep up to date with the latest news and content from BMC Medical Research Methodology and BioMed Central.

Open Access Research article

Detecting and correcting the bias of unmeasured factors using perturbation analysis: a data-mining approach

Wen-Chung Lee

Author Affiliations

Research Center for Genes, Environment and Human Health, College of Public Health, National Taiwan University, Rm. 536, No. 17, Xuzhou Rd., Taipei 100, Taiwan

Institute of Epidemiology and Preventive Medicine, College of Public Health, National Taiwan University, Rm. 536, No. 17, Xuzhou Rd., Taipei 100, Taiwan

BMC Medical Research Methodology 2014, 14:18  doi:10.1186/1471-2288-14-18

Published: 5 February 2014

Abstract

Background

The randomized controlled study is the gold-standard research method in biomedicine. In contrast, the validity of a (nonrandomized) observational study is often questioned because of unknown/unmeasured factors, which may have confounding and/or effect-modifying potential.

Methods

In this paper, the author proposes a perturbation test to detect the bias of unmeasured factors and a perturbation adjustment to correct for such bias. The proposed method circumvents the problem of measuring unknowns by collecting the perturbations of unmeasured factors instead. Specifically, a perturbation is a variable that is readily available (or can be measured easily) and is potentially associated, though perhaps only very weakly, with unmeasured factors. The author conducted extensive computer simulations to provide a proof of concept.

Results

Computer simulations show that, as the number of perturbation variables increases from data mining, the power of the perturbation test increased progressively, up to nearly 100%. In addition, after the perturbation adjustment, the bias decreased progressively, down to nearly 0%.

Conclusions

The data-mining perturbation analysis described here is recommended for use in detecting and correcting the bias of unmeasured factors in observational studies.

Keywords:
Epidemiologic methods; Confounding; Data mining; Effect modification; Bias; Standardization