Table 1

Results of the original deid program and modified program on the training set and two validation sets

Feature added/Modified

Number of Free Text Records

Sensitivity/Recall

Specificity

Precision

Accuracy

F-measure


Original deid Program

500

83.4%

71.6%

71.0%

77.0%

0.77


Modification of deid Program


- Replaced deid lists for cities, businesses

and medical facilities with Ontario lists

and made adjustments for Ontario

healthcard numbers and postal codes

500

91.5%

71.0%

70.7%

79.9%

0.80


- Added RPDB* names to ambiguous

names, added PS‡ derived initial name

removal replacement names to the

unambiguous names and added list of

Ontario physicians

500

90.9%

71.8%

71.5%

80.1%

0.80


- Improved medical eponyms lists

500

90.9%

71.8%

71.5%

80.1%

0.80


- Added protection for common acronyms

and nomenclature

750

92.6%

72.8%

72.7%

81.5%

0.81


- Added 'do not remove' list

1000

88.3%

91.4%

91.3%

89.9%

0.90


First Validation

700

86.7%

91.4%

91.1%

89.0%

0.89


Second Validation

500

80.2%

87.7%

87.4%

83.8%

0.84


*RPDB = Registered Persons Database

‡PS = Practice Solutions

Tu et al. BMC Medical Informatics and Decision Making 2010 10:35   doi:10.1186/1472-6947-10-35

Open Data