Table 4

Performance on gold standard corpus.

PHI Type

PHI sub-type

Count

# FNs

# FNs per 100,000 words

Per Category Recall

Per Category Precision


Name

Patient Name

54

0

0

1.00


Patient Name Initial

2

2

0.598

0.00


Relative/Proxy Name

175

4

1.195

0.977


Clinician Name

593

3

1.494

0.995

0.725


Date

Date (not year)

482

26

7.769

0.946


Year

46

11

3.287

0.761

0.713


Location

367

10

4.482

0.973

0.922


Phone

53

0

0

1.00

0.898


Age over 89

4

1

0.299

0.750

0.600


Undefined

3

2

0.598

0.333

N/A


Overall

1779

59

19.720

0.967

0.749


(FNs are false negatives and N/A indicates not applicable)

Neamatullah et al. BMC Medical Informatics and Decision Making 2008 8:32   doi:10.1186/1472-6947-8-32

Open Data