Email updates

Keep up to date with the latest news and content from BMC Medical Informatics and Decision Making and BioMed Central.

Open Access Research article

A method for encoding clinical datasets with SNOMED CT

Dennis H Lee1, Francis Y Lau1* and Hue Quan2

Author Affiliations

1 School of Health Information Science, University of Victoria, Human & Social Development Building A202, 3800 Finnerty Road (Ring Road), Victoria, BC V8P 5C2, Canada

2 Edmonton Palliative/End of Life Program, Alberta Health Services, Seniors Health, Grey Nuns Community Hospital, 335 St. Marguerite Health Services Centre, 1090 Youville Drive, Edmonton, AB T6L 5X8, Canada

For all author emails, please log on.

BMC Medical Informatics and Decision Making 2010, 10:53  doi:10.1186/1472-6947-10-53

Published: 17 September 2010



Over the past decade there has been a growing body of literature on how the Systematised Nomenclature of Medicine Clinical Terms (SNOMED CT) can be implemented and used in different clinical settings. Yet, for those charged with incorporating SNOMED CT into their organisation's clinical applications and vocabulary systems, there are few detailed encoding instructions and examples available to show how this can be done and the issues involved. This paper describes a heuristic method that can be used to encode clinical terms in SNOMED CT and an illustration of how it was applied to encode an existing palliative care dataset.


The encoding process involves: identifying input data items; cleaning the data items; encoding the cleaned data items; and exporting the encoded terms as output term sets. Four outputs are produced: the SNOMED CT reference set; interface terminology set; SNOMED CT extension set and unencodeable term set.


The original palliative care database contained 211 data elements, 145 coded values and 37,248 free text values. We were able to encode ~84% of the terms, another ~8% require further encoding and verification while terms that had a frequency of fewer than five were not encoded (~7%).


From the pilot, it would seem our SNOMED CT encoding method has the potential to become a general purpose terminology encoding approach that can be used in different clinical systems.