Open Access Research article

Querying phenotype-genotype relationships on patient datasets using semantic web technology: the example of cerebrotendinous xanthomatosis

María Taboada1*, Diego Martínez2, Belén Pilo3, Adriano Jiménez-Escrig4, Peter N Robinson5 and María J Sobrido6

Author Affiliations

1 Department of Electronics and Computer Science, University of Santiago de Compostela, Edificio Monte da Condesa, Campus Vida, 15782, Spain

2 Department of Applied Physics, University of Santiago de Compostela, Santiago de Compostela, Spain

3 Section of Neurology, Hospital del Sureste, Arganda del Rey, Madrid, Spain

4 Department of Neurology, Hospital Ramon y Cajal, University of Alcalá de Henares, Alcalá de Henares, Spain

5 Institut für Medizinische Genetik und Humangenetik, Charité - Universitätsmedizin, Charité, Berlin, Germany

6 Fundación Pública Galega de Medicina Xenómica, Santiago de Compostela. Center for Biomedical Research on Rare Diseases (CIBERER), Institute of Health Carlos III, Santiago de Compostela, Spain

For all author emails, please log on.

BMC Medical Informatics and Decision Making 2012, 12:78  doi:10.1186/1472-6947-12-78

Published: 31 July 2012



Semantic Web technology can considerably catalyze translational genetics and genomics research in medicine, where the interchange of information between basic research and clinical levels becomes crucial. This exchange involves mapping abstract phenotype descriptions from research resources, such as knowledge databases and catalogs, to unstructured datasets produced through experimental methods and clinical practice. This is especially true for the construction of mutation databases. This paper presents a way of harmonizing abstract phenotype descriptions with patient data from clinical practice, and querying this dataset about relationships between phenotypes and genetic variants, at different levels of abstraction.


Due to the current availability of ontological and terminological resources that have already reached some consensus in biomedicine, a reuse-based ontology engineering approach was followed. The proposed approach uses the Ontology Web Language (OWL) to represent the phenotype ontology and the patient model, the Semantic Web Rule Language (SWRL) to bridge the gap between phenotype descriptions and clinical data, and the Semantic Query Web Rule Language (SQWRL) to query relevant phenotype-genotype bidirectional relationships. The work tests the use of semantic web technology in the biomedical research domain named cerebrotendinous xanthomatosis (CTX), using a real dataset and ontologies.


A framework to query relevant phenotype-genotype bidirectional relationships is provided. Phenotype descriptions and patient data were harmonized by defining 28 Horn-like rules in terms of the OWL concepts. In total, 24 patterns of SWQRL queries were designed following the initial list of competency questions. As the approach is based on OWL, the semantic of the framework adapts the standard logical model of an open world assumption.


This work demonstrates how semantic web technologies can be used to support flexible representation and computational inference mechanisms required to query patient datasets at different levels of abstraction. The open world assumption is especially good for describing only partially known phenotype-genotype relationships, in a way that is easily extensible. In future, this type of approach could offer researchers a valuable resource to infer new data from patient data for statistical analysis in translational research. In conclusion, phenotype description formalization and mapping to clinical data are two key elements for interchanging knowledge between basic and clinical research.