Log on / register
Feedback | Support | My details

This article is part of the supplement: Selected papers from the Seventh Asia-Pacific Bioinformatics Conference (APBC 2009) .

Open AccessResearch

Identification of histone modifications in biomedical text for supporting epigenomic research

Corinna Kolářik1,2 email, Roman Klinger1 email and Martin Hofmann-Apitius1,2 email

1Department of Bioinformatics, Fraunhofer Institute Algorithms and Scientific Computing (SCAI) Schloß Birlinghoven, D-53754 Sankt Augustin, Germany

2Department of Applied Life Science Informatics, Bonn-Aachen International Center for Information Technology (B-IT) Dahlmannstrasse 2, D-53113 Bonn, Germany

author email corresponding author email

BMC Bioinformatics 2009, 10(Suppl 1):S28doi:10.1186/1471-2105-10-S1-S28

Published: 30 January 2009

Abstract

Background

Posttranslational modifications of histones influence the structure of chromatine and in such a way take part in the regulation of gene expression. Certain histone modification patterns, distributed over the genome, are connected to cell as well as tissue differentiation and to the adaption of organisms to their environment. Abnormal changes instead influence the development of disease states like cancer. The regulation mechanisms for modifying histones and its functionalities are the subject of epigenomics investigation and are still not completely understood. Text provides a rich resource of knowledge on epigenomics and modifications of histones in particular. It contains information about experimental studies, the conditions used, and results. To our knowledge, no approach has been published so far for identifying histone modifications in text.

Results

We have developed an approach for identifying histone modifications in biomedical literature with Conditional Random Fields (CRF) and for resolving the recognized histone modification term variants by term standardization. For the term identification F1 measures of 0.84 by 10-fold cross-validation on the training corpus and 0.81 on an independent test corpus have been obtained. The standardization enabled the correct transformation of 96% of the terms from training and 98% from test the corpus. Due to the lack of terminologies exhaustively covering specific histone modification types, we developed a histone modification term hierarchy for use in a semantic text retrieval system.

Conclusion

The developed approach highly improves the retrieval of articles describing histone modifications. Since text contains context information about performed studies and experiments, the identification of histone modifications is the basis for supporting literature-based knowledge discovery and hypothesis generation to accelerate epigenomic research.


© 1999-2009 BioMed Central Ltd unless otherwise stated. Part of Springer Science+Business Media.