BMC Bioinformatics

official impact factor 3.03

This article is part of the supplement: Workshop on Advances in Bio Text Mining

Open Access Poster presentation

Species identification for gene name normalization

Illés Solt1,2*, Domonkos Tikk1,2 and Ulf Leser1

Author Affiliations

1 Knowledge Management in Bioinformatics, Institute for Computer Science, Humboldt-Universität zu Berlin, Unter den Linden 6, 10099 Berlin, Germany

2 Department of Telecommunications and Media Informatics, Budapest University of Technology and Economics, H-1117 Budapest, Magyar Tudósok krt 2., Hungary

For all author emails, please log on.

BMC Bioinformatics 2010, 11(Suppl 5):P5 doi:10.1186/1471-2105-11-S5-P5

Published: 6 October 2010

First paragraph (this article has no abstract)

Protein interaction networks are expensive to construct experimentally. Therefore, researchers usually refer to the literature or domain-specific databases to convey knowledge on currently known interactions. Yet the task of manual collection of knowledge from scientific papers is labor intensive, and therefore should be automated to the extent possible. For this, an important step is identifying gene and protein names (termed entities). After identification, gene names must be mapped to database identifiers to connect them to structured knowledge. One particular problem in this step are homonymous, i.e., identical names referring to different genes in different species.