Email updates

Keep up to date with the latest news and content from BMC Genomics and BioMed Central.

This article is part of the supplement: Proceedings of the 5th International Conference of the Brazilian Association for Bioinformatics and Computational Biology (X-meeting 2009)

Open Access Proceedings

Disclosing ambiguous gene aliases by automatic literature profiling

Roney S Coimbra12*, Dana E Vanderwall3 and Guilherme C Oliveira12

Author Affiliations

1 Center for Excellence in Bioinformatics, Research Center René Rachou, FIOCRUZ-MG. Rua Araguari, 741, Barro Preto. Belo Horizonte, MG, Brazil

2 Genomics and Computational Biology Group, Research Center René Rachou, FIOCRUZ-MG. Av. Augusto de Lima, 1715, Barro Preto. Belo Horizonte, MG, Brazil

3 Molecular Discovery Research, GlaxoSmithKline Moore Dr, Research Triangle Park, NC, 27709, USA

For all author emails, please log on.

BMC Genomics 2010, 11(Suppl 5):S3  doi:10.1186/1471-2164-11-S5-S3

Published: 22 December 2010

Additional files

Additional file 1:

EntrezGene official symbols with PubMed abstracts and their aliases classified by the algorithm. Description of data: 73 randomly chosen official gene symbols that produced text corpora of PubMed abstracts and their aliases. Aliases were classified by the algorithm as “synonyms”, “ambiguous”, “aliases with PubMed abstract but not passing the filters”, or “aliases without PubMed abstracts”.

Format: XLS Size: 43KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 2:

EntrezGene official symbols without PubMed abstracts and their aliases. 27 randomly chosen official gene symbols that did not produce text corpora of PubMed abstracts, and their aliases. Aliases were classified as “aliases with PubMed abstract but not passing the filters”, or “aliases without PubMed abstracts”.

Format: XLS Size: 25KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 3:

Jaccard distances between the official gene symbols and their respective aliases. For 36 genes the distance between the official gene symbol and at least one of its aliases (red circles) exceeded the distance between the official symbol and the internal control (black circles). Green circles represent the distance between the official gene symbol and aliases classified as “synonyms”.

Format: PDF Size: 36KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data