BMC Bioinformatics

official impact factor 3.03

Open Access Highly Access Database

Text-mining of PubMed abstracts by natural language processing to create a public knowledge base on molecular mechanisms of bacterial enteropathogens

Sam Zaremba1*, Mila Ramos-Santacruz1, Thomas Hampton2, Panna Shetty1, Joel Fedorko1, Jon Whitmore1, John M Greene1, Nicole T Perna3,4, Jeremy D Glasner3, Guy Plunkett4, Matthew Shaker1 and David Pot1

Author Affiliations

1 ERIC-BRC, SRA International Inc, Global Health Sector, Rockville MD, 20852, USA

2 14026 Marblestone Drive Clifton, VA 20124, USA

3 Genome Center, University of Wisconsin, Madison WI, 53706, USA

4 Laboratory of Genetics, University of Wisconsin, Madison WI, 53706, USA

For all author emails, please log on.

BMC Bioinformatics 2009, 10:177 doi:10.1186/1471-2105-10-177

Published: 10 June 2009

Additional files

Additional file 1:

Training Set PubMed IDs. Tab-delimited text file with list of PubMed IDs used in training set.

Format: TXT Size: 3KB Download file

Open Data

Additional file 2:

Blind Set PubMed IDs. Tab-delimited text file with list of PubMed IDs used in blind set.

Format: TXT Size: 1KB Download file

Open Data

Additional file 3:

Paraphrased description of rules. Word document with paraphrased description of extraction rules.

Format: DOC Size: 36KB Download file

This file can be viewed with: Microsoft Word Viewer

Open Data