Log on / register
Feedback | Support | My details
Open AccessHighly AccessResearch article

MScanner: a classifier for retrieving Medline citations

Graham L Poulter1 email, Daniel L Rubin2 email, Russ B Altman3 email and Cathal Seoighe1 email

1UCT NBN Node, Department of Molecular and Cell Biology, University of Cape Town, Cape Town, South Africa

2Stanford Medical Informatics, Stanford University, San Francisco, USA

3Department of Bioengineering and Department of Genetics, Stanford University, San Francisco, USA

author email corresponding author email

BMC Bioinformatics 2008, 9:108doi:10.1186/1471-2105-9-108

Published: 19 February 2008

Additional files

Additional file 1:

11-point precision-recall curves. 11pointcurves.pdf is a PDF file containing a table of 11-point interpolated precision curves for all experiments in the paper. The interpolated precision at a specified recall is the highest precision found for any value of recall greater than or equal to the specified recall.

Format: PDF Size: 44KB Download file

This file can be viewed with: Adobe Acrobat Reader

Additional file 2:

Corpora used in the IEDB comparison. iedb.zip is a ZIP archive containing text files, where each line contains the PubMed ID and completion date of a Medline record. iedb-all-relevant.txt and iedb-all-irrelevant.txt are the relevant and irrelevant cross validation corpora used in the IEDB cross validation. iedb-pre2004-relevant.txt are the relevant training examples for the retrieval comparison. iedb-2004-relevant.txt and iedb-2004-irrelevant.txt are the manually evaluated IEDB query results from 2004 Medline. PubMed IDs for 2004 Medline may be obtained using the PubMed query 2004 [DateCompleted] AND medline [sb].

Format: ZIP Size: 107KB Download file

Additional file 3:

Source code for MScanner. mscanner-20071123.zip is a ZIP archive containing the Python 2.5 source code for MScanner, licensed under the GNU General Public License. It also contains API documentation in HTML format. Updated versions will be made available at http://mscanner.stanford.edu webcite.

Format: ZIP Size: 909KB Download file

Additional file 4:

Sample cross validation corpora. corpora.zip is a ZIP archive containing text files for the PG07, AIDSBio, Radiology, Control and Medline100K sample corpora. Each line contains the PubMed ID and completion date of a Medline record.

Format: ZIP Size: 442KB Download file


© 1999-2009 BioMed Central Ltd unless otherwise stated. Part of Springer Science+Business Media.