MScanner: a classifier for retrieving Medline citations
-
* Corresponding author: Graham L Poulter graham.poulter@gmail.com
1 UCT NBN Node, Department of Molecular and Cell Biology, University of Cape Town, Cape Town, South Africa
2 Stanford Medical Informatics, Stanford University, San Francisco, USA
3 Department of Bioengineering and Department of Genetics, Stanford University, San Francisco, USA
BMC Bioinformatics 2008, 9:108 doi:10.1186/1471-2105-9-108
Published: 19 February 2008Additional files
Additional file 1:
11-point precision-recall curves. 11pointcurves.pdf is a PDF file containing a table of 11-point interpolated precision curves for all experiments in the paper. The interpolated precision at a specified recall is the highest precision found for any value of recall greater than or equal to the specified recall.
Format: PDF Size: 44KB Download file
This file can be viewed with: Adobe Acrobat Reader
Additional file 2:
Corpora used in the IEDB comparison. iedb.zip is a ZIP archive containing text files, where each line contains the PubMed ID and completion date of a Medline record. iedb-all-relevant.txt and iedb-all-irrelevant.txt are the relevant and irrelevant cross validation corpora used in the IEDB cross validation. iedb-pre2004-relevant.txt are the relevant training examples for the retrieval comparison. iedb-2004-relevant.txt and iedb-2004-irrelevant.txt are the manually evaluated IEDB query results from 2004 Medline. PubMed IDs for 2004 Medline may be obtained using the PubMed query 2004 [DateCompleted] AND medline [sb].
Format: ZIP Size: 107KB Download file
Additional file 3:
Source code for MScanner. mscanner-20071123.zip is a ZIP archive containing the Python 2.5 source code for MScanner, licensed under the GNU General Public License. It also contains API documentation in HTML format. Updated versions will be made available at http://mscanner.stanford.edu webcite.
Format: ZIP Size: 909KB Download file
Additional file 4:
Sample cross validation corpora. corpora.zip is a ZIP archive containing text files for the PG07, AIDSBio, Radiology, Control and Medline100K sample corpora. Each line contains the PubMed ID and completion date of a Medline record.
Format: ZIP Size: 442KB Download file
