Exploiting MeSH indexing in MEDLINE to generate a data set for word sense disambiguation
-
* Corresponding author: Antonio J Jimeno-Yepes antonio.jimeno@gmail.com
1 National Library of Medicine, 8600 Rockville Pike, Bethesda, MD 20894, USA
2 Department of Pharmacology, University of Minnesota Twin Cities, Minneapolis, MN 55155, USA
BMC Bioinformatics 2011, 12:223 doi:10.1186/1471-2105-12-223
Published: 2 June 2011Additional files
Additional file 1:
Accuracy per ambiguous word. Medline Freq. is the frequency of the term in MEDLINE up to 23rd July 2010. NB stands for Naïve Bayes, AEC stands for Automatic Extracted Corpus, MRD stands for Machine Readable dictionary, 2-MRD stands for 2nd Order Co-occurrence and JDI stands for Journal Descriptor Indexing. The possible values for type are: A for abbreviations, T for terms and AT for abbreviations/terms.
Format: CSV Size: 10KB Download file
Additional file 2:
Semantic Type frequency in the MSH WSD set and Metathesaurus concept count.
Format: CSV Size: 3KB Download file
