|
Relevance of inter-dictionary ambiguities for mining MEDLINE (amb.: ambiguous). The column 'nb. found abstracts' contains the number of MEDLINE abstracts (from within a set of approx. 7 million abstracts) that contain at least one gene/protein name of the respective organisms. The values in the other columns are percentages of the values in the column 'nb. found abstracts'. |
|||||
| nb. found abstracts |
% amb. abstracts |
% amb.+ unique synonym |
% amb.+ unique organism |
% amb.+ unique synonym or organism |
|
|
|
|||||
| human-mouse |
2 761 987 |
60.5 |
23.1 |
37.8 |
46.5 |
| human-rat |
2 238 212 |
64.5 |
27.2 |
43.5 |
52.1 |
| mouse-rat |
2 532 682 |
58.2 |
24.2 |
17.1 |
33.7 |
Fundel and Zimmer BMC Bioinformatics 2006 7:372 doi:10.1186/1471-2105-7-372 |
|||||