Pitfalls in applying text mining to scientific literature

Neefs, Jean-Marc

doi:10.1186/1471-2105-11-S5-O4

Volume 11 Supplement 5

Workshop on Advances in Bio Text Mining

Oral presentation
Open access
Published: 06 October 2010

Pitfalls in applying text mining to scientific literature

Jean-Marc Neefs¹

BMC Bioinformatics volume 11, Article number: O4 (2010) Cite this article

2673 Accesses
1 Citations
2 Altmetric
Metrics details

Numbers and data mining are easy. Our numerical system counts 10 digits, any combination is possible, and every measured value can be captured in a number. Large quantities of measures can be analysed efficiently using incredibly powerful calculators, and resulting information can be shown is simple clear graphs.

Text is hard. Hundreds of letters and millions of different combinations can be used in the personal interpretation of information, in words and phrases that reflect one's personality rather than objective measurements. Depending on context and language, the same expression carries totally different information, or no meaning at all.

Text Mining requires 'education' at different levels: for providing information, to capture, to store and to retrieve that information, and to interpret results of the mining process.

I will provide a few examples of a few text mining tools in daily practice.

Author information

Authors and Affiliations

Janssen Pharmaceutica, 2340 Beerse, Belgium
Jean-Marc Neefs

Authors

Jean-Marc Neefs
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jean-Marc Neefs.

Rights and permissions

Open Access This article is published under license to BioMed Central Ltd. This is an Open Access article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Neefs, JM. Pitfalls in applying text mining to scientific literature. BMC Bioinformatics 11 (Suppl 5), O4 (2010). https://doi.org/10.1186/1471-2105-11-S5-O4

Download citation

Published: 06 October 2010
DOI: https://doi.org/10.1186/1471-2105-11-S5-O4

Workshop on Advances in Bio Text Mining

Pitfalls in applying text mining to scientific literature

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

BMC Bioinformatics

Contact us

Workshop on Advances in Bio Text Mining

Pitfalls in applying text mining to scientific literature

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

BMC Bioinformatics

Contact us