Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

This article is part of the supplement: Workshop on Advances in Bio Text Mining

Open Access Oral presentation

Integrating text mining into high-throughput assay analysis

K Bretonnel Cohen

Author Affiliations

Center for Computational Pharmacology and The MITRE Corporation, USA

BMC Bioinformatics 2010, 11(Suppl 5):O3  doi:10.1186/1471-2105-11-S5-O3

The electronic version of this article is the complete one and can be found online at: http://www.biomedcentral.com/1471-2105/11/S5/O3


Published:6 October 2010

© 2010 Cohen; licensee BioMed Central Ltd.

Oral presentation

There are two basic paradigms of use cases for the output of text mining tools in the genomics subfield of biomedical natural language processing (BioNLP). In the most common one, the tool is designed to produce output that will be viewed by an individual researcher, most commonly a database curator or a bench scientist. This use case has been heavily studied, thanks in part to shared tasks like BioCreative and TREC Genomics. The other use case is the integration of text mining into high-throughput assay analysis. This includes tasks ranging from sequence alignment to the evaluation of gene expression array data. In the past, text mining has been applied to these areas either as a post-processing step, or as an integrated part of the analysis algorithm. More recently, our lab has developed Hanalyzer, a 3R tool—a tool for the knowledge-based analysis of high-throughput assays based on Reading, Reasoning, and Reporting about experimental results in the context of pre-existing knowledge, including knowledge from text mining. The tool is designed to facilitate exploration of experimental data, explanation of observed patterns in the light of what is already known about the entities involved, and the generation of novel hypotheses. Results from a study of craniofacial development demonstrate that the system can be used to explain patterns in gene expression and to generate a set of hypotheses about the roles of four genes previously not known to be involved in tongue development. These hypotheses were experimentally validated by in situ hybridization and may have clinical consequences related to cleft lip and palate.