Log on / register
Feedback | Support | My details
Open AccessResearch article

Text-derived concept profiles support assessment of DNA microarray data for acute myeloid leukemia and for androgen receptor stimulation

Rob Jelier1 email, Guido Jenster2 email, Lambert CJ Dorssers3 email, Bas J Wouters4 email, Peter JM Hendriksen2 email, Barend Mons1 email, Ruud Delwel4 email and Jan A Kors1 email

Department of Medical Informatics, Erasmus MC – University Medical Center, Rotterdam, The Netherlands

Department of Urology, Erasmus MC – University Medical Center, Rotterdam, The Netherlands

Department of Pathology, Erasmus MC – University Medical Center, Rotterdam, The Netherlands

Department of Hematology, Erasmus MC – University Medical Center, Rotterdam, The Netherlands

author email corresponding author email

BMC Bioinformatics 2007, 8:14doi:10.1186/1471-2105-8-14

Published: 18 January 2007

Abstract

Background

High-throughput experiments, such as with DNA microarrays, typically result in hundreds of genes potentially relevant to the process under study, rendering the interpretation of these experiments problematic. Here, we propose and evaluate an approach to find functional associations between large numbers of genes and other biomedical concepts from free-text literature. For each gene, a profile of related concepts is constructed that summarizes the context in which the gene is mentioned in literature. We assign a weight to each concept in the profile based on a likelihood ratio measure. Gene concept profiles can then be clustered to find related genes and other concepts.

Results

The experimental validation was done in two steps. We first applied our method on a controlled test set. After this proved to be successful the datasets from two DNA microarray experiments were analyzed in the same way and the results were evaluated by domain experts. The first dataset was a gene-expression profile that characterizes the cancer cells of a group of acute myeloid leukemia patients. For this group of patients the biological background of the cancer cells is largely unknown. Using our methodology we found an association of these cells to monocytes, which agreed with other experimental evidence. The second data set consisted of differentially expressed genes following androgen receptor stimulation in a prostate cancer cell line. Based on the analysis we put forward a hypothesis about the biological processes induced in these studied cells: secretory lysosomes are involved in the production of prostatic fluid and their development and/or secretion are androgen-regulated processes.

Conclusion

Our method can be used to analyze DNA microarray datasets based on information explicitly and implicitly available in the literature. We provide a publicly available tool, dubbed Anni, for this purpose.


© 1999-2010 BioMed Central Ltd unless otherwise stated. Part of Springer Science+Business Media.