Log on / register
Feedback | Support | My details
Open AccessHighly AccessSoftware

CoPub Mapper: mining MEDLINE based on search term co-publication

Blaise TF Alako1 email, Antoine Veldhoven2 email, Sjozef van Baal3 email, Rob Jelier4 email, Stefan Verhoeven1 email, Ton Rullmann1 email, Jan Polman1 email and Guido Jenster2 email

1Department of Molecular Design & Informatics, Organon NV, P.O. Box 20, 5340 BH Oss, The Netherlands

2Department of Urology, Erasmus MC, P.O. Box 1738, 3000 DR Rotterdam, The Netherlands

3Department of Genetics, Erasmus MC, Rotterdam, The Netherlands

4Department of Medical Informatics, Erasmus MC, Rotterdam, The Netherlands

author email corresponding author email

BMC Bioinformatics 2005, 6:51doi:10.1186/1471-2105-6-51

Published: 11 March 2005

Abstract

Background

High throughput microarray analyses result in many differentially expressed genes that are potentially responsible for the biological process of interest. In order to identify biological similarities between genes, publications from MEDLINE were identified in which pairs of gene names and combinations of gene name with specific keywords were co-mentioned.

Results

MEDLINE search strings for 15,621 known genes and 3,731 keywords were generated and validated. PubMed IDs were retrieved from MEDLINE and relative probability of co-occurrences of all gene-gene and gene-keyword pairs determined. To assess gene clustering according to literature co-publication, 150 genes consisting of 8 sets with known connections (same pathway, same protein complex, or same cellular localization, etc.) were run through the program. Receiver operator characteristics (ROC) analyses showed that most gene sets were clustered much better than expected by random chance. To test grouping of genes from real microarray data, 221 differentially expressed genes from a microarray experiment were analyzed with CoPub Mapper, which resulted in several relevant clusters of genes with biological process and disease keywords. In addition, all genes versus keywords were hierarchical clustered to reveal a complete grouping of published genes based on co-occurrence.

Conclusion

The CoPub Mapper program allows for quick and versatile querying of co-published genes and keywords and can be successfully used to cluster predefined groups of genes and microarray data.


© 1999-2009 BioMed Central Ltd unless otherwise stated. Part of Springer Science+Business Media.