PSE: A tool for browsing a large amount of MEDLINE/PubMed abstracts with gene names and common words as the keywords
Pharmaceutical Research Laboratories, Pharmaceutical Division, Kirin Brewery Co. Ltd., 3 Miyahara, Takasaki, Gunma 370-1295, Japan
BMC Bioinformatics 2005, 6:295 doi:10.1186/1471-2105-6-295Published: 10 December 2005
MEDLINE/PubMed (hereinafter called PubMed) is one of the most important literature databases for the biological and medical sciences, but it is impossible to read all related records due to the sheer size of the repository. We usually have to repeatedly enter keywords in a trial-and-error manner to extract useful records. Software which can reduce such a laborious task is therefore required.
We developed a web-based software, the PubMed Sentence Extractor (PSE), which parses large number of PubMed abstracts, extracts and displays the co-occurrence sentences of gene names and other keywords, and some information from EntrezGene records. The result links to whole abstracts and other resources such as the Online Mendelian Inheritance in Men and Reference Sequence. While PSE executes at the sentence-level when evaluating the existence of keywords, the popular PubMed operates at the record-level. Therefore, the relationship between the two keywords, a gene name and a common word, is more accurately captured by PSE than PubMed. In addition, PSE shows the list of keywords and considers the synonyms and variations on gene names. Through these functions, PSE would reduce the task of searching through records for gene information.
We developed PSE in order to extract useful records efficiently from PubMed. This system has four advantages over a simple PubMed search; the reduction in the amount of collected literatures, the showing of keyword lists, the consideration for synonyms and variations on gene names, and the links to external databases. We believe PSE is helpful in collecting necessary literatures efficiently in order to find research targets. PSE is freely available under the GPL licence as additional files to this manuscript.