Log on / register
Feedback | Support
for authors
for libraries
Media information
Press center
about us
 Search info pages
Printer friendly version


BMC  Using BioMed Central's open access full-text corpus for text mining research

BioMed Central has so far published 35238 articles of peer-reviewed biomedical research, all of which are covered by our open access license agreement which allows free distribution and re-use of the full- text article, including the highly structured XML version.

As a result, BioMed Central's research article corpus is ideally suited for use by text mining researchers.

New! An XSLT preview stylesheet, which will render any BioMed Central article XML file into HTML, is now available:

preview.xsl (37K)

Sample code for developers, demonstrating the use of the stylesheet, is also available:

How to download BioMed Central's corpus

1. By FTP

Server: ftp.biomedcentral.com
Directory: /content/
Username: datamining
Password: $8Xguppy

File/directory Description
/content/index.xml An index of all research articles, in timestamp order (the timestamp is the date on which the XML became available)
/content/articles/ A subdirectory containing the full-text XML file for each article, each named based on its unique identifier - i.e. [ui].xml
/content/articles.zip A single ZIP-compressed file containing all the full-text XML files
Remember to set FTP transfer mode to BINARY

2. Via the Open Archive Initiative Metadata Harvesting Protocol (OAI protocol)

The OAI protocol is an HTTP/XML web service standard for the exchange of data between archives and repositories. Full-text XML is one of the metadata formats that the BioMed Central OAI protocol interface supports. See BioMed Central's OAI page for more details.

You should use the following OAI 'set' to download all open access research articles via BioMed Central's OAI interface.
articletype:research

Publish your text mining research with BioMed Central

BioMed Central is keen to publish high quality research in the area of text mining and biomedical literature analysis.

See this list of recent publications on this topic that have appeared in BioMed Central's journals.

All research articles published by BioMed Central are covered by our open access policy, and so are freely available without subscription.

For more information about submitting an article, visit the BMC Bioinformatics home page.

More information

For more information on using BioMed Central's articles for text mining purposes info@biomedcentral.com.

Useful links


 

 



© 1999-2008 BioMed Central Ltd unless otherwise stated