Log on / register
Feedback | Support | My details
Open AccessHighly AccessSoftware

XML schemas for common bioinformatic data types and their application in workflow systems

Philipp N Seibel* 1 email, Jan Krüger* 2 email, Sven Hartmeier* 2 email, Knut Schwarzer3 email, Kai Löwenthal2 email, Henning Mersch4 email, Thomas Dandekar1 email and Robert Giegerich2 email

1Department of Bioinformatics, Biocenter, University of Würzburg, Würzburg, Germany

2Bioinformatics Group, Practical Computer Science Department, Faculty of Technology, Bielefeld University, Bielefeld, Germany

3Department of Bioinformatics, UKG, University of Göttingen, Göttingen, Germany

4Distributed Systems and Grid Computing, Central Institute for Applied Mathematics, Research Centre Jülich, Jülich, Germany

author email corresponding author email* Contributed equally

BMC Bioinformatics 2006, 7:490doi:10.1186/1471-2105-7-490

Published: 6 November 2006

Abstract

Background

Today, there is a growing need in bioinformatics to combine available software tools into chains, thus building complex applications from existing single-task tools. To create such workflows, the tools involved have to be able to work with each other's data – therefore, a common set of well-defined data formats is needed. Unfortunately, current bioinformatic tools use a great variety of heterogeneous formats.

Results

Acknowledging the need for common formats, the Helmholtz Open BioInformatics Technology network (HOBIT) identified several basic data types used in bioinformatics and developed appropriate format descriptions, formally defined by XML schemas, and incorporated them in a Java library (BioDOM). These schemas currently cover sequence, sequence alignment, RNA secondary structure and RNA secondary structure alignment formats in a form that is independent of any specific program, thus enabling seamless interoperation of different tools. All XML formats are available at http://bioschemas.sourceforge.net webcite, the BioDOM library can be obtained at http://biodom.sourceforge.net webcite.

Conclusion

The HOBIT XML schemas and the BioDOM library simplify adding XML support to newly created and existing bioinformatic tools, enabling these tools to interoperate seamlessly in workflow scenarios.


© 1999-2009 BioMed Central Ltd unless otherwise stated. Part of Springer Science+Business Media.