BMC Bioinformatics

official impact factor 3.03

Open Access Highly Access Database

MeMo: a hybrid SQL/XML approach to metabolomic data management for functional genomics

Irena Spasić1*, Warwick B Dunn1, Giles Velarde1, Andy Tseng1, Helen Jenkins2, Nigel Hardy2, Stephen G Oliver3 and Douglas B Kell1

Author Affiliations

1 School of Chemistry, Faraday Building, The University of Manchester, Manchester, M60 1QD, UK

2 Department of Computer Science, The University of Wales, Aberystwyth, SY23 3DB, UK

3 Faculty of Life Sciences, Michael Smith Building, The University of Manchester, Manchester, M13 9PT, UK

For all author emails, please log on.

BMC Bioinformatics 2006, 7:281 doi:10.1186/1471-2105-7-281

Published: 5 June 2006

Abstract

Background

The genome sequencing projects have shown our limited knowledge regarding gene function, e.g. S. cerevisiae has 5–6,000 genes of which nearly 1,000 have an uncertain function. Their gross influence on the behaviour of the cell can be observed using large-scale metabolomic studies. The metabolomic data produced need to be structured and annotated in a machine-usable form to facilitate the exploration of the hidden links between the genes and their functions.

Description

MeMo is a formal model for representing metabolomic data and the associated metadata. Two predominant platforms (SQL and XML) are used to encode the model. MeMo has been implemented as a relational database using a hybrid approach combining the advantages of the two technologies. It represents a practical solution for handling the sheer volume and complexity of the metabolomic data effectively and efficiently. The MeMo model and the associated software are available at http://dbkgroup.org/memo/ webcite.

Conclusion

The maturity of relational database technology is used to support efficient data processing. The scalability and self-descriptiveness of XML are used to simplify the relational schema and facilitate the extensibility of the model necessitated by the creation of new experimental techniques. Special consideration is given to data integration issues as part of the systems biology agenda. MeMo has been physically integrated and cross-linked to related metabolomic and genomic databases. Semantic integration with other relevant databases has been supported through ontological annotation. Compatibility with other data formats is supported by automatic conversion.