Email updates

Keep up to date with the latest news and content from BMC Proceedings and BioMed Central.

This article is part of the supplement: Selected Proceedings of the 6th International Symposium on Bioinformatics Research and Applications (ISBRA'10)

Open Access Proceedings

MetaPath: identifying differentially abundant metabolic pathways in metagenomic datasets

Bo Liu12 and Mihai Pop12*

Author affiliations

1 Center for Bioinformatics and Computational Biology, Institute for Advanced Computer Studies, University of Maryland, College Park, MD 20742, USA

2 Department of Computer Science, University of Maryland-College Park, College Park, MD 20742, USA

For all author emails, please log on.

Citation and License

BMC Proceedings 2011, 5(Suppl 2):S9  doi:10.1186/1753-6561-5-S2-S9

Published: 28 April 2011

Abstract

Background

Enabled by rapid advances in sequencing technology, metagenomic studies aim to characterize entire communities of microbes bypassing the need for culturing individual bacterial members. One major goal of metagenomic studies is to identify specific functional adaptations of microbial communities to their habitats. The functional profile and the abundances for a sample can be estimated by mapping metagenomic sequences to the global metabolic network consisting of thousands of molecular reactions. Here we describe a powerful analytical method (MetaPath) that can identify differentially abundant pathways in metagenomic datasets, relying on a combination of metagenomic sequence data and prior metabolic pathway knowledge.

Methods

First, we introduce a scoring function for an arbitrary subnetwork and find the max-weight subnetwork in the global network by a greedy search algorithm. Then we compute two p values (pabund and pstruct) using nonparametric approaches to answer two different statistical questions: (1) is this subnetwork differentically abundant? (2) What is the probability of finding such good subnetworks by chance given the data and network structure? Finally, significant metabolic subnetworks are discovered based on these two p values.

Results

In order to validate our methods, we have designed a simulated metabolic pathways dataset and show that MetaPath outperforms other commonly used approaches. We also demonstrate the power of our methods in analyzing two publicly available metagenomic datasets, and show that the subnetworks identified by MetaPath provide valuable insights into the biological activities of the microbiome.

Conclusions

We have introduced a statistical method for finding significant metabolic subnetworks from metagenomic datasets. Compared with previous methods, results from MetaPath are more robust against noise in the data, and have significantly higher sensitivity and specificity (when tested on simulated datasets). When applied to two publicly available metagenomic datasets, the output of MetaPath is consistent with previous observations and also provides several new insights into the metabolic activity of the gut microbiome. The software is freely available at http://metapath.cbcb.umd.edu webcite.