Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

This article is part of the supplement: Ninth Annual MCBIOS Conference. Dealing with the Omics Data Deluge

Open Access Proceedings

PAGED: a pathway and gene-set enrichment database to enable molecular phenotype discoveries

Hui Huang12, Xiaogang Wu123, Madhankumar Sonachalam2, Sammed N Mandape1, Ragini Pandey2, Karl F MacDorman1, Ping Wan4* and Jake Y Chen123*

Author affiliations

1 School of Informatics, Indiana University, Indianapolis, IN 46202, USA

2 Indiana Center for Systems Biology and Personalized Medicine, Indiana University, Indianapolis, IN 46202, USA

3 MedeoLinx, LLC, Indianapolis, IN 46280, USA

4 Capital Normal University, Beijing, 100048, China

For all author emails, please log on.

Citation and License

BMC Bioinformatics 2012, 13(Suppl 15):S2  doi:10.1186/1471-2105-13-S15-S2

Published: 11 September 2012



Over the past decade, pathway and gene-set enrichment analysis has evolved into the study of high-throughput functional genomics. Owing to poorly annotated and incomplete pathway data, researchers have begun to combine pathway and gene-set enrichment analysis as well as network module-based approaches to identify crucial relationships between different molecular mechanisms.


To meet the new challenge of molecular phenotype discovery, in this work, we have developed an integrated online database, the

atabase (PAGED), to enable comprehensive searches for disease-specific pathways, gene signatures, microRNA targets, and network modules by integrating gene-set-based prior knowledge as molecular patterns from multiple levels: the genome, transcriptome, post-transcriptome, and proteome.


The online database we developed, PAGED webcite is by far the most comprehensive public compilation of gene sets. In its current release, PAGED contains a total of 25,242 gene sets, 61,413 genes, 20 organisms, and 1,275,560 records from five major categories. Beyond its size, the advantage of PAGED lies in the explorations of relationships between gene sets as gene-set association networks (GSANs). Using colorectal cancer expression data analysis as a case study, we demonstrate how to query this database resource to discover crucial pathways, gene signatures, and gene network modules specific to colorectal cancer functional genomics.


This integrated online database lays a foundation for developing tools beyond third-generation pathway analysis approaches on for discovering molecular phenotypes, especially for disease-associated pathway/gene-set enrichment analysis.