This article is part of the supplement: Ninth Annual MCBIOS Conference. Dealing with the Omics Data Deluge
IPAD: the Integrated Pathway Analysis Database for Systematic Enrichment Analysis
1 Department of Academic and Institutional Resources and Technology, University of North Texas Health Science Center, Fort Worth, USA
2 Department of Forensic and Investigative Genetics, University of North Texas Health Science Center, Fort Worth, USA
BMC Bioinformatics 2012, 13(Suppl 15):S7 doi:10.1186/1471-2105-13-S15-S7Published: 11 September 2012
Next-Generation Sequencing (NGS) technologies and Genome-Wide Association Studies (GWAS) generate millions of reads and hundreds of datasets, and there is an urgent need for a better way to accurately interpret and distill such large amounts of data. Extensive pathway and network analysis allow for the discovery of highly significant pathways from a set of disease vs. healthy samples in the NGS and GWAS. Knowledge of activation of these processes will lead to elucidation of the complex biological pathways affected by drug treatment, to patient stratification studies of new and existing drug treatments, and to understanding the underlying anti-cancer drug effects. There are approximately 141 biological human pathway resources as of Jan 2012 according to the Pathguide database. However, most currently available resources do not contain disease, drug or organ specificity information such as disease-pathway, drug-pathway, and organ-pathway associations. Systematically integrating pathway, disease, drug and organ specificity together becomes increasingly crucial for understanding the interrelationships between signaling, metabolic and regulatory pathway, drug action, disease susceptibility, and organ specificity from high-throughput omics data (genomics, transcriptomics, proteomics and metabolomics).
We designed the
IPAD is a comprehensive database covering about 22,498 genes, 25,469 proteins, 1956 pathways, 6704 diseases, 5615 drugs, and 52 organs integrated from databases including the BioCarta, KEGG, NCI-Nature curated, Reactome, CTD, PharmGKB, DrugBank, PharmGKB, and HOMER. The database has a web-based user interface that allows users to perform enrichment analysis from genes/proteins/molecules and inter-association analysis from a pathway, disease, drug, and organ.
Moreover, the quality of the database was validated with the context of the existing biological knowledge and a "gold standard" constructed from reputable and reliable sources. Two case studies were also presented to demonstrate: 1) self-validation of enrichment analysis and inter-association analysis on brain-specific markers, and 2) identification of previously undiscovered components by the enrichment analysis from a prostate cancer study.
IPAD is a new resource for analyzing, identifying, and validating pathway, disease, drug, organ specificity and their inter-associations. The statistical method we developed for enrichment and similarity measurement and the two criteria we described for setting the threshold parameters can be extended to other enrichment applications. Enriched pathways, diseases, drugs, organs and their inter-associations can be searched, displayed, and downloaded from our online user interface. The current IPAD database can help users address a wide range of biological pathway related, disease susceptibility related, drug target related and organ specificity related questions in human disease studies.