Email updates

Keep up to date with the latest news and content from BMC Genomics and BioMed Central.

Open Access Highly Accessed Database

Pancreatic Expression database: a generic model for the organization, integration and mining of complex cancer datasets

Claude Chelala1*, Stephan A Hahn2, Hannah J Whiteman1, Sayka Barry1, Deepak Hariharan1, Tomasz P Radon1, Nicholas R Lemoine1 and Tatjana Crnogorac-Jurcevic1

Author Affiliations

1 Centre for Molecular Oncology, Institute of Cancer & CR-UK Clinical Centre, Barts & The London School of Medicine (QMUL), Charterhouse Square London EC1M 6BQ, UK

2 Molecular GI-Onkologie (MGO), University of Bochum, Germany

For all author emails, please log on.

BMC Genomics 2007, 8:439  doi:10.1186/1471-2164-8-439

Published: 28 November 2007

Abstract

Background

Pancreatic cancer is the 5th leading cause of cancer death in both males and females. In recent years, a wealth of gene and protein expression studies have been published broadening our understanding of pancreatic cancer biology. Due to the explosive growth in publicly available data from multiple different sources it is becoming increasingly difficult for individual researchers to integrate these into their current research programmes. The Pancreatic Expression database, a generic web-based system, is aiming to close this gap by providing the research community with an open access tool, not only to mine currently available pancreatic cancer data sets but also to include their own data in the database.

Description

Currently, the database holds 32 datasets comprising 7636 gene expression measurements extracted from 20 different published gene or protein expression studies from various pancreatic cancer types, pancreatic precursor lesions (PanINs) and chronic pancreatitis. The pancreatic data are stored in a data management system based on the BioMart technology alongside the human genome gene and protein annotations, sequence, homologue, SNP and antibody data. Interrogation of the database can be achieved through both a web-based query interface and through web services using combined criteria from pancreatic (disease stages, regulation, differential expression, expression, platform technology, publication) and/or public data (antibodies, genomic region, gene-related accessions, ontology, expression patterns, multi-species comparisons, protein data, SNPs). Thus, our database enables connections between otherwise disparate data sources and allows relatively simple navigation between all data types and annotations.

Conclusion

The database structure and content provides a powerful and high-speed data-mining tool for cancer research. It can be used for target discovery i.e. of biomarkers from body fluids, identification and analysis of genes associated with the progression of cancer, cross-platform meta-analysis, SNP selection for pancreatic cancer association studies, cancer gene promoter analysis as well as mining cancer ontology information. The data model is generic and can be easily extended and applied to other types of cancer. The database is available online with no restrictions for the scientific community at http://www.pancreasexpression.org/ webcite.