Log on / register
Feedback | Support | My details
Open AccessHighly AccessMethodology article

A search engine to identify pathway genes from expression data on multiple organisms

Chunnuan Chen* 1 email, Matthew T Weirauch* 1 email, Corey C Powell1 email, Alexander C Zambon2 email and Joshua M Stuart1 email

1Department of Biomolecular Engineering, University of California, Santa Cruz, California, 95064, USA

2Department of Medicine, Gladstone Institute of Cardiovascular Disease, San Francisco, California 94158, USA

author email corresponding author email* Contributed equally

BMC Systems Biology 2007, 1:20doi:10.1186/1752-0509-1-20

Published: 4 May 2007

Abstract

Background

The completion of several genome projects showed that most genes have not yet been characterized, especially in multicellular organisms. Although most genes have unknown functions, a large collection of data is available describing their transcriptional activities under many different experimental conditions. In many cases, the coregulatation of a set of genes across a set of conditions can be used to infer roles for genes of unknown function.

Results

We developed a search engine, the Multiple-Species Gene Recommender (MSGR), which scans gene expression datasets from multiple organisms to identify genes that participate in a genetic pathway. The MSGR takes a query consisting of a list of genes that function together in a genetic pathway from one of six organisms: Homo sapiens, Drosophila melanogaster, Caenorhabditis elegans, Saccharomyces cerevisiae, Arabidopsis thaliana, and Helicobacter pylori. Using a probabilistic method to merge searches, the MSGR identifies genes that are significantly coregulated with the query genes in one or more of those organisms. The MSGR achieves its highest accuracy for many human pathways when searches are combined across species. We describe specific examples in which new genes were identified to be involved in a neuromuscular signaling pathway and a cell-adhesion pathway.

Conclusion

The search engine can scan large collections of gene expression data for new genes that are significantly coregulated with a pathway of interest. By integrating searches across organisms, the MSGR can identify pathway members whose coregulation is either ancient or newly evolved.


© 1999-2009 BioMed Central Ltd unless otherwise stated. Part of Springer Science+Business Media.