BMC Bioinformatics
|
Viewing options:Associated material:Related literature:- Articles citing this article
- Other articles by authors
- Related articles/pages
Tools:Post to:
|
 Methodology articleGene Vector Analysis (Geneva): A unified method to detect differentially-regulated gene sets and similar microarray experimentsStephen W Tanner1 and Pankaj Agarwal2  1
Bioinformatics program, University of California, San Diego, La Jolla, CA 92093-0419, USA 2
Computational Biology, GlaxoSmithKline Pharmaceuticals R&D, 709 Swedeland Road, UW2230, King of Prussia, PA 19406-0939, USA author email corresponding author email
BMC Bioinformatics 2008,
9:348doi:10.1186/1471-2105-9-348
|
|
| Published: |
22 August 2008 |
Abstract
Background
Microarray experiments measure changes in the expression of thousands of genes. The resulting lists of genes with changes in expression are then searched for biologically related sets using several divergent methods such as the Fisher Exact Test (as used in multiple GO enrichment tools), Parametric Analysis of Gene Expression (PAGE), Gene Set Enrichment Analysis (GSEA), and the connectivity map.
Results
We describe an analytical method (Geneva: Gene Vector Analysis) to relate genes to biological properties and to other similar experiments in a uniform way. This new method works on both gene sets and on gene lists/vectors as input queries, and can effectively query databases consisting of sets of biologically related sets, or of results from other microarray experiments. We also present an improvement to the null model estimate by using the empirical background distribution drawn from previous experiments. We validated Geneva by rediscovering a number of previous findings, and by finding significant relationships within microarrays in the GEO repository.
Conclusion
Provided a reasonable corpus of previous experiments is available, this method is more accurate than the class label permutation model, especially for data sets with limited number of replicates. Geneva is, moreover, computationally faster because the background distributions can be precomputed. We also provide a standard evaluation data set based on 5 pairs of related experiments that should share similar functional relationships and 28 pairs of unrelated experiments from GEO. Discovering relationships amongst GEO data sets has implications for drug repositioning, and understanding relationships between diseases and drugs. |