Open Access Highly Accessed Software

GECKO: a complete large-scale gene expression analysis platform

Joachim Theilhaber1*, Anatoly Ulyanov1, Anish Malanthara2, Jack Cole3, Dapeng Xu1, Robert Nahf4, Michael Heuer5, Christoph Brockel1 and Steven Bushnell1

Author Affiliations

1 Cambridge Genomics Center, Sanofi-Aventis, 26 Landsdowne Street, Cambridge, MA 02139, USA

2 Sanofi-Aventis, Genomics and Scientific Computation, Route 202–206, Bridgewater, NJ 08807, USA

3 Fast Gun Software, Inc., 180 Myrtle St., Wrentham MA, 02093, USA

4 Sanofi-Aventis Tucson Selectide, 1580 E. Hanley Blvd., Tucson, AZ 85737, USA

5 Center for Computational Genomics and Bioinformatics, University of Minnesota, 426 Church Street SE, Minneapolis, MN 55455, USA

For all author emails, please log on.

BMC Bioinformatics 2004, 5:195  doi:10.1186/1471-2105-5-195

Published: 10 December 2004



Gecko (Gene Expression: Computation and Knowledge Organization) is a complete, high-capacity centralized gene expression analysis system, developed in response to the needs of a distributed user community.


Based on a client-server architecture, with a centralized repository of typically many tens of thousands of Affymetrix scans, Gecko includes automatic processing pipelines for uploading data from remote sites, a data base, a computational engine implementing ~ 50 different analysis tools, and a client application. Among available analysis tools are clustering methods, principal component analysis, supervised classification including feature selection and cross-validation, multi-factorial ANOVA, statistical contrast calculations, and various post-processing tools for extracting data at given error rates or significance levels. On account of its open architecture, Gecko also allows for the integration of new algorithms. The Gecko framework is very general: non-Affymetrix and non-gene expression data can be analyzed as well. A unique feature of the Gecko architecture is the concept of the Analysis Tree (actually, a directed acyclic graph), in which all successive results in ongoing analyses are saved. This approach has proven invaluable in allowing a large (~ 100 users) and distributed community to share results, and to repeatedly return over a span of years to older and potentially very complex analyses of gene expression data.


The Gecko system is being made publicly available as free software webcite. In totality or in parts, the Gecko framework should prove useful to users and system developers with a broad range of analysis needs.