Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

This article is part of the supplement: UT-ORNL-KBRIN Bioinformatics Summit 2010

Open Access Open Badges Poster presentation

Serendipitous discoveries in microarray analysis

Sally R Ellingson1*, Charles A Phillips2, Randy Glenn3, Douglas Swanson3, Thomas Ha3, Daniel Goldowitz3 and Michael A Langston2

Author Affiliations

1 Genome Science and Technology Program, University of Tennessee,Knoxville, TN 37996, USA

2 Department of Electrical Engineering and Computer Science, University of Tennessee, Knoxville, TN 37996, USA

3 Centre for Molecular Medicine and Therapeutics, University of British Columbia, Vancouver, BC V5Z 4H4, Canada

For all author emails, please log on.

BMC Bioinformatics 2010, 11(Suppl 4):P24  doi:10.1186/1471-2105-11-S4-P24

The electronic version of this article is the complete one and can be found online at:

Published:23 July 2010

© 2010 Ellingson et al; licensee BioMed Central Ltd.


Scientists are capable of performing very large scale gene expression experiments with current microarray technologies. In order to find significance in the expression data, it is common to use clustering algorithms to group genes with similar expression patterns. Clusters will often contain related genes, such as co-regulated genes or genes in the same biological pathway. It is too expensive and time consuming to test all of the relationships found in large scale microarray experiments. There are many bioinformatics tools that can be used to infer the significance of microarray experiments and cluster analysis.

Materials and methods

In this project we review several existing tools and used a combination of them to narrow down the number of significant clusters from a microarray experiment. Microarray data was obtained through the Cerebellar Gene Regulation in Time and Space (Cb GRiTS) database [2]. The data was clustered using paraclique, a graph-based clustering algorithm. Each cluster was evaluated using Gene-Set Cohesion Analysis Tool (GCAT) [3], ONTO-Pathway Analysis [4], and Allen Brain Atlas data [1]. The clusters with the lowest p-values in each of the three analysis methods were researched to determine good candidate clusters for further experimental confirmation of gene relationships.

Results and conclusion

While looking for genes important to cerebellar development, we serendipitously came across interesting clusters related to neural diseases. For example, we found two clusters that contain genes known to be associated with Parkinson’s disease, Huntington’s disease, and Alzheimer’s disease pathways. Both clusters scored low in all three analyses and have very similar expression patterns but at different expression levels. Such unexpected discoveries help unlock the real power of high throughput data analysis.


  1. Allen Brain Atlas: Home. [] webcite

    Allen Institute for Brain Science. Web 2009. OpenURL

  2. Cb GRiTS Database. [] webcite

    Web 2009. OpenURL

  3. GCAT: Gene-set Cohesion Analysis Tool. [] webcite

    The University of Memphis. Web 2009. OpenURL

  4. Intelligent Systems and Bioinformatics Laboratory. [] webcite

    Web 2009. OpenURL