Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

This article is part of the supplement: UT-ORNL-KBRIN Bioinformatics Summit 2011

Open Access Open Badges Meeting abstract

categoryCompare: high-throughput data meta-analysis using gene annotations

Robert M Flight1, Jeffrey C Petruska1, Benjamin J Harrison1 and Eric C Rouchka2*

Author Affiliations

1 Department of Anatomical Sciences and Neurobiology, University of Louisville, Louisville, KY, 40292, USA

2 Department of Computer Engineering and Computer Science, University of Louisville, Louisville, KY, 40292, USA

For all author emails, please log on.

BMC Bioinformatics 2011, 12(Suppl 7):A16  doi:10.1186/1471-2105-12-S7-A16

The electronic version of this article is the complete one and can be found online at:

Published:5 August 2011

© 2011 Flight et al; licensee BioMed Central Ltd.

This is an open access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


Many current DNA microarray and other high-throughput data meta-analysis studies concentrate on deriving a concordant list of genes across many experiments to discover the "true" genes responsible for a particular disease process or biological pathway or cellular response (Figure 1A). However, by concentrating on the genes in common, similarities or differences that exist at a pathway or process level may be missed.

thumbnailFigure 1. A - Usual method of high-throughput experiment meta-analysis comparing gene lists (L1 and L2) directly. B - categoryCompare compares the gene lists on the basis of enriched annotations.


We describe a meta-analysis approach that allows comparison and contrast of gene lists at the level of categorical annotation (pathway or Gene Ontology annotations). This categorical evaluation compares enriched annotations between gene lists (Figure 1B), and displays the results graphically to allow intuitive visualization and exploration of the similarities and differences. False discovery correction via simulation is implemented to control for the effect of different sized gene lists as inputs.


The approach was tested using two gene lists, genes involved in the response to denervation in muscle (a literature compendium), and in skin (experimentally determined). Using the categorical comparison highlights known biological processes that are common in the two cases, while also allowing one to easily see areas of difference that are not apparent from examining the gene lists alone.


categoryCompare is available as a Bioconductor package, and a web interface (using RApache) has also been developed to facilitate use in the wider research community.


The authors gratefully acknowledge the Christopher and Dana Reeve Foundation, KY Spinal Cord and Head Injury Research Trust, the Paralyzed Veterans of America, and NIH Grants: P20RR016481-10, P20RR016481-09S1, P30ES014443-04 for funding.