Open Access Open Badges Technical Note

Human gene correlation analysis (HGCA): A tool for the identification of transcriptionally co-expressed genes

Ioannis Michalopoulos1*, Georgios A Pavlopoulos2, Apostolos Malatras13, Alexandros Karelas1, Myrto-Areti Kostadima45, Reinhard Schneider67 and Sophia Kossida5

Author Affiliations

1 Cryobiology of Stem Cells, Centre of Immunology and Transplantation, Biomedical Research Foundation, Academy of Athens, Soranou Efessiou 4, Athens, 11527, Greece

2 ESAT-SCD/IBBT-K.U. Leuven Future Health Department, Katholieke Universiteit Leuven, Kasteelpark Arenberg 10, Heverlee-Leuven, 3001, Belgium

3 Department of Cell Biology and Biophysics, Faculty of Biology, University of Athens, Panepistimiopolis, Athens, 15701, Greece

4 Wellcome Trust Genome Campus, European Bioinformatics Institute, Cambridge, CB10 1SD, United Kingdom

5 Bioinformatics & Medical Informatics Team, Biomedical Research Foundation, Academy of Athens, Soranou Efessiou 4, Athens, 11527, Greece

6 Structural and Computational Biology Unit, European Molecular Biology Laboratory, Meyerhofstrasse 1, Heidelberg, 69117, Germany

7 Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, Campus Belval, avenue des Hauts-Fourneaux 7, Esch sur Alzette, L-4362, Luxembourg

For all author emails, please log on.

BMC Research Notes 2012, 5:265  doi:10.1186/1756-0500-5-265

Published: 6 June 2012



Bioinformatics and high-throughput technologies such as microarray studies allow the measure of the expression levels of large numbers of genes simultaneously, thus helping us to understand the molecular mechanisms of various biological processes in a cell.


We calculate the Pearson Correlation Coefficient (r-value) between probe set signal values from Affymetrix Human Genome Microarray samples and cluster the human genes according to the r-value correlation matrix using the Neighbour Joining (NJ) clustering method. A hyper-geometric distribution is applied on the text annotations of the probe sets to quantify the term overrepresentations. The aim of the tool is the identification of closely correlated genes for a given gene of interest and/or the prediction of its biological function, which is based on the annotations of the respective gene cluster.


Human Gene Correlation Analysis (HGCA) is a tool to classify human genes according to their coexpression levels and to identify overrepresented annotation terms in correlated gene groups. It is available at: webcite.

Microarray analysis; Gene annotation; Gene coexpression; Functional annotation