Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

This article is part of the supplement: Selected Proceedings of the First Summit on Translational Bioinformatics 2008

Open Access Proceedings

Identifying hypothetical genetic influences on complex disease phenotypes

Benjamin J Keller13* and Richard C McEachin23

Author Affiliations

1 Eastern Michigan University, Computer Science Department, Ypsilanti, MI 48197, USA

2 Department of Psychiatry, University of Michigan, Ann Arbor, MI 48109, USA

3 National Center for Integrative Biomedical Informatics, Ann Arbor, MI 48109, USA

For all author emails, please log on.

BMC Bioinformatics 2009, 10(Suppl 2):S13  doi:10.1186/1471-2105-10-S2-S13

Published: 5 February 2009



Statistical interactions between disease-associated loci of complex genetic diseases suggest that genes from these regions are involved in a common mechanism impacting, or impacted by, the disease. The computational problem we address is to discover relationships among genes from these interacting regions that may explain the observed statistical interaction and the role of these genes in the disease phenotype.


We describe a heuristic algorithm for generating hypothetical gene relationships from loci associated with a complex disease phenotype. This approach, called Prioritizing Disease Genes by Analysis of Common Elements (PDG-ACE), mines biomedical keywords from text descriptions of genes and uses them to relate genes close to disease-associated loci. A keyword common to, and significantly over-represented in, a pair of gene descriptions may represent a preliminary hypothesis about the biological relationship between the genes, and suggest the role the genes play in the disease phenotype.


Our experimentation shows that the approach finds previously published relationships, while failing to find relationships that don't exist. The results also indicate that the approach is robust to differences in keyword vocabulary. We outline a brief case study in which results from a recently published Type 2 Diabetes association study are used to identify potential hypotheses.