Log on / register
Feedback | Support | My details
Open AccessHighly AccessResearch article

Clique-based data mining for related genes in a biomedical database

Tsutomu Matsunaga1 email, Chikara Yonemori1 email, Etsuji Tomita2,3 email and Masaaki Muramatsu4,5 email

1Research and Development Headquarters, NTT DATA Corporation, Tokyo, 135-8671, Japan

2The Advanced Algorithms Research Laboratory, The University of Electro-Communications, Tokyo, 182-8585, Japan

3Research and Development Initiative, Chuo University, Tokyo, 112-8551, Japan

4Medical Research Institute, Tokyo Medical and Dental University, Tokyo, 101-0062, Japan

5Research Institute, HuBit Genomix Inc, Tokyo, 102-0092, Japan

author email corresponding author email

BMC Bioinformatics 2009, 10:205doi:10.1186/1471-2105-10-205

Published: 1 July 2009

Abstract

Background

Progress in the life sciences cannot be made without integrating biomedical knowledge on numerous genes in order to help formulate hypotheses on the genetic mechanisms behind various biological phenomena, including diseases. There is thus a strong need for a way to automatically and comprehensively search from biomedical databases for related genes, such as genes in the same families and genes encoding components of the same pathways. Here we address the extraction of related genes by searching for densely-connected subgraphs, which are modeled as cliques, in a biomedical relational graph.

Results

We constructed a graph whose nodes were gene or disease pages, and edges were the hyperlink connections between those pages in the Online Mendelian Inheritance in Man (OMIM) database. We obtained over 20,000 sets of related genes (called 'gene modules') by enumerating cliques computationally. The modules included genes in the same family, genes for proteins that form a complex, and genes for components of the same signaling pathway. The results of experiments using 'metabolic syndrome'-related gene modules show that the gene modules can be used to get a coherent holistic picture helpful for interpreting relations among genes.

Conclusion

We presented a data mining approach extracting related genes by enumerating cliques. The extracted gene sets provide a holistic picture useful for comprehending complex disease mechanisms.


© 1999-2009 BioMed Central Ltd unless otherwise stated. Part of Springer Science+Business Media.