This article is part of the supplement: A Semantic Web for Bioinformatics: Goals, Tools, Systems, Applications
GraphFind: enhancing graph searching by low support data mining techniques
- Equal contributors
1 Dipartimento di Matematica e Informatica, Università di Catania, Catania, 95125, Italy
2 Dipartimento di Scienze Biomediche, Università di Catania, Catania, 95125, Italy
3 Courant Institute of Mathematical Sciences, New York University, New York, 10012, USA
BMC Bioinformatics 2008, 9(Suppl 4):S10 doi:10.1186/1471-2105-9-S4-S10Published: 25 April 2008
Biomedical and chemical databases are large and rapidly growing in size. Graphs naturally model such kinds of data. To fully exploit the wealth of information in these graph databases, a key role is played by systems that search for all exact or approximate occurrences of a query graph. To deal efficiently with graph searching, advanced methods for indexing, representation and matching of graphs have been proposed.
This paper presents GraphFind. The system implements efficient graph searching algorithms together with advanced filtering techniques that allow approximate search. It allows users to select candidate subgraphs rather than entire graphs. It implements an effective data storage based also on low-support data mining.
GraphFind is compared with Frowns, GraphGrep and gIndex. Experiments show that GraphFind outperforms the compared systems on a very large collection of small graphs. The proposed low-support mining technique which applies to any searching system also allows a significant index space reduction.