An overview of the SegMine methodology. The four main steps of the SegMine methodology: data preprocessing, Identification of differentially expressed gene sets, clustering of rules describing differentially expressed gene sets, and link discovery, graph visualization and exploration. The data preprocesing step (1) takes normalized microarray data as the input, and results in a ranked list of genes. Identification of differentially expressed gene sets (2) is performed by the SEGS algorithm, which makes use of the GO and KEGG ontologies and Entrez interactions to construct gene sets using SEGS operators, hierarchy information, and solution space search parameters. Rules composed of ontology terms describing gene sets that SEGS found to be statistically significant according to three enrichment tests are sent to the agglomerative hierarchical clustering component (3), which enables grouping of similar and separation of different rules. Finally, link discovery and graph visualization (4) is provided by Biomine, which can perform neighbourhood search as well as search for connections between two query sets. Note that SegMine supports the construction of Biomine queries composed of individual genes, gene sets, ontology terms, rules composed of these terms or even whole clusters.
Podpečan et al. BMC Bioinformatics 2011 12:416 doi:10.1186/1471-2105-12-416