System architecture. The system is composed of four major parts, including heterogeneous biological network integration, seed nodes selection, pathways identification, and differential expression analysis. The large integrated biological network was constructed and stored in MySQL database. By stripping away unambiguous vertices according to the genes' official symbols and the duplicated interactions between them, the k-shortest path algorithm could be implemented to obtain the shortest pathways for given seed nodes. The seed nodes are particular nodes given by users or selected from transcription factors, and paths between them are identified by the k-shortest path algorithm. The identified pathways were scored using gene expression values as metrics for weighted edges. Finally, the top scoring n pathways were selected and further analyzed.
Chao et al. BMC Medical Genomics 2011 4:23 doi:10.1186/1755-8794-4-23