Predicting genetic interactions with random walks on biological networks
1 Biomolecular Science and Engineering Program, UC Santa Barbara, Santa Barbara, CA, USA
2 Department of Computer Science, UC Santa Barbara, Santa Barbara, CA, USA
BMC Bioinformatics 2009, 10:17 doi:10.1186/1471-2105-10-17Published: 12 January 2009
Several studies have demonstrated that synthetic lethal genetic interactions between gene mutations provide an indication of functional redundancy between molecular complexes and pathways. These observations help explain the finding that organisms are able to tolerate single gene deletions for a large majority of genes. For example, system-wide gene knockout/knockdown studies in S. cerevisiae and C. elegans revealed non-viable phenotypes for a mere 18% and 10% of the genome, respectively. It has been postulated that the low percentage of essential genes reflects the extensive amount of genetic buffering that occurs within genomes. Consistent with this hypothesis, systematic double-knockout screens in S. cerevisiae and C. elegans show that, on average, 0.5% of tested gene pairs are synthetic sick or synthetic lethal. While knowledge of synthetic lethal interactions provides valuable insight into molecular functionality, testing all combinations of gene pairs represents a daunting task for molecular biologists, as the combinatorial nature of these relationships imposes a large experimental burden. Still, the task of mapping pairwise interactions between genes is essential to discovering functional relationships between molecular complexes and pathways, as they form the basis of genetic robustness. Towards the goal of alleviating the experimental workload, computational techniques that accurately predict genetic interactions can potentially aid in targeting the most likely candidate interactions. Building on previous studies that analyzed properties of network topology to predict genetic interactions, we apply random walks on biological networks to accurately predict pairwise genetic interactions. Furthermore, we incorporate all published non-interactions into our algorithm for measuring the topological relatedness between two genes. We apply our method to S. cerevisiae and C. elegans datasets and, using a decision tree classifier, integrate diverse biological networks and show that our method outperforms established methods.
By applying random walks on biological networks, we were able to predict synthetic lethal interactions at a true positive rate of 95 percent against a false positive rate of 10 percent in S. cerevisiae. Similarly, in C. elegans, we achieved a true positive rate of 95 against a false positive rate of 7 percent. Furthermore, we demonstrate that the inclusion of non-interacting gene pairs results in a considerable performance improvement.
We presented a method based on random walks that accurately captures aspects of network topology towards the goal of classifying potential genetic interactions as either synthetic lethal or non-interacting. Our method, which is generalizable to all types of biological networks, is likely to perform well with limited information, as estimated by holding out large portions of the synthetic lethal interactions and non-interactions.