Predicting functional upstream open reading frames in Saccharomyces cerevisiae
1 Department of Applied Mechanics, Chalmers University of Technology, SE-412 96 Göteborg, Sweden
2 School of Computing, Science and Engineering, University of Salford, Salford, M5 4WT, UK
3 Department of Computer Science and Engineering, Chalmers University of Technology, SE-412 96 Göteborg, Sweden
4 Department of Mathematical Sciences, Chalmers University of Technology and the University of Gothenburg, SE-412 96 Göteborg, Sweden
5 Department of Zoology, University of Gothenburg, Box 463, SE-405 30 Göteborg, Sweden
6 Department of Cell and Molecular Biology, Lundberg Laboratory, University of Gothenburg, PO BOX 462, SE-405 30 Göteborg, Sweden
BMC Bioinformatics 2009, 10:451 doi:10.1186/1471-2105-10-451Published: 30 December 2009
Some upstream open reading frames (uORFs) regulate gene expression (i.e., they are functional) and can play key roles in keeping organisms healthy. However, how uORFs are involved in gene regulation is not yet fully understood. In order to get a complete view of how uORFs are involved in gene regulation, it is expected that a large number of experimentally verified functional uORFs are needed. Unfortunately, wet-experiments to verify that uORFs are functional are expensive.
In this paper, a new computational approach to predicting functional uORFs in the yeast Saccharomyces cerevisiae is presented. Our approach is based on inductive logic programming and makes use of a novel combination of knowledge about biological conservation, Gene Ontology annotations and genes' responses to different conditions. Our method results in a set of simple and informative hypotheses with an estimated sensitivity of 76%. The hypotheses predict 301 further genes to have 398 novel functional uORFs. Three (RPC11, TPK1, and FOL1) of these 301 genes have been hypothesised, following wet-experiments, by a related study to have functional uORFs. A comparison with another related study suggests that eleven of the predicted functional uORFs from genes LDB17, HEM3, CIN8, BCK2, PMC1, FAS1, APP1, ACC1, CKA2, SUR1, and ATH1 are strong candidates for wet-lab experimental studies.
Learning based prediction of functional uORFs can be done with a high sensitivity. The predictions made in this study can serve as a list of candidates for subsequent wet-lab verification and might help to elucidate the regulatory roles of uORFs.