BMC Bioinformatics

official impact factor 3.03

Open Access Highly Access Methodology article

An accurate and interpretable model for siRNA efficacy prediction

Jean-Philippe Vert1*, Nicolas Foveau2, Christian Lajaunie1 and Yves Vandenbrouck2

Author Affiliations

1 Centre for Computational Biology, Ecole des Mines de Paris, 35 rue Saint-Honoré, 77300 Fontainebleau, France

2 Laboratoire de Biologie, Informatique, Mathématiques, Département Réponse et Dynamique Cellulaire, CEA Grenoble, 17 rue des Martyrs, 38054 Grenoble, France

For all author emails, please log on.

BMC Bioinformatics 2006, 7:520 doi:10.1186/1471-2105-7-520

Published: 30 November 2006

Abstract

Background

The use of exogenous small interfering RNAs (siRNAs) for gene silencing has quickly become a widespread molecular tool providing a powerful means for gene functional study and new drug target identification. Although considerable progress has been made recently in understanding how the RNAi pathway mediates gene silencing, the design of potent siRNAs remains challenging.

Results

We propose a simple linear model combining basic features of siRNA sequences for siRNA efficacy prediction. Trained and tested on a large dataset of siRNA sequences made recently available, it performs as well as more complex state-of-the-art models in terms of potency prediction accuracy, with the advantage of being directly interpretable. The analysis of this linear model allows us to detect and quantify the effect of nucleotide preferences at particular positions, including previously known and new observations. We also detect and quantify a strong propensity of potent siRNAs to contain short asymmetric motifs in their sequence, and show that, surprisingly, these motifs alone contain at least as much relevant information for potency prediction as the nucleotide preferences for particular positions.

Conclusion

The model proposed for prediction of siRNA potency is as accurate as a state-of-the-art nonlinear model and is easily interpretable in terms of biological features. It is freely available on the web at http://cbio.ensmp.fr/dsir webcite