Email updates

Keep up to date with the latest news and content from BMC Research Notes and BioMed Central.

Open Access Data Note

An efficient algorithm for systematic analysis of nucleotide strings suitable for siRNA design

Ancha Baranova12*, Jonathan Bode3, Ganiraju Manyam14 and Maria Emelianenko3

Author Affiliations

1 School of Systems Biology, George Mason University, Fairfax VA, USA

2 Research Center for Medical Genetics, RAMS, Moskvorechie Str., 1, Moscow, Russian Federation

3 Department of Mathematical Sciences, George Mason University, Fairfax VA 22030, USA

4 Bioinformatics & Computational Biology Dept, The UT MD Anderson Cancer Center, Houston, TX, USA

For all author emails, please log on.

BMC Research Notes 2011, 4:168  doi:10.1186/1756-0500-4-168

Published: 27 May 2011

Abstract

Background

The "off-target" silencing effect hinders the development of siRNA-based therapeutic and research applications. Existing solutions for finding possible locations of siRNA seats within a large database of genes are either too slow, miss a portion of the targets, or are simply not designed to handle a very large number of queries. We propose a new approach that reduces the computational time as compared to existing techniques.

Findings

The proposed method employs tree-based storage in a form of a modified truncated suffix tree to sort all possible short string substrings within given set of strings (i.e. transcriptome). Using the new algorithm, we pre-computed a list of the best siRNA locations within each human gene ("siRNA seats"). siRNAs designed to reside within siRNA seats are less likely to hybridize off-target. These siRNA seats could be used as an input for the traditional "set-of-rules" type of siRNA designing software. The list of siRNA seats is available through a publicly available database located at http://web.cos.gmu.edu/~gmanyam/siRNA_db/search.php webcite

Conclusions

In attempt to perform top-down prediction of the human siRNA with minimized off-target hybridization, we developed an efficient algorithm that employs suffix tree based storage of the substrings. Applications of this approach are not limited to optimal siRNA design, but can also be useful for other tasks involving selection of the characteristic strings specific to individual genes. These strings could then be used as siRNA seats, as specific probes for gene expression studies by oligonucleotide-based microarrays, for the design of molecular beacon probes for Real-Time PCR and, generally, any type of PCR primers.