Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Methodology article

Directed acyclic graph kernels for structural RNA analysis

Kengo Sato123*, Toutai Mituyama2, Kiyoshi Asai24 and Yasubumi Sakakibara23

Author Affiliations

1 Japan Biological Informatics Consortium (JBIC), 2-45 Aomi, Koto-ku, Tokyo 135-8073, Japan

2 Computational Biology Research Center (CBRC), National Institute of Advanced Industrial Science and Technology (AIST), 2-42 Aomi, Koto-ku, Tokyo 135-0064, Japan

3 Department of Biosciences and Informatics, Keio University, 3-14-1 Hiyoshi, Kohoku-ku, Yokohama, Kanagawa 223-8522, Japan

4 Department of Computational Biology, Graduate School of Frontier Sciences, The University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, Chiba 277-8561, Japan

For all author emails, please log on.

BMC Bioinformatics 2008, 9:318  doi:10.1186/1471-2105-9-318

Published: 22 July 2008

Abstract

Background

Recent discoveries of a large variety of important roles for non-coding RNAs (ncRNAs) have been reported by numerous researchers. In order to analyze ncRNAs by kernel methods including support vector machines, we propose stem kernels as an extension of string kernels for measuring the similarities between two RNA sequences from the viewpoint of secondary structures. However, applying stem kernels directly to large data sets of ncRNAs is impractical due to their computational complexity.

Results

We have developed a new technique based on directed acyclic graphs (DAGs) derived from base-pairing probability matrices of RNA sequences that significantly increases the computation speed of stem kernels. Furthermore, we propose profile-profile stem kernels for multiple alignments of RNA sequences which utilize base-pairing probability matrices for multiple alignments instead of those for individual sequences. Our kernels outperformed the existing methods with respect to the detection of known ncRNAs and kernel hierarchical clustering.

Conclusion

Stem kernels can be utilized as a reliable similarity measure of structural RNAs, and can be used in various kernel-based applications.