Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

This article is part of the supplement: Proceedings of the 21st International Conference on Genome Informatics (GIW2010)

Open Access Research

RecMotif: a novel fast algorithm for weak motif discovery

He Quan Sun*, Malcolm Yoke Hean Low, Wen Jing Hsu and Jagath C Rajapakse

Author Affiliations

School of Computer Engineering, Nanyang Technological University, 639798, Singapore

For all author emails, please log on.

BMC Bioinformatics 2010, 11(Suppl 11):S8  doi:10.1186/1471-2105-11-S11-S8

Published: 14 December 2010

Abstract

Background

Weak motif discovery in DNA sequences is an important but unresolved problem in computational biology. Previous algorithms that aimed to solve the problem usually require a large amount of memory or execution time. In this paper, we proposed a fast and memory efficient algorithm, RecMotif, which guarantees to discover all motifs with specific (l, d) settings (where l is the motif length and d is the maximum number of mutations between a motif instance and the true motif).

Results

Comparisons with several recently proposed algorithms have shown that RecMotif is more scalable for handling longer and weaker motifs. For instance, it can solve the open challenge cases such as (40, 14) within 5 hours while the other algorithms compared failed due to either longer execution times or shortage of memory space. For real biological sequences, such as E.coli CRP, RecMotif is able to accurately discover the motif instances with (l, d) as (18, 6) in less than 1 second, which is faster than the other algorithms compared.

Conclusions

RecMotif is a novel algorithm that requires only a space complexity of O(m2n) (where m is the number of sequences in the data and n is the length of the sequences).