Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Research article

PMS5: an efficient exact algorithm for the (ℓ, d)-motif finding problem

Hieu Dinh, Sanguthevar Rajasekaran* and Vamsi K Kundeti

Author Affiliations

Department of CSE, University of Connecticut, Storrs, CT 06269, USA

For all author emails, please log on.

BMC Bioinformatics 2011, 12:410  doi:10.1186/1471-2105-12-410

Published: 24 October 2011

Abstract

Background

Motifs are patterns found in biological sequences that are vital for understanding gene function, human disease, drug design, etc. They are helpful in finding transcriptional regulatory elements, transcription factor binding sites, and so on. As a result, the problem of identifying motifs is very crucial in biology.

Results

Many facets of the motif search problem have been identified in the literature. One of them is (ℓ, d)-motif search (or Planted Motif Search (PMS)). The PMS problem has been well investigated and shown to be NP-hard. Any algorithm for PMS that always finds all the (ℓ, d)-motifs on a given input set is called an exact algorithm. In this paper we focus on exact algorithms only. All the known exact algorithms for PMS take exponential time in some of the underlying parameters in the worst case scenario. But it does not mean that we cannot design exact algorithms for solving practical instances within a reasonable amount of time. In this paper, we propose a fast algorithm that can solve the well-known challenging instances of PMS: (21, 8) and (23, 9). No prior exact algorithm could solve these instances. In particular, our proposed algorithm takes about 10 hours on the challenging instance (21, 8) and about 54 hours on the challenging instance (23, 9). The algorithm has been run on a single 2.4GHz PC with 3GB RAM. The implementation of PMS5 is freely available on the web at http://www.pms.engr.uconn.edu/downloads/PMS5.zip webcite.

Conclusions

We present an efficient algorithm PMS5 that uses some novel ideas and combines them with well-known algorithm PMS1 and PMSPrune. PMS5 can tackle the large challenging instances (21, 8) and (23, 9). Therefore, we hope that PMS5 will help biologists discover longer motifs in the futures.