This article is part of the supplement: The 2010 International Conference on Bioinformatics and Computational Biology (BIOCOMP 2010): Genomics
Parallel progressive multiple sequence alignment on reconfigurable meshes
1 Department of Information Technology, Clayton State University, Morrow, GA 30260, USA
2 Department of Computer Science, Georgia State University, Atlanta, GA 30303, USA
3 Department of Computer Science, Sun Yat-sen University, P.R.C
BMC Genomics 2011, 12(Suppl 5):S4 doi:10.1186/1471-2164-12-S5-S4Published: 23 December 2011
One of the most fundamental and challenging tasks in bio-informatics is to identify related sequences and their hidden biological significance. The most popular and proven best practice method to accomplish this task is aligning multiple sequences together. However, multiple sequence alignment is a computing extensive task. In addition, the advancement in DNA/RNA and Protein sequencing techniques has created a vast amount of sequences to be analyzed that exceeding the capability of traditional computing models. Therefore, an effective parallel multiple sequence alignment model capable of resolving these issues is in a great demand.
We design O(1) run-time solutions for both local and global dynamic programming pair-wise alignment algorithms on reconfigurable mesh computing model. To align m sequences with max length n, we combining the parallel pair-wise dynamic programming solutions with newly designed parallel components. We successfully reduce the progressive multiple sequence alignment algorithm's run-time complexity from O(m × n4) to O(m) using O(m × n3) processing units for scoring schemes that use three distinct values for match/mismatch/gap-extension. The general solution to multiple sequence alignment algorithm takes O(m × n4) processing units and completes in O(m) time.
To our knowledge, this is the first time the progressive multiple sequence alignment algorithm is completely parallelized with O(m) run-time. We also provide a new parallel algorithm for the Longest Common Subsequence (LCS) with O(1) run-time using O(n3) processing units. This is a big improvement over the current best constant-time algorithm that uses O(n4) processing units.