Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Methodology article

Local search for the generalized tree alignment problem

Andrés Varón and Ward C Wheeler*

Author Affiliations

Division of Invertebrate Zoology, American Museum of Natural History, New York, NY - 10024, USA

For all author emails, please log on.

BMC Bioinformatics 2013, 14:66  doi:10.1186/1471-2105-14-66

Published: 26 February 2013

Abstract

Background

A phylogeny postulates shared ancestry relationships among organisms in the form of a binary tree. Phylogenies attempt to answer an important question posed in biology: what are the ancestor-descendent relationships between organisms? At the core of every biological problem lies a phylogenetic component. The patterns that can be observed in nature are the product of complex interactions, constrained by the template that our ancestors provide. The problem of simultaneous tree and alignment estimation under Maximum Parsimony is known in combinatorial optimization as the Generalized Tree Alignment Problem (GTAP). The GTAP is the Steiner Tree Problem for the sequence edit distance. Like many biologically interesting problems, the GTAP is NP-Hard. Typically the Steiner Tree is presented under the Manhattan or the Hamming distances.

Results

Experimentally, the accuracy of the GTAP has been subjected to evaluation. Results show that phylogenies selected using the GTAP from unaligned sequences are competitive with the best methods and algorithms available. Here, we implement and explore experimentally existing and new local search heuristics for the GTAP using simulated and real data.

Conclusions

The methods presented here improve by more than three orders of magnitude in execution time the best local search heuristics existing to date when applied to real data.

Keywords:
Tree alignment; Tree search; Phylogeny; Sequence alignment; Direct optimization