Local search for the generalized tree alignment problem
Division of Invertebrate Zoology, American Museum of Natural History, New York, NY - 10024, USA
BMC Bioinformatics 2013, 14:66 doi:10.1186/1471-2105-14-66Published: 26 February 2013
A phylogeny postulates shared ancestry relationships among organisms in the form of a binary tree. Phylogenies attempt to answer an important question posed in biology: what are the ancestor-descendent relationships between organisms? At the core of every biological problem lies a phylogenetic component. The patterns that can be observed in nature are the product of complex interactions, constrained by the template that our ancestors provide. The problem of simultaneous tree and alignment estimation under Maximum Parsimony is known in combinatorial optimization as the Generalized Tree Alignment Problem (GTAP). The GTAP is the Steiner Tree Problem for the sequence edit distance. Like many biologically interesting problems, the GTAP is NP-Hard. Typically the Steiner Tree is presented under the Manhattan or the Hamming distances.
Experimentally, the accuracy of the GTAP has been subjected to evaluation. Results show that phylogenies selected using the GTAP from unaligned sequences are competitive with the best methods and algorithms available. Here, we implement and explore experimentally existing and new local search heuristics for the GTAP using simulated and real data.
The methods presented here improve by more than three orders of magnitude in execution time the best local search heuristics existing to date when applied to real data.