A practical approximation algorithm for solving massive instances of hybridization number for binary and nonbinary trees
1 Centrum Wiskunde & Informatica (CWI), P.O. Box 94079, 1090 GB, Amsterdam, The Netherlands
2 Department of Knowledge Engineering (DKE), Maastricht University, P.O. Box 616, 6200 MD, Maastricht, The Netherlands
3 ISEM, CNRS – Université Montpellier II, Place Eugène Bataillon, 34095 Montpellier, France
BMC Bioinformatics 2014, 15:127 doi:10.1186/1471-2105-15-127Published: 5 May 2014
Reticulate events play an important role in determining evolutionary relationships. The problem of computing the minimum number of such events to explain discordance between two phylogenetic trees is a hard computational problem. Even for binary trees, exact solvers struggle to solve instances with reticulation number larger than 40-50.
Here we present CYCLEKILLER and NONBINARYCYCLEKILLER, the first methods to produce solutions verifiably close to optimality for instances with hundreds or even thousands of reticulations.
Using simulations, we demonstrate that these algorithms run quickly for large and difficult instances, producing solutions that are very close to optimality. As a spin-off from our simulations we also present TERMINUSEST, which is the fastest exact method currently available that can handle nonbinary trees: this is used to measure the accuracy of the NONBINARYCYCLEKILLER algorithm. All three methods are based on extensions of previous theoretical work (SIDMA 26(4):1635-1656, TCBB 10(1):18-25, SIDMA 28(1):49-66) and are publicly available. We also apply our methods to real data.