Simulating gene trees under the multispecies coalescent and time-dependent migration
Citation and License
BMC Evolutionary Biology 2013, 13:44 doi:10.1186/1471-2148-13-44Published: 18 February 2013
The multispecies coalescent model has become popular in recent years as a framework to infer a species phylogeny from multilocus genetic data collected from multiple individuals. The model assumes that speciation occurs at a specific point in time, after which the two sub-species evolve in total isolation. However in reality speciation may occur over an extended period of time, during which sister lineages remain in partial contact. Inference of multispecies phylogenies under those conditions is difficult. Indeed even designing simulators which correctly sample gene histories under these conditions is non-trivial.
In this paper we present a method and software which simulates gene trees under both the multispecies coalescent and migration. Our approach allows for both population sizes and migration rates to change over the species lifetime. Also, migration rates are specified in units of fraction of emigrants per time unit, which makes them easier to interpret. Overall this setup covers a wide range of migration scenarios. The software can be used to investigate properties of gene trees under different migration settings and to generate test cases for programs which infer species trees and/or migration from sequence data. Using simulated data we investigate the effect of migrations between sister lineages on the inference of multispecies phylogenies and on post analysis detection.
Our results indicate that while estimation of species tree topology can be quite robust to the presence of gene flow, the inference and detection of migration is problematic, even with methods based on full likelihood models.