A new method for modeling coalescent processes with recombination
- Equal contributors
1 Institute of Applied Mathematics, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China
2 Max Planck Independent Research Group on Population Genomics, Chinese Academy of Sciences and Max Planck Society (CAS-MPG) Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, China
3 School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, China
4 Department of Mathematics, School of Science, Beijing Jiaotong University, Beijing 100044, China
BMC Bioinformatics 2014, 15:273 doi:10.1186/1471-2105-15-273Published: 11 August 2014
Recombination plays an important role in the maintenance of genetic diversity in many types of organisms, especially diploid eukaryotes. Recombination can be studied and used to map diseases. However, recombination adds a great deal of complexity to the genetic information. This renders estimation of evolutionary parameters more difficult. After the coalescent process was formulated, models capable of describing recombination using graphs, such as ancestral recombination graphs (ARG) were also developed. There are two typical models based on which to simulate ARG: back-in-time model such as ms and spatial model including Wiuf&Hein’s, SMC, SMC’, and MaCS.
In this study, a new method of modeling coalescence with recombination, Spatial Coalescent simulator (SC), was developed, which considerably improved the algorithm described by Wiuf and Hein. The present algorithm constructs ARG spatially along the sequence, but it does not produce any redundant branches which are inevitable in Wiuf and Hein’s algorithm. Interestingly, the distribution of ARG generated by the present new algorithm is identical to that generated by a typical back-in-time model adopted by ms, an algorithm commonly used to model coalescence. It is here demonstrated that the existing approximate methods such as the sequentially Markov coalescent (SMC), a related method called SMC′, and Markovian coalescent simulator (MaCS) can be viewed as special cases of the present method. Using simulation analysis, the time to the most common ancestor (TMRCA) in the local trees of ARGs generated by the present algorithm was found to be closer to that produced by ms than time produced by MaCS. Sample-consistent ARGs can be generated using the present method. This may significantly reduce the computational burden.
In summary, the present method and algorithm may facilitate the estimation and description of recombination in population genomics and evolutionary biology.