Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Software

ACG: rapid inference of population history from recombining nucleotide sequences

Brendan D O'Fallon

Author affiliations

ARUP Labs, 500 Chipeta Way, Salt Lake City, UT, 84108, USA

Citation and License

BMC Bioinformatics 2013, 14:40  doi:10.1186/1471-2105-14-40

Published: 5 February 2013

Abstract

Background

Reconstruction of population history from genetic data often requires Monte Carlo integration over the genealogy of the samples. Among tools that perform such computations, few are able to consider genetic histories including recombination events, precluding their use on most alignments of nuclear DNA. Explicit consideration of recombinations requires modeling the history of the sequences with an Ancestral Recombination Graph (ARG) in place of a simple tree, which presents significant computational challenges.

Results

ACG is an extensible desktop application that uses a Bayesian Markov chain Monte Carlo procedure to estimate the posterior likelihood of an evolutionary model conditional on an alignment of genetic data. The ancestry of the sequences is represented by an ARG, which is estimated from the data with other model parameters. Importantly, ACG computes the full, Felsenstein likelihood of the ARG, not a pairwise or composite likelihood. Several strategies are used to speed computations, and ACG is roughly 100x faster than a similar, recombination-aware program.

Conclusions

Modeling the ancestry of the sequences with an ARG allows ACG to estimate the evolutionary history of recombining nucleotide sequences. ACG can accurately estimate the posterior distribution of population parameters such as the (scaled) population size and recombination rate, as well as many aspects of the recombinant history, including the positions of recombination breakpoints, the distribution of time to most recent common ancestor along the sequence, and the non-recombining trees at individual sites. Multiple substitution models and population size models are provided. ACG also provides a richly informative graphical interface that allows users to view the evolution of model parameters and likelihoods in real time.

Keywords:
Coalescent; ARG; Ancestral recombination graph; Bayesian inference; Recombination