Log on / register
Feedback | Support | My details
Open AccessHighly AccessMethodology article

Fully Bayesian tests of neutrality using genealogical summary statistics

Alexei J Drummond1,2 email and Marc A Suchard3,4 email

Bioinformatics Institute, University of Auckland, Private Bag 92019, Auckland, New Zealand

Department of Computer Science, University of Auckland, Private Bag 92019, Auckland, New Zealand

Departments of Biomathematics and Human Genetics, David Geffen School of Medicine at UCLA, Los Angeles, California, USA

Department of Biostatistics, UCLA School of Public Health, Los Angeles, California, USA

author email corresponding author email

BMC Genetics 2008, 9:68doi:10.1186/1471-2156-9-68

Published: 31 October 2008

Abstract

Background

Many data summary statistics have been developed to detect departures from neutral expectations of evolutionary models. However questions about the neutrality of the evolution of genetic loci within natural populations remain difficult to assess. One critical cause of this difficulty is that most methods for testing neutrality make simplifying assumptions simultaneously about the mutational model and the population size model. Consequentially, rejecting the null hypothesis of neutrality under these methods could result from violations of either or both assumptions, making interpretation troublesome.

Results

Here we harness posterior predictive simulation to exploit summary statistics of both the data and model parameters to test the goodness-of-fit of standard models of evolution. We apply the method to test the selective neutrality of molecular evolution in non-recombining gene genealogies and we demonstrate the utility of our method on four real data sets, identifying significant departures of neutrality in human influenza A virus, even after controlling for variation in population size.

Conclusion

Importantly, by employing a full model-based Bayesian analysis, our method separates the effects of demography from the effects of selection. The method also allows multiple summary statistics to be used in concert, thus potentially increasing sensitivity. Furthermore, our method remains useful in situations where analytical expectations and variances of summary statistics are not available. This aspect has great potential for the analysis of temporally spaced data, an expanding area previously ignored for limited availability of theory and methods.


© 1999-2009 BioMed Central Ltd unless otherwise stated. Part of Springer Science+Business Media.