Email updates

Keep up to date with the latest news and content from BMC Genetics and BioMed Central.

Open Access Software

Fast "coalescent" simulation

Paul Marjoram1* and Jeff D Wall2

Author Affiliations

1 Department of Preventive Medicine, University of Southern California, Los Angeles, CA 90089-9011, USA

2 Molecular and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA

For all author emails, please log on.

BMC Genetics 2006, 7:16  doi:10.1186/1471-2156-7-16

Published: 15 March 2006

Abstract

Background

The amount of genome-wide molecular data is increasing rapidly, as is interest in developing methods appropriate for such data. There is a consequent increasing need for methods that are able to efficiently simulate such data. In this paper we implement the sequentially Markovian coalescent algorithm described by McVean and Cardin and present a further modification to that algorithm which slightly improves the closeness of the approximation to the full coalescent model. The algorithm ignores a class of recombination events known to affect the behavior of the genealogy of the sample, but which do not appear to affect the behavior of generated samples to any substantial degree.

Results

We show that our software is able to simulate large chromosomal regions, such as those appropriate in a consideration of genome-wide data, in a way that is several orders of magnitude faster than existing coalescent algorithms.

Conclusion

This algorithm provides a useful resource for those needing to simulate large quantities of data for chromosomal-length regions using an approach that is much more efficient than traditional coalescent models.