BMC Bioinformatics

official impact factor 3.03

Open Access Software

Fregene: Simulation of realistic sequence-level data in populations and ascertained samples

Marc Chadeau-Hyam1*, Clive J Hoggart1, Paul F O'Reilly1, John C Whittaker2, Maria De Iorio1 and David J Balding1

Author Affiliations

1 Department of Epidemiology and Public Health, Imperial College, St Mary's Campus, Norfolk Place, London, W2 1PG, UK

2 Non-communicable Disease Epidemiology Unit, London School of Hygiene & Tropical Medicine, Keppel Street, London, WC1E 7HT, UK

For all author emails, please log on.

BMC Bioinformatics 2008, 9:364 doi:10.1186/1471-2105-9-364

Published: 8 September 2008

Abstract

Background

FREGENE simulates sequence-level data over large genomic regions in large populations. Because, unlike coalescent simulators, it works forwards through time, it allows complex scenarios of selection, demography, and recombination to be modelled simultaneously. Detailed tracking of sites under selection is implemented in FREGENE and provides the opportunity to test theoretical predictions and gain new insights into mechanisms of selection. We describe here main functionalities of both FREGENE and SAMPLE, a companion program that can replicate association study datasets.

Results

We report detailed analyses of six large simulated datasets that we have made publicly available. Three demographic scenarios are modelled: one panmictic, one substructured with migration, and one complex scenario that mimics the principle features of genetic variation in major worldwide human populations. For each scenario there is one neutral simulation, and one with a complex pattern of selection.

Conclusion

FREGENE and the simulated datasets will be valuable for assessing the validity of models for selection, demography and population genetic parameters, as well as the efficacy of association studies. Its principle advantages are modelling flexibility and computational efficiency. It is open source and object-oriented. As such, it can be customised and the range of models extended.