Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Software

SeqSIMLA: a sequence and phenotype simulation tool for complex disease studies

Ren-Hua Chung* and Chung-Chin Shih

Author Affiliations

Division of Biostatistics and Bioinformatics, Institute of Population Health Sciences, National Health Research Institutes, Zhunan, Miaoli, Taiwan

For all author emails, please log on.

BMC Bioinformatics 2013, 14:199  doi:10.1186/1471-2105-14-199

Published: 20 June 2013

Abstract

Background

Association studies based on next-generation sequencing (NGS) technology have become popular, and statistical association tests for NGS data have been developed rapidly. A flexible tool for simulating sequence data in either unrelated case–control or family samples with different disease and quantitative trait models would be useful for evaluating the statistical power for planning a study design and for comparing power among statistical methods based on NGS data.

Results

We developed a simulation tool, SeqSIMLA, which can simulate sequence data with user-specified disease and quantitative trait models. We implemented two disease models, in which the user can flexibly specify the number of disease loci, effect sizes or population attributable risk, disease prevalence, and risk or protective loci. We also implemented a quantitative trait model, in which the user can specify the number of quantitative trait loci (QTL), proportions of variance explained by the QTL, and genetic models. We compiled recombination rates from the HapMap project so that genomic structures similar to the real data can be simulated.

Conclusions

SeqSIMLA can efficiently simulate sequence data with disease or quantitative trait models specified by the user. SeqSIMLA will be very useful for evaluating statistical properties for new study designs and new statistical methods using NGS. SeqSIMLA can be downloaded for free at http://seqsimla.sourceforge.net webcite.