This article is part of the supplement: Genetic Analysis Workshop 15: Gene Expression Analysis and Approaches to Detecting Multiple Functional Loci
Genetic Analysis Workshop 15: simulation of a complex genetic model for rheumatoid arthritis in nuclear families including a dense SNP map with linkage disequilibrium between marker loci and trait loci
1 Division of Epidemiology and Community Health, School of Public Health, and Institute of Human Genetics, University of Minnesota, 1300 S Second Street, Suite 300, Minneapolis, Minnesota 55454, USA
2 Division of Epidemiology and Community Health, School of Public Health, University of Minnesota, 1300 South 2nd Street, Suite 300, Minneapolis, Minnesota 55454, USA
3 Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota 55454, USA
BMC Proceedings 2007, 1(Suppl 1):S4 doi:Published: 18 December 2007
Data for Problem 3 of the Genetic Analysis Workshop 15 were generated by computer simulation in an attempt to mimic some of the genetic and epidemiological features of rheumatoid arthritis (RA) such as its population prevalence, sex ratio, risk to siblings of affected individuals, association with cigarette smoking, the strong effect of genotype in the HLA region and other genetic effects. A complex genetic model including epistasis and genotype-by-environment interaction was applied to a population of 1.9 million nuclear families of size four from which we selected 1500 families with both offspring affected and 2000 unrelated, unaffected individuals all of whose first-degree relatives were unaffected. This process was repeated to produce 100 replicate data sets. In addition, we generated marker data for 22 autosomes consisting of a genome-wide set of 730 simulated STRP markers, 9187 SNP markers and an additional 17,820 SNP markers on chromosome 6. Appropriate linkage disequilibrium between markers and between trait loci and markers was modelled using HapMap Phase 1 data http://www.hapmap.org/downloads/phasing/2005-03_phaseI/ webcite. The code base for this project was written primarily in the Octave programming language, but it is being ported to the R language and developed into a larger project for general genetic simulation called GenetSim http://genetsim.org/ webcite. All of the source code that was used to generate the GAW 15 Problem 3 data is freely available for download at http://genetsim.org/gaw15/ webcite.