This article is part of the supplement: Genetic Analysis Workshop 13: Analysis of Longitudinal Family Data for Complex Diseases and Related Risk Factors
Genetic Analysis Workshop 13: Simulated longitudinal data on families for a system of oligogenic traits
1 Department of Epidemiology, University of Texas M.D. Anderson Cancer Center, Houston, Texas
2 University of Southern California, Los Angeles, California, USA
BMC Genetics 2003, 4(Suppl 1):S3 doi:10.1186/1471-2156-4-S1-S3Published: 31 December 2003
The Genetic Analysis Workshop 13 simulated data aimed to mimic the major features of the real Framingham Heart Study data that formed Problem 1, but under a known inheritance model and with 100 replicates, so as to allow evaluation of the statistical properties of various methods. The pedigrees used were the 330 real pedigree structures (comprising 4692 individuals) with some minor changes to protect confidentiality. Fifty trait genes and 399 microsatellite markers were simulated by gene dropping on 22 autosomal chromosomes. Assuming random ascertainment of families, a system of eight longitudinal quantitative traits (designed to be similar to those in the real data) was generated with a wide range of heritabilities, including some pleiotropic and interactive effects. Genes could affect either the baseline level or the rate of change of the phenotype. Hypertension diagnosis and treatment were simulated with treatment availability, compliance, and efficacy depending on calendar year. Nongenetic traits of smoking and alcohol were generated as covariates for other traits. Death was simulated as a hazard rate depending upon age, sex, smoking, cholesterol, and systolic blood pressure.
After the complete data were simulated, missing data indicators were generated based on logistic models fitted to the real data, involving the subject's history of previous missing values, together with that of their spouses, parents, siblings, and offspring, as well as marital status, only-child indicators, current value at certain simulated traits, and the data collection pattern on the cohort into which each subject was ascertained.