Email updates

Keep up to date with the latest news and content from BMC Genetics and BioMed Central.

This article is part of the supplement: Genetic Analysis Workshop 13: Analysis of Longitudinal Family Data for Complex Diseases and Related Risk Factors

Open Access Proceedings

Genetic Analysis Workshop 13: Simulated longitudinal data on families for a system of oligogenic traits

E Warwick Daw1*, John Morrison2, Xiaojun Zhou1 and Duncan C Thomas2

Author Affiliations

1 Department of Epidemiology, University of Texas M.D. Anderson Cancer Center, Houston, Texas

2 University of Southern California, Los Angeles, California, USA

For all author emails, please log on.

BMC Genetics 2003, 4(Suppl 1):S3  doi:10.1186/1471-2156-4-S1-S3

Published: 31 December 2003

Abstract

The Genetic Analysis Workshop 13 simulated data aimed to mimic the major features of the real Framingham Heart Study data that formed Problem 1, but under a known inheritance model and with 100 replicates, so as to allow evaluation of the statistical properties of various methods. The pedigrees used were the 330 real pedigree structures (comprising 4692 individuals) with some minor changes to protect confidentiality. Fifty trait genes and 399 microsatellite markers were simulated by gene dropping on 22 autosomal chromosomes. Assuming random ascertainment of families, a system of eight longitudinal quantitative traits (designed to be similar to those in the real data) was generated with a wide range of heritabilities, including some pleiotropic and interactive effects. Genes could affect either the baseline level or the rate of change of the phenotype. Hypertension diagnosis and treatment were simulated with treatment availability, compliance, and efficacy depending on calendar year. Nongenetic traits of smoking and alcohol were generated as covariates for other traits. Death was simulated as a hazard rate depending upon age, sex, smoking, cholesterol, and systolic blood pressure.

After the complete data were simulated, missing data indicators were generated based on logistic models fitted to the real data, involving the subject's history of previous missing values, together with that of their spouses, parents, siblings, and offspring, as well as marital status, only-child indicators, current value at certain simulated traits, and the data collection pattern on the cohort into which each subject was ascertained.