Email updates

Keep up to date with the latest news and content from BMC Genetics and BioMed Central.

Open Access Open Badges Methodology article

On coding genotypes for genetic markers with multiple alleles in genetic association study of quantitative traits

Tao Wang

Author Affiliations

Division of Biostatistics, Institute for Health and Society, Medical College of Wisconsin, Milwaukee, WI 53226, USA

BMC Genetics 2011, 12:82  doi:10.1186/1471-2156-12-82

Published: 21 September 2011



In genetic association study of quantitative traits using Fmodels, how to code the marker genotypes and interpret the model parameters appropriately is important for constructing hypothesis tests and making statistical inferences. Currently, the coding of marker genotypes in building Fmodels has mainly focused on the biallelic case. A thorough work on the coding of marker genotypes and interpretation of model parameters for Fmodels is needed especially for genetic markers with multiple alleles.


In this study, we will formulate Fgenetic models under various regression model frameworks and introduce three genotype coding schemes for genetic markers with multiple alleles. Starting from an allele-based modeling strategy, we first describe a regression framework to model the expected genotypic values at given markers. Then, as extension from the biallelic case, we introduce three coding schemes for constructing fully parameterized one-locus Fmodels and discuss the relationships between the model parameters and the expected genotypic values. Next, under a simplified modeling framework for the expected genotypic values, we consider several reduced one-locus Fmodels from the three coding schemes on the estimability and interpretation of their model parameters. Finally, we explore some extensions of the one-locus Fmodels to two loci. Several fully parameterized as well as reduced two-locus Fmodels are addressed.


The genotype coding schemes provide different ways to construct Fmodels for association testing of multi-allele genetic markers with quantitative traits. Which coding scheme should be applied depends on how convenient it can provide the statistical inferences on the parameters of our research interests. Based on these Fmodels, the standard regression model fitting tools can be used to estimate and test for various genetic effects through statistical contrasts with the adjustment for environmental factors.