Analysis of heterogeneity and epistasis in physiological mixed populations by combined structural equation modelling and latent class analysis
1 Department of Clinical Biochemistry and Molecular Biology, University Hospital of Copenhagen, Denmark
2 Research Centre for Prevention and Health, Copenhagen County, Glostrup University Hospital, Denmark
3 Psychiatric Research Centre, Skt. Hans Hospital, Roskilde, Denmark
BMC Genetics 2008, 9:43 doi:10.1186/1471-2156-9-43Published: 8 July 2008
Biological systems are interacting, molecular networks in which genetic variation contributes to phenotypic heterogeneity. This heterogeneity is traditionally modelled as a dichotomous trait (e.g. affected vs. non-affected). This is far too simplistic considering the complexity and genetic variations of such networks.
In this study on type 2 diabetes mellitus, heterogeneity was resolved in a latent class framework combined with structural equation modelling using phenotypic indicators of distinct physiological processes. We modelled the clinical condition "the metabolic syndrome", which is known to be a heterogeneous and polygenic condition with a clinical endpoint (type 2 diabetes mellitus). In the model presented here, genetic factors were not included and no genetic model is assumed except that genes operate in networks. The impact of stratification of the study population on genetic interaction was demonstrated by analysis of several genes previously associated with the metabolic syndrome and type 2 diabetes mellitus.
The analysis revealed the existence of 19 distinct subpopulations with a different propensity to develop diabetes mellitus within a large healthy study population. The allocation of subjects into subpopulations was highly accurate with an entropy measure of nearly 0.9. Although very few gene variants were directly associated with metabolic syndrome in the total study sample, almost one third of all possible epistatic interactions were highly significant. In particular, the number of interactions increased after stratifying the study population, suggesting that interactions are masked in heterogenous populations. In addition, the genetic variance increased by an average of 35-fold when analysed in the subpopulations.
The major conclusions from this study are that the likelihood of detecting true association between genetic variants and complex traits increases tremendously when studied in physiological homogenous subpopulations and on inclusion of epistasis in the analysis, whereas epistasis (i.e. genetic networks) is ubiquitous and should be the basis in modelling any biological process.