This article is part of the supplement: Genetic Analysis Workshop 13: Analysis of Longitudinal Family Data for Complex Diseases and Related Risk Factors

Open Access Proceedings

Empirically derived phenotypic subgroups – qualitative and quantitative trait analyses

Marsha A Wilcox*, Diego F Wyszynski, Carolien I Panhuysen, Qianli Ma, Agustin Yip, John Farrell and Lindsay A Farrer

Author Affiliations

Genetics Program, Department of Medicine, Boston University School of Medicine, 715 Albany Street, Boston, Massachusetts, 02118 USA

For all author emails, please log on.

BMC Genetics 2003, 4(Suppl 1):S15  doi:10.1186/1471-2156-4-S1-S15

Published: 31 December 2003



The Framingham Heart Study has contributed a great deal to advances in medicine. Most of the phenotypes investigated have been univariate traits (quantitative or qualitative). The aims of this study are to derive multivariate traits by identifying homogeneous groups of people and assigning both qualitative and quantitative trait scores; to assess the heritability of the derived traits; and to conduct both qualitative and quantitative linkage analysis on one of the heritable traits.


Multiple correspondence analysis, a nonparametric analogue of principal components analysis, was used for data reduction. Two-stage clustering, using both k-means and agglomerative hierarchical clustering, was used to cluster individuals based upon axes (factor) scores obtained from the data reduction. Probability of cluster membership was calculated using binary logistic regression. Heritability was calculated using SOLAR, which was also used for the quantitative trait analysis. GENEHUNTER-PLUS was used for the qualitative trait analysis.


We found four phenotypically distinct groups. Membership in the smallest group was heritable (38%, p < 1 × 10-6) and had characteristics consistent with atherogenic dyslipidemia. We found both qualitative and quantitative LOD scores above 3 on chromosomes 11 and 14 (11q13, 14q23, 14q31). There were two Kong & Cox LOD scores above 1.0 on chromosome 6 (6p21) and chromosome 11 (11q23).


This approach may be useful for the identification of genetic heterogeneity in complex phenotypes by clarifying the phenotype definition prior to linkage analysis. Some of our findings are in regions linked to elements of atherogenic dyslipidemia and related diagnoses, some may be novel, or may be false positives.