Epidemiology Department, School of Public Health and Tropical Medicine, Tulane University, New Orleans, USA

Tulane Center for Cardiovascular Health, Tulane University Health Sciences Center, New Orleans, USA

Center for Human Genetics, Duke University, Durham, NC, USA

School of Life Science, Nanjing University, Nanjing, PR China

Biostatistics Department, School of Public Health and Tropical Medicine, Tulane University, New Orleans, USA

Abstract

Background

Quantitative traits often underlie risk for complex diseases. For example, weight and body mass index (BMI) underlie the human abdominal obesity-metabolic syndrome. Many attempts have been made to identify quantitative trait loci (QTL) over the past decade, including association studies. However, a single QTL is often capable of affecting multiple traits, a quality known as gene pleiotropy. Gene pleiotropy may therefore cause a loss of power in association studies focused only on a single trait, whether based on single or multiple markers.

Results

We propose using principal-component-based multivariate regression (PCBMR) to test for gene pleiotropy with comprehensive evaluation. This method generates one or more independent canonical variables based on the principal components of original traits and conducts a multivariate regression to test for association with these new variables. Systematic simulation studies have shown that PCBMR has great power. PCBMR-based pleiotropic association studies of abdominal obesity-metabolic syndrome and its possible linkage to chromosomal band 3q27 identified 11 susceptibility genes with significant associations. Whereas some of these genes had been previously reported to be associated with metabolic traits, others had never been identified as metabolism-associated genes.

Conclusions

PCBMR is a computationally efficient and powerful test for gene pleiotropy. Application of PCBMR to abdominal obesity-metabolic syndrome indicated the existence of gene pleiotropy affecting this syndrome.

Background

Quantitative traits often underlie increased risk for complex diseases. To understand the genetic basis of such traits, each trait is often separately tested for association with one or more markers. This approach has two disadvantages: 1) independent tests of each trait may lead to issues related to multiple testing; and 2) if a locus affects two or more traits, a single-trait study may lose the power to detect a pleiotropic effect, where a single gene influences multiple phenotypic traits.

In the past decade, simultaneous analysis of multiple traits in the context of linkage mapping of quantitative trait loci (QTL) has attracted much attention. Three approaches to simultaneous analysis have been developed and broadly applied, the first of which is generalization of maximum likelihood (ML)

The resolution of QTL linkage mapping is generally low (typically ≥ 10 cM)

In this study, we propose to integrate two common methods that test for association by analyzing multiple traits simultaneously: principal components and multivariate regression. However, there are no comprehensive evaluations of this principal-component-based multivariate regression (PCBMR). In our study, we comprehensively evaluated the power and type I error of PCBMR using simulations that varied pleiotropic effects, linkage disequilibrium (LD), proportion of contributed correlation, and number of traits. We also used PCBMR to examine the pleiotropic effects of multiple traits on human abdominal obesity-metabolic syndrome.

Human abdominal obesity-metabolic syndrome

Results

Simulation 1, differences in extent of QTL pleiotropic effect

The correlation coefficients between traits _{1 }_{2 }_{1 }_{2 }

Type 1 error and power of data sets of simulation 1

**Effect (b)**

**PCBMR**

**Single-Trait Association**

**GEN**

**ADD**

**DOM**

**REC**

**GEN**

**ADD**

**DOM**

**REC**

0

5.1

4.5

5.3

5.6

5.7(2.8)

5.8(3.0)

6.1(2.9)

4.9(3.0)

0.1

5.8

5.4

6.2

4.7

5.6(3.1)

5.8(3.3)

5.9(3.0)

5.3(2.8)

0.2

10.8*

12

11.2

6.8

8.9(4.8)

10.9(6.1)

10.9(5.7)

6.4(3.4)

0.3

14.1*

18.8*

18.3*

9.1*

12.2(8.6)

14.4(9.2)

14.6(9.6)

7.3(4.4)

0.4

21.4*

26.8*

25.2*

11.3

15.9(10.0)

20.5(14.5)

19.9(13.1)

10.4(6.3)

0.5

31.9*

41.9*

36.7*

15.7*

24.3(14.8)

29.1(20.3)

27.3(18.1)

13.6(8.7)

0.6

45.4*

54.9*

50.1*

21.3*

31.6(23.2)

39.9(30.0)

36.1(26.7)

17.2(10.8)

0.7

60.3*

71.4*

65.0*

26.5*

41.9(31.3)

50.5(40.1)

47.2(36.9)

21.6(14.1)

0.8

71.9*

81.9*

77.3*

30.9*

53.3(43.6)

63.6(51.9)

58.2(46.9)

24.2(18.0)

0.9

81.7*

90.8*

84.3*

41.7*

62.5(50.4)

72.7(62.2)

66.1(55.6)

30.4(21.5)

1

91.4*

95.2*

92.8*

48.9*

72.8(62.8)

82.0(73.4)

76.7(67.3)

36.7(27.0)

(The values outside the parentheses are the power (b > 0) or type I error (b = 0) of the single-trait association test without multiple-test adjustment (SATN) and the values inside the parentheses are the power (b > 0) or type I error (b = 0) of the single-trait association test with Bonferroni adjustment (SATB). * indicates that the power of PCBMR is significantly better than that of SATN; GEN: general model without assumption of genetic inheritance; ADD: additive effect model; DOM: dominant model and REC: recessive model)

Simulation 2, differences in extent of LD between a marker and pleiotropic QTL

Correlation coefficients between _{1 }_{2 }

Type 1 error and power of data sets of simulation 2

**LD (r)**

**PCBMR**

**Single-Trait Association**

**GEN**

**ADD**

**DOM**

**REC**

**GEN**

**ADD**

**DOM**

**REC**

0

5

4.2

4.6

5.5

5.8(2.5)

5.3(3.1)

5.7(3.1)

5.7(3.1)

0.1

4.1

4.8

5.3

4.7

5.5(2.8)

6.0(3.4)

6.6(3.9)

5.2(2.7)

0.2

9.2

11

10.1

8.2

9.9(5.4)

12.3(6.5)

10.2(6.3)

8.0(4.3)

0.3

14.8

19.6*

16.2*

11.7

14.0(8.1)

17.5(10.5)

14.6(8.4)

10.4(6.2)

0.4

18.7*

24.3*

20.5*

11.7

15.9(10.2)

19.6(11.9)

16.5(10.7)

10.4(6.2)

0.5

26.6*

32.9*

29.7*

11.7

20.1(13.8)

23.8(17.4)

23.4(15.9)

10.4(6.2)

0.6

50.2*

62.8*

56.2*

26.6*

36.6(27.9)

47.0(36.3)

41.4(31.2)

21.2(13.8)

0.7

49.2*

61.6*

56.1*

23.4*

36.7(27.2)

46.8(34.6)

40.6(29.4)

20.7(12.5)

0.8

72.6*

81.5*

76.7*

37.6*

52.0(42.3)

63.1(52.2)

57.2(46.0)

27.9(18.9)

0.9

80.8*

89.7*

86.2*

37.6*

62.0(50.4)

71.9(62.2)

67.9(57.0)

27.9(18.9)

1

91.4*

95.2*

92.8*

48.9*

72.8(62.8)

82.0(73.4)

76.7(67.3)

36.7(27.0)

(See Table 1)

Simulation 3, trait correlation between effects of two QTL and an environmental variable

The correlation coefficients between simulated traits _{1 }_{2 }_{ρ}(b)_{ρ}(b) _{ρ}(b) = 0_{ρ}(b) _{ρ}(b) = 0.4%_{ρ}(b) = 15.7%

Type 1 error and power of data sets of simulation 3

**Effect (b)**

**PCBMR**

**Single-Trait Association**

**GEN**

**ADD**

**DOM**

**REC**

**GEN**

**ADD**

**DOM**

**REC**

0

4.7

5

5.6

3.9

5.0(3.2)

5.3(2.7)

5.7(3.3)

4.2(2.4)

0.5

7.6

8.8

7.5

6.4

7.4(4.1)

8.4(4.7)

7.1(4.6)

6.1(3.3)

1

18.1

22.4

22

11.2

17.8(11.2)

22.4(16.2)

21.2(14.1)

11.2(7.2)

1.5

35.3

45.2

40.1

20.3

35.8(24.8)

45.8(34.3)

40.2(28.4)

20.2(13.8)

2

60.4

70.1

66

29.2

60.2(50.3)

70.2(59.8)

66.5(54.9)

28.9(22.0)

2.5

79.2

87.4

82.5

40.8

79.1(70.1)

86.8(79.0)

82.2(75.1)

40.2(30.2)

3

91.2

95.9

93.4

50.6

91.1(85.5)

95.8(91.7)

93.3(88.6)

50.7(40.4)

3.5

97.2

99.3

97.6

62

97.6(94.9)

99.4(98.0)

97.8(96.0)

62.5(50.6)

4

99.6

99.7

99.5

75.4

99.4(98.9)

99.7(99.5)

99.5(99.0)

75.9(63.9)

(See Table 1; Based on equation 3, the percentages of trait correlation contributed by tested QTL are 0%, 0.4%, 1.5%, 3.3%, 5.7%, 8.7%, 12.0%, 15.7% and 19.5% corresponding to b = 0, 0.5, 1, 1.5, 2, 2.5, 3, 3.5, 4)

Simulation 4, pleiotropic effects on more than two traits

Under this simulation strategy, the number of traits affected by the QTL ranged from 2 to 10. The correlation coefficients between any pair of simulated traits were all ≥0.97 and the expected percentage of correlation contributed by the tested QTL was 8.7%. For all numbers of traits, PCBMR generated one canonical variable for the association test. Results are presented in Table

Type 1 error and power of data sets of simulation 4

**Traits**

**PCBMR**

**Single-Trait Association**

**GEN**

**ADD**

**DOM**

**REC**

**GEN**

**ADD**

**DOM**

**REC**

2

79.1

87.4

82.5

40.8

79.1(70.1)

86.8(79.0)

82.2(75.1)

40.2(30.2)

3

80.6

87.2

83.1

40.2

80.1(67.0)

86.8(75.7)

82.5(70.1)

40.0(25.7)

4

79.4

87.6

83.3

41.2

79.6(62.2)

88.3(72.9)

83.3(65.6)

42.4(24.6)

5

78.6

87.8

83.2

40.9

78.9(59.1)

87.3(69.6)

83.3(63.8)

41.2(18.0)

6

78.4

86.3

83

40.1

78.6(57.2)

86.9(68.4)

82.8(60.7)

40.8(17.9)

7

79.2

86.2

82.4

40.1

78.5(54.6)

86.4(65.7)

82.7(58.5)

40.7(14.6)

8

80.3

85.5

82.4

40.7

80.0(51.6)

86.2(63.3)

82.1(56.0)

40.8(13.7)

9

77.9

85.6

83.1

42.5

78.1(52.3)

86.3(61.5)

82.3(53.9)

42.9(15.3)

10

78.7

87.3

82.6

42.4

78.8(47.4)

87.7(60.4)

83.0(51.4)

42.4(14.7)

(See Table 1)

Pleiotropic Association Studies of Traits of Abdominal Obesity-Metabolic Syndrome

A total of 1,196 subjects with 5,529 SNPs in the candidate region of chromosome 3 (at 182-227cM or 173.4-198.8 Mb) made up the study population. Quality control measures included the removal of SNPs with minor allele frequencies of ≤0.01 and Hardy-Weinberg equilibrium p-values of ≤1e^{-5}, leaving 4,769 SNPs in the study. The characteristics of the study participants are summarized in Table ^{2}), hip circumference (HIP, in cm), plasma insulin level (INSULIN, in μU/mL) and plasma insulin/glucose ratio (I/G). The pairwise correlation coefficients (

Characteristics of study participants

**N**

**AGE (yrs)**

**WEIGHT (kg)**

**WAIST (cm)**

**BMI (kg/m ^{2})**

**HIP (cm)**

**INSULIN (μU/mL)**

**I/G**

Male

517

36.2 (4.4)

91.8 (20.6)

98.4 (15.9)

29.1 (6.2)

107.6 (11.6)

12.8 (9.6)

0.14 (0.09)

Female

679

35.7 (4.6)

78.8 (22.2)

89.3 (17.7)

29.5 (8.0)

110.2 (15.7)

13.2 (14.7)

0.15 (0.16)

Mean (standard deviation)

Pair-wise correlation coefficients, r, between adjusted traits

**WEIGHT**

**BMI**

**WAIST**

**HIP**

**INSULIN**

**I/G**

**WEIGHT**

1.00

**BMI**

**0.95**

1.00

**WAIST**

**0.93**

**0.91**

1.00

**HIP**

**0.93**

**0.92**

**0.89**

1.00

**INSULIN**

0.46

0.46

0.47

0.41

1.00

**I/G**

0.43

0.43

0.44

0.38

**0.97**

1.00

The results of the PCBMR pleiotropic association studies based on the GEN model are presented in Figures ^{-5}) are summarized in Tables

Pleiotropic association study of WEIGHT, HIP, BMI and WAIST based on general model by PCMBA on the candidate region, 182-227cM of Chromosome 3

**Pleiotropic association study of WEIGHT, HIP, BMI and WAIST based on general model by PCMBA on the candidate region, 182-227cM of Chromosome 3. **There are 4769 total SNPs. The x axis is the SNP position and y axis is negative logarithm of p-value, i.e. -log (P).

Pleiotropic association study of INSULIN and I/G by PCMBA based on general model on the candidate region, 182-227cM of Chromosome 3

**Pleiotropic association study of INSULIN and I/G by PCMBA based on general model on the candidate region, 182-227cM of Chromosome 3. **There are total 4769 SNPs. The x axis is the SNP position and y axis is negative logarithm of p-value, i.e. -log (P).

Significant pleiotropic association with WEIGHT, HIP, BMI and WAIST

**SNP**

**-Log(P)**

**POSITION**

**Function**

rs11721044

5.19(5.88^{1})

174.64

NLGN1 (intron)

rs11926347

6.00(6.46^{1})

185.21

ABCC5 (intron)

rs9843456

5.88(6.70^{3})

192.85

rs1916636

6.33(7.15^{3})

192.85

Position is in megabase. The smallest p-value and its corresponding genetic model, additive (1), dominant (2) or (3) recessive, are enclosed inside parenthesis.

Significant pleiotropic association with INSULIN and I/G

**SNP**

**POSITION**

**-LOG(P)**

**Function**

rs669552

173.54

5.20(5.79^{3})

FNDC3B (intron)

rs6786075

175.12

5.76(6.56^{3})

NLGN1 (intron)

rs9854235

175.25

6.21(6.73^{3})

NLGN1 (intron)

rs6445137

175.26

10.03(10.75^{3})

NLGN1 (intron)

rs6798572

175.32

6.01

NLGN1 (intron)

rs12493995

175.89

17.34(18.29^{3})

rs9878945

176.22

5.80(6.21^{3})

NAALADL2 (intron)

rs9809218

176.56

7.17

NAALADL2 (intron)

rs11920602

178.39

11.59 (12.40^{3})

TBL1XR1 (Intron)

rs17633881

178.83

5.05(5.72^{3})

rs6797848

180.17

5.63(6.36^{3})

rs7611854

180.17

5.63(6.36^{3})

rs11927983

180.82

5.09(5.89^{1})

NDUFB5 (Intron)

rs4854964

181.40

5.15(5.87^{3})

rs1525276

181.43

5.13(5.75^{3})

rs7643438

181.47

6.36 (7.18^{3})

rs9869409

181.48

15.44(16.28^{3})

rs7647526

181.63

5.45

rs7650795

181.67

8.55(9.11^{3})

rs6803379

181.68

43.76(44.14^{3})

rs11926347

185.21

109.86(110.37^{3})

ABCC5 (intron)

rs6798973

185.67

5.73(6.55^{3})

rs6786711

187.56

5.52(6.32^{3})

DGKG (intron)

rs6795506

187.81

109.05(110.37^{3})

AHSG (near gene 5')

rs2082940

188.06

5.50(6.30^{3})

ADIPOQ (utr 3')

rs7628649

188.07

5.25(6.06^{3})

rs16863863

190.20

5.36(6.11^{3})

rs7614680

190.57

6.06(6.81^{3})

rs1515495

191.00

6.81

TP63 (intron)

rs4571225

191.81

6.44(7.05^{3})

IL1RAP (intron)

rs9821331

191.81

7.43(8.32^{3})

IL1RAP (intron)

rs9865681

191.83

12.63(13.59^{3})

IL1RAP (intron)

rs902192

194.60

5.66(6.41^{3})

rs768858

198.56

8.15(8.87^{3})

Refer to Table 7

For the first trait group of WEIGHT, BMI, WAIST, and HIP, PCBMR generated a single canonical variable that explained 94.1% of the variance. With Bonferroni adjustment, PCBMR using the GEN model found four SNPs with significant pleiotropic association (p <^{-5}

For the second trait group of INSULIN and I/G, PCBMR also generated a single canonical variable, and this variable explained 98.6% of the variance. Using the GEN model, thirty-four SNPs passed Bonferroni significance level (Figure

SNP rs11926347 in ABCC5 showed significant pleiotropic association in both groups and the p-value was extremely small in the second group of traits (^{-5 }_{10}(P) was 6.46, 5.92, and 2.12 for the first group's traits and 11.65, 4.74, and 110.37 for the second group's traits for additive, dominant, and recessive models, respectively. These results indicate that the additive model best suits the first trait group's association and the recessive model best suits the second trait group's association. After dropping the single homozygote, analyses based on different genetic models generated the same results. The significant association was absent in the second group's traits (-log_{10}(p) = 1.14), but still present in the first group's traits (-log_{10}(p) = 5.2). These results indicated that allele '

Summary of rs11926347 in ABCC5

**A/A**

**G/A**

**G/G**

Frequency

1

45

1150

AGE (yrs)

26.5

36.06(5.32)

35.97(4.43)

Male% (kg)

0

0.49

0.43

BMI (kg/m2)

48.08

34.50(10.26)

29.11(7.03)

WEIGHT (kg)

144.8

99.33(30.03)

83.79(21.83)

WAIST(cm)

129.1

103.3(22.5)

92.8(17.2)

HIP (cm)

146.27

117.23(19.63)

108.74(13.79)

INSULIN (μU/mL)

247

16.11(12.04)

12.69(10.68)

IG

2.68

0.17(0.12)

0.15(0.11)

(HWE: p-value = 0.37)

Discussion

Most current association studies have been based on single trait-single marker or single trait-multiple marker tests. These kinds of studies lose power in identifying genes with pleiotropic effects. In some cases, genes with pleiotropy may be found by separately testing each trait. However, two major issues make this strategy not always appropriate. First, pleiotropic effects for each trait may be too weak to be identified. Second, multiple testing problems may either lower the power or inflate the type I error. It is therefore important to develop methods that can test for association by analyzing multiple traits simultaneously.

In this paper, we present the use of PCBMR as a method which detects pleiotropic effects by combining principal component methods and multivariate regression. PCBMR generates a set of independent canonical variables based on principal components. Each canonical variable is associated with multiple traits and the sum of all variables explains at least 80% of the variation. Analysis of canonical variables is simultaneously implemented by multivariate regression. The statistic of PCBMR is simply the sum of individual test statistics. PCBMR is computationally efficient and can be easily implemented by most statistical packages. This makes PCBMR fast and feasible not only for candidate-gene association studies but also for genome-wide association studies (GWAS).

Comprehensive studies of simulated data have shown that PCBMR has well-controlled type I error, about 5%, when a tested marker has no pleiotropy (simulation 1 and 3) or exhibits linkage equilibrium to the pleiotropic QTL, in the case of pleiotropic tested markers (simulation 2). The power of PCBMR depends on the extent of the pleiotropic effect and on the LD of the QTL. Larger pleiotropic effects and higher LD result in larger power (simulation 1 and 2). When the trait correlation caused by pleiotropy was not strong (simulation 1), the number of canonical variables was the same as the number of traits and the power was reasonably high, even compared with SATN. When there were strong correlations among traits (simulation 4), the reduced number of variables resulted in fewer degrees of freedom for the PCBMR test, and the power of PCBMR was as high as SATN. However, SATN always has much higher type I error than PCBMR due to multiple testing. PCBMR was robust to conflicting effects from environmental factors or other, untested QTLs (simulation 3). In all cases, multi-trait association analyses using PCBMR were much more powerful than multiple single-trait association analyses using SATB. For all tests, multiple traits simultaneously studied by PCBMR were compared with the single trait with the best power as determined by SATN and SATB. The present study showed that PCBMR is at least as powerful as SATN and more powerful than SATB under pleiotropy.

PCBMR has great extensibility. For equations (_{i, }^{2 }distribution, is again the simple sum of statistics from separate regressions of a canonical variable on single or multiple covariates.

Comparisons of power estimates among PCBMR, SATB, and SATN in this study were based on analyses of the simulated additive model. To verify these findings, we tried studies on both simulated dominant and recessive models, and the same conclusions were obtained - that pleiotropic association studies by PCBMR are more powerful than single-trait association studies by either SATN or SATB (results not shown here). In addition, influences of model mismatch were also observed. For example, we observed that a pleiotropic study based on an additive model sacrificed its power when the true model was dominant or recessive. In addition, we observed that all studies based on the general model have acceptable power. In contrast to the additive model, which assumes linear trends of genotypic effect, and the dominant and recessive models, which assume equal effects of two genotypes for an SNP, the general model aims to separately estimate the effect of each genotype without any restriction. Therefore, PCBMR based on the general model has the advantage of testing for a pleiotropic effect when a complex trait has no obvious Mendelian inheritance.

As a real example, PCBMR was applied to test association in a study of traits-weight, waist circumference, BMI, hip circumference, plasma insulin, and insulin-glucose ratios-of abdominal obesity-metabolic syndrome in the Bogalusa Heart Study cohort. The traits were clustered into two groups based on two previously identified linkage peaks

Although this study illustrates many advantages of PCBMR, there are also some challenges to be faced in terms of practical application. In contrast to pleiotropic linkage studies that map a QTL to a large locus

Another challenge is to decide which traits should be studied simultaneously by PCBMR. Some strategies may help to address this challenge. Candidate traits could be those related to each other in the same pathway leading to a disease or symptom. For example, greater weight and BMI are correlated with obesity. Candidate traits could also include traits with linkage to the same region, such as two groups of traits with linkage peaks in two separate loci, as found in our studies of abdominal obesity-metabolic syndrome. Nevertheless, it is possible that two traits without much correlation may be strongly affected by a common gene. For example, in our simulation

PCA is an important tool for data mining that transforms a larger number of correlated variables into a smaller number of independent variables,

**Factor analysis-based study of pleiotropic association. **Table of significant pleiotropic association and figure of p-values of SNPs in linkage region.

Click here for file

In spite of its potential challenges, PCBMR is a powerful and computationally efficient method of studying the huge amounts of genetic data generated by advanced technology,

Conclusion

In summary, we propose the use of PCBMR, a computationally efficient method for the testing of gene pleiotropy. Although PCBMR is a combination of two established methods- principal components and multivariate regression-we are the first to comprehensively evaluate this technique in its combined form. The simulation studies described here indicate that this method is powerful for different kinds of pleiotropy. In spite of some challenges for its use in practical studies, PCBMR can greatly increase the power of association studies under pleiotropy and can broaden understanding of a gene's functions as well as its pathway and mechanisms. PCBMR is not only a useful method for candidate-gene based studies; as the generation of high-throughput expression data becomes increasingly efficient, PCBMR can be used to study pleiotropy in analyses of massive amounts of data, such as GWAS.

Methods

Principal Component Based Multivariate Regression (PCBMR)

Given a set of traits, PCBMR uses the method of principal component analysis (PCA) _{1}, Y_{2},..., Y_{m}) ^{s}

where ^{S}^{S}) = (^{1/2})^{-1}^{1/2})^{-1 }= ^{T}, where Γ is the matrix of eigenvectors and Λ is the diagonal matrix of eigenvalues.

PCA finds the weighting vector ^{1}, ..., δ^{p})^{T }^{T}Y^{S }

_{1}, z_{2}, ..., z_{m}]^{T }^{T}Y _{i }_{j}^{S }_{ij}Λ_{jj})^{1/2}, and the sum of squares of correlations between all _{i }_{j}^{S}_{i }

Suppose _{1}, z_{2}, ...,z_{k }_{i }_{i}^{2 }

Where _{i }= μ_{i}, ϕ_{i }= σ_{i}^{2}, a(ϕ_{i}) = ϕ_{i}, b(θ_{i}) = θ_{i}^{2}/2 _{i}, ϕ_{i}) = -[z_{i}^{2}/ϕ_{i}+log(2πϕ_{i})]/2

In multivariate regression, PCBMR takes the canonic link. The mean regression model is _{i }= Xβ_{i}+Wτ_{i}_{i }_{i }_{0}

We define the full model as the one without restriction of _{0 }_{0}_{ij }_{i }

The LRT statistic T is -2[logL(_{i }= μ_{i }= Xβ_{i}+Wτ_{i}

The mean estimates, _{i }_{i }^{2 }distributed LRT statistic for testing marker association with canonical trait _{i }_{i }^{2 }distribution with degrees of freedom equal to the difference of parameter numbers between the full and the nested model. A large T causing rejection of _{0 }_{i }≠ 0 and the presence of association attributable to the pleiotropic effects of multiple markers.

Simulation Studies

The power of PCBMR may depend on many factors; some of these are: 1) the extent of the QTL pleiotropic effect; 2) the extent of LD between the tested marker and the pleiotropic QTL; 3) the portion of the trait correlation contributed by the tested QTL relative to the portion contributed by other QTL and environmental factors; and 4) the number of traits in the study. For each simulation, 1,000 datasets were generated. Type I error and power were calculated as percentages of the datasets, with _{1}, Y_{2}, ...Y_{k }_{1}, U_{2}, ..., U_{k }_{1}_{2}_{k }_{1}, E_{2}, ..., E_{k }

Simulation 1, different extents of pleiotropic effects in QTL

The minor allele frequency of QTL is _{1 }_{1}_{1}_{1 }_{2 }_{2}_{2}_{2}_{1 }_{2}_{1 }_{2 }_{1 }_{2 }**0 **and standard deviation _{1}~E_{2}~N(0, 2^{2})_{1 }= b_{2 }= b

Simulation 2, different extents of LD between a marker and a pleiotropic QTL

In this situation, the QTL (_{1 }= 0.2_{1 }= 0.2_{1}_{2}_{1}_{2 }_{1 }_{2 }_{1 }_{2}_{1 }_{2}_{1}B_{1}) = p_{1}q_{1}+D_{1}B_{2}) = p_{1}(1-q_{1})-D_{2}B_{1}) = (1-p_{1})q_{1}-D _{2}B_{2}) = (1-p_{1})(1-q_{1})+D

Simulation 3, trait correlation based on the effects of two QTL and an environmental variable

Two linear regression models, _{1 }_{1}_{1}_{1}+W*d_{1}+E_{1 }_{2 }_{2}_{2}_{2}+W*d_{2}_{2}_{1 }_{2}_{1 }_{2 }_{1}~E_{2}~N(0, 0.5^{2})_{1 }= b_{2 }= b _{1 }_{2 }_{1 }_{2 }_{1}_{2}_{1 }_{2}

The proportion of the correlation contributed by QTL _{ρ}(b)

so _{ρ}(b)

Simulation 4, pleiotropic effects on more than two traits

Based on the linear regression model, _{i }_{i}^{2})_{i }= (i-1)*50

Power and type I error were estimated for PCBMR under the four simulation conditions. For comparison, we conducted single-trait association studies using classical linear regression with (STAB) and without (SATN) Bonferroni adjustment. For single-trait association studies, only the trait with the largest power or type I error was presented in the paper. Based on different assumptions of the genetic models, there are four possible ways of processing the

Power comparison by binomial exact test

Without loss of generality, we created indicator variables _{1 }_{2 }_{1 }_{2 }_{i}M_{1i}|(Σ_{i}M_{1i}+Σ_{i}M_{2i}) = N_{m }_{m}_{i}M_{1i }> N_{m}/2_{i}M_{1i }< N_{m}/2

Pleiotropic Association Studies of Abdominal Obesity-Metabolic Syndrome

We applied PCBMR to search for markers associated with multiple traits related to abdominal obesity-metabolic syndrome in the Bogalusa Heart Study, a community-based investigation of the evolution of cardiovascular disease risk beginning in childhood _{i }= U+AGE*b_{1}+AGE^{2}*b_{2}+SEX+E_{i}_{i}^{-5}.

Authors' contributions

HM developed and implemented the method. HM and WC performed the simulations, analysis and interpretation of the data. All authors participated in planning and discussion of the study. All authors read and approved the final manuscript.

Acknowledgements

This study was supported by grants 0855082E and 0555168B from American Heart Association, AG-16592 from the National Institute on Aging and HL-38844 from the National Heart, Lung, Blood Institute.

Electronic-Database Information

Online Mendelian Inheritance in Man (OMIM),