This article is part of the supplement: The Framingham Heart Study 100,000 single nucleotide polymorphisms resource .Genome-wide association and linkage analyses of hemostatic factors and hematological phenotypes in the Framingham Heart Study1 The National Heart, Lung and Blood Institute's Framingham Heart Study, Framingham, MA, USA 2 Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA 3 Broad Institute of Harvard University and Massachusetts Institute of Technology, Cambridge, MA, USA 4 Office of Biostatistics Research, NHLBI, National Institute of Health; Bethesda, MD, USA 5 Royal North Shore Hospital Sydney, Australia 6 Cardiology Division, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
BMC Medical Genetics 2007, 8(Suppl 1):S12doi:10.1186/1471-2350-8-S1-S12 The electronic version of this article is the complete one and can be found online at: http://www.biomedcentral.com/1471-2350/8/S1/S12
©
2007 Yang et al; licensee BioMed Central Ltd. AbstractBackgroundIncreased circulating levels of hemostatic factors as well as anemia have been associated with increased risk of cardiovascular disease (CVD). Known associations between hemostatic factors and sequence variants at genes encoding these factors explain only a small proportion of total phenotypic variation. We sought to confirm known putative loci and identify novel loci that may influence either trait in genome-wide association and linkage analyses using the Affymetrix GeneChip 100K single nucleotide polymorphism (SNP) set. MethodsPlasma levels of circulating hemostatic factors (fibrinogen, factor VII, plasminogen activator inhibitor-1, von Willebrand factor, tissue plasminogen activator, D-dimer) and hematological phenotypes (platelet aggregation, viscosity, hemoglobin, red blood cell count, mean corpuscular volume, mean corpuscular hemoglobin concentration) were obtained in approximately 1000 Framingham Heart Study (FHS) participants from 310 families. Population-based association analyses using the generalized estimating equations (GEE), family-based association test (FBAT), and multipoint variance components linkage analyses were performed on the multivariable adjusted residuals of hemostatic and hematological phenotypes. ResultsIn association analysis, the lowest GEE p-value for hemostatic factors was p = 4.5*10-16 for factor VII at SNP rs561241, a variant located near the F7 gene and in complete linkage disequilibrium (LD) (r2 = 1) with the Arg353Gln F7 SNP previously shown to account for 9% of total phenotypic variance. The lowest GEE p-value for hematological phenotypes was 7*10-8 at SNP rs2412522 on chromosome 4 for mean corpuscular hemoglobin concentration. We presented top 25 most significant GEE results with p-values in the range of 10-6 to 10-5 for hemostatic or hematological phenotypes. In relating 100K SNPs to known candidate genes, we identified two SNPs (rs1582055, rs4897475) in erythrocyte membrane protein band 4.1-like 2 (EPB41L2) associated with hematological phenotypes (GEE p < 10-3). In linkage analyses, the highest linkage LOD score for hemostatic factors was 3.3 for factor VII on chromosome 10 around 15 Mb, and for hematological phenotypes, LOD 3.4 for hemoglobin on chromosome 4 around 55 Mb. All GEE and FBAT association and variance components linkage results can be found at http://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?id=phs000007 webcite ConclusionUsing genome-wide association methodology, we have successfully identified a SNP in complete LD with a sequence variant previously shown to be strongly associated with factor VII, providing proof of principle for this approach. Further study of additional strongly associated SNPs and linked regions may identify novel variants that influence the inter-individual variability in hemostatic factors and hematological phenotypes. BackgroundThe relationship of hemostasis and thrombosis with atherothrombotic cardiovascular disease has been extensively studied in the past decades. Elevated circulating levels of hemostatic factors, such as fibrinogen [1-3], plasminogen activator inhibitor (PAI-1) [4,5], von Willebrand factor (vWF) [6], tissue plasminogen activator (tPA) [4,5,7], factor VII (FVII) [8], and D-dimer [9,10] are linked to the development of atherothrombosis and are risk markers for coronary heart disease (CHD), stroke and other cardiovascular disease (CVD) events. In addition to coagulation proteins, the cellular and rheological components of circulating blood have been implicated in CHD, stroke and peripheral arterial disease, including hematological phenotypes such as hematocrit (HCT), hemoglobin (Hgb), red blood cell count (RBCC) and size, mean corpuscular volume (MCV) and mean corpuscular hemoglobin (MCH) [11,12], as well as measures of platelet aggregation (induced by adenosine 5'-diphosphate (ADP), epinephrine (Epi) and collagen respectively) [12,13], and viscosity [14,15]. Cis-acting sequence variants in the following genes – fibrinogen-β (FGB), fibrinogen-α (FGA), fibrinogen-γ (FGG), FVII (F7), and PAI-1 (SERPINE1) – have been associated with corresponding levels of circulating hemostatic factor. By comprehensively characterizing common genetic variation at each of these loci, we have recently clarified that cis-acting variants, in sum, explain a modest proportion of phenotypic variation, ranging from 1% – 10% [16,17]. For hematological variables such as hematocrit and hemoglobin, sequence variation in the major hemoglobin genes is well described to be associated with anemias, such as beta- and alpha-thalassemia, and sickle cell anemia [18-20]. Systematic searches for novel genes beyond the known genetic determinants influencing these phenotypes have been carried out using genome-wide linkage analyses with microsatellite markers: Chromosome regions that may harbor novel loci influencing fibrinogen, PAI-1 [21,22], hematocrit, Hgb, RBCC, MCV and MCH [23,24], have been identified. However, linkage scans with microsatellite markers generally had low power to detect loci with small effects, and lacked precision in localizing the loci; thus, few novel loci have been identified. The recent completion of a genome-wide scan using the Affymetrix GeneChip Human Mapping 100K single nucleotide polymorphism (SNP) set on participants in the Framingham Heart Study offered the opportunity to conduct a genome-wide association study (GWAS) and linkage scan for variants that influence hemostatic factors and hematological phenotypes. MethodsStudy participants and genotyping methodsThe Framingham Heart Study design and the genotyping of the Affymetrix GeneChip Human Mapping 100K SNP set on Framingham Heart Study participants are detailed in the overview of this project [25]. To avoid potential bias due to genotyping artifacts, we limited the association analyses to 70987 SNPs on autosomes with minor allele frequency (MAF) ≥ 10%, genotyping call rate ≥ 80%, and Hardy-Weinberg equilibrium test p-value ≥ 0.001. Measurements of hemostatic factors and hematological phenotypesVenous blood samples of Framingham Heart Study Offspring Cohort taken at the first and second examination cycles (1971–1975, and 1979–1983) were used to measure Hgb, RBCC, MCV and MCH, and samples taken at the fifth examination cycle (1991–1995) were used to measure all the hemostatic factors, platelet aggregation, D-dimer, and viscosity. Fibrinogen was additionally measured at the sixth (1995–1998) and seventh (1998–2001) examination cycles, and PAI-I antigen levels at the sixth exam. Details of the assessment of hemostatic factor levels have been described previously [17,26]. Plasma fibrinogen levels were measured using the Clauss method [27]. Plasma PAI-I antigen, tPA antigen, von Willebrand factor and FVII antigen were assessed using enzyme-linked immunosorbent assays. The determination of hematological phenotypes has been detailed previously. Platelet aggregation was performed according to the method of Born [28]. The reagents used were epinephrine, ADP and collagen. The percent extent of aggregation in duplicate to epinephrine and ADP was determined in varying concentrations (0.01 to 15 mmol/L). For each subject, the aggregation response (yes/no) was also tested to a fixed concentration of arachidonic acid (5 mg/mL). The collagen lag time was measured in response to 1.9 mmol/L collagen. Participants who were taking aspirin were excluded from the analyses for platelet aggregation phenotypes as well as PAI-1 and tPA. HCT was measured by the Wintrobe method [29]. Blood was collected and spun at 5000 rpm for 20 minutes in a balanced oxalate tube. The percent of total blood volume that was due to red blood cells was determined visually against a calibrated scale. MCV is the average volume of an individual's red blood cells determined as the ratio of HCT to RBCC. MCH is the average amount of hemoglobin of an individual's red cell determined as the ratio of Hgb to RBCC. Statistical methodsStandardized multivariable adjusted residuals of the hemostatic and hematological phenotypes were computed and used in all the linkage and association analyses. Covariates used in the adjustments were determined based upon what has been reported in the literature as potential risk factors for hemostatic factors or hematological phenotypes. Hardy-Weinberg equilibrium was examined using an exact chi-square test statistic [30]. Association between each SNP and each hemostatic or hematological phenotype was examined using a population based association method via generalized estimating equations (GEE) [31] and family-based association test (FBAT) [32], assuming an additive genetic model. Variance components linkage analyses were conducted using a subset of SNPs with pairwise r2 < 0.5. Details of both association and linkage methods are described in the overview of this project [25]. In secondary analyses, we combined the GEE association tests results across multiple phenotypes that may share the common pathway to reduce the type I error rates, and possibly detect SNPs of smaller effect sizes. We ranked SNPs by the number of GEE test p-values less than 0.01, and then by the geometric mean of the GEE test p-values. We also examined the β coefficient from the GEE regression that is the change in the phenotype in one standardized deviation unit with an increment of a copy of the alphabetically second allele (for example, allele G for a SNP with alleles A and G). This analysis was conducted for a phenotype assessed using multiple measurement methods such as the platelet aggregation with ADP-, collagen-, and Epi-induced platelet aggregation; or for a phenotype with serial measurements such as fibrinogen level measured at examination cycles 5, 6 and 7. We attempted to identify association of 100K SNPs in or within 60 kilo base pairs (kbp) of selected candidate genes previously reported to be associated with hemostatic factors or hematological phenotypes. For hemostatic factors and platelet aggregation phenotypes, we included the following candidate genes in the search: F7, fibrinogen gene cluster (FGB, FGA, FGG), SERPINE1, plasminogen activator-tissue (PLAT), vWF and integrin beta 3 (ITGB3). For hematological phenotypes excluding platelet aggregation, we included erythropoietin receptor (EPOR), erythropoietin (EPO), erythrocyte membrane protein band 4.1-like 2 (EPB41L2), Kruppel-like factor 1(KLF1), heme binding protein 2 (HEBP2), the hemoglobin gene clusters on chromosome 11: hemoglobin-β chain complex (HBB), hemoglobin-δ (HBD), hemoglobin-γ A (HBG1), hemoglobin-γ G (HBG2), hemoglobin-ε 1 (HBE1), and the hemoglobin gene clusters on chromosome 16: hemoglobin-α 1 (HBA1), hemoglobin-α 2 (HBA2), hemoglobin-μ (HBM). ResultsTable 1 displays the hemostatic and hematological phenotypes analyzed in this study, as well as the number of individuals, examination cycles, and covariates used in multivariable models. The sample size ranged from 702 to 1073. Traits measured at multiple examinations were analyzed using multivariable adjusted residuals from each examination measure, and also the average of all the multivariable adjusted residuals from individual examination cycles. Table 1. Description of hemostatic factors, hematological phenotypes, and covariates adjustment Among individuals who were included in the genotyping and had at least one hemostatic factor or platelet aggregation phenotype measured at examination cycle five, 52% were women, mean age was 52 years, and 6% had prevalent CVD. Among individuals who were included in the genotyping and had at least one hematological phenotype measured at examination cycle one or two, 52% were women, with a mean age over the two examinations of 36 years, and 2% had prevalent CVD. Association between SNPs and hemostatic and hematological phenotypesWe report the 25 SNPs with lowest GEE association test p-values in Table 2 for hemostatic factors, and in Table 3 for hematological phenotypes. The lowest GEE p-value (4.5*10-16) for hemostatic factors was obtained from the test of association between circulating levels of FVII and rs561241; this SNP resides near the F7 gene on chromosome 13 and is in complete linkage disequilibrium (LD) (r2 = 1) with the Arg353Gln F7 SNP (rs6046) we previously reported to account for 9% of total phenotypic variance [16]. The lowest GEE p-value (6.9*10-8) for hematological phenotypes was obtained in the test of association between MCH and rs1397048 on chromosome 11 near the olfactory receptors, olfactory receptor, family 5, subfamily AP, member 2 (OR5AP2), olfactory receptor, family 5, subfamily AR, member 1 (OR5AR1), olfactory receptor, family 9, subfamily G, member 1(OR9G1) and olfactory receptor, family 9, subfamily G, member 4 (OR9G4). The 25 SNPs with lowest FBAT association test p-values are presented in Additional file 1, Table A1 and Table A2, respectively. Additional file 1. The 25 SNPs with lowest FBAT association test p-values are presented in Additional data file 1, Table A1 and Table A2, respectively. Format: DOC Size: 104KB Download file This file can be viewed with: Microsoft Word Viewer Table 2. The 25 SNPs with lowest GEE association test p-values with hemostatic factors measured at exam 5 Table 3. The 25 SNPs with lowest GEE association tests p-values with hematological phenotypes Linkage resultsMaximum multipoint LOD scores greater than 2 and the 1.5-LOD support intervals around the maximum LOD scores are presented in Table 4. The highest LOD score for hemostatic factors was 3.3 for factor VII at approximately 15 Mb on chromosome 10. The highest LOD for hematological phenotypes was 3.4 for Hgb at approximately 55 Mb on chromosome 4. Table 4. Maximum LOD scores (≥2) on each chromosomes for hemostatic factors and hematological phenotypes Combining association tests across multiple phenotypesThe top 10 SNPs with most number of p-values < 0.01 and lowest mean p-values are reported in Tables 5 and 6 for platelet aggregation phenotypes and fibrinogen levels respectively. The top ranked SNP for platelet aggregation was rs10500631 on chromosome 11 located near an olfactory gene cluster. The p-values of the GEE association test for ADP-, collagen- and epinephrine-induced platelet aggregation levels with this SNP were all less than 0.01, with average p-value 0.007 over the three tests. The range of the regression coefficients was 0.19–0.24, indicating the effect size was consistently estimated across the three phenotypes. Table 5. Top 10 ranked SNPs in combining GEE association tests of ADP-induced, Collagen-induced and Epi-induced platelet aggregation levels Table 6. Top 10 ranked SNPs in combining GEE association tests of fibrinogen levels measured at examination cycles 5, 6 and 7 For fibrinogen, the top ranked SNP was rs4861952 on chromosome 4, which was also listed in the Table 2 as one of the 25 most significantly associated SNPs with hemostatic factors. This SNP was consistently associated with fibrinogen levels across three examination cycles with effect size ranging from -0.28 to -0.17. Association of SNPs in known candidate genes100K SNPs residing in or near known candidate genes for hemostatic factors are presented in Table 7. Among the candidate genes for hemostatic factors, no 100K SNP was in or within 60 kb of PLAT. Only SNPs in or near the rest of the candidate genes (F7, FGG, FGA, FGB, ITGB3, SERPINE1 and vWF) are presented. Among all these associations, three reached nominal significance (p-value < 0.05): rs561241 for factor VII, and rs6950982 and rs6956010 for PAI-1. Table 7. Association between SNPs in/near known hemostatic candidate genes, and the corresponding phenotypes Among the candidate genes for hematological traits, no 100K SNP was in or within 60 Kb of EPOR, EPO, KLF1, HBA1, HBA2, HBM. For the rest of the candidate genes, associations between hematological phenotypes and 100K SNPs in/near EPB41L2, the beta hemoglobin gene cluster on chromosome 11 (HBB, HBD, HBG1, HBG2, HBE1), and HEBP2 are presented in Table 8. The most significant associations were SNP rs1582055 near EPB41L2 with hematocrit (p = 7.7 × 10-5), Hgb (p = 2.9 × 10-4), and RBCC (p = 3.9 × 10-4); SNP rs4897475 with hematocrit (p = 1.6 × 10-4) and Hgb (p = 6.0 × 10-4). Table 8. Association between hematological phenotypes and SNPs in/near known candidate genes DiscussionWe conducted a GWAS and a genome-wide linkage analyses for hemostatic factors and hematological phenotypes measured in Framingham Heart Study Offspring participants. We identified a highly significant association between factor VII level and SNP rs561241 in complete LD with the F7 SNP rs6046 (Arg353Gln) previously demonstrated to explain about 9% of total phenotypic variation. This association is significant after Bonferroni correction for multiple testing (we used a conservative α = 5 × 10-8), and confirms the strong association at this locus that has previously been reported by us and others. This SNP was also significant (p-value = 3.4 × 10-4) at a nominal α level 0.05 for FBAT and linkage test (LOD = 1.8, p-value = 0.002), but not after Bonferroni correction. That may be explained by the well known fact that FBAT and linkage test are less powerful than population-based association tests. FBAT lacks power to detect variants that explain small proportion of variance for this study. It is difficult to distinguish true positives from false ones among FBAT results because it was evident that few 100K SNPs explain a large proportion of variance for hemostatic factors or hematological phenotypes. Given that there is no evidence for major population substructure in FHS [33] and there is greater power from use of GEE testing, we emphasize our population-based GEE analysis results in this report. Linkage analyses have the same problem of low power to detect small effects. However, a linkage peak can be caused by loci in linkage but not in LD with the SNPs, or by several loci of small effects in the region. Thus linkage peaks deserve additional attention. For example, we identified a linkage peak on chromosome 10 for multivariate adjusted factor VII. The SNP underneath the peak is rs2400107. However, the GEE association p-value was 0.52. This could occur because rs2400107 was linked but not in LD with the disease locus (loci) under the peak, or because this linkage peak was caused by several loci of small effects, or this peak was a false positive. Therefore, a more careful examination of the association results of SNPs under the linkage peak along with potentially additional genotyping may be needed to confirm the linkage results. Among the SNPs with top GEE p-values in single phenotype or multiple phenotypes analyses, only a few resided near genes that were known for a likely role in hemostasis and thrombosis and hematological biology. For hemostatic factors, the cis-acting SNP rs561241 near F7 gene was associated with factor. For hematological phenotypes, we identified rs6811964 near PDGFC, platelet derived growth factor-C. It has been shown that PDGFC highly expressed in vascular smooth muscle cells, renal mesangial cells and platelets, and was likely involved in platelet biology [34]. This SNP was found associated with Epi-induced platelet aggregation (P = 10-5, Table 3), with ADP-induced platelet aggregation at nominal significance (P = 0.02), and with collagen-induced platelet aggregation at borderline nominal significance (P = 0.08). Other associations were found with SNPs in genes not clearly related to the phenotypes, or with SNPs that are not in known genes. These associations, together with other findings from this GWAS, must be viewed as hypotheses that warrant further testing in other cohorts. Although we only summarized results for multivariable adjusted phenotype, we have also conducted linkage and association analyses for age-sex adjusted phenotypes. It is possible that the effects of some loci may be mediated through the covariates included in multivariable adjustment, and thus only associated with age and sex adjusted phenotypes. Among the 52 SNPs that were associated with age and sex adjusted hemostatic factors or hematological phenotypes with a GEE p-value equal or less than 10-5, 28 SNPs had a GEE p-value greater than 10-5 with multivariable adjusted phenotypes. However, no age and sex adjusted GEE p-value for the 28 SNPs reached genome-wide significance (p-value < 5 × 10-8), and no new highly plausible candidate genes resided within 60 Kb of these SNPs. The full disclosure results of all analyses, including the age-sex adjusted analyses, can be viewed at http://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?id=phs000007 webcite. There are some limitations to this study. The participants are Caucasian and thus the results may not be generalizable to other racial groups. The study sample size was relatively small, and as such, we may have insufficient power to detect small effects. To avoid worsening the multiple testing problem, we performed only sex-pooled and not sex-specific analyses. There may be some SNPs that are associated with some phenotypes only in female or male undetected in the current study. The advantages of this study are that we had family data, which enabled us to also apply family-based association tests that are robust to population admixture, and linkage analyses that can detect loci not in LD but in linkage with any 100K SNP. The study subjects were recruited without regarding to their phenotypic values, which makes the analyses of multiple phenotypes possible without the need to correct ascertainment bias. Finally, compared with studies focused only on SNPs within candidate genes, GWAS approaches are unbiased and as such they have the advantage of detecting novel genes or confirming genes that are not well-known to have an influence on a phenotype. However, since the current GWAS uses only a subset of all the SNPs in HapMap [35], it may miss some genes due to lack of coverage. For the same reason, GWAS data usually are not enough to study a candidate gene comprehensively. To understand the roles played by each SNP in a candidate gene, additional genotyping, and single-SNP and haplotype analyses are needed. A large GWAS involving more than 550,000 SNPs in more than 9000 participants of FHS will be available for analysis later in 2007, providing increased power for detection of smaller effects for the hemostatic and hematological phenotypes. ConclusionIn summary, we have tested for association and linkage using the Affymetrix 100K SNPs and a set of hemostatic factor and hematological phenotypes. We have confirmed a previously reported association, providing proof of principle (a "positive control") for the GWAS approach. Our results provide a set of hypotheses that warrant testing in additional studies. AbbreviationsADP = adenosine 5'-diphosphate; bp = base pair(s); CHD = coronary heart disease; CVD = cardiovascular disease; CHR = chromosome; EPB41L2 = erythrocyte membrane protein band 4.1-like 2; EPO = erythropoietin; EPOR = erythropoietin receptor; FBAT = family-based association test; FGB = fibrinogen-β; FGA = fibrinogen-α; FGG = fibrinogen-γ; FHS = Framingham Heart Study; FVII = factor VII; GAW = Genetic Analysis Workshop; GEE = generalized estimating equations; GWAS = genome-wide association study; HBA1 = hemoglobin-α 1; HBA2 = hemoglobin-α 2; HBB = hemoglobin-β chain complex; HBD = hemoglobin-δ; HBE1 = hemoglobin-ε 1; HBG1 = hemoglobin-γ A; HBG2 = hemoglobin-γ G; HBM = hemoglobin-μ; HCT = hematocrit; HEBP2 = heme binding protein 2; Hgb = hemoglobin; ITGB3 = integrin, beta 3 (platelet glycoprotein IIIa, antigen CD61); kb = kilo base pairs (=1,000 bp); KLF1 = Kruppel-like factor 1; LD = linkage disequilibrium; MAF = minor allele frequency; MCV = mean corpuscular volume; MCH = mean corpuscular hemoglobin; Mb = mega base pairs (=1,000,000 bp); OR5AP2 = olfactory receptor, family 5, subfamily AP, member 2; OR5AR1 = olfactory receptor, family 5, subfamily AR, member 1; OR9G1 = olfactory receptor, family 9, subfamily G, member 1; OR9G4 = olfactory receptor, family 9, subfamily G, member 4; PAI-1 = plasminogen activator inhibitor; PDGFC = platelet derived growth factor-C; RBCC = red blood cell count; SERPINE1 = serpin peptidase inhibitor, clade E, member 1; SNP = single nucleotide polymorphism; tPA = tissue plasminogen activator; vWF = von Willebrand factor; WBC = white blood cell. Competing interestsThe authors declare that they have no competing interests. Authors' contributionsQY took a leading role in study design, results summarization and interpretation, and manuscript drafting. SK contributed to the study design, summarization and interpretation of the results, and the manuscript revision. JPL contributed to the study of hematological phenotypes. GHT contributed to the measurement and study of the hemostatic traits. COD contributed to study design, results summarization and manuscript drafting and revision. All authors read and approved the final manuscript. AcknowledgementsThis work is supported by National Institute of Health/National Heart, Lung & Blood Institute (NHLBI) Contract N01-HC-25195. A portion of the research was conducted using the BU Linux Cluster for Genetic Analysis (LinGA) funded by the NIH NCRR (National Center for Research Resources) Shared Instrumentation grant (1S10RR163736-01A1). We express out gratitude to the Framinghan Heart Study participants, and helpful input from the collaborators: Emelia Benjamin and Martin G. Larson. This article has been published as part of BMC Medical Genetics Volume 8 Supplement 1, 2007: The Framingham Heart Study 100,000 single nucleotide polymorphisms resource. The full contents of the supplement are available online at http://www.biomedcentral.com/1471-2350/8?issue=S1. References
Have something to say? Post a comment on this article! |



on Google Scholar







author email
corresponding author email