Open Access Highly Accessed Methodology article

Phenotype-genotype association grid: a convenient method for summarizing multiple association analyses

Daniel Levy12345*, Steven R DePalma6, Emelia J Benjamin2, Christopher J O'Donnell127, Helen Parise8, Joel N Hirschhorn1069, Ramachandran S Vasan2, Seigo Izumo11 and Martin G Larson28

Author Affiliations

1 From the National Heart, Lung, and Blood Institute, Bethesda, MD, USA

2 National Heart, Lung, and Blood Institute's Framingham Heart Study, Framingham, MA, USA

3 Cardiology Division, Beth Israel-Deaconess Medical Center, Boston, MA, USA

4 Division of Cardiology

5 Department of Preventive Medicine, Boston University School of Medicine, Boston, MA, USA

6 Department of Genetics, Harvard Medical School and Howard Hughes Medical Institute, Boston, MA, USA

7 Division of Cardiology, Massachusetts General Hospital, Boston, MA, USA

8 Department of Mathematics and Statistics, Boston University, Boston, MA, USA

9 Divisions of Genetics and Endocrinology, Children's Hospital, Boston. MA, USA

10 Broad Center at Harvard and MIT, Cambridge, MA, USA

11 Novartis Research Institute, Cambridge, MA, USA

For all author emails, please log on.

BMC Genetics 2006, 7:30  doi:10.1186/1471-2156-7-30

Published: 22 May 2006

Abstract

Background

High-throughput genotyping generates vast amounts of data for analysis; results can be difficult to summarize succinctly. A single project may involve genotyping many genes with multiple variants per gene and analyzing each variant in relation to numerous phenotypes, using several genetic models and population subgroups. Hundreds of statistical tests may be performed for a single SNP, thereby complicating interpretation of results and inhibiting identification of patterns of association.

Results

To facilitate visual display and summary of large numbers of association tests of genetic loci with multiple phenotypes, we developed a Phenotype-Genotype Association (PGA) grid display. A database-backed web server was used to create PGA grids from phenotypic and genotypic data (sample sizes, means and standard errors, P-value for association). HTML pages were generated using Tcl scripts on an AOLserver platform, using an Oracle database, and the ArsDigita Community System web toolkit. The grids are interactive and permit display of summary data for individual cells by a mouse click (i.e. least squares means for a given SNP and phenotype, specified genetic model and study sample). PGA grids can be used to visually summarize results of individual SNP associations, gene-environment associations, or haplotype associations.

Conclusion

The PGA grid, which permits interactive exploration of large numbers of association test results, can serve as an easily adapted common and useful display format for large-scale genetic studies. Doing so would reduce the problem of publication bias, and would simplify the task of summarizing large-scale association studies.