Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

Open Access Highly Accessed Software

SNPEVG: a graphical tool for GWAS graphing with mouse clicks

Shengwen Wang, Daniel Dvorkin and Yang Da*

Author Affiliations

Department of Animal Science, University of Minnesota, St. Paul, USA

For all author emails, please log on.

BMC Bioinformatics 2012, 13:319  doi:10.1186/1471-2105-13-319

The electronic version of this article is the complete one and can be found online at: http://www.biomedcentral.com/1471-2105/13/319


Received:11 July 2012
Accepted:24 November 2012
Published:30 November 2012

© 2012 Wang et al.; licensee BioMed Central Ltd.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

Background

Genome-wide association studies (GWAS) using single nucleotide polymorphism (SNP) markers generate large quantities of tests results. Global and local graphical viewing of the test results is an effective approach to digest and interpret GWAS results.

Results

SNPEVG is a set of graphical tools for instant global and local viewing and graphing of GWAS results for all chromosomes and for each trait. The current version includes three programs, SNPEVG1, SNPEVG2 and SNPEVG3. SNPEVG1 is a graphical tool for SNP effect viewing of P-values allowing multiple traits. The total number of graphs that can be generated by one ‘Run’ is n(c + 2), where n is number of ‘traits’ with 0 < n ≤ 100, and c is the number of chromosomes. SNP effect viewing and graphing is accomplished through a user friendly graphical user interface (GUI) that provides a wide-range of options for the user to choose. The GUI can produce the Manhattan plot, the Q-Q plot of all SNP effects, and graphs for SNP effects by chromosome by clicking one command. Any or all the graphs can be saved with publication quality by clicking one command. SNPEVG2 is for the viewing and graphing of multiple traits on the same graph with options to graph any or all of the traits, customizable colors and user specified Y1 or Y2 axis for each traits. The SNPEVG3 program uses the output file of single-locus test results from the epiSNP computer package as the input file. Each chromosome figure can display three genetic effects (genotypic, additive and dominance effects), and the number of observations.

Conclusions

The SNPEVG package is a versatile, flexible and efficient graphical tool for rapid digestion of large quantities of GWAS results with mouse clicks.

Background

GWAS analysis generally yields large quantities of test results. Global and local graphical viewing of the test results is an effective approach and often is a necessary step for interpreting GWAS results. A widely used graphical viewing of GWAS results is the Manhattan plot, which provides a global graphic view of GWAS results of all chromosomes for a trait on one graph to quickly identify genome locations with the most significant SNP effects [1,2]. Following this global view, detailed graphical examination of each chromosome is helpful for further understanding the GWAS results, and more graphical work often is needed for effective presentation of the GWAS results. The purpose of the SNPEVG package is to provide a graphical tool for rapid digestion of GWAS results and to accomplish large quantities of graphical tasks of GWAS analysis in a seamless fashion.

Implementation

The SNPEVG computer package is implemented in the C++ programming language. The object orientation feature of the C++ language enables the efficient software development cycle by easy reuse of modules for different applications with similar features. The SNPEVG computer package used the Qt library under the terms of the GNU Lesser General Public License (LGPL) version 2.1 as shown [3].

Results and discussion

SNPEVG Version 3.2 includes three graphical programs: SNPEVG1, SNPEVG2 and SNPEVG3. SNPEVG1 is for graphing effects of one trait per graph for up to 100 traits, SNPEVG2 is for graphing multiple traits on the same graph, and SNPEVG3 processes directly uses an output file of EPISNP or EPISNPmpi [2] as the input file. Both SNPEVG1 and SNPEVG2 using the same format of the input file, which contain name, chromosome number and chromosome position of each SNP marker, and P-values of statistical tests from any method. Each program has a scalable GUI allowing efficient and flexible use of computer screen and allows the production of graphical images with user defined vertical/horizontal ratios. Each program can be launched multiple times by mouse click of the executable program so that the user can compare graphical effects of different graph options simultaneously. SNPEVG 3.2 is available from Additional files 1 and 2 or from the website at http://animalgene.umn.edu webcite. Full features of the SNPEVG package are described in the SNPEVG user manual [4] Additional file 3.

Additional file 1. SNPEVG_Win_3.2.zip: contains all executables and libraries for Microsoft Windows platforms and the user manual of SNPEVG package version 3.2.

Format: ZIP Size: 14.2MB Download fileOpen Data

Additional file 2. SNPEVG_Mac_3.2.zip: contains all executables for Mac OS X platforms and the user manual of SNPEVG package version 3.2.

Format: ZIP Size: 19.7MB Download fileOpen Data

Additional file 3. A Graphical Tool For SNP Effect Viewing and Graphing. (PDF 6838 kb)

Format: PDF Size: 6.7MB Download file

This file can be viewed with: Adobe Acrobat ReaderOpen Data

The SNPEVG1 program

SNPEVG1 supports a maximum 100 traits. The GUI (Figure 1A) has numerous graphical options for Manhattan plots, including user-customized colors (Figure 1B), shading P-values below the threshold P-value line (Figure 1C), and scalable pixel size proportional to P-values [5] (Figure 1A 1C), and displaying P-values above the specified cut-off P-value (Figure 1D). Each Manhattan plot uses true chromosome size defined by the starting and ending SNP marker positions of the chromosome. P-values for the unknown chromosome are displayed in sequential order of SNP markers rather than chromosome positions. Manhattan plots and Q-Q plots (Figure 1E) provide global view of test results for each trait. In addition to global viewing, the GUI produces graphs for each chromosome and each trait. For each chromosome, P-values can be presented as connected lines (Figure 2A) or separate symbols (Figure 2B). The total number of graphs that can be generated is n(c + 2), where n is the number of ‘traits’ with 0 < n ≤ 100, and c is the number of chromosomes. Assuming 30 traits and 30 chromosomes per trait, the program produces 960 graphs for interactive viewing by one click of ‘run’, including 30 Manhattan plots, 30 Q-Q plots and 900 chromosome graphs. The upper-right window of the GUI (Figure 1A) is the ‘Graph list’ by trait, showing a list of graphs produced by the ‘Run’ button. The user can turn off Manhattan and Q-Q plots, scroll the chromosome graphs of each trait using the up or down arrow key, and switch between traits using the left or right key. Any selected graph, or graphs for selected traits, or all graphs can be saved as graphical images with publication quality by clicking a button on the GUI. SNPEVG1 requires a simple text input file with the following columns: CHR, POSITION, SNP, and P-VALUE columns, where CHR = chromosome number, POSITION = chromosomal position of the SNP marker, SNP = name of the SNP marker, and P-VALUE is the P-value for a trait.

thumbnailFigure 1. SNPEVG1 GUI and Manhattan and Q-Q plots for global viewing and graphing of GWAS results. A: The user friendly GUI of SNPEVG1 offers user interactive viewing and graphing of global and local test results. B: The ‘Manhattan setting’ plate for customized chromosome color, fixed pixel size, or dynamic pixel size proportional to P-values in Manhattan plot. C: A Manhattan plot with color Template 1, proportional pixel size, and shading of P-values below the threshold P-value line. D: A Manhattan plot with color Template 2, proportional pixel size, and elimination of P-values below the line of cut-off P-value. E: A Q-Q plot.

thumbnailFigure 2. Chromosome graphs of SNPEVG1. A: Chromosome figure of SNPEVG1 with connecting line between adjacent data points. B: Chromosome figure of SNPEVG1 without connecting line between adjacent data points.

The SNPEVG2 program

SNPEVG2 is designed to display P-values of multiple traits on the same graph. Each chromosome figure can display P-values in log scale or the original values of a variable on either Y1 or Y2 axis (up to 100 traits) (Figure 3A). The Y2 axis can be used to display a variable unrelated to P-values such as minor allele frequency or allele frequency difference between the best and worst individuals, allowing the production of more flexible and informative graphs than using Y1 axis presenting P-values only. The chromosome graphs can be crowded and difficult to view if the number of traits is large. This problem can be solved by the option to select traits to display, to customize the color of each trait or switch Y1 and Y2 axes using the ‘Setting’ button on the GUI (Figure 3B). Each Y axis, Y1 or Y2, can have its own threshold P-value or cut-off P-value (Figures 3A and C). SNPEVG2 requires a simple text input file with the same format as for SNPEVG1, i.e., CHR, POSITION, SNP, and P-VALUE columns, where CHR = chromosome number, POSITION = chromosomal position of the SNP marker, SNP = name of the SNP marker, and P-VALUE is the P-value for a trait.

thumbnailFigure 3. SNPEVG2 program GUI and chromosome graphs of SNPEVG2 and SNPEVG3. A: The user friendly GUI of SNPEVG2 offers user interactive viewing and graphing of global and local test results from multiple traits on the same graphs. B: The ‘Effect setting’ plate for the selection of traits to be displayed, the selection of Y1 or Y2 axis, and customized chromosome color for each trait. C: A chromosome graph with two threshold value lines for Y1 and Y2 axes, and elimination of P-values below the line of cut-off P-value line. D: Chromosome figure of SNPEVG3 with connecting line between adjacent data points. E: Chromosome figure of SNPEVG3 without connecting line between adjacent data points.

The SNPEVGconvert program

The SNPEVGconvert program is designed to convert an output file from any GWAS analysis software to the format of SNPEVG1 and SNPEVG2. With this format conversion program, virtually any GWAS software could SNPEVG1 and SNPEVG2. To use this program, the user only needs to specify the number of columns in the original files and identify the column numbers to be printed in the input file for SNPEVG1 and SNPEVG2.

The SNPEVG3 program

SNPEVG3 is developed for graphical analysis of GWAS using the output file of single-locus test results of EPISNP or EPISNPmpi [2] as the input file for drawing figures. SNPEVG3 has similar GUI features as SNPEVG1, but it does not have the limit of 100 traits. This program draws graphs for P-values of additive, dominance and genotypic effects on the Y1 axis and draws sample size on the Y2 axis. The P-values can be displayed with lines connecting adjacent data points (Figure 3D) or use symbols without connecting lines (Figure 3E). The user has an option to draw a figure by a sorted effect such as additive or dominance effect.

Evaluation of sample size limitations

Currently, SNPEVG1, SNPEVG2 and SNPEVG3 have a Microsoft Windows 32-bit version and a 64-bit version for Mac OS X 10.6 or newer. A Windows 64-bit version is expected to become available at a later time. For practical purposes, either the 32-bit or the 64-bit version would be powerful enough for real GWAS data sets. For a single trait, the 32-bit version could process 10 million markers per trait in about 30 seconds but failed for 12 million markers, and the 64-bit version could process 30 million markers in 80.62 seconds (Table 1). For multiple traits, the number of markers that can be processed per trait is approximately the numbers in Table 1 divided by the number of traits (Table 2).

Table 1. Processing time and limit of SNP markers for SNPEVG1 and SNPEVG3 of 32-bit (Windows 7) and 64-bit versions (Mac OS X) for one trait

Table 2. Processing time and limit of SNP markers for SNPEVG1 and SNPEVG2 of 32-bit (Windows 7) and 64-bit versions (Mac OS X) for 100 traits

Conclusions

The SNPEVG package is a versatile and efficient graphical tool for rapid digestion of large quantities of test results from GWAS and can be customized for graphical viewing and drawing of non-GWAS information such as allele frequency differences.

Availability and requirements

Project name: SNPEVG

Project homepage: http://animalgene.umn.edu/ webcite

Operating system(s): Microsoft Windows 7, Mac OS X 10.6 or newer

Other requirements: none.

License: none.

Any restrictions to use by non-academics: none.

Abbreviations

GWAS: Genome-wide association study; SNP: Single nucleotide polymorphism.

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

SW is the author of SNPEVG1, SNPEVG2, and SNPEVG3. DD is the author of the EPISNPPLOT program that is partially used in SNPEVG3. YD designed most functions of the computing tools, and is the lead writer of the manuscript. All authors read and approved this manuscript.

Acknowledgements

This research is supported by USDA National Institute of Food and Agriculture Grant no. 2011-67015-30333 and by project MN-16-043 of the Agricultural Experiment Station at the University of Minnesota.

References

  1. Zhao JH: Gap: Genetic analysis package.

    J Stat Softw 2007., 23(i08)

    http://www.jstatsoft.org/v23/i08/paper webcite

    OpenURL

  2. Ma L, Runesha HB, Dvorkin D, Garbe JR, Da Y: Parallel and serial computing tools for testing single-locus and epistatic SNP effects of quantitative traits in genome-wide association studies.

    BMC Bioinformatics 2008, 9(1):315. PubMed Abstract | BioMed Central Full Text | PubMed Central Full Text OpenURL

  3. GNU Lesser General Public License, version 2.1. 1999.

    http://www.gnu.org/licenses/old-licenses/lgpl-2.1.html webcite

    OpenURL

  4. Wang S, Dvorkin D, Da Y:

    A graphical tool for SNP effect viewing and graphing, version 3.2. 2012.

    http://animalgene.umn.edu webcite

    OpenURL

  5. Cole JB:

    Data Structures and Visualization. ADSA/ASAS Joint Annual Meeting. 2011.

    http://aipl.arsusda.gov/publish/presentations/ADSA11/ADSA11_jbc_files/frame.htm webcite

    OpenURL