Email updates

Keep up to date with the latest news and content from BMC Genomics and BioMed Central.

Open Access Highly Accessed Software

GACT: a Genome build and Allele definition Conversion Tool for SNP imputation and meta-analysis in genetic association studies

Arvis Sulovari12 and Dawei Li134*

Author Affiliations

1 Department of Microbiology and Molecular Genetics, University of Vermont, 05405 Burlington, VT, USA

2 Cell, Molecular and Biomedical Sciences Graduate Program, University of Vermont, 05405 Burlington, VT, USA

3 Department of Computer Science, University of Vermont, 05405 Burlington, VT, USA

4 Neuroscience, Behavior and Health Initiative, University of Vermont, 05405 Burlington, VT, USA

For all author emails, please log on.

BMC Genomics 2014, 15:610  doi:10.1186/1471-2164-15-610

Published: 19 July 2014

Abstract

Background

Genome-wide association studies (GWAS) have successfully identified genes associated with complex human diseases. Although much of the heritability remains unexplained, combining single nucleotide polymorphism (SNP) genotypes from multiple studies for meta-analysis will increase the statistical power to identify new disease-associated variants. Meta-analysis requires same allele definition (nomenclature) and genome build among individual studies. Similarly, imputation, commonly-used prior to meta-analysis, requires the same consistency. However, the genotypes from various GWAS are generated using different genotyping platforms, arrays or SNP-calling approaches, resulting in use of different genome builds and allele definitions. Incorrect assumptions of identical allele definition among combined GWAS lead to a large portion of discarded genotypes or incorrect association findings. There is no published tool that predicts and converts among all major allele definitions.

Results

In this study, we have developed a tool, GACT, which stands for Genome build and Allele definition Conversion Tool, that predicts and inter-converts between any of the common SNP allele definitions and between the major genome builds. In addition, we assessed several factors that may affect imputation quality, and our results indicated that inclusion of singletons in the reference had detrimental effects while ambiguous SNPs had no measurable effect. Unexpectedly, exclusion of genotypes with missing rate > 0.001 (40% of study SNPs) showed no significant decrease of imputation quality (even significantly higher when compared to the imputation with singletons in the reference), especially for rare SNPs.

Conclusion

GACT is a new, powerful, and user-friendly tool with both command-line and interactive online versions that can accurately predict, and convert between any of the common allele definitions and between genome builds for genome-wide meta-analysis and imputation of genotypes from SNP-arrays or deep-sequencing, particularly for data from the dbGaP and other public databases.

GACT software

http://www.uvm.edu/genomics/software/gact webcite

Keywords:
Allele definition (nomenclature); Genome build; Genome-wide association study (GWAS); Imputation; Meta-analysis