Department of Mathematical Sciences, Michigan Technological University, 1400 Townsend Drive, Houghton, MI 49931, USA

Abstract

Background

We have conducted a genome-wide association study on the Genetic Analysis Workshop (GAW) 16 rheumatoid arthritis data using a multilocus score test based on wavelet transform proposed recently by the authors. The wavelet-based test automatically adjusts for the amount of noise suppressed from the data. The power of the test is also increased by using the genetic information contained in the spatial ordering of single-nucleotide polymorphisms on a chromosome.

Results

After adjusting for the effect of population stratification, the test identified some previously discovered rheumatoid arthritis susceptibility loci (

Conclusion

This new test provides a useful tool in genome-wide association studies.

Background

In genome-wide association studies, the first choice of tests is usually a single-marker test. If a single-nucleotide polymorphism (SNP) has a strong association with a disease, single-marker tests should have higher power than multilocus tests. Multilocus tests can achieve higher power if several SNPs are associated with the disease. However, the potential high power of multilocus tests could be diminished as an increased number of markers results in an increased number of degrees of freedom. Therefore, reducing the number of degrees of freedom is essential to increasing the power of multilocus tests. Different strategies were introduced to reduce the number of degrees of freedom. Tests based on haplotype sharing (the longest continuous interval of matching alleles between haplotypes) effectively reduce the number of degrees of freedom

Many multilocus association tests are not affected by permuting spatial order of SNPs; thus, they do not use the information contained in the ordering of SNPs. For example, the results of logistic regression will not change if the order of SNPs are permuted. The same is true for the test obtained by fitting a regression function with one SNP followed by Bonferroni correction to find the global ^{T}G/(^{-1}, where _{1}_{1 }is the first several columns of ^{T }(^{T}^{T}V)^{-1}, the eigenvalues are not changed by the permutation, and the matrix of eigenvectors becomes ^{T}V. The regression model

We recently proposed a score test based on wavelet transform

Methods

There is a total of 2,062 individuals consisting of 868 cases and 1,194 controls in the North American Rheumatoid Arthritis Consortium (NARAC) data for Genetic Analysis Workshop (GAW) 16. These individuals were genotyped on the 550 k Illumina SNP chip. We analyzed 22 autosomal chromosomes in this report. SNPs satisfying one of the following criterion were excluded: missing genotype rate >0.05, or minor allele frequency < 0.05, or having ^{th }window is from the (8(^{th }SNP to the 8^{th }SNPs. Let _{k }be the corresponding eight columns of the genotype matrix _{k }were recoded to maximize the number of positive pairwise correlations. This removes any ambiguity in the coding of genotypes. In our simulation studies, the recoding increased the power of the test while keeping the type I error rate in check.

We applied EIGENSTRAT ^{T}^{T}_{GC }= 1.399, which reduced to 1.025 after the adjustment. To avoid a possible bias, all chromosomes except 6 and 8 were included to produce the eigenvectors, which were used to adjust genotypes on chromosomes other than 6. Chromosome 6 was used to produce eigenvectors to adjust genotypes on chromosome 6.

Consider the ^{th }window and the corresponding matrix of adjusted genotype ^{th }row of _{i1}, _{i2}, ..., _{im}) of the ^{th }individual. Subtract the mean of the ^{th }column of _{ij}) from _{ij }such that the mean of each column of _{1}, _{2}, ..., _{n}) be the adjusted phenotype of _{i})) = _{i}_{j }under the null hypothesis can be estimated by

The global _{ij}), where _{ij }is the absolute value of ^{th }window using the ^{th }set of permuted phenotypes. Let _{i }= max_{j }_{ij }be the maximum absolute value of ^{th }set of permuted phenotypes. Let _{j }be the absolute value of ^{th }window using the adjusted phenotypes. The global ^{th }window is the proportion of _{i }> _{j}: global _{j }= #{_{i}|_{i }> _{j}}/5,000.

Results and discussion

After correcting for population stratification, significant signals were only found on chromosomes 6 and 9. Four windows on chromosome 6 attracted our attention. The first window (rs9268005, rs3130340, rs3115553, rs9268132, rs926070, rs6935269, rs7775397, rs17422797) contains rs3130340, which was identified to have association with bone mineral density and fractures

We applied the wavelet-based test on a moving window of eight SNPs with overlapping (the first window contains SNPs 1-8, the second window contains SNPs 2-9, etc.) on a 550-kb region of chromosome 6 for fine mapping. The results are shown in Figure ^{2 }test. If ^{2 }test was not significant around rs3130340 (

** p-Values on 22 chromosomes, and comparison of windows**. The plot on the left contains the

Comparison with single marker test

**Comparison with single marker test**. Comparisons of the wavelet-based test and the Armitage χ^{2 }test on chromosome 6. The plots on the left are for the whole chromosome 6, and the plots on the right are fine mapping results for a 550-kb region of chromosome 6. The triangles indicate the positions (from left to right) of rs3130340 (associated with bone mineral density and RA), rs2076530 (associated with sarcoidosis and RA), HLA-DRB1 (Note: no data from 32.54 Mb to 32.67 Mb), and rs6457617 (associated with RA).

Conclusion

A wavelet-based multilocus score test was applied in a genome-wide association study on RA data followed by fine mapping of regions identified in our genome-wide association study. Several statistically significant risk loci for RA were identified after adjustment for population stratification. Some windows contain genes and/or SNPs (

List of abbreviations used

NARAC: North American Rheumatoid Arthritis Consortium; PCReg: Principal components analysis; RA: Rheumatoid arthritis; SNP: Single-nucleotide polymorphism

Competing interests

The authors declare that they have no competing interests.

Authors' contributions

RJ and JD both contributed in development of the statistical test, provided simulation strategies, and drafted the manuscript. RJ also participated and guided the numerical calculations. YD carried out part of the programming work. All authors read and approved the manuscript.

Acknowledgements

This research was partially supported by a National Institutes of Health grant GM069940-01A2. The authors thank the reviewers for their helpful suggestions which greatly improved the paper.

This article has been published as part of