Department of Statistics, Carnegie Mellon University, Pittsburgh, USA

Department of Statistics, Universidade Federal de São Carlos, São Carlos, Brazil

Department of Psychiatry, University of São Paulo, São Paulo, Brazil

Department of Statistics, University of Braslia, Brasília, Brazil

Department of Statistics, University of São Paulo, São Paulo, Brazil

Abstract

Background

The evaluation of associations between genotypes and diseases in a case-control framework plays an important role in genetic epidemiology. This paper focuses on the evaluation of the homogeneity of both genotypic and allelic frequencies. The traditional test that is used to check allelic homogeneity is known to be valid only under Hardy-Weinberg equilibrium, a property that may not hold in practice.

Results

We first describe the flaws of the traditional (chi-squared) tests for both allelic and genotypic homogeneity. Besides the known problem of the allelic procedure, we show that whenever these tests are used, an incoherence may arise: sometimes the genotypic homogeneity hypothesis is not rejected, but the allelic hypothesis is. As we argue, this is logically impossible. Some methods that were recently proposed implicitly rely on the idea that this does not happen. In an attempt to correct this incoherence, we describe an alternative frequentist approach that is appropriate even when Hardy-Weinberg equilibrium does not hold. It is then shown that the problem remains and is intrinsic of frequentist procedures. Finally, we introduce the

Conclusions

Contrary to more traditional approaches, the

Background

One of the main goals in genetic epidemiology is the evaluation of associations between specific genotypes or alleles and a certain disease. Association studies are usually performed in a case-control framework in which one or several polymorphisms of candidate genes are evaluated in a group of cases (that is, patients that have a disease) and in a group of controls from the same population (that is, healthy individuals)

Several statistical tests are usually employed for this scenario. Among them, Cochran-Armitage test for trends

HWE plays an important role in genetic studies, in particular when testing for allelic homogeneity

In the present paper, we focus on two hypotheses: 1. homogeneity of the genotypic frequencies; and 2. homogeneity of the allelic frequencies. Usually, data in such studies are summarized in two different ways _{
A
} sub genes would contribute to a disorder due to methamphetamine use. It is worth noting that Table

**Group**

**AA**

**AB**

**BB**

**Total**

Genotypic frequencies for the data set presented in

Case

55

83

50

188

Control

24

42

39

105

**Group**

**A**

**B**

**Total**

Allelic frequencies derived from Table

Case

193

183

376

Control

90

120

210

The aims of this paper are four-fold: 1 - to describe how the analysis of such data is usually conducted and to emphasize its known flaw (namely lack of robustness to departures from HWE); 2 - to describe one exact frequentist approach which is correct from a classical point of view; 3 - to present a Bayesian method to deal with the problem, and 4 - to advocate the use of the Bayesian solution by demonstrating why this is the best solution compared to the others. The main argument is based on an undesirable logical inconsistency that can happen whenever

The paper is organized as follows. Section Methods contains three subsections: Usual Procedures, which introduces the notation that is used throughout the paper, discusses the usual methods to deal with the problem and argues why the test for allelic homogeneity is wrong when there are departures from HWE;

Methods

Here, we formally describe three different approaches to deal with the problem described: the usual procedure, a correct frequentist proposal and a Bayesian solution.

Usual Procedures

We begin by describing the statistical model that is used to deal with the problem approached in this paper (namely, product of multinomials) and also how the hypotheses of interest are usually tested in genetic literature. For more details, see

Let **
X
**= (

**Group**

**AA**

**AB**

**BB**

**Total**

Genotypic frequencies (probabilities).

Case

_{
AA
}(_{
AA
})

_{
AB
}(_{
AB
})

_{
BB
}(_{
BB
})

Control

_{
AA
}(_{
AA
})

_{
AB
}(_{
AB
})

_{
BB
}(_{
BB
})

Considering observations from different individuals to be statistically independent, we have **
X
**|

which is the product of two multinomial distributions.

The first hypothesis to be tested (null hypothesis), namely that there is no difference in genotypic frequencies between the groups, may be formally expressed as

The usual procedure to test

where ^{
G
} has asymptotic distribution

The second hypothesis states that there is no difference in allelic frequencies between the groups. This hypothesis - which will be made formal in the next section - is usually tested by considering the allelic frequencies in both samples, _{
A
}= 2_{
AA
} + _{
AB
} and _{
A
}= 2_{
AA
} + _{
AB
}, as in Table

**Group**

**A**

**B**

**Total**

Allelic frequencies from Table

Case

_{
A
}= 2_{
AA
} + _{
AB
}

_{
B
}= 2_{
BB
} + _{
AB
}

2

Control

_{
A
}= 2_{
AA
} + _{
AB
}

_{
B
}= 2_{
BB
} + _{
AB
}

2

More formally, the statistic considered is

where

Applying the traditional tests to data from Table

A Different Frequentist Test

Some attempts to correct the above-mentioned allelic test so that it works even when HWE assumption is not met are considered by

Here we show another solution that has the advantages of being exact, unconditional, and that it can also be calculated in a computationally efficient way, even for large data sets. Moreover, it is defined in the same parametric space Θ as the genotypic test. Essentially, this test is derived by noticing that the hypothesis that allele frequencies are the same in both groups can be written in terms of the original parametric space as

Note that this formulation is always true independent of the Hardy-Weinberg equilibrium restriction and does not involve changing neither the sample space nor the parametric space.

The chi-square statistic may be used to test this hypothesis:

Here,

and then using the relations

Maximization of Equation (3) can be efficiently done by using numerical methods such as Newton’s method ^{
A∗} can then be compared to a ^{
A∗}under the null hypothesis and compute the proportion of these that are larger than the observed statistic on the sample. This is the (estimate of the) exact

This test is very similar to the ones recently introduced by

The allelic

Bayesian Solution

Bayesian methods are the alternative inductive way to deal with such a problem. These methods are widely used nowadays because they allow prior knowledge from the researcher and scientific community to be incorporated into the analysis (see

In this paper, we choose to use the FBST (**
θ
**). We note that it is not necessary to attribute different probability to each of the hypothesis: it is only necessary to specify

Suppose one is interested in testing the null hypothesis **
θ
**∈

The measure of evidence proposed, the ite-value, is defined by

In words,

Implementation of the FBST procedure requires two simple steps, which can be performed numerically:

•

•

More details on the implementation of the FBST procedure can be found in

• Empirical power analysis

• Reference sensitivity analysis and paraconsistent logic

•

• Bayesian decision-theoretic approach

• An asymptotically consistent threshold for a given confidence level (

The prior distribution for **
γ
**in the routine that was implemented and is available in the website is a Dirichlet distribution, as well as the prior distribution for

Note that in this case the posterior distribution is also the product of two independent Dirichlet distributions (once they are conjugate with the multinomial distribution). Their parameters are (_{
AA
} + _{
AA
},_{
AB
} + _{
AB
},_{
BB
} + _{
BB
}) and (_{
AA
} + _{
AA
},_{
AB
} + _{
AB
},_{
BB
} + _{
BB
}) respectively. Simulation of the Dirchlet distribution can be efficiently done by sampling from Gamma distributions; see _{
i
} and _{
i
}) are equal to 1,

The FBST procedure can be used in general, not only for testing allelic homogeneity. In particular, it can be used to test Hardy-Weinberg equilibrium, as shown by

**for HWE: real data.** Geometric representation of the HWE hypothesis (green curve), FBST tangential set (continuous ellipsis) and 99% credible set (dashed ellipsis): data from real samples.

**for HWE: simulated data.** Geometric representation of the HWE hypothesis (green curve), FBST tangential set (continuous ellipsis) and 99% credible set (dashed ellipsis): data from simulated samples (case 26 from Table

We see that while both groups from Figure

When testing genotypic and allelic homogeneity using FBST and uniform priors (_{
i
}=_{
i
}= 1 for all

Results and Discussion

We begin this Section by summarizing the results of the analyses for data presented in ^{
G
} is the traditional ^{
G
} is the ^{
A
}is the ^{
A
} is the

**Genotypes**

**Alleles**

**Hardy-Weinberg**

**
p
**

**
e
**

**
p
**

**
e
**

**
p
**

**
e
**

**
p
**

**
e
**

Significance indices for homogeneity for data presented in Table

0.152

0.434

0.049

0.069

0.493

0.111

0.276

0.060

0.165

**Genotypes**

**Alleles**

**Hardy-Weinberg**

**Case**

**Control**

**
p
**

**
e
**

**
p
**

**
e
**

**
p
**

**
e
**

**
p
**

**
e
**

Results of the simulations under three different scenarios: genotypic homogeneity, allelic (but not genotypic) homogeneity and no homogeneity at all. Bold p values indicate incoherence.

**Genotypic Homogeneity**

1

**0.408**

0.773

**0.197**

**0.189**

0.786

0.540

0.832

0.819

0.971

2

0.588

0.897

0.648

0.684

0.997

0.030

0.090

0.001

0.002

3

0.478

0.826

0.483

0.510

0.980

0.496

0.793

0.035

0.119

4

**0.912**

0.996

**0.709**

**0.689**

0.997

0.172

0.377

0.122

0.287

5

**0.836**

0.985

**0.578**

**0.554**

0.985

0.224

0.464

0.170

0.378

6

**0.989**

1.000

**0.926**

**0.903**

1.000

0.000

0.000

0.000

0.000

7

**0.187**

0.494

**0.100**

**0.068**

0.498

0.027

0.081

0.044

0.124

8

**0.652**

0.929

**0.444**

**0.416**

0.953

0.338

0.626

0.104

0.257

9

**0.620**

0.916

**0.510**

**0.494**

0.976

0.192

0.422

0.761

0.955

10

0.565

0.888

0.923

0.912

1.000

0.001

0.003

0.057

0.153

**Allelic Homogeneity**

11

0.008

0.034

0.325

0.291

0.893

0.494

0.790

0.001

0.003

12

0.000

0.000

0.067

0.057

0.442

0.068

0.190

0.000

0.000

13

0.002

0.013

0.151

0.114

0.629

0.989

1.000

0.000

0.000

14

0.001

0.003

0.923

0.918

1.000

0.174

0.400

0.000

0.000

15

0.113

0.342

0.844

0.833

1.000

0.989

1.000

0.006

0.014

16

0.020

0.086

0.559

0.547

0.985

0.174

0.395

0.015

0.040

17

0.001

0.006

0.147

0.129

0.683

0.129

0.319

0.002

0.005

18

0.040

0.149

0.501

0.462

0.970

0.871

0.986

0.001

0.002

19

0.026

0.106

1.000

1.000

1.000

0.760

0.955

0.000

0.000

20

0.001

0.002

0.446

0.379

0.939

0.733

0.938

0.000

0.000

**No Homogeneity**

21

0.000

0.000

0.925

0.928

1.000

0.000

0.000

0.015

0.045

22

**0.843**

0.987

**0.646**

**0.618**

0.993

0.055

0.153

0.141

0.333

23

0.062

0.219

0.104

0.124

0.661

0.989

1.000

0.007

0.028

24

**0.669**

0.939

**0.403**

**0.408**

0.955

0.994

1.000

0.621

0.882

25

0.000

0.000

0.000

0.000

0.003

0.000

0.000

0.771

0.958

26

0.105

0.331

0.017

**0.047**

0.403

0.001

0.001

0.001

0.001

27

0.000

0.000

0.000

0.000

0.012

0.072

0.197

0.010

0.033

28

0.180

0.485

0.230

0.233

0.835

0.310

0.598

0.324

0.602

29

**0.134**

0.387

**0.068**

**0.045**

0.389

0.045

0.128

0.063

0.170

30

**0.807**

0.980

**0.522**

**0.517**

0.980

0.806

0.971

0.713

0.933

Hence, it is reasonable to expect that

Even though this logical coherence is desirable, the analysis of data presented by

On the other hand,

For the problem considered here, this means that

In Table

As mentioned before

An important question is why we use FBST methodology rather then standard Bayes factors, the traditional Bayesian procedure to test sharp hypotheses

We end up this Section by answering the question of whether FBST procedure has good power properties. Even though this is not of primary interested in this work and is not a relevant question for most orthodox Bayesians, we indicate that this Bayesian procedure has good frequency properties. In order to do this, we fix different values for _{
AA
},_{
AB
} and _{
AA
}. We then set _{
AB
} to be 2(_{
AA
} + 1/2_{
AB
}−_{
AA
}−

Power analysis of

**Power analysis of ****.** Comparison of power of different tests for allelic homogeneity. Horizontal lines show level of significance. Topleft: _{AA }= 1/5,_{AB }= 2/5,_{AA }= 1/4,_{AA }= 1/5,_{AB }= 2/5,_{AA }= 1/4,_{AA }= 1/3,_{AB }= 1/5,_{AA }= 1/3,_{AA }= 1/3,_{AB }= 1/5,_{AA }= 1/3,

Conclusions

Although the traditional approach of doubling the sample size to test allelic homogeneity hypothesis was already shown to be incorrect when Hardy-Weinberg equilibrium is not met, many recent articles in biology still use it. As Figure

Similar incoherences of _{1},_{2} and _{3}. If we assume their distribution is normal with variance 1 and the sample means in each group (sufficient statistics) are −0.192,0.015 and 0.017, the likelihood ratio _{1} =_{2} is 0.037. On the other hand, when testing _{1} =_{2} =_{3} we get a _{1},_{2},_{3})∝1, the

As probabilities are monotonic, traditional Bayesian tests based on posterior probability calculations do enjoy monotonicity property, however using them here may be problematic because the hypotheses of interest are sharp. Mixed continuous-discrete distributions are needed in this case. Bayes Factors, on the other hand, were shown to be not monotonic. This does not invalidate its use: in fact, as pointed out by

The FBST computation always is performed in the full space that has dimension 4. Hence subhypotheses should coherently follow the orientation of the main hypothesis. Moreover, there is no need of specifying special priors for each of the null hypotheses, only for the whole parametric space Θ. It can also be easily implemented. The problem with the FBST is that the values of the significance index, “e”, are related to the dimension and increase as the dimension increases. However, in

Using the R Software, a routine that performs all the tests considered in this paper can be downloaded on

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

RI wrote the original manuscript. AGH was responsible for finding real data for the problem, as well as discussing the methods from a biological point view. AGH and RI did the literature survey. EYN, RI and VF worked on the mathematics of the methods as well as implemented them. CAdeBP first noticed the lack of monotonicity of the previous approaches and introduced the

Acknowledgements

The authors are grateful for Luís Gustavo Esteves, Julio Stern, Marcelo Lauretto, Rafael Bassi Stern and Sergio Wechsler for having discussed all the methods of FBST used in this paper. We also thank them for all the patience and painstakingly reading. We thank the anonymous referees for their comments that much improved the quality of the paper. This work was supported by Coordenação de Aperfeiçoamento de Pessoal de Nível Superior; Conselho Nacional de Desenvolvimento Científico e Tecnológico; and Fundação de Amparo à Pesquisa do Estado de São Paulo.