On the use of haplotype phylogeny to detect disease susceptibility loci
1 Unité de recherche en Génétique Épidémiologique et structure des populations humaines, INSERM U535, Villejuif, France
2 Laboratoire Bordelais de Recherche en Informatique, UMR 5800, Bordeaux, France
3 Programme Avenir, INSERM U458, hôpital Robert Debré, AP-HP, Paris, France
4 Fondation Jean Dausset, Paris, France
BMC Genetics 2005, 6:24 doi:10.1186/1471-2156-6-24Published: 18 May 2005
The cladistic approach proposed by Templeton has been presented as promising for the study of the genetic factors involved in common diseases. This approach allows the joint study of multiple markers within a gene by considering haplotypes and grouping them in nested clades. The idea is to search for clades with an excess of cases as compared to the whole sample and to identify the mutations defining these clades as potential candidate disease susceptibility sites. However, the performance of this approach for the study of the genetic factors involved in complex diseases has never been studied.
In this paper, we propose a new method to perform such a cladistic analysis and we estimate its power through simulations. We show that under models where the susceptibility to the disease is caused by a single genetic variant, the cladistic test is neither really more powerful to detect an association nor really more efficient to localize the susceptibility site than an individual SNP testing. However, when two interacting sites are responsible for the disease, the cladistic analysis greatly improves the probability to find the two susceptibility sites. The impact of the linkage disequilibrium and of the tree characteristics on the efficiency of the cladistic analysis are also discussed. An application on a real data set concerning the CARD15 gene and Crohn disease shows that the method can successfully identify the three variant sites that are involved in the disease susceptibility.
The use of phylogenies to group haplotypes is especially interesting to pinpoint the sites that are likely to be involved in disease susceptibility among the different markers identified within a gene.