This article is part of the supplement: Symposium of Computations in Bioinformatics and Bioscience (SCBB06)
Estimate haplotype frequencies in pedigrees
Department of Computer Science and Technology, University of Science and Technology of China, Hefei, Anhui, 230027, China
BMC Bioinformatics 2006, 7(Suppl 4):S5 doi:10.1186/1471-2105-7-S4-S5Published: 12 December 2006
Haplotype analysis has gained increasing attention in the context of association studies of disease genes and drug responsivities over the last years. The potential use of haplotypes has led to the initiation of the HapMap project which is to investigate haplotype patterns in the human genome in different populations. Haplotype inference and frequency estimation are essential components of this endeavour.
We present a two-stage method to estimate haplotype frequencies in pedigrees, which includes haplotyping stage and estimation stage. In the haplotyping stage, we propose a linear time algorithm to determine all zero-recombinant haplotype configurations for each pedigree. In the estimation stage, we use the expectation-maximization (EM) algorithm to estimate haplotype frequencies based on these haplotype configurations. The experiments demonstrate that our method runs much faster and gives more credible estimates than other popular haplotype analysis software that discards the pedigree information.
Our method suggests that pedigree information is of great importance in haplotype analysis. It can be used to speedup estimation process, and to improve estimation accuracy as well. The result also demonstrates that the whole haplotype configuration space can be substituted by the space of zero-recombinant haplotype configurations in haplotype frequency estimation, especially when the considered haplotype block is relatively short.