This article is part of the supplement: NIPS workshop on New Problems and Methods in Computational Biology
The Cluster Variation Method for Efficient Linkage Analysis on Extended Pedigrees
Department of Medical Physics and Biophysics, Radboud University, Nijmegen, The Netherlands
BMC Bioinformatics 2006, 7(Suppl 1):S1 doi:10.1186/1471-2105-7-S1-S1Published: 20 March 2006
Computing exact multipoint LOD scores for extended pedigrees rapidly becomes infeasible as the number of markers and untyped individuals increase. When markers are excluded from the computation, significant power may be lost. Therefore accurate approximate methods which take into account all markers are desirable.
We present a novel method for efficient estimation of LOD scores on extended pedigrees. Our approach is based on the Cluster Variation Method, which deterministically estimates likelihoods by performing exact computations on tractable subsets of variables (clusters) of a Bayesian network. First a distribution over inheritances on the marker loci is approximated with the Cluster Variation Method. Then this distribution is used to estimate the LOD score for each location of the trait locus.
First we demonstrate that significant power may be lost if markers are ignored in the multi-point analysis. On a set of pedigrees where exact computation is possible we compare the estimates of the LOD scores obtained with our method to the exact LOD scores. Secondly, we compare our method to a state of the art MCMC sampler. When both methods are given equal computation time, our method is more efficient. Finally, we show that CVM scales to large problem instances.
We conclude that the Cluster Variation Method is as accurate as MCMC and generally is more efficient. Our method is a promising alternative to approaches based on MCMC sampling.