This article is part of the supplement: Symposium of Computations in Bioinformatics and Bioscience (SCBB07)
MLIP: using multiple processors to compute the posterior probability of linkage
1 Department of Oral Biology and Center for Craniofacial and Dental Genetics, School of Dental Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania, USA
2 Department of Computer Science, The University of Iowa, Iowa City, Iowa, USA
3 Battelle Center for Mathematical Medicine, The Research Institute at Nationwide Children's Hospital and The Ohio State University, Columbus, Ohio, USA
BMC Bioinformatics 2008, 9(Suppl 6):S2 doi:10.1186/1471-2105-9-S6-S2Published: 28 May 2008
Localization of complex traits by genetic linkage analysis may involve exploration of a vast multidimensional parameter space. The posterior probability of linkage (PPL), a class of statistics for complex trait genetic mapping in humans, is designed to model the trait model complexity represented by the multidimensional parameter space in a mathematically rigorous fashion. However, the method requires the evaluation of integrals with no functional form, making it difficult to compute, and thus further test, develop and apply. This paper describes MLIP, a multiprocessor two-point genetic linkage analysis system that supports statistical calculations, such as the PPL, based on the full parameter space implicit in the linkage likelihood.
The fundamental question we address here is whether the use of additional processors effectively reduces total computation time for a PPL calculation. We use a variety of data – both simulated and real – to explore the question "how close can we get?" to linear speedup. Empirical results of our study show that MLIP does significantly speed up two-point log-likelihood ratio calculations over a grid space of model parameters.
Observed performance of the program is dependent on characteristics of the data including granularity of the parameter grid space being explored and pedigree size and structure. While work continues to further optimize performance, the current version of the program can already be used to efficiently compute the PPL. Thanks to MLIP, full multidimensional genome scans are now routinely being completed at our centers with runtimes on the order of days, not months or years.