Email updates

Keep up to date with the latest news and content from BMC Genomics and BioMed Central.

Open Access Research article

Maximum parsimony xor haplotyping by sparse dictionary selection

Abdulkadir Elmas1, Guido H Jajamovich2 and Xiaodong Wang1*

Author Affiliations

1 Department of Electrical Engineering, Columbia University, 500 W 120th St, New York, 10027 NY, USA

2 Department of Radiology, Icahn School of Medicine at Mount Sinai, New York, 10029 NY, USA

For all author emails, please log on.

BMC Genomics 2013, 14:645  doi:10.1186/1471-2164-14-645

Published: 23 September 2013

Abstract

Background

Xor-genotype is a cost-effective alternative to the genotype sequence of an individual. Recent methods developed for haplotype inference have aimed at finding the solution based on xor-genotype data. Given the xor-genotypes of a group of unrelated individuals, it is possible to infer the haplotype pairs for each individual with the aid of a small number of regular genotypes.

Results

We propose a framework of maximum parsimony inference of haplotypes based on the search of a sparse dictionary, and we present a greedy method that can effectively infer the haplotype pairs given a set of xor-genotypes augmented by a small number of regular genotypes. We test the performance of the proposed approach on synthetic data sets with different number of individuals and SNPs, and compare the performances with the state-of-the-art xor-haplotyping methods PPXH and XOR-HAPLOGEN.

Conclusions

Experimental results show good inference qualities for the proposed method under all circumstances, especially on large data sets. Results on a real database, CFTR, also demonstrate significantly better performance. The proposed algorithm is also capable of finding accurate solutions with missing data and/or typing errors.