This article is part of the supplement: Proceedings of the Second Annual RECOMB Satellite Workshop on Massively Parallel Sequencing (RECOMB-seq 2012)
Haplotype reconstruction using perfect phylogeny and sequence data
1 The Blavatnik School of Computer Science, Tel-Aviv University, Tel-Aviv, Israel
2 International Computer Science Institute, Berkeley, California, USA
3 Department of molecular microbiology and Biotechnology, Tel-Aviv University, Tel-Aviv, Israel
BMC Bioinformatics 2012, 13(Suppl 6):S3 doi:10.1186/1471-2105-13-S6-S3Published: 19 April 2012
Haplotype phasing is a well studied problem in the context of genotype data. With the recent developments in high-throughput sequencing, new algorithms are needed for haplotype phasing, when the number of samples sequenced is low and when the sequencing coverage is blow. High-throughput sequencing technologies enables new possibilities for the inference of haplotypes. Since each read is originated from a single chromosome, all the variant sites it covers must derive from the same haplotype. Moreover, the sequencing process yields much higher SNP density than previous methods, resulting in a higher correlation between neighboring SNPs. We offer a new approach for haplotype phasing, which leverages on these two properties. Our suggested algorithm, called Perfect Phlogeny Haplotypes from Sequencing (PPHS) uses a perfect phylogeny model and it models the sequencing errors explicitly. We evaluated our method on real and simulated data, and we demonstrate that the algorithm outperforms previous methods when the sequencing error rate is high or when coverage is low.