YHap: a population model for probabilistic assignment of Y haplogroups from re-sequencing data
- Equal contributors
1 BGI-shenzhen, Shenzhen, China
2 Institute for Molecular Bioscience, University of Queensland, Queensland, Australia
3 Department of Genomics of Complex Disease, School of Public Health, Imperial College, London, UK
4 Department of Computational Medicine and Bioinformatics, Medical School, University of Michigan, Ann Arbor, USA
BMC Bioinformatics 2013, 14:331 doi:10.1186/1471-2105-14-331Published: 19 November 2013
Y haplogroup analyses are an important component of genealogical reconstruction, population genetic analyses, medical genetics and forensics. These fields are increasingly moving towards use of low-coverage, high throughput sequencing. While there have been methods recently proposed for assignment of Y haplogroups on the basis of high-coverage sequence data, assignment on the basis of low-coverage data remains challenging.
We developed a new algorithm, YHap, which uses an imputation framework to jointly predict Y chromosome genotypes and assign Y haplogroups using low coverage population sequence data. We use data from the 1000 genomes project to demonstrate that YHap provides accurate Y haplogroup assignment with less than 2x coverage.
Borrowing information across multiple samples within a population using an imputation framework enables accurate Y haplogroup assignment.