Open Access Methodology article

YHap: a population model for probabilistic assignment of Y haplogroups from re-sequencing data

Fan Zhang14, Ruoyan Chen1, Dongbing Liu1, Xiaotian Yao1, Guoqing Li1, Yabin Jin1, Chang Yu1*, Yingrui Li1* and Lachlan JM Coin123*

Author Affiliations

1 BGI-shenzhen, Shenzhen, China

2 Institute for Molecular Bioscience, University of Queensland, Queensland, Australia

3 Department of Genomics of Complex Disease, School of Public Health, Imperial College, London, UK

4 Department of Computational Medicine and Bioinformatics, Medical School, University of Michigan, Ann Arbor, USA

For all author emails, please log on.

BMC Bioinformatics 2013, 14:331  doi:10.1186/1471-2105-14-331

Published: 19 November 2013



Y haplogroup analyses are an important component of genealogical reconstruction, population genetic analyses, medical genetics and forensics. These fields are increasingly moving towards use of low-coverage, high throughput sequencing. While there have been methods recently proposed for assignment of Y haplogroups on the basis of high-coverage sequence data, assignment on the basis of low-coverage data remains challenging.


We developed a new algorithm, YHap, which uses an imputation framework to jointly predict Y chromosome genotypes and assign Y haplogroups using low coverage population sequence data. We use data from the 1000 genomes project to demonstrate that YHap provides accurate Y haplogroup assignment with less than 2x coverage.


Borrowing information across multiple samples within a population using an imputation framework enables accurate Y haplogroup assignment.