Email updates

Keep up to date with the latest news and content from BMC Proceedings and BioMed Central.

This article is part of the supplement: Beyond the Genome 2012

Open Access Poster presentation

Dirichlet process model for joint haplotype inference and GWAS

Avinash Das Sahu12* and Sridhar Hannenhalli12

  • * Corresponding author: Avinash D Sahu

Author affiliations

1 University of Maryland, College Park, MD 20740, USA

2 University of Maryland Institute for Advanced Computer Studies, MD, USA

For all author emails, please log on.

Citation and License

BMC Proceedings 2012, 6(Suppl 6):P49  doi:10.1186/1753-6561-6-S6-P49

The electronic version of this article is the complete one and can be found online at: http://www.biomedcentral.com/1753-6561/6/S6/P49


Published:1 October 2012

© 2012 Sahu and Hannenhalli; licensee BioMed Central Ltd.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Poster presentation

Identification of causal genomic mutations that underlie disease phenotypes remains a key problem in the field of medical informatics. With the advent of new sequencing technologies and decreasing cost of human genotyping, it is now possible to study genotype-phenotype interactions, such as genome-wide association studies (GWAS), at the population level. However, due to large genomic variance and linkage disequilibrium, genetic diversity of a complete human population cannot be captured by a limited number of clusters. Furthermore, application of current haplotype inferencing (phasing) methods to rare genomic variance, such as disease-related alleles, is not reliable. Hence, a satisfactory method for deleterious mutation identification remains largely elusive. Here we present a non-parametric Bayesian model that jointly infers haplotypes and identifies deleterious mutations, taking into consideration genomic variance in the human population. The model is based on the Dirichlet process, which can capture genomic variance by modeling it with non-bounded numbers of clusters.