Bayesian Multi-Population Haplotype Inference via a Hierarchical Dirichlet Process Mixture

Abstract

Uncovering the haplotypes of single nucleotide polymorphisms and their population demography is essential for many biological and medical applications. Methods for haplotype inference developed thus far---including methods based on coalescence, finite and infinite mixtures, and maximal parsimony---ignore the underlying population structure in the genotype data. As noted by Pritchard (2001), different populations can share certain portion of their genetic ancestors, as well as have their own genetic components through migration and diversification. In this paper, we address the problem of multi-population haplotype inference. We capture cross-population structure using a nonparametric Bayesian prior known as the hierarchical Dirichlet process (HDP) (Teh et al., 2006), conjoining this prior with a recently developed Bayesian methodology for haplotype phasing known as DP-Haplotyper (Xing et al., 2004). We also develop an efficient sampling algorithm for the HDP based on a two-level nested Pólya urn scheme. We show that our model outperforms extant algorithms on both simulated and real biological data.

Cite

Text

Xing et al. "Bayesian Multi-Population Haplotype Inference via a Hierarchical Dirichlet Process Mixture." International Conference on Machine Learning, 2006. doi:10.1145/1143844.1143976

Markdown

[Xing et al. "Bayesian Multi-Population Haplotype Inference via a Hierarchical Dirichlet Process Mixture." International Conference on Machine Learning, 2006.](https://mlanthology.org/icml/2006/xing2006icml-bayesian/) doi:10.1145/1143844.1143976

BibTeX

@inproceedings{xing2006icml-bayesian,
  title     = {{Bayesian Multi-Population Haplotype Inference via a Hierarchical Dirichlet Process Mixture}},
  author    = {Xing, Eric P. and Sohn, Kyung-Ah and Jordan, Michael I. and Teh, Yee Whye},
  booktitle = {International Conference on Machine Learning},
  year      = {2006},
  pages     = {1049-1056},
  doi       = {10.1145/1143844.1143976},
  url       = {https://mlanthology.org/icml/2006/xing2006icml-bayesian/}
}