PhyloGFN: Phylogenetic Inference with Generative Flow Networks

Abstract

Phylogenetics is a branch of computational biology that studies the evolutionary relationships among biological entities. Its long history and numerous applications notwithstanding, inference of phylogenetic trees from sequence data remains challenging: the high complexity of tree space poses a significant obstacle for the current combinatorial and probabilistic techniques. In this paper, we adopt the framework of generative flow networks (GFlowNets) to tackle two core problems in phylogenetics: parsimony-based and Bayesian phylogenetic inference. Because GFlowNets are well-suited for sampling complex combinatorial structures, they are a natural choice for exploring and sampling from the multimodal posterior distribution over tree topologies and evolutionary distances. We demonstrate that our amortized posterior sampler, PhyloGFN, produces diverse and high-quality evolutionary hypotheses on real benchmark datasets. PhyloGFN is competitive with prior works in marginal likelihood estimation and achieves a closer fit to the target distribution than state-of-the-art variational inference methods.

Cite

Text

Zhou et al. "PhyloGFN: Phylogenetic Inference with Generative Flow Networks." International Conference on Learning Representations, 2024.

Markdown

[Zhou et al. "PhyloGFN: Phylogenetic Inference with Generative Flow Networks." International Conference on Learning Representations, 2024.](https://mlanthology.org/iclr/2024/zhou2024iclr-phylogfn/)

BibTeX

@inproceedings{zhou2024iclr-phylogfn,
  title     = {{PhyloGFN: Phylogenetic Inference with Generative Flow Networks}},
  author    = {Zhou, Ming Yang and Yan, Zichao and Layne, Elliot and Malkin, Nikolay and Zhang, Dinghuai and Jain, Moksh and Blanchette, Mathieu and Bengio, Yoshua},
  booktitle = {International Conference on Learning Representations},
  year      = {2024},
  url       = {https://mlanthology.org/iclr/2024/zhou2024iclr-phylogfn/}
}