Learning Proposals for Sequential Importance Samplers Using Reinforced Variational Inference
Abstract
The problem of inferring unobserved values in a partially observed trajectory from a stochastic process can be considered as a structured prediction problem. Traditionally inference is conducted using heuristic-based Monte Carlo methods. This work considers learning heuristics by leveraging a connection between policy optimization reinforcement learning and approximate inference. In particular, we learn proposal distributions used in importance samplers by casting it as a variational inference problem. We then rewrite the variational lower bound as a policy optimization problem similar to Weber et al. (2015) allowing us to transfer techniques from reinforcement learning. We apply this technique to a simple stochastic process as a proof-of-concept and show that while it is viable, it will require more engineering effort to scale inference for rare observations
Cite
Text
Ahmed et al. "Learning Proposals for Sequential Importance Samplers Using Reinforced Variational Inference." ICLR 2019 Workshops: drlStructPred, 2019.Markdown
[Ahmed et al. "Learning Proposals for Sequential Importance Samplers Using Reinforced Variational Inference." ICLR 2019 Workshops: drlStructPred, 2019.](https://mlanthology.org/iclrw/2019/ahmed2019iclrw-learning/)BibTeX
@inproceedings{ahmed2019iclrw-learning,
title = {{Learning Proposals for Sequential Importance Samplers Using Reinforced Variational Inference}},
author = {Ahmed, Zafarali and Karuvally, Arjun and Precup, Doina and Gravel, Simon},
booktitle = {ICLR 2019 Workshops: drlStructPred},
year = {2019},
url = {https://mlanthology.org/iclrw/2019/ahmed2019iclrw-learning/}
}