VI$^2$N: A Network for Planning Under Uncertainty Based on Value of Information

Abstract

Despite of great success in the recent years, deep reinforcement learning architectures still face a tremendous challenge in many real-world scenarios due to perceptual ambiguity. Similarly, differentiable networks, known as value iteration networks, that performs well in novel situations by extracting the environment model from training setups, are mostly limited to fully observable tasks. In this paper, we propose a new architecture, the VI$^2$N (Value Iteration with Value of Information Network) that can learn to act in novel environments with high amount of uncertainty. Specifically, this architecture uses a heuristic that over-emphasizes on reducing the uncertainty before exploiting the reward. Our network outperforms the state of the art differentiable architecture for partially observable environments especially when long term planning is needed to resolve the uncertainty.

Cite

Text

Johnson et al. "VI$^2$N: A Network for Planning Under Uncertainty Based on Value of Information." NeurIPS 2022 Workshops: DeepRL, 2022.

Markdown

[Johnson et al. "VI$^2$N: A Network for Planning Under Uncertainty Based on Value of Information." NeurIPS 2022 Workshops: DeepRL, 2022.](https://mlanthology.org/neuripsw/2022/johnson2022neuripsw-vi-a/)

BibTeX

@inproceedings{johnson2022neuripsw-vi-a,
  title     = {{VI$^2$N: A Network for Planning Under Uncertainty Based on Value of Information}},
  author    = {Johnson, Samantha and Buice, Michael A and Khalvati, Koosha},
  booktitle = {NeurIPS 2022 Workshops: DeepRL},
  year      = {2022},
  url       = {https://mlanthology.org/neuripsw/2022/johnson2022neuripsw-vi-a/}
}