VIPer: Iterative Value-Aware Model Learning on the Value Improvement Path

Abstract

We propose a practical and generalizable Decision-Aware Model-Based Reinforcement Learning algorithm. We extend the frameworks of VAML (Farahmand et al., 2017) and IterVAML (Farahmand, 2018), which have been shown to be difficult to scale to high-dimensional and continuous environments (Lovatto et al., 2020a; Modhe et al., 2021; Voelcker et al., 2022). We propose to use the notion of the Value Improvement Path (Dabney et al., 2020) to improve the generalization of VAML-like model learning. We show theoretically for linear and tabular spaces that our proposed algorithm is sensible, justifying extension to non-linear and continuous spaces. We also present a detailed implementation proposal based on these ideas.

Cite

Text

Abachi et al. "VIPer: Iterative Value-Aware Model Learning on the Value Improvement Path." ICML 2022 Workshops: DARL, 2022.

Markdown

[Abachi et al. "VIPer: Iterative Value-Aware Model Learning on the Value Improvement Path." ICML 2022 Workshops: DARL, 2022.](https://mlanthology.org/icmlw/2022/abachi2022icmlw-viper/)

BibTeX

@inproceedings{abachi2022icmlw-viper,
  title     = {{VIPer: Iterative Value-Aware Model Learning on the Value Improvement Path}},
  author    = {Abachi, Romina and Voelcker, Claas A and Garg, Animesh and Farahmand, Amir-massoud},
  booktitle = {ICML 2022 Workshops: DARL},
  year      = {2022},
  url       = {https://mlanthology.org/icmlw/2022/abachi2022icmlw-viper/}
}