An Analysis of Categorical Distributional Reinforcement Learning

Mark Rowland, Marc G. Bellemare, Will Dabney, Rémi Munos, Yee Whye Teh

AISTATS 2018 pp. 29-37

/aistats/2018/rowland2018aistats-analysis/

Abstract

Distributional approaches to value-based reinforcement learning model the entire distribution of returns, rather than just their expected values, and have recently been shown to yield state-of-the-art empirical performance. This was demonstrated by the recently proposed C51 algorithm, based on categorical distributional reinforcement learning (CDRL) [Bellemare et al., 2017]. However, the theoretical properties of CDRL algorithms are not yet well understood. In this paper, we introduce a framework to analyse CDRL algorithms, establish the importance of the projected distributional Bellman operator in distributional RL, draw fundamental connections between CDRL and the Cram\'er distance, and give a proof of convergence for sample-based categorical distributional reinforcement learning algorithms.

PDF AISTATS Semantic Scholar

Cite

Text

Rowland et al. "An Analysis of Categorical Distributional Reinforcement Learning." International Conference on Artificial Intelligence and Statistics, 2018.

Markdown

[Rowland et al. "An Analysis of Categorical Distributional Reinforcement Learning." International Conference on Artificial Intelligence and Statistics, 2018.](https://mlanthology.org/aistats/2018/rowland2018aistats-analysis/)

BibTeX

@inproceedings{rowland2018aistats-analysis,
  title     = {{An Analysis of Categorical Distributional Reinforcement Learning}},
  author    = {Rowland, Mark and Bellemare, Marc G. and Dabney, Will and Munos, Rémi and Teh, Yee Whye},
  booktitle = {International Conference on Artificial Intelligence and Statistics},
  year      = {2018},
  pages     = {29-37},
  url       = {https://mlanthology.org/aistats/2018/rowland2018aistats-analysis/}
}