Distributional Bellman Operators over Mean Embeddings

Abstract

We propose a novel algorithmic framework for distributional reinforcement learning, based on learning finite-dimensional mean embeddings of return distributions. The framework reveals a wide variety of new algorithms for dynamic programming and temporal-difference algorithms that rely on the sketch Bellman operator, which updates mean embeddings with simple linear-algebraic computations. We provide asymptotic convergence theory, and examine the empirical performance of the algorithms on a suite of tabular tasks. Further, we show that this approach can be straightforwardly combined with deep reinforcement learning.

Cite

Text

Wenliang et al. "Distributional Bellman Operators over Mean Embeddings." International Conference on Machine Learning, 2024.

Markdown

[Wenliang et al. "Distributional Bellman Operators over Mean Embeddings." International Conference on Machine Learning, 2024.](https://mlanthology.org/icml/2024/wenliang2024icml-distributional/)

BibTeX

@inproceedings{wenliang2024icml-distributional,
  title     = {{Distributional Bellman Operators over Mean Embeddings}},
  author    = {Wenliang, Li Kevin and Deletang, Gregoire and Aitchison, Matthew and Hutter, Marcus and Ruoss, Anian and Gretton, Arthur and Rowland, Mark},
  booktitle = {International Conference on Machine Learning},
  year      = {2024},
  pages     = {52839-52868},
  volume    = {235},
  url       = {https://mlanthology.org/icml/2024/wenliang2024icml-distributional/}
}