DIME: Diffusion-Based Maximum Entropy Reinforcement Learning

Abstract

Maximum entropy reinforcement learning (MaxEnt-RL) has become the standard approach to RL due to its beneficial exploration properties. Traditionally, policies are parameterized using Gaussian distributions, which significantly limits their representational capacity. Diffusion-based policies offer a more expressive alternative, yet integrating them into MaxEnt-RL poses challenges—primarily due to the intractability of computing their marginal entropy. To overcome this, we propose Diffusion-Based Maximum Entropy RL (DIME). DIME leverages recent advances in approximate inference with diffusion models to derive a lower bound on the maximum entropy objective. Additionally, we propose a policy iteration scheme that provably converges to the optimal diffusion policy. Our method enables the use of expressive diffusion-based policies while retaining the principled exploration benefits of MaxEnt-RL, significantly outperforming other diffusion-based methods on challenging high-dimensional control benchmarks. It is also competitive with state-of-the-art non-diffusion based RL methods while requiring fewer algorithmic design choices and smaller update-to-data ratios, reducing computational complexity.

Cite

Text

Celik et al. "DIME: Diffusion-Based Maximum Entropy Reinforcement Learning." Proceedings of the 42nd International Conference on Machine Learning, 2025.

Markdown

[Celik et al. "DIME: Diffusion-Based Maximum Entropy Reinforcement Learning." Proceedings of the 42nd International Conference on Machine Learning, 2025.](https://mlanthology.org/icml/2025/celik2025icml-dime/)

BibTeX

@inproceedings{celik2025icml-dime,
  title     = {{DIME: Diffusion-Based Maximum Entropy Reinforcement Learning}},
  author    = {Celik, Onur and Li, Zechu and Blessing, Denis and Li, Ge and Palenicek, Daniel and Peters, Jan and Chalvatzaki, Georgia and Neumann, Gerhard},
  booktitle = {Proceedings of the 42nd International Conference on Machine Learning},
  year      = {2025},
  pages     = {6958-6977},
  volume    = {267},
  url       = {https://mlanthology.org/icml/2025/celik2025icml-dime/}
}