Planning with Consistency Models for Model-Based Offline Reinforcement Learning

Abstract

This paper introduces consistency models to the problem of sequential decision-making. Previous work applying diffusion models to planning within a model-based reinforcement learning framework often struggles with high computational cost during the inference process, primarily due to their reliance on iterative reverse diffusion processes. Consistency models, known for their computational efficiency, have already shown promise in reinforcement learning within the actor-critic algorithm. Therefore, we combine guided consistency distillation with a continuous-time diffusion model in the framework of Decision Diffuser. Our approach, named Consistency Planning, combines the robust planning capabilities of diffusion models with the speed of consistency models. We validate our method on Gym tasks in the D4RL framework, demonstrating that, when compared to its diffusion model counterparts, our method achieves more than a 12-fold increase in speed without any loss in performance.

Cite

Text

Wang et al. "Planning with Consistency Models for Model-Based Offline Reinforcement Learning." Transactions on Machine Learning Research, 2024.

Markdown

[Wang et al. "Planning with Consistency Models for Model-Based Offline Reinforcement Learning." Transactions on Machine Learning Research, 2024.](https://mlanthology.org/tmlr/2024/wang2024tmlr-planning/)

BibTeX

@article{wang2024tmlr-planning,
  title     = {{Planning with Consistency Models for Model-Based Offline Reinforcement Learning}},
  author    = {Wang, Guanquan and Hiraoka, Takuya and Tsuruoka, Yoshimasa},
  journal   = {Transactions on Machine Learning Research},
  year      = {2024},
  url       = {https://mlanthology.org/tmlr/2024/wang2024tmlr-planning/}
}