Consistency Trajectory Planning: High-Quality and Efficient Trajectory Optimization for Offline Model-Based Reinforcement Learning

Abstract

This paper introduces Consistency Trajectory Planning (CTP), a novel offline model-based reinforcement learning method that leverages the recently proposed Consistency Trajectory Model (CTM) for efficient trajectory optimization. While prior work applying diffusion models to planning has demonstrated strong performance, it often suffers from high computational costs due to iterative sampling procedures. CTP supports few-step trajectory generation without significant degradation in policy quality. We evaluate CTP on the D4RL benchmark and show that it consistently outperforms existing diffusion-based planning methods in long-horizon, goal-conditioned tasks. Notably, CTP achieves higher normalized returns while using fewer denoising steps. In particular, CTP attains comparable—or even superior—performance with reduced inference cost, highlighting its practicality and effectiveness for high-performance, low-latency offline planning.

Cite

Text

Wang et al. "Consistency Trajectory Planning: High-Quality and Efficient Trajectory Optimization for Offline Model-Based Reinforcement Learning." Transactions on Machine Learning Research, 2026.

Markdown

[Wang et al. "Consistency Trajectory Planning: High-Quality and Efficient Trajectory Optimization for Offline Model-Based Reinforcement Learning." Transactions on Machine Learning Research, 2026.](https://mlanthology.org/tmlr/2026/wang2026tmlr-consistency/)

BibTeX

@article{wang2026tmlr-consistency,
  title     = {{Consistency Trajectory Planning: High-Quality and Efficient Trajectory Optimization for Offline Model-Based Reinforcement Learning}},
  author    = {Wang, Guanquan and Hiraoka, Takuya and Tsuruoka, Yoshimasa},
  journal   = {Transactions on Machine Learning Research},
  year      = {2026},
  url       = {https://mlanthology.org/tmlr/2026/wang2026tmlr-consistency/}
}