Chain-of-Action: Trajectory Autoregressive Modeling for Robotic Manipulation

Abstract

We present Chain-of-Action (CoA), a novel visuomotor policy paradigm built upon Trajectory Autoregressive Modeling. Unlike conventional approaches that predict next step action(s) forward, CoA generates an entire trajectory by explicit backward reasoning with task-specific goals through an action-level Chain-of-Thought (CoT) process. This process is unified within a single autoregressive structure: (1) the first token corresponds to a stable keyframe action that encodes the task-specific goals; and (2) subsequent action tokens are generated autoregressively, conditioned on the initial keyframe and previously predicted actions. This backward action reasoning enforces a global-to-local structure, allowing each local action to be tightly constrained by the final goal. To further realize the action reasoning structure, CoA incorporates four complementary designs: continuous action token representation; dynamic stopping for variable-length trajectory generation; reverse temporal ensemble; and multi-token prediction to balance action chunk modeling with global structure. As a result, CoA gives strong spatial generalization capabilities while preserving the flexibility and simplicity of a visuomotor policy. Empirically, we observe that CoA outperforms representative imitation learning algorithms such as ACT and Diffusion Policy across 60 RLBench tasks and 8 real-world tasks.

Cite

Text

Zhang et al. "Chain-of-Action: Trajectory Autoregressive Modeling for Robotic Manipulation." Advances in Neural Information Processing Systems, 2025.

Markdown

[Zhang et al. "Chain-of-Action: Trajectory Autoregressive Modeling for Robotic Manipulation." Advances in Neural Information Processing Systems, 2025.](https://mlanthology.org/neurips/2025/zhang2025neurips-chainofaction/)

BibTeX

@inproceedings{zhang2025neurips-chainofaction,
  title     = {{Chain-of-Action: Trajectory Autoregressive Modeling for Robotic Manipulation}},
  author    = {Zhang, Wenbo and Hu, Tianrun and Zhang, Hanbo and Qiao, Yanyuan and Qin, Yuchu and Li, Yang and Liu, Jiajun and Kong, Tao and Liu, Lingqiao and Ma, Xiao},
  booktitle = {Advances in Neural Information Processing Systems},
  year      = {2025},
  url       = {https://mlanthology.org/neurips/2025/zhang2025neurips-chainofaction/}
}