ChainedDiffuser: Unifying Trajectory Diffusion and Keypose Prediction for Robotic Manipulation

Abstract

We present ChainedDiffuser, a policy architecture that unifies action keypose prediction and trajectory diffusion generation for learning robot manipulation from demonstrations. Our main innovation is to use a global transformer-based action predictor to predict actions at keyframes, a task that requires multi- modal semantic scene understanding, and to use a local trajectory diffuser to predict trajectory segments that connect predicted macro-actions. ChainedDiffuser sets a new record on established manipulation benchmarks, and outperforms both state-of-the-art keypose (macro-action) prediction models that use motion plan- ners for trajectory prediction, and trajectory diffusion policies that do not predict keyframe macro-actions. We conduct experiments in both simulated and real-world environments and demonstrate ChainedDiffuser’s ability to solve a wide range of manipulation tasks involving interactions with diverse objects.

Cite

Text

Xian et al. "ChainedDiffuser: Unifying Trajectory Diffusion and Keypose Prediction for Robotic Manipulation." Conference on Robot Learning, 2023.

Markdown

[Xian et al. "ChainedDiffuser: Unifying Trajectory Diffusion and Keypose Prediction for Robotic Manipulation." Conference on Robot Learning, 2023.](https://mlanthology.org/corl/2023/xian2023corl-chaineddiffuser/)

BibTeX

@inproceedings{xian2023corl-chaineddiffuser,
  title     = {{ChainedDiffuser: Unifying Trajectory Diffusion and Keypose Prediction for Robotic Manipulation}},
  author    = {Xian, Zhou and Gkanatsios, Nikolaos and Gervet, Theophile and Ke, Tsung-Wei and Fragkiadaki, Katerina},
  booktitle = {Conference on Robot Learning},
  year      = {2023},
  pages     = {2323-2339},
  volume    = {229},
  url       = {https://mlanthology.org/corl/2023/xian2023corl-chaineddiffuser/}
}