ChainedDiffuser: Unifying Trajectory Diffusion and Keypose Prediction for Robotic Manipulation
Abstract
We present ChainedDiffuser, a policy architecture that unifies action keypose prediction and trajectory diffusion generation for learning robot manipulation from demonstrations. Our main innovation is to use a global transformer-based action predictor to predict actions at keyframes, a task that requires multi- modal semantic scene understanding, and to use a local trajectory diffuser to predict trajectory segments that connect predicted macro-actions. ChainedDiffuser sets a new record on established manipulation benchmarks, and outperforms both state-of-the-art keypose (macro-action) prediction models that use motion plan- ners for trajectory prediction, and trajectory diffusion policies that do not predict keyframe macro-actions. We conduct experiments in both simulated and real-world environments and demonstrate ChainedDiffuser’s ability to solve a wide range of manipulation tasks involving interactions with diverse objects.
Cite
Text
Xian et al. "ChainedDiffuser: Unifying Trajectory Diffusion and Keypose Prediction for Robotic Manipulation." Conference on Robot Learning, 2023.Markdown
[Xian et al. "ChainedDiffuser: Unifying Trajectory Diffusion and Keypose Prediction for Robotic Manipulation." Conference on Robot Learning, 2023.](https://mlanthology.org/corl/2023/xian2023corl-chaineddiffuser/)BibTeX
@inproceedings{xian2023corl-chaineddiffuser,
title = {{ChainedDiffuser: Unifying Trajectory Diffusion and Keypose Prediction for Robotic Manipulation}},
author = {Xian, Zhou and Gkanatsios, Nikolaos and Gervet, Theophile and Ke, Tsung-Wei and Fragkiadaki, Katerina},
booktitle = {Conference on Robot Learning},
year = {2023},
pages = {2323-2339},
volume = {229},
url = {https://mlanthology.org/corl/2023/xian2023corl-chaineddiffuser/}
}