Dynamics Learning with Cascaded Variational Inference for Multi-Step Manipulation

Abstract

The fundamental challenge of planning for multi-step manipulation is to find effective and plausible action sequences that lead to the task goal. We present Cascaded Variational Inference Planner (CAVIN), a model-based method that hierarchically generates plans by sampling from latent spaces. To facilitate planning over long time horizons, our method learns latent representations that decouple the prediction of high-level effects from the generation of low-level motions through cascaded variational inference. This enables us to model dynamics at two different levels of temporal resolutions for hierarchical planning. We evaluate our approach in three multi-step robotic manipulation tasks in cluttered tabletop environments given raw visual observations. Empirical results demonstrate that the proposed method outperforms state-of-the-art model-based approaches by strategically planning for interactions with multiple objects. See more details at pair.stanford.edu/cavin

Cite

Text

Fang et al. "Dynamics Learning with Cascaded Variational Inference for Multi-Step Manipulation." Conference on Robot Learning, 2019.

Markdown

[Fang et al. "Dynamics Learning with Cascaded Variational Inference for Multi-Step Manipulation." Conference on Robot Learning, 2019.](https://mlanthology.org/corl/2019/fang2019corl-dynamics/)

BibTeX

@inproceedings{fang2019corl-dynamics,
  title     = {{Dynamics Learning with Cascaded Variational Inference for Multi-Step Manipulation}},
  author    = {Fang, Kuan and Zhu, Yuke and Garg, Animesh and Savarese, Silvio and Fei-Fei, Li},
  booktitle = {Conference on Robot Learning},
  year      = {2019},
  pages     = {42-52},
  volume    = {100},
  url       = {https://mlanthology.org/corl/2019/fang2019corl-dynamics/}
}