Hierarchical Policy Blending as Optimal Transport

Abstract

We present hierarchical policy blending as optimal transport (HiPBOT). HiPBOT hierarchically adjusts the weights of low-level reactive expert policies of different agents by adding a look-ahead planning layer on the parameter space. The high-level planner renders policy blending as unbalanced optimal transport consolidating the scaling of the underlying Riemannian motion policies. As a result, HiPBOT effectively decides the priorities between expert policies and agents, ensuring the task’s success and guaranteeing safety. Experimental results in several application scenarios, from low-dimensional navigation to high-dimensional whole-body control, show the efficacy and efficiency of HiPBOT. Our method outperforms state-of-the-art baselines – either adopting probabilistic inference or defining a tree structure of experts – paving the way for new applications of optimal transport to robot control. More material at https://sites.google.com/view/hipobot

Cite

Text

Le et al. "Hierarchical Policy Blending as Optimal Transport." Proceedings of The 5th Annual Learning for Dynamics and Control Conference, 2023.

Markdown

[Le et al. "Hierarchical Policy Blending as Optimal Transport." Proceedings of The 5th Annual Learning for Dynamics and Control Conference, 2023.](https://mlanthology.org/l4dc/2023/le2023l4dc-hierarchical/)

BibTeX

@inproceedings{le2023l4dc-hierarchical,
  title     = {{Hierarchical Policy Blending as Optimal Transport}},
  author    = {Le, An Thai and Hansel, Kay and Peters, Jan and Chalvatzaki, Georgia},
  booktitle = {Proceedings of The 5th Annual Learning for Dynamics and Control Conference},
  year      = {2023},
  pages     = {797-812},
  volume    = {211},
  url       = {https://mlanthology.org/l4dc/2023/le2023l4dc-hierarchical/}
}