Inherently Robust Control Through Maximum-Entropy Learning-Based Rollout

Abstract

Reinforcement Learning has recently proven extremely successful in the context of robot control. One of the major reasons is massively parallel simulation in conjunction with controlling for the so-called ``sim to real'' gap: training on a distribution of environments, which is assumed to contain the real one, is sufficient for finding neural policies that successfully transfer from computer simulations to real robots. Often, this is accompanied by a layer of system identification during deployment to close the gap further. Still, the efficacy of these approaches hinges on reasonable simulation capabilities with an adequately rich task distribution containing the real environment. This work aims to provide a complementary solution in cases where the aforementioned criteria may prove challenging to satisfy. We combine two approaches, $\textit{maximum-entropy reinforcement learning}$ (MaxEntRL) and $\textit{rollout}$, into an inherently robust control method called $\textbf{Maximum-Entropy Learning-Based Rollout (MELRO)}$. Both promise increased robustness and adaptability on their own. While MaxEntRL has been shown to be an adversarially-robust approach in disguise, rollout greatly improves over parametric models through an implicit Newton step on a model of the environment. We find that our approach works excellently in the vast majority of cases on both the Real World Reinforcement Learning (RWRL) benchmark and on our own environment perturbations of the popular DeepMind Control (DMC) suite, which move beyond simple parametric noise. We also show its success in ``sim to real'' transfer with the Franka Panda robot arm.

Cite

Text

Bok et al. "Inherently Robust Control Through Maximum-Entropy Learning-Based Rollout." Transactions on Machine Learning Research, 2025.

Markdown

[Bok et al. "Inherently Robust Control Through Maximum-Entropy Learning-Based Rollout." Transactions on Machine Learning Research, 2025.](https://mlanthology.org/tmlr/2025/bok2025tmlr-inherently/)

BibTeX

@article{bok2025tmlr-inherently,
  title     = {{Inherently Robust Control Through Maximum-Entropy Learning-Based Rollout}},
  author    = {Bok, Felix and Mirchev, Atanas and Kayalibay, Baris and Wenzel, Ole Jonas and van der Smagt, Patrick and Bayer, Justin},
  journal   = {Transactions on Machine Learning Research},
  year      = {2025},
  url       = {https://mlanthology.org/tmlr/2025/bok2025tmlr-inherently/}
}