Exploring Under Constraints with Model-Based Actor-Critic and Safety Filters

Abstract

Applying reinforcement learning (RL) to learn effective policies on physical robots without supervision remains challenging when it comes to tasks where safe exploration is critical. Constrained model-based RL (CMBRL) presents a promising approach to this problem. These methods are designed to learn constraint-adhering policies through constrained optimization approaches. Yet, such policies often fail to meet stringent safety requirements during learning and exploration. Our solution “CASE” aims to reduce the instances where constraints are breached during the learning phase. Specifically, CASE integrates techniques for optimizing constrained policies and employs planning-based safety filters as backup policies, effectively lowering constraint violations during learning and making it a more reliable option than other recent constrained model-based policy optimization methods.

Cite

Text

Agha et al. "Exploring Under Constraints with Model-Based Actor-Critic and Safety Filters." Proceedings of The 8th Conference on Robot Learning, 2024.

Markdown

[Agha et al. "Exploring Under Constraints with Model-Based Actor-Critic and Safety Filters." Proceedings of The 8th Conference on Robot Learning, 2024.](https://mlanthology.org/corl/2024/agha2024corl-exploring/)

BibTeX

@inproceedings{agha2024corl-exploring,
  title     = {{Exploring Under Constraints with Model-Based Actor-Critic and Safety Filters}},
  author    = {Agha, Ahmed and Kayalibay, Baris and Mirchev, Atanas and van der Smagt, Patrick and Bayer, Justin},
  booktitle = {Proceedings of The 8th Conference on Robot Learning},
  year      = {2024},
  pages     = {1216-1230},
  volume    = {270},
  url       = {https://mlanthology.org/corl/2024/agha2024corl-exploring/}
}