Augmented Proximal Policy Optimization for Safe Reinforcement Learning

Abstract

Safe reinforcement learning considers practical scenarios that maximize the return while satisfying safety constraints. Current algorithms, which suffer from training oscillations or approximation errors, still struggle to update the policy efficiently with precise constraint satisfaction. In this article, we propose Augmented Proximal Policy Optimization (APPO), which augments the Lagrangian function of the primal constrained problem via attaching a quadratic deviation term. The constructed multiplier-penalty function dampens cost oscillation for stable convergence while being equivalent to the primal constrained problem to precisely control safety costs. APPO alternately updates the policy and the Lagrangian multiplier via solving the constructed augmented primal-dual problem, which can be easily implemented by any first-order optimizer. We apply our APPO methods in diverse safety-constrained tasks, setting a new state of the art compared with a comprehensive list of safe RL baselines. Extensive experiments verify the merits of our method in easy implementation, stable convergence, and precise cost control.

Cite

Text

Dai et al. "Augmented Proximal Policy Optimization for Safe Reinforcement Learning." AAAI Conference on Artificial Intelligence, 2023. doi:10.1609/AAAI.V37I6.25888

Markdown

[Dai et al. "Augmented Proximal Policy Optimization for Safe Reinforcement Learning." AAAI Conference on Artificial Intelligence, 2023.](https://mlanthology.org/aaai/2023/dai2023aaai-augmented/) doi:10.1609/AAAI.V37I6.25888

BibTeX

@inproceedings{dai2023aaai-augmented,
  title     = {{Augmented Proximal Policy Optimization for Safe Reinforcement Learning}},
  author    = {Dai, Juntao and Ji, Jiaming and Yang, Long and Zheng, Qian and Pan, Gang},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2023},
  pages     = {7288-7295},
  doi       = {10.1609/AAAI.V37I6.25888},
  url       = {https://mlanthology.org/aaai/2023/dai2023aaai-augmented/}
}