On Robust Reinforcement Learning with Lipschitz-Bounded Policy Networks

Abstract

This paper presents a study of robust policy networks in deep reinforcement learning. We investigate the benefits of policy parameterizations that naturally satisfy constraints on their Lipschitz bound, analyzing their empirical performance and robustness on two representative problems: pendulum swing-up and Atari Pong. We illustrate that policy networks with small Lipschitz bounds are significantly more robust to disturbances, random noise, and targeted adversarial attacks than unconstrained policies composed of vanilla multi-layer perceptrons or convolutional neural networks. Moreover, we find that choosing a policy parameterization with a non-conservative Lipschitz bound and an expressive, nonlinear layer architecture gives the user much finer control over the performance-robustness trade-off than existing state-of-the-art methods based on spectral normalization.

Cite

Text

Barbara et al. "On Robust Reinforcement Learning with Lipschitz-Bounded Policy Networks." ICML 2024 Workshops: RLControlTheory, 2024.

Markdown

[Barbara et al. "On Robust Reinforcement Learning with Lipschitz-Bounded Policy Networks." ICML 2024 Workshops: RLControlTheory, 2024.](https://mlanthology.org/icmlw/2024/barbara2024icmlw-robust/)

BibTeX

@inproceedings{barbara2024icmlw-robust,
  title     = {{On Robust Reinforcement Learning with Lipschitz-Bounded Policy Networks}},
  author    = {Barbara, Nicholas H. and Wang, Ruigang and Manchester, Ian},
  booktitle = {ICML 2024 Workshops: RLControlTheory},
  year      = {2024},
  url       = {https://mlanthology.org/icmlw/2024/barbara2024icmlw-robust/}
}