On the Tension Between Optimality and Adversarial Robustness in Policy Optimization

Abstract

Achieving optimality and adversarial robustness in deep reinforcement learning has long been regarded as conflicting goals. Nonetheless, recent theoretical insights presented in CAR suggest a potential alignment, raising the important question of how to realize this in practice. This paper first identifies a key gap between theory and practice by comparing standard policy optimization (SPO) and adversarially robust policy optimization (ARPO). Although they share theoretical consistency, *a fundamental tension between robustness and optimality arises in practical policy gradient methods*. SPO tends toward convergence to vulnerable first-order stationary policies (FOSPs) with strong natural performance, whereas ARPO typically favors more robust FOSPs at the expense of reduced returns. Furthermore, we attribute this tradeoff to the *reshaping effect of the strongest adversaries* in ARPO, which significantly complicates the global landscape by inducing *deceptive sticky FOSPs*. This improves robustness but makes navigation more challenging. To alleviate this, we develop the *BARPO*, a bilevel framework unifying SPO and ARPO by modulating adversary strength, thereby facilitating navigability while preserving global optima. Extensive empirical results demonstrate that BARPO consistently outperforms vanilla ARPO, providing a practical approach to reconcile theoretical and empirical performance.

Cite

Text

Li et al. "On the Tension Between Optimality and Adversarial Robustness in Policy Optimization." International Conference on Learning Representations, 2026.

Markdown

[Li et al. "On the Tension Between Optimality and Adversarial Robustness in Policy Optimization." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/li2026iclr-tension/)

BibTeX

@inproceedings{li2026iclr-tension,
  title     = {{On the Tension Between Optimality and Adversarial Robustness in Policy Optimization}},
  author    = {Li, Haoran and Lv, Jiayu and Han, Congying and Zhang, Zicheng and Li, Anqi and Liu, Yan and Guo, Tiande and Jiang, Nan},
  booktitle = {International Conference on Learning Representations},
  year      = {2026},
  url       = {https://mlanthology.org/iclr/2026/li2026iclr-tension/}
}