On Corruption-Robustness in Performative Reinforcement Learning

Vasilis Pollatos, Debmalya Mandal, Goran Radanovic

AAAI 2025 pp. 19939-19947

doi:10.1609/AAAI.V39I19.34196 /aaai/2025/pollatos2025aaai-corruption/

Abstract

In performative Reinforcement Learning (RL), an agent faces a policy-dependent environment: the reward and transition functions depend on the agent's policy. Prior work on performative RL has studied the convergence of repeated retraining approaches to a performatively stable policy. In the finite sample regime, these approaches repeatedly solve for a saddle point of a convex-concave objective, which estimates the Lagrangian of a regularized version of the reinforcement learning problem. In this paper, we aim to extend such repeated retraining approaches, enabling them to operate under corrupted data. More specifically, we consider Huber's ε-contamination model, where an ε fraction of data points is corrupted by arbitrary adversarial noise. We propose a repeated retraining approach based on convex-concave optimization under corrupted gradients and a novel problem-specific robust mean estimator for the gradients. We prove that our approach exhibits last-iterate convergence to an approximately stable policy, with the approximation error linear in √ε. We experimentally demonstrate the importance of accounting for corruption in performative reinforcement learning.

PDF AAAI Semantic Scholar

Cite

Text

Pollatos et al. "On Corruption-Robustness in Performative Reinforcement Learning." AAAI Conference on Artificial Intelligence, 2025. doi:10.1609/AAAI.V39I19.34196

Markdown

[Pollatos et al. "On Corruption-Robustness in Performative Reinforcement Learning." AAAI Conference on Artificial Intelligence, 2025.](https://mlanthology.org/aaai/2025/pollatos2025aaai-corruption/) doi:10.1609/AAAI.V39I19.34196

BibTeX

@inproceedings{pollatos2025aaai-corruption,
  title     = {{On Corruption-Robustness in Performative Reinforcement Learning}},
  author    = {Pollatos, Vasilis and Mandal, Debmalya and Radanovic, Goran},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {19939-19947},
  doi       = {10.1609/AAAI.V39I19.34196},
  url       = {https://mlanthology.org/aaai/2025/pollatos2025aaai-corruption/}
}