DR-SAC: Distributionally Robust Soft Actor-Critic for Reinforcement Learning Under Uncertainty

Cui, Mingxuan; Zhou, Duo; Han, Yuxuan; Hanasusanto, Grani A.; Wang, Qiong; Zhang, Huan; Zhou, Zhengyuan

DR-SAC: Distributionally Robust Soft Actor-Critic for Reinforcement Learning Under Uncertainty

Mingxuan Cui, Duo Zhou, Yuxuan Han, Grani A. Hanasusanto, Qiong Wang, Huan Zhang, Zhengyuan Zhou

ICLR 2026

/iclr/2026/cui2026iclr-drsac/

Abstract

Deep reinforcement learning (RL) has achieved remarkable success, yet its deployment in real-world scenarios is often limited by vulnerability to environmental uncertainties. Distributionally robust RL (DR-RL) algorithms have been proposed to resolve this challenge, but existing approaches are largely restricted to value-based methods in tabular settings. In this work, we introduce Distributionally Robust Soft Actor-Critic (DR-SAC), the first actor–critic based DR-RL algorithm for offline learning in continuous action spaces. DR-SAC maximizes the entropy-regularized rewards against the worst possible transition models within an KL-divergence constrained uncertainty set. We derive the distributionally robust version of the soft policy iteration with a convergence guarantee and incorporate a generative modeling approach to estimate the unknown nominal transition models. Experiment results on five continuous RL tasks demonstrate our algorithm achieves up to $9.8\times$ higher average reward than the SAC baseline under common perturbations. Additionally, DR-SAC significantly improves computing efficiency and applicability to large-scale problems compared with existing DR-RL algorithms.

PDF ICLR OpenReview Semantic Scholar

Cite

Text

Cui et al. "DR-SAC: Distributionally Robust Soft Actor-Critic for Reinforcement Learning Under Uncertainty." International Conference on Learning Representations, 2026.

Markdown

[Cui et al. "DR-SAC: Distributionally Robust Soft Actor-Critic for Reinforcement Learning Under Uncertainty." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/cui2026iclr-drsac/)

BibTeX

@inproceedings{cui2026iclr-drsac,
  title     = {{DR-SAC: Distributionally Robust Soft Actor-Critic for Reinforcement Learning Under Uncertainty}},
  author    = {Cui, Mingxuan and Zhou, Duo and Han, Yuxuan and Hanasusanto, Grani A. and Wang, Qiong and Zhang, Huan and Zhou, Zhengyuan},
  booktitle = {International Conference on Learning Representations},
  year      = {2026},
  url       = {https://mlanthology.org/iclr/2026/cui2026iclr-drsac/}
}