Risk-Sensitive Variational Actor-Critic: A Model-Based Approach

Abstract

Risk-sensitive reinforcement learning (RL) with an entropic risk measure typically requires knowledge of the transition kernel or performs unstable updates w.r.t. exponential Bellman equations. As a consequence, algorithms that optimize this objective have been restricted to tabular or low-dimensional continuous environments. In this work we leverage the connection between the entropic risk measure and the RL-as-inference framework to develop a risk-sensitive variational actor-critic algorithm (rsVAC). Our work extends the variational framework to incorporate stochastic rewards and proposes a variational model-based actor-critic approach that modulates policy risk via a risk parameter. We consider, both, the risk-seeking and risk-averse regimes and present rsVAC learning variants for each setting. Our experiments demonstrate that this approach produces risk-sensitive policies and yields improvements in both tabular and risk-aware variants of complex continuous control tasks in MuJoCo.

Cite

Text

Granados et al. "Risk-Sensitive Variational Actor-Critic: A Model-Based Approach." International Conference on Learning Representations, 2025.

Markdown

[Granados et al. "Risk-Sensitive Variational Actor-Critic: A Model-Based Approach." International Conference on Learning Representations, 2025.](https://mlanthology.org/iclr/2025/granados2025iclr-risksensitive/)

BibTeX

@inproceedings{granados2025iclr-risksensitive,
  title     = {{Risk-Sensitive Variational Actor-Critic: A Model-Based Approach}},
  author    = {Granados, Alonso and Ebrahimi, Reza and Pacheco, Jason},
  booktitle = {International Conference on Learning Representations},
  year      = {2025},
  url       = {https://mlanthology.org/iclr/2025/granados2025iclr-risksensitive/}
}