Risk-Sensitive Bandits: Arm Mixture Optimality and Regret-Efficient Algorithms
Abstract
This paper introduces a general framework for risk-sensitive bandits that integrates the notions of risk-sensitive objectives by adopting a rich class of {\em distortion riskmetrics}. The introduced framework subsumes the various existing risk-sensitive models. An important and hitherto unknown observation is that for a wide range of riskmetrics, the optimal bandit policy involves selecting a \emph{mixture} of arms. This is in sharp contrast to the convention in the multi-arm bandit algorithms that there is generally a \emph{solitary} arm that maximizes the utility, whether purely reward-centric or risk-sensitive. This creates a major departure from the principles for designing bandit algorithms since there are uncountable mixture possibilities. The contributions of the paper are as follows: (i) it formalizes a general framework for risk-sensitive bandits, (ii) identifies standard risk-sensitive bandit models for which solitary arm selections is not optimal, (iii) and designs regret-efficient algorithms whose sampling strategies can accurately track optimal arm mixtures (when mixture is optimal) or the solitary arms (when solitary is optimal). The algorithms are shown to achieve a regret that scales according to $O((\log T/T )^{\nu})$, where $T$ is the horizon, and $\nu>0$ is a riskmetric-specific constant.
Cite
Text
Tatlı et al. "Risk-Sensitive Bandits: Arm Mixture Optimality and Regret-Efficient Algorithms." Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, 2025.Markdown
[Tatlı et al. "Risk-Sensitive Bandits: Arm Mixture Optimality and Regret-Efficient Algorithms." Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, 2025.](https://mlanthology.org/aistats/2025/tatl2025aistats-risksensitive/)BibTeX
@inproceedings{tatl2025aistats-risksensitive,
title = {{Risk-Sensitive Bandits: Arm Mixture Optimality and Regret-Efficient Algorithms}},
author = {Tatlı, Meltem and Mukherjee, Arpan and Prashanth, L. A. and Shanmugam, Karthikeyan and Tajer, Ali},
booktitle = {Proceedings of The 28th International Conference on Artificial Intelligence and Statistics},
year = {2025},
pages = {3871-3879},
volume = {258},
url = {https://mlanthology.org/aistats/2025/tatl2025aistats-risksensitive/}
}