Improved Algorithms for Adversarial Bandits with Unbounded Losses
Abstract
We consider the Adversarial Multi-Armed Bandits (MAB) problem with unbounded losses, where the algorithms have no prior knowledge on the sizes of the losses. We present \texttt{UMAB-NN} and \texttt{UMAB-G}, two algorithms for non-negative and general unbounded loss respectively. For non-negative unbounded losses, \texttt{UMAB-NN} achieves the first fully adaptive and scale-free regret bound. Building up on that, we further develop \texttt{UMAB-G} that can learn from arbitrary unbounded losses. Complementing the algorithms is a new lower bound, demonstrating a fundamental trade-off between adaptivility and minimax optimality in scale-free MAB. Finally, we perform extensive empirical evaluations, showing that our algorithms consistently out-perform all existing algorithms that handle unbounded losses.
Cite
Text
Chen and Zhang. "Improved Algorithms for Adversarial Bandits with Unbounded Losses." ICML 2024 Workshops: ARLET, 2024.Markdown
[Chen and Zhang. "Improved Algorithms for Adversarial Bandits with Unbounded Losses." ICML 2024 Workshops: ARLET, 2024.](https://mlanthology.org/icmlw/2024/chen2024icmlw-improved/)BibTeX
@inproceedings{chen2024icmlw-improved,
title = {{Improved Algorithms for Adversarial Bandits with Unbounded Losses}},
author = {Chen, Mingyu and Zhang, Xuezhou},
booktitle = {ICML 2024 Workshops: ARLET},
year = {2024},
url = {https://mlanthology.org/icmlw/2024/chen2024icmlw-improved/}
}