Gaussian Randomized Exploration for Semi-Bandits with Sleeping Arms
Abstract
This paper provides theoretical analyses of worst-case regret upper and lower bounds for Gaussian randomized algorithms in semi-bandits with sleeping arms, In this setting, base arms may be unavailable in certain rounds, and only available base arms satisfying combinatorial constraints can be played simultaneously. We first introduce CTS-G, a randomized algorithm that achieves a $\tilde{O}(m\sqrt{NT})$ regret upper bound, where $T$ is the number of rounds, $N$ is the number of base arms, and up to $m$ base arms can be played per round. Next, we present CL-SG, a randomized algorithm that achieves a $\tilde{O}(\sqrt{mNT})$ regret bound. In addition to regret upper bounds, we also establish lower bounds showing that both of our proposed algorithms are near-optimal.
Cite
Text
Huang et al. "Gaussian Randomized Exploration for Semi-Bandits with Sleeping Arms." NeurIPS 2024 Workshops: BDU, 2024.Markdown
[Huang et al. "Gaussian Randomized Exploration for Semi-Bandits with Sleeping Arms." NeurIPS 2024 Workshops: BDU, 2024.](https://mlanthology.org/neuripsw/2024/huang2024neuripsw-gaussian/)BibTeX
@inproceedings{huang2024neuripsw-gaussian,
title = {{Gaussian Randomized Exploration for Semi-Bandits with Sleeping Arms}},
author = {Huang, Zhiming and Hu, Bingshan and Pan, Jianping},
booktitle = {NeurIPS 2024 Workshops: BDU},
year = {2024},
url = {https://mlanthology.org/neuripsw/2024/huang2024neuripsw-gaussian/}
}