On Multi-Armed Bandit with Impatient Arms

Abstract

In this paper, we investigate a Multi-Armed Bandit (MAB) setting where an arm exits the game if the algorithm continuously neglects it. This setup is motivated by real-world scenarios, such as online advertising and crowdsourcing, where arms only gain benefits after being pulled by the algorithm. We identify the intrinsic hardness of this problem and limitations in existing approaches. We propose FC-SE algorithm with expected regret upper bounds as our solution to this problem. As an extension, we even allow new arms to enter after the game starts and design FC-Entry algorithm with performance guarantees for this setup. Finally, we conduct experiments to validate our theoretical results.

Cite

Text

Shao and Fang. "On Multi-Armed Bandit with Impatient Arms." International Conference on Machine Learning, 2024.

Markdown

[Shao and Fang. "On Multi-Armed Bandit with Impatient Arms." International Conference on Machine Learning, 2024.](https://mlanthology.org/icml/2024/shao2024icml-multiarmed/)

BibTeX

@inproceedings{shao2024icml-multiarmed,
  title     = {{On Multi-Armed Bandit with Impatient Arms}},
  author    = {Shao, Yuming and Fang, Zhixuan},
  booktitle = {International Conference on Machine Learning},
  year      = {2024},
  pages     = {44429-44473},
  volume    = {235},
  url       = {https://mlanthology.org/icml/2024/shao2024icml-multiarmed/}
}