Budgeted Bandit Problems with Continuous Random Costs

Abstract

We study the budgeted bandit problem, where each arm is associated with both a reward and a cost. In a budgeted bandit problem, the objective is to design an arm pulling algorithm in order to maximize the total reward before the budget runs out. In this work, we study both multi-armed bandits and linear bandits, and focus on the setting with continuous random costs. We propose an upper confidence bound based algorithm for multi-armed bandits and a confidence ball based algorithm for linear bandits, and prove logarithmic regret bounds for both algorithms. We conduct simulations on the proposed algorithms, which verify the effectiveness of our proposed algorithms.

Cite

Text

Xia et al. "Budgeted Bandit Problems with Continuous Random Costs." Proceedings of The 7th Asian Conference on Machine Learning, 2015.

Markdown

[Xia et al. "Budgeted Bandit Problems with Continuous Random Costs." Proceedings of The 7th Asian Conference on Machine Learning, 2015.](https://mlanthology.org/acml/2015/xia2015acml-budgeted/)

BibTeX

@inproceedings{xia2015acml-budgeted,
  title     = {{Budgeted Bandit Problems with Continuous Random Costs}},
  author    = {Xia, Yingce and Ding, Wenkui and Zhang, Xu-Dong and Yu, Nenghai and Qin, Tao},
  booktitle = {Proceedings of The 7th Asian Conference on Machine Learning},
  year      = {2015},
  pages     = {317-332},
  volume    = {45},
  url       = {https://mlanthology.org/acml/2015/xia2015acml-budgeted/}
}