Combinatorial Semi-Bandits with Knapsacks

Abstract

We unify two prominent lines of work on multi-armed bandits: bandits with knapsacks (BwK) and combinatorial semi-bandits. The former concerns limited "resources" consumed by the algorithm, e.g., limited supply in dynamic pricing. The latter allows a huge number of actions but assumes combinatorial structure and additional feedback to make the problem tractable. We define a common generalization, support it with several motivating examples, and design an algorithm for it. Our regret bounds are comparable with those for BwK and combinatorial semi- bandits.

Cite

Text

Sankararaman and Slivkins. "Combinatorial Semi-Bandits with Knapsacks." International Conference on Artificial Intelligence and Statistics, 2018.

Markdown

[Sankararaman and Slivkins. "Combinatorial Semi-Bandits with Knapsacks." International Conference on Artificial Intelligence and Statistics, 2018.](https://mlanthology.org/aistats/2018/sankararaman2018aistats-combinatorial/)

BibTeX

@inproceedings{sankararaman2018aistats-combinatorial,
  title     = {{Combinatorial Semi-Bandits with Knapsacks}},
  author    = {Sankararaman, Karthik Abinav and Slivkins, Aleksandrs},
  booktitle = {International Conference on Artificial Intelligence and Statistics},
  year      = {2018},
  pages     = {1760-1770},
  url       = {https://mlanthology.org/aistats/2018/sankararaman2018aistats-combinatorial/}
}