Bandit Multi-Linear DR-Submodular Maximization and Its Applications on Adversarial Submodular Bandits

Abstract

We investigate the online bandit learning of the monotone multi-linear DR-submodular functions, designing the algorithm $\mathtt{BanditMLSM}$ that attains $O(T^{2/3}\log T)$ of $(1-1/e)$-regret. Then we reduce submodular bandit with partition matroid constraint and bandit sequential monotone maximization to the online bandit learning of the monotone multi-linear DR-submodular functions, attaining $O(T^{2/3}\log T)$ of $(1-1/e)$-regret in both problems, which improve the existing results. To the best of our knowledge, we are the first to give a sublinear regret algorithm for the submodular bandit with partition matroid constraint. A special case of this problem is studied by Streeter et al.(2009). They prove a $O(T^{4/5})$ $(1-1/e)$-regret upper bound. For the bandit sequential submodular maximization, the existing work proves an $O(T^{2/3})$ regret with a suboptimal $1/2$ approximation ratio (Niazadeh et al. 2021).

Cite

Text

Wan et al. "Bandit Multi-Linear DR-Submodular Maximization and Its Applications on Adversarial Submodular Bandits." International Conference on Machine Learning, 2023.

Markdown

[Wan et al. "Bandit Multi-Linear DR-Submodular Maximization and Its Applications on Adversarial Submodular Bandits." International Conference on Machine Learning, 2023.](https://mlanthology.org/icml/2023/wan2023icml-bandit/)

BibTeX

@inproceedings{wan2023icml-bandit,
  title     = {{Bandit Multi-Linear DR-Submodular Maximization and Its Applications on Adversarial Submodular Bandits}},
  author    = {Wan, Zongqi and Zhang, Jialin and Chen, Wei and Sun, Xiaoming and Zhang, Zhijie},
  booktitle = {International Conference on Machine Learning},
  year      = {2023},
  pages     = {35491-35524},
  volume    = {202},
  url       = {https://mlanthology.org/icml/2023/wan2023icml-bandit/}
}