Efficient Best-of-Both-Worlds Algorithms for Contextual Combinatorial Semi-Bandits

Abstract

We introduce the first best-of-both-worlds algorithm for contextual combinatorial semi-bandits that simultaneously guarantees $\widetilde{\mathcal{O}}(\sqrt{T})$ regret in the adversarial regime and $\widetilde{\mathcal{O}}(\ln T)$ regret in the corrupted stochastic regime. Our approach builds on the Follow-the-Regularized-Leader (FTRL) framework equipped with a Shannon entropy regularizer, yielding a flexible method that admits efficient implementations. Beyond regret bounds, we tackle the practical bottleneck in FTRL (or, equivalently, Online Stochastic Mirror Descent) arising from the high-dimensional projection step encountered in each round of interaction. By leveraging the Karush-Kuhn-Tucker conditions, we transform the $K$-dimensional convex projection problem into a single-variable root-finding problem, dramatically accelerating each round. Empirical evaluations demonstrate that this combined strategy not only attains the attractive regret bounds of best-of-both-worlds algorithms but also delivers substantial per-round speed-ups, making it well-suited for large-scale, real-time applications.

Cite

Text

Li et al. "Efficient Best-of-Both-Worlds Algorithms for Contextual Combinatorial Semi-Bandits." International Conference on Learning Representations, 2026.

Markdown

[Li et al. "Efficient Best-of-Both-Worlds Algorithms for Contextual Combinatorial Semi-Bandits." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/li2026iclr-efficient-a/)

BibTeX

@inproceedings{li2026iclr-efficient-a,
  title     = {{Efficient Best-of-Both-Worlds Algorithms for Contextual Combinatorial Semi-Bandits}},
  author    = {Li, Mengmeng and Schneider, Philipp J. and Aleksic, Jelisaveta and Kuhn, Daniel},
  booktitle = {International Conference on Learning Representations},
  year      = {2026},
  url       = {https://mlanthology.org/iclr/2026/li2026iclr-efficient-a/}
}