Optimal and Practical Batched Linear Bandit Algorithm

Abstract

We study the linear bandit problem under limited adaptivity, known as the batched linear bandit. While existing approaches can achieve near-optimal regret in theory, they are often computationally prohibitive or underperform in practice. We propose BLAE, a novel batched algorithm that integrates arm elimination with regularized G-optimal design, achieving the minimax optimal regret (up to logarithmic factors in $T$) in both large-$K$ and small-$K$ regimes for the first time, while using only $O(\log\log T)$ batches. Our analysis introduces new techniques for batch-wise optimal design and refined concentration bounds. Crucially, BLAE demonstrates low computational overhead and strong empirical performance, outperforming state-of-the-art methods in extensive numerical evaluations. Thus, BLAE is the first algorithm to combine provable minimax-optimality in all regimes and practical superiority in batched linear bandits.

Cite

Text

Yu and Oh. "Optimal and Practical Batched Linear Bandit Algorithm." Proceedings of the 42nd International Conference on Machine Learning, 2025.

Markdown

[Yu and Oh. "Optimal and Practical Batched Linear Bandit Algorithm." Proceedings of the 42nd International Conference on Machine Learning, 2025.](https://mlanthology.org/icml/2025/yu2025icml-optimal/)

BibTeX

@inproceedings{yu2025icml-optimal,
  title     = {{Optimal and Practical Batched Linear Bandit Algorithm}},
  author    = {Yu, Sanghoon and Oh, Min-Hwan},
  booktitle = {Proceedings of the 42nd International Conference on Machine Learning},
  year      = {2025},
  pages     = {73262-73285},
  volume    = {267},
  url       = {https://mlanthology.org/icml/2025/yu2025icml-optimal/}
}