Optimal Streaming Algorithms for Multi-Armed Bandits

Tianyuan Jin, Keke Huang, Jing Tang, Xiaokui Xiao

ICML 2021 pp. 5045-5054

/icml/2021/jin2021icml-optimal/

Abstract

This paper studies two variants of the best arm identification (BAI) problem under the streaming model, where we have a stream of n arms with reward distributions supported on [0,1] with unknown means. The arms in the stream are arriving one by one, and the algorithm cannot access an arm unless it is stored in a limited size memory. We first study the streaming \epslion-topk-arms identification problem, which asks for k arms whose reward means are lower than that of the k-th best arm by at most \epsilon with probability at least 1-\delta. For general \epsilon \in (0,1), the existing solution for this problem assumes k = 1 and achieves the optimal sample complexity O(\frac{n}{\epsilon^2} \log \frac{1}{\delta}) using O(\log^*(n)) memory and a single pass of the stream. We propose an algorithm that works for any k and achieves the optimal sample complexity O(\frac{n}{\epsilon^2} \log\frac{k}{\delta}) using a single-arm memory and a single pass of the stream. Second, we study the streaming BAI problem, where the objective is to identify the arm with the maximum reward mean with at least 1-\delta probability, using a single-arm memory and as few passes of the input stream as possible. We present a single-arm-memory algorithm that achieves a near instance-dependent optimal sample complexity within O(\log \Delta_2^-1) passes, where \Delta_2 is the gap between the mean of the best arm and that of the second best arm.

PDF ICML Semantic Scholar

Cite

Text

Jin et al. "Optimal Streaming Algorithms for Multi-Armed Bandits." International Conference on Machine Learning, 2021.

Markdown

[Jin et al. "Optimal Streaming Algorithms for Multi-Armed Bandits." International Conference on Machine Learning, 2021.](https://mlanthology.org/icml/2021/jin2021icml-optimal/)

BibTeX

@inproceedings{jin2021icml-optimal,
  title     = {{Optimal Streaming Algorithms for Multi-Armed Bandits}},
  author    = {Jin, Tianyuan and Huang, Keke and Tang, Jing and Xiao, Xiaokui},
  booktitle = {International Conference on Machine Learning},
  year      = {2021},
  pages     = {5045-5054},
  volume    = {139},
  url       = {https://mlanthology.org/icml/2021/jin2021icml-optimal/}
}