Delay and Cooperation in Nonstochastic Bandits

Nicolò Cesa-Bianchi, Claudio Gentile, Yishay Mansour

JMLR 2019 pp. 1-38

/jmlr/2019/cesabianchi2019jmlr-delay/

Abstract

We study networks of communicating learning agents that cooperate to solve a common nonstochastic bandit problem. Agents use an underlying communication network to get messages about actions selected by other agents, and drop messages that took more than $d$ hops to arrive, where $d$ is a delay parameter. We introduce Exp3-Coop, a cooperative version of the Exp3 algorithm and prove that with $K$ actions and $N$ agents the average per-agent regret after $T$ rounds is at most of order $\sqrt{\bigl(d+1 + \tfrac{K}{N}\alpha_{\le d}\bigr)(T\ln K)}$, where $\alpha_{\le d}$ is the independence number of the $d$-th power of the communication graph $G$. We then show that for any connected graph, for $d=\sqrt{K}$ the regret bound is $K^{1/4}\sqrt{T}$, strictly better than the minimax regret $\sqrt{KT}$ for noncooperating agents. More informed choices of $d$ lead to bounds which are arbitrarily close to the full information minimax regret $\sqrt{T\ln K}$ when $G$ is dense. When $G$ has sparse components, we show that a variant of Exp3-Coop, allowing agents to choose their parameters according to their centrality in $G$, strictly improves the regret. Finally, as a by-product of our analysis, we provide the first characterization of the minimax regret for bandit learning with delay.

PDF JMLR Semantic Scholar

Cite

Text

Cesa-Bianchi et al. "Delay and Cooperation in Nonstochastic Bandits." Journal of Machine Learning Research, 2019.

Markdown

[Cesa-Bianchi et al. "Delay and Cooperation in Nonstochastic Bandits." Journal of Machine Learning Research, 2019.](https://mlanthology.org/jmlr/2019/cesabianchi2019jmlr-delay/)

BibTeX

@article{cesabianchi2019jmlr-delay,
  title     = {{Delay and Cooperation in Nonstochastic Bandits}},
  author    = {Cesa-Bianchi, Nicolò and Gentile, Claudio and Mansour, Yishay},
  journal   = {Journal of Machine Learning Research},
  year      = {2019},
  pages     = {1-38},
  volume    = {20},
  url       = {https://mlanthology.org/jmlr/2019/cesabianchi2019jmlr-delay/}
}