One More Step Towards Reality: Cooperative Bandits with Imperfect Communication

Abstract

The cooperative bandit problem is increasingly becoming relevant due to its applications in large-scale decision-making. However, most research for this problem focuses exclusively on the setting with perfect communication, whereas in most real-world distributed settings, communication is often over stochastic networks, with arbitrary corruptions and delays. In this paper, we study cooperative bandit learning under three typical real-world communication scenarios, namely, (a) message-passing over stochastic time-varying networks, (b) instantaneous reward-sharing over a network with random delays, and (c) message-passing with adversarially corrupted rewards, including byzantine communication. For each of these environments, we propose decentralized algorithms that achieve competitive performance, along with near-optimal guarantees on the incurred group regret as well. Furthermore, in the setting with perfect communication, we present an improved delayed-update algorithm that outperforms the existing state-of-the-art on various network topologies. Finally, we present tight network-dependent minimax lower bounds on the group regret. Our proposed algorithms are straightforward to implement and obtain competitive empirical performance.

Cite

Text

Madhushani et al. "One More Step Towards Reality: Cooperative Bandits with Imperfect Communication." Neural Information Processing Systems, 2021.

Markdown

[Madhushani et al. "One More Step Towards Reality: Cooperative Bandits with Imperfect Communication." Neural Information Processing Systems, 2021.](https://mlanthology.org/neurips/2021/madhushani2021neurips-one/)

BibTeX

@inproceedings{madhushani2021neurips-one,
  title     = {{One More Step Towards Reality: Cooperative Bandits with Imperfect Communication}},
  author    = {Madhushani, Udari and Dubey, Abhimanyu and Leonard, Naomi and Pentland, Alex},
  booktitle = {Neural Information Processing Systems},
  year      = {2021},
  url       = {https://mlanthology.org/neurips/2021/madhushani2021neurips-one/}
}