Communication Efficient Distributed Learning for Kernelized Contextual Bandits

Abstract

We tackle the communication efficiency challenge of learning kernelized contextual bandits in a distributed setting. Despite the recent advances in communication-efficient distributed bandit learning, existing solutions are restricted to simple models like multi-armed bandits and linear bandits, which hamper their practical utility. In this paper, instead of assuming the existence of a linear reward mapping from the features to the expected rewards, we consider non-linear reward mappings, by letting agents collaboratively search in a reproducing kernel Hilbert space (RKHS). This introduces significant challenges in communication efficiency as distributed kernel learning requires the transfer of raw data, leading to a communication cost that grows linearly w.r.t. time horizon $T$. We addresses this issue by equipping all agents to communicate via a common Nystr\"om embedding that gets updated adaptively as more data points are collected. We rigorously proved that our algorithm can attain sub-linear rate in both regret and communication cost.

Cite

Text

Li et al. "Communication Efficient Distributed Learning for Kernelized Contextual Bandits." Neural Information Processing Systems, 2022.

Markdown

[Li et al. "Communication Efficient Distributed Learning for Kernelized Contextual Bandits." Neural Information Processing Systems, 2022.](https://mlanthology.org/neurips/2022/li2022neurips-communication/)

BibTeX

@inproceedings{li2022neurips-communication,
  title     = {{Communication Efficient Distributed Learning for Kernelized Contextual Bandits}},
  author    = {Li, Chuanhao and Wang, Huazheng and Wang, Mengdi and Wang, Hongning},
  booktitle = {Neural Information Processing Systems},
  year      = {2022},
  url       = {https://mlanthology.org/neurips/2022/li2022neurips-communication/}
}