Federated Linear Contextual Bandits
Abstract
This paper presents a novel federated linear contextual bandits model, where individual clients face different $K$-armed stochastic bandits coupled through common global parameters. By leveraging the geometric structure of the linear rewards, a collaborative algorithm called Fed-PE is proposed to cope with the heterogeneity across clients without exchanging local feature vectors or raw data. Fed-PE relies on a novel multi-client G-optimal design, and achieves near-optimal regrets for both disjoint and shared parameter cases with logarithmic communication costs. In addition, a new concept called collinearly-dependent policies is introduced, based on which a tight minimax regret lower bound for the disjoint parameter case is derived. Experiments demonstrate the effectiveness of the proposed algorithms on both synthetic and real-world datasets.
Cite
Text
Huang et al. "Federated Linear Contextual Bandits." Neural Information Processing Systems, 2021.Markdown
[Huang et al. "Federated Linear Contextual Bandits." Neural Information Processing Systems, 2021.](https://mlanthology.org/neurips/2021/huang2021neurips-federated/)BibTeX
@inproceedings{huang2021neurips-federated,
title = {{Federated Linear Contextual Bandits}},
author = {Huang, Ruiquan and Wu, Weiqiang and Yang, Jing and Shen, Cong},
booktitle = {Neural Information Processing Systems},
year = {2021},
url = {https://mlanthology.org/neurips/2021/huang2021neurips-federated/}
}