Towards Scalable and Robust Structured Bandits: A Meta-Learning Framework
Abstract
Online learning in large-scale structured bandits is known to be challenging due to the curse of dimensionality. In this paper, we propose a unified meta-learning framework for a wide class of structured bandit problems where the parameter space can be factorized to item-level, which covers many popular tasks. Compared with existing approaches, the proposed solution is both scalable to large systems and robust by utilizing a more flexible model. At the core of this framework is a Bayesian hierarchical model that allows information sharing among items via their features, upon which we design a meta Thompson sampling algorithm. Three representative examples are discussed thoroughly. Theoretical analysis and extensive numerical results both support the usefulness of the proposed method.
Cite
Text
Wan et al. "Towards Scalable and Robust Structured Bandits: A Meta-Learning Framework." Artificial Intelligence and Statistics, 2023.Markdown
[Wan et al. "Towards Scalable and Robust Structured Bandits: A Meta-Learning Framework." Artificial Intelligence and Statistics, 2023.](https://mlanthology.org/aistats/2023/wan2023aistats-scalable/)BibTeX
@inproceedings{wan2023aistats-scalable,
title = {{Towards Scalable and Robust Structured Bandits: A Meta-Learning Framework}},
author = {Wan, Runzhe and Ge, Lin and Song, Rui},
booktitle = {Artificial Intelligence and Statistics},
year = {2023},
pages = {1144-1173},
volume = {206},
url = {https://mlanthology.org/aistats/2023/wan2023aistats-scalable/}
}