A Practical Semi-Parametric Contextual Bandit
Abstract
Classic multi-armed bandit algorithms are inefficient for a large number of arms. On the other hand, contextual bandit algorithms are more efficient, but they suffer from a large regret due to the bias of reward estimation with finite dimensional features. Although recent studies proposed semi-parametric bandits to overcome these defects, they assume arms' features are constant over time. However, this assumption rarely holds in practice, since real-world problems often involve underlying processes that are dynamically evolving over time especially for the special promotions like Singles' Day sales. In this paper, we formulate a novel Semi-Parametric Contextual Bandit Problem to relax this assumption. For this problem, a novel Two-Steps Upper-Confidence Bound framework, called Semi-Parametric UCB (SPUCB), is presented. It can be flexibly applied to linear parametric function problem with a satisfied gap-free bound on the n-step regret. Moreover, to make our method more practical in online system, an optimization is proposed for dealing with high dimensional features of a linear function. Extensive experiments on synthetic data as well as a real dataset from one of the largest e-commercial platforms demonstrate the superior performance of our algorithm.
Cite
Text
Peng et al. "A Practical Semi-Parametric Contextual Bandit." International Joint Conference on Artificial Intelligence, 2019. doi:10.24963/IJCAI.2019/450Markdown
[Peng et al. "A Practical Semi-Parametric Contextual Bandit." International Joint Conference on Artificial Intelligence, 2019.](https://mlanthology.org/ijcai/2019/peng2019ijcai-practical/) doi:10.24963/IJCAI.2019/450BibTeX
@inproceedings{peng2019ijcai-practical,
title = {{A Practical Semi-Parametric Contextual Bandit}},
author = {Peng, Yi and Xie, Miao and Liu, Jiahao and Meng, Xuying and Li, Nan and Yang, Cheng and Yao, Tao and Jin, Rong},
booktitle = {International Joint Conference on Artificial Intelligence},
year = {2019},
pages = {3246-3252},
doi = {10.24963/IJCAI.2019/450},
url = {https://mlanthology.org/ijcai/2019/peng2019ijcai-practical/}
}