When Are Linear Stochastic Bandits Attackable?

Abstract

We study adversarial attacks on linear stochastic bandits: by manipulating the rewards, an adversary aims to control the behaviour of the bandit algorithm. Perhaps surprisingly, we first show that some attack goals can never be achieved. This is in a sharp contrast to context-free stochastic bandits, and is intrinsically due to the correlation among arms in linear stochastic bandits. Motivated by this finding, this paper studies the attackability of a $k$-armed linear bandit environment. We first provide a complete necessity and sufficiency characterization of attackability based on the geometry of the arms’ context vectors. We then propose a two-stage attack method against LinUCB and Robust Phase Elimination. The method first asserts whether the given environment is attackable; and if yes, it poisons the rewards to force the algorithm to pull a target arm linear times using only a sublinear cost. Numerical experiments further validate the effectiveness and cost-efficiency of the proposed attack method.

Cite

Text

Wang et al. "When Are Linear Stochastic Bandits Attackable?." International Conference on Machine Learning, 2022.

Markdown

[Wang et al. "When Are Linear Stochastic Bandits Attackable?." International Conference on Machine Learning, 2022.](https://mlanthology.org/icml/2022/wang2022icml-linear/)

BibTeX

@inproceedings{wang2022icml-linear,
  title     = {{When Are Linear Stochastic Bandits Attackable?}},
  author    = {Wang, Huazheng and Xu, Haifeng and Wang, Hongning},
  booktitle = {International Conference on Machine Learning},
  year      = {2022},
  pages     = {23254-23273},
  volume    = {162},
  url       = {https://mlanthology.org/icml/2022/wang2022icml-linear/}
}