Influential Bandits: Pulling an Arm May Change the Environment

Abstract

While classical formulations of multi-armed bandit problems assume that each arm's reward is independent and stationary, real-world applications often involve non-stationary environments and interdependencies between arms. In particular, selecting one arm may influence the future rewards of other arms, a scenario not adequately captured by existing models such as rotting bandits or restless bandits. To address this limitation, we propose the influential bandit problem, which models inter-arm interactions through an unknown, symmetric, positive semi-definite interaction matrix that governs the dynamics of arm losses. We formally define this problem and establish two regret lower bounds, including a superlinear $\Omega(T^2 / \log^2 T)$ bound for the standard LCB algorithm (loss minimization version of UCB) and an algorithm-independent $\Omega(T)$ bound, which highlight the inherent difficulty of the setting. We then introduce a new algorithm based on a lower confidence bound (LCB) estimator tailored to the structure of the loss dynamics. Under mild assumptions, our algorithm achieves a regret of $O(KT \log T)$, which is nearly optimal in terms of its dependence on the time horizon. The algorithm is simple to implement and computationally efficient. Empirical evaluations on both synthetic and real-world datasets demonstrate the presence of inter-arm influence and confirm the superior performance of our method compared to conventional bandit algorithms.

Cite

Text

Sato and Ito. "Influential Bandits: Pulling an Arm May Change the Environment." Transactions on Machine Learning Research, 2025.

Markdown

[Sato and Ito. "Influential Bandits: Pulling an Arm May Change the Environment." Transactions on Machine Learning Research, 2025.](https://mlanthology.org/tmlr/2025/sato2025tmlr-influential/)

BibTeX

@article{sato2025tmlr-influential,
  title     = {{Influential Bandits: Pulling an Arm May Change the Environment}},
  author    = {Sato, Ryoma and Ito, Shinji},
  journal   = {Transactions on Machine Learning Research},
  year      = {2025},
  url       = {https://mlanthology.org/tmlr/2025/sato2025tmlr-influential/}
}