Near-Optimal Algorithm for Non-Stationary Kernelized Bandits
Abstract
This paper studies a non-stationary kernelized bandit (KB) problem, also called time-varying Bayesian optimization, where one seeks to minimize the regret under an unknown reward function that varies over time. In particular, we focus on a near-optimal algorithm whose regret upper bound matches the regret lower bound. For this goal, we show the first algorithm-independent regret lower bound for non-stationary KB with squared exponential and Matérn kernels, which reveals that an existing optimization-based KB algorithm with slight modification is near-optimal. However, this existing algorithm suffers from feasibility issues due to its huge computational cost. Therefore, we propose a novel near-optimal algorithm called restarting phased elimination with random permutation (R-PERP), which bypasses the huge computational cost. A technical key point is the simple permutation procedures of query candidates, which enable us to derive a novel tighter confidence bound tailored to the non-stationary problems.
Cite
Text
Iwazaki and Takeno. "Near-Optimal Algorithm for Non-Stationary Kernelized Bandits." Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, 2025.Markdown
[Iwazaki and Takeno. "Near-Optimal Algorithm for Non-Stationary Kernelized Bandits." Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, 2025.](https://mlanthology.org/aistats/2025/iwazaki2025aistats-nearoptimal/)BibTeX
@inproceedings{iwazaki2025aistats-nearoptimal,
title = {{Near-Optimal Algorithm for Non-Stationary Kernelized Bandits}},
author = {Iwazaki, Shogo and Takeno, Shion},
booktitle = {Proceedings of The 28th International Conference on Artificial Intelligence and Statistics},
year = {2025},
pages = {406-414},
volume = {258},
url = {https://mlanthology.org/aistats/2025/iwazaki2025aistats-nearoptimal/}
}