Time-Varying Gaussian Process Bandit Optimization with Experts: No-Regret in Logarithmically-Many Side Queries
Abstract
We study a time-varying Bayesian optimization problem with bandit feedback, where the reward function belongs to a Reproducing Kernel Hilbert Space (RKHS). We approach the problem via an upper-confidence bound Gaussian Process algorithm, which has been proven to yield no-regret in the stationary case. The time-varying case is more challenging and no-regret results are out of reach in general in the standard setting. As such, we instead tackle the question of how many additional observations asked to an expert are required to regain a no-regret property. To do so, we formulate the presence of past observation via an uncertainty injection procedure, and we reframe the problem as a heteroscedastic Gaussian Process regression. In addition, to achieve a no-regret result, we discard long outdated observations and replace them with updated (possibly very noisy) ones obtained by asking queries to an external expert. By leveraging and extending sparse inference to the heteroscedastic case, we are able to secure a no-regret result in a challenging time-varying setting with only logarithmically-many side queries per time step. Our method demonstrates that minimal additional information suffices to counteract temporal drift, ensuring efficient optimization despite time variation.
Cite
Text
Mauduit et al. "Time-Varying Gaussian Process Bandit Optimization with Experts: No-Regret in Logarithmically-Many Side Queries." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2025. doi:10.1007/978-3-032-06096-9_10Markdown
[Mauduit et al. "Time-Varying Gaussian Process Bandit Optimization with Experts: No-Regret in Logarithmically-Many Side Queries." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2025.](https://mlanthology.org/ecmlpkdd/2025/mauduit2025ecmlpkdd-timevarying/) doi:10.1007/978-3-032-06096-9_10BibTeX
@inproceedings{mauduit2025ecmlpkdd-timevarying,
title = {{Time-Varying Gaussian Process Bandit Optimization with Experts: No-Regret in Logarithmically-Many Side Queries}},
author = {Mauduit, Eliabelle and Berthier, Eloïse and Simonetto, Andrea},
booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
year = {2025},
pages = {164-182},
doi = {10.1007/978-3-032-06096-9_10},
url = {https://mlanthology.org/ecmlpkdd/2025/mauduit2025ecmlpkdd-timevarying/}
}