Safe Exploration in Reproducing Kernel Hilbert Spaces
Abstract
Popular safe Bayesian optimization (BO) algorithms successfully control safety-critical systems in unknown environments. However, most algorithms require smoothness assumptions, which are encoded by a norm in a reproducing kernel Hilbert space (RKHS). The RKHS is a potentially infinite-dimensional space and it remains unclear how to reliably obtain the RKHS norm of an unknown function. In this work, we propose a safe BO algorithm capable of estimating the RKHS norm from data. We provide statistical guarantees on the RKHS norm estimation, derive novel confidence intervals for, and prove safety of the resulting safe BO algorithm. We apply our algorithm to safely optimize reinforcement learning policies on physics simulators and on a real Furuta pendulum, demonstrating improved performance, safety, and scalability compared to the state-of-the-art.
Cite
Text
Tokmak et al. "Safe Exploration in Reproducing Kernel Hilbert Spaces." ICML 2024 Workshops: ARLET, 2024.Markdown
[Tokmak et al. "Safe Exploration in Reproducing Kernel Hilbert Spaces." ICML 2024 Workshops: ARLET, 2024.](https://mlanthology.org/icmlw/2024/tokmak2024icmlw-safe/)BibTeX
@inproceedings{tokmak2024icmlw-safe,
title = {{Safe Exploration in Reproducing Kernel Hilbert Spaces}},
author = {Tokmak, Abdullah and Krishnan, Kiran G. and Schön, Thomas B. and Baumann, Dominik},
booktitle = {ICML 2024 Workshops: ARLET},
year = {2024},
url = {https://mlanthology.org/icmlw/2024/tokmak2024icmlw-safe/}
}