Gray-Box Gaussian Processes for Automated Reinforcement Learning

Abstract

Despite having achieved spectacular milestones in an array of important real-world applications, most Reinforcement Learning (RL) methods are very brittle concerning their hyperparameters. Notwithstanding the crucial importance of setting the hyperparameters in training state-of-the-art agents, the task of hyperparameter optimization (HPO) in RL is understudied. In this paper, we propose a novel gray-box Bayesian Optimization technique for HPO in RL, that enriches Gaussian Processes with reward curve estimations based on generalized logistic functions. In a very large-scale experimental protocol, comprising 5 popular RL methods (DDPG, A2C, PPO, SAC, TD3), dozens of environments (Atari, Mujoco), and 7 HPO baselines, we demonstrate that our method significantly outperforms current HPO practices in RL.

Cite

Text

Shala et al. "Gray-Box Gaussian Processes for Automated Reinforcement Learning." International Conference on Learning Representations, 2023.

Markdown

[Shala et al. "Gray-Box Gaussian Processes for Automated Reinforcement Learning." International Conference on Learning Representations, 2023.](https://mlanthology.org/iclr/2023/shala2023iclr-graybox/)

BibTeX

@inproceedings{shala2023iclr-graybox,
  title     = {{Gray-Box Gaussian Processes for Automated Reinforcement Learning}},
  author    = {Shala, Gresa and Biedenkapp, André and Hutter, Frank and Grabocka, Josif},
  booktitle = {International Conference on Learning Representations},
  year      = {2023},
  url       = {https://mlanthology.org/iclr/2023/shala2023iclr-graybox/}
}