ULTHO: Ultra-Lightweight yet Efficient Hyperparameter Optimization in Deep Reinforcement Learning

Abstract

Hyperparameter optimization (HPO) is a billion-dollar problem in machine learning, which significantly impacts the training efficiency and model performance. However, achieving efficient and robust HPO in deep reinforcement learning (RL) is consistently challenging due to its high non-stationarity and computational cost. To tackle this problem, existing approaches attempt to adapt common HPO techniques (e.g., population-based training or Bayesian optimization) to the RL scenario. However, they remain sample-inefficient and computationally expensive, which cannot facilitate a wide range of applications. In this paper, we propose ULTHO, an ultra-lightweight yet powerful framework for fast HPO in deep RL within single runs. Specifically, we formulate the HPO process as a multi-armed bandit with clustered arms (MABC) and link it directly to long-term return optimization. ULTHO also provides a quantified and statistical perspective to filter the HPs efficiently. We test ULTHO on benchmarks including ALE, Procgen, MiniGrid, and PyBullet. Extensive experiments demonstrate that the ULTHO can achieve superior performance with a simple architecture, contributing to the development of advanced and automated RL systems.

Cite

Text

Yuan et al. "ULTHO: Ultra-Lightweight yet Efficient Hyperparameter Optimization in Deep Reinforcement Learning." International Conference on Computer Vision, 2025.

Markdown

[Yuan et al. "ULTHO: Ultra-Lightweight yet Efficient Hyperparameter Optimization in Deep Reinforcement Learning." International Conference on Computer Vision, 2025.](https://mlanthology.org/iccv/2025/yuan2025iccv-ultho/)

BibTeX

@inproceedings{yuan2025iccv-ultho,
  title     = {{ULTHO: Ultra-Lightweight yet Efficient Hyperparameter Optimization in Deep Reinforcement Learning}},
  author    = {Yuan, Mingqi and Li, Bo and Jin, Xin and Zeng, Wenjun},
  booktitle = {International Conference on Computer Vision},
  year      = {2025},
  pages     = {2620-2630},
  url       = {https://mlanthology.org/iccv/2025/yuan2025iccv-ultho/}
}