Reinforced In-Context Black-Box Optimization
Abstract
Black-Box Optimization (BBO) has found successful applications in many fields of science and engineering. Recently, there has been a growing interest in meta-learning particular components of BBO algorithms to speed up optimization and get rid of tedious hand-crafted heuristics. As an extension, learning the entire algorithm from data requires the least labor from experts and can provide the most flexibility. In this paper, we propose RIBBO, a method to reinforce-learn a BBO algorithm from offline data in an end-to-end fashion. RIBBO employs expressive sequence models to learn the optimization histories produced by multiple behavior algorithms and tasks, leveraging the in-context learning ability of large models to extract task information and make decisions accordingly. Central to our method is to augment the optimization histories with regret-to-go tokens, which are designed to represent the performance of an algorithm based on cumulative regret over the future part of the histories. The integration of regret-to-go tokens enables RIBBO to automatically generate sequences of query points that are positively correlated to the user-desired regret, verified by its universally good empirical performance on diverse problems, including BBO benchmark, hyper-parameter optimization, and robot control problems.
Cite
Text
Song et al. "Reinforced In-Context Black-Box Optimization." International Joint Conference on Artificial Intelligence, 2025. doi:10.24963/IJCAI.2025/994Markdown
[Song et al. "Reinforced In-Context Black-Box Optimization." International Joint Conference on Artificial Intelligence, 2025.](https://mlanthology.org/ijcai/2025/song2025ijcai-reinforced/) doi:10.24963/IJCAI.2025/994BibTeX
@inproceedings{song2025ijcai-reinforced,
title = {{Reinforced In-Context Black-Box Optimization}},
author = {Song, Lei and Gao, Chen-Xiao and Xue, Ke and Wu, Chenyang and Li, Dong and Hao, Jianye and Zhang, Zongzhang and Qian, Chao},
booktitle = {International Joint Conference on Artificial Intelligence},
year = {2025},
pages = {8939-8947},
doi = {10.24963/IJCAI.2025/994},
url = {https://mlanthology.org/ijcai/2025/song2025ijcai-reinforced/}
}