Reinforced In-Context Black-Box Optimization

Song, Lei; Gao, Chen-Xiao; Xue, Ke; Wu, Chenyang; Li, Dong; Hao, Jianye; Zhang, Zongzhang; Qian, Chao

doi:10.24963/IJCAI.2025/994

Reinforced In-Context Black-Box Optimization

Lei Song, Chen-Xiao Gao, Ke Xue, Chenyang Wu, Dong Li, Jianye Hao, Zongzhang Zhang, Chao Qian

IJCAI 2025 pp. 8939-8947

doi:10.24963/IJCAI.2025/994 /ijcai/2025/song2025ijcai-reinforced/

Abstract

Black-Box Optimization (BBO) has found successful applications in many fields of science and engineering. Recently, there has been a growing interest in meta-learning particular components of BBO algorithms to speed up optimization and get rid of tedious hand-crafted heuristics. As an extension, learning the entire algorithm from data requires the least labor from experts and can provide the most flexibility. In this paper, we propose RIBBO, a method to reinforce-learn a BBO algorithm from offline data in an end-to-end fashion. RIBBO employs expressive sequence models to learn the optimization histories produced by multiple behavior algorithms and tasks, leveraging the in-context learning ability of large models to extract task information and make decisions accordingly. Central to our method is to augment the optimization histories with regret-to-go tokens, which are designed to represent the performance of an algorithm based on cumulative regret over the future part of the histories. The integration of regret-to-go tokens enables RIBBO to automatically generate sequences of query points that are positively correlated to the user-desired regret, verified by its universally good empirical performance on diverse problems, including BBO benchmark, hyper-parameter optimization, and robot control problems.

PDF IJCAI Semantic Scholar

Cite

Text

Song et al. "Reinforced In-Context Black-Box Optimization." International Joint Conference on Artificial Intelligence, 2025. doi:10.24963/IJCAI.2025/994

Markdown

[Song et al. "Reinforced In-Context Black-Box Optimization." International Joint Conference on Artificial Intelligence, 2025.](https://mlanthology.org/ijcai/2025/song2025ijcai-reinforced/) doi:10.24963/IJCAI.2025/994

BibTeX

@inproceedings{song2025ijcai-reinforced,
  title     = {{Reinforced In-Context Black-Box Optimization}},
  author    = {Song, Lei and Gao, Chen-Xiao and Xue, Ke and Wu, Chenyang and Li, Dong and Hao, Jianye and Zhang, Zongzhang and Qian, Chao},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {8939-8947},
  doi       = {10.24963/IJCAI.2025/994},
  url       = {https://mlanthology.org/ijcai/2025/song2025ijcai-reinforced/}
}