In-Context Compositional Q-Learning for Offline Reinforcement Learning
Abstract
Accurate estimation of the Q-function is a central challenge in offline reinforcement learning. However, existing approaches often rely on a shared global Q-function, which is inadequate for capturing the compositional structure of tasks that consist of diverse subtasks. We propose In-context Compositional Q-Learning (ICQL), an offline RL framework that formulates Q-learning as a contextual inference problem and uses linear Transformers to adaptively infer local Q-functions from retrieved transitions without explicit subtask labels. Theoretically, we show that, under two assumptions---linear approximability of the local Q-function and accurate inference of weights from retrieved context---ICQL achieves a bounded approximation error for the Q-function and enables near-optimal policy extraction. Empirically, ICQL substantially improves performance in offline settings, achieving gains of up to 16.4\% on kitchen tasks and up to 8.8\% and 6.3\% on MuJoCo and Adroit tasks, respectively. These results highlight the underexplored potential of in-context learning for robust and compositional value estimation and establish ICQL as a principled and effective framework for offline RL.
Cite
Text
Xu et al. "In-Context Compositional Q-Learning for Offline Reinforcement Learning." International Conference on Learning Representations, 2026.Markdown
[Xu et al. "In-Context Compositional Q-Learning for Offline Reinforcement Learning." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/xu2026iclr-incontext/)BibTeX
@inproceedings{xu2026iclr-incontext,
title = {{In-Context Compositional Q-Learning for Offline Reinforcement Learning}},
author = {Xu, Qiushui and Huang, Yu-Hao and Jiang, Yushu and Zheng, Wenliang and Song, Lei and Wang, Jinyu and Bian, Jiang},
booktitle = {International Conference on Learning Representations},
year = {2026},
url = {https://mlanthology.org/iclr/2026/xu2026iclr-incontext/}
}