Unveiling Markov Heads in Pretrained Language Models for Offline Reinforcement Learning
Abstract
Recently, incorporating knowledge from pretrained language models (PLMs) into decision transformers (DTs) has generated significant attention in offline reinforcement learning (RL). These PLMs perform well in RL tasks, raising an intriguing question: what kind of knowledge from PLMs has been transferred to RL to achieve such good results? This work first dives into this problem by analyzing each head quantitatively and points out Markov head, a crucial component that exists in the attention heads of PLMs. It leads to extreme attention on the last-input token and performs well only in short-term environments. Furthermore, we prove that this extreme attention cannot be changed by re-training embedding layer or fine-tuning. Inspired by our analysis, we propose a general method GPT-DTMA, which equips a pretrained DT with Mixture of Attention (MoA), to enable adaptive learning and accommodate diverse attention requirements during fine-tuning. Extensive experiments demonstrate the effectiveness of GPT-DTMA: it achieves superior performance in short-term environments compared to baselines, significantly reduces the performance gap of PLMs in long-term scenarios, and the experimental results also validate our theorems.
Cite
Text
Zhao et al. "Unveiling Markov Heads in Pretrained Language Models for Offline Reinforcement Learning." Proceedings of the 42nd International Conference on Machine Learning, 2025.Markdown
[Zhao et al. "Unveiling Markov Heads in Pretrained Language Models for Offline Reinforcement Learning." Proceedings of the 42nd International Conference on Machine Learning, 2025.](https://mlanthology.org/icml/2025/zhao2025icml-unveiling/)BibTeX
@inproceedings{zhao2025icml-unveiling,
title = {{Unveiling Markov Heads in Pretrained Language Models for Offline Reinforcement Learning}},
author = {Zhao, Wenhao and Xu, Qiushui and Xu, Linjie and Song, Lei and Wang, Jinyu and Zhou, Chunlai and Bian, Jiang},
booktitle = {Proceedings of the 42nd International Conference on Machine Learning},
year = {2025},
pages = {77807-77821},
volume = {267},
url = {https://mlanthology.org/icml/2025/zhao2025icml-unveiling/}
}