Pretraining a Shared Q-Network for Data-Efficient Offline Reinforcement Learning
Abstract
Offline reinforcement learning (RL) aims to learn a policy from a fixed dataset without additional environment interaction. However, effective offline policy learning often requires a large and diverse dataset to mitigate epistemic uncertainty. Collecting such data demands substantial online interactions, which are costly or infeasible in many real-world domains. Therefore, improving policy learning from limited offline data—achieving high data efficiency—is critical for practical offline RL. In this paper, we propose a simple yet effective plug-and-play pretraining framework that initializes the feature representation of a $Q$-network to enhance data efficiency in offline RL. Our approach employs a shared $Q$-network architecture trained in two stages: pretraining a backbone feature extractor with a transition prediction head; training a $Q$-network—combining the backbone feature extractor and a $Q$-value head—with *any* offline RL objective. Extensive experiments on the D4RL, Robomimic, V-D4RL, and ExoRL benchmarks show that our method substantially improves both performance and data efficiency across diverse datasets and domains. Remarkably, with only **10\%** of the dataset, our approach outperforms standard offline RL baselines trained on the full data.
Cite
Text
Park et al. "Pretraining a Shared Q-Network for Data-Efficient Offline Reinforcement Learning." Advances in Neural Information Processing Systems, 2025.Markdown
[Park et al. "Pretraining a Shared Q-Network for Data-Efficient Offline Reinforcement Learning." Advances in Neural Information Processing Systems, 2025.](https://mlanthology.org/neurips/2025/park2025neurips-pretraining/)BibTeX
@inproceedings{park2025neurips-pretraining,
title = {{Pretraining a Shared Q-Network for Data-Efficient Offline Reinforcement Learning}},
author = {Park, Jongchan and Park, Mingyu and Lee, Donghwan},
booktitle = {Advances in Neural Information Processing Systems},
year = {2025},
url = {https://mlanthology.org/neurips/2025/park2025neurips-pretraining/}
}