Improving Generalization in Offline Reinforcement Learning via Latent Distribution Representation Learning
Abstract
Dealing with the distribution shift is a significant challenge when building offline reinforcement learning (RL) models that can generalize from a static dataset to out-of-distribution (OOD) scenarios. Previous approaches have employed pessimism or conservatism strategies. More recently, data-driven work has taken a distributional perspective, treating offline data as a domain adaptation problem. However, these methods use heuristic techniques to simulate distribution shifts, resulting in a limited diversity of artificially created distribution gaps. In this paper, we propose a novel perspective: offline datasets inherently contain multiple latent distributions, with behavior data from diverse policies potentially following different distributions and data from the same policy across various time phases also exhibiting distribution variance. We introduce the Latent Distribution Representation Learning (LAD) framework, which aims to characterize the multiple latent distributions within offline data and reduce the distribution gaps between any pair of them. LAD consists of a min-max adversarial process: it first identifies the "worst-case" distributions to enlarge the diversity of distribution gaps and then reduces these gaps to learn invariant representations for generalization. We derive a generalization error bound to support LAD theoretically and verify its effectiveness through extensive experiments.
Cite
Text
Wang et al. "Improving Generalization in Offline Reinforcement Learning via Latent Distribution Representation Learning." AAAI Conference on Artificial Intelligence, 2025. doi:10.1609/AAAI.V39I20.35402Markdown
[Wang et al. "Improving Generalization in Offline Reinforcement Learning via Latent Distribution Representation Learning." AAAI Conference on Artificial Intelligence, 2025.](https://mlanthology.org/aaai/2025/wang2025aaai-improving/) doi:10.1609/AAAI.V39I20.35402BibTeX
@inproceedings{wang2025aaai-improving,
title = {{Improving Generalization in Offline Reinforcement Learning via Latent Distribution Representation Learning}},
author = {Wang, Da and Li, Lin and Wei, Wei and Yu, Qixian and Hao, Jianye and Liang, Jiye},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2025},
pages = {21053-21061},
doi = {10.1609/AAAI.V39I20.35402},
url = {https://mlanthology.org/aaai/2025/wang2025aaai-improving/}
}