A Clean Slate for Offline Reinforcement Learning

Abstract

Progress in offline reinforcement learning (RL) has been impeded by ambiguous problem definitions and entangled algorithmic designs, resulting in inconsistent implementations, insufficient ablations, and unfair evaluations. Although offline RL explicitly avoids environment interaction, prior methods frequently employ extensive, undocumented online evaluation for hyperparameter tuning, complicating method comparisons. Moreover, existing reference implementations differ significantly in boilerplate code, obscuring their core algorithmic contributions. We address these challenges by first introducing a rigorous taxonomy and a transparent evaluation protocol that explicitly quantifies online tuning budgets. To resolve opaque algorithmic design, we provide clean, minimalistic, single-file implementations of various model-free and model-based offline RL methods, significantly enhancing clarity and achieving substantial speed-ups. Leveraging these streamlined implementations, we propose Unifloral, a unified algorithm that encapsulates diverse prior approaches and enables development within a single, comprehensive hyperparameter space. Using Unifloral with our rigorous evaluation protocol, we develop two novel algorithms - TD3-AWR (model-free) and MoBRAC (model-based) - which substantially outperform established baselines. Our implementation is publicly available at https://github.com/EmptyJackson/unifloral.

Cite

Text

Jackson et al. "A Clean Slate for Offline Reinforcement Learning." Advances in Neural Information Processing Systems, 2025.

Markdown

[Jackson et al. "A Clean Slate for Offline Reinforcement Learning." Advances in Neural Information Processing Systems, 2025.](https://mlanthology.org/neurips/2025/jackson2025neurips-clean/)

BibTeX

@inproceedings{jackson2025neurips-clean,
  title     = {{A Clean Slate for Offline Reinforcement Learning}},
  author    = {Jackson, Matthew Thomas and Berdica, Uljad and Liesen, Jarek Luca and Whiteson, Shimon and Foerster, Jakob Nicolaus},
  booktitle = {Advances in Neural Information Processing Systems},
  year      = {2025},
  url       = {https://mlanthology.org/neurips/2025/jackson2025neurips-clean/}
}