Replay Across Experiments: A Natural Extension of Off-Policy RL
Abstract
Replaying data is a principal mechanism underlying the stability and data efficiency of off-policy reinforcement learning (RL). We present an effective yet simple framework to extend the use of replays across multiple experiments, minimally adapting the RL workflow for sizeable improvements in controller performance and research iteration times. At its core, Replay across Experiments (RaE) involves reusing experience from previous experiments to improve exploration and bootstrap learning while reducing required changes to a minimum in comparison to prior work. We empirically show benefits across a number of RL algorithms and challenging control domains spanning both locomotion and manipulation, including hard exploration tasks from egocentric vision. Through comprehensive ablations, we demonstrate robustness to the quality and amount of data available and various hyperparameter choices. Finally, we discuss how our approach can be applied more broadly across research life cycles and can increase resilience by reloading data across random seeds or hyperparameter variations.
Cite
Text
Tirumala et al. "Replay Across Experiments: A Natural Extension of Off-Policy RL." International Conference on Learning Representations, 2024.Markdown
[Tirumala et al. "Replay Across Experiments: A Natural Extension of Off-Policy RL." International Conference on Learning Representations, 2024.](https://mlanthology.org/iclr/2024/tirumala2024iclr-replay/)BibTeX
@inproceedings{tirumala2024iclr-replay,
title = {{Replay Across Experiments: A Natural Extension of Off-Policy RL}},
author = {Tirumala, Dhruva and Lampe, Thomas and Chen, Jose Enrique and Haarnoja, Tuomas and Huang, Sandy and Lever, Guy and Moran, Ben and Hertweck, Tim and Hasenclever, Leonard and Riedmiller, Martin and Heess, Nicolas and Wulfmeier, Markus},
booktitle = {International Conference on Learning Representations},
year = {2024},
url = {https://mlanthology.org/iclr/2024/tirumala2024iclr-replay/}
}