A Study of Plasticity Loss in On-Policy Deep Reinforcement Learning
Abstract
Continual learning with deep neural networks presents challenges distinct from both the fixed-dataset and convex continual learning regimes. One such challenge is plasticity loss, wherein a neural network trained in an online fashion displays a degraded ability to fit new tasks. This problem has been extensively studied in both supervised learning and off-policy reinforcement learning (RL), where a number of remedies have been proposed. Still, plasticity loss has received less attention in the on-policy deep RL setting. Here we perform an extensive set of experiments examining plasticity loss and a variety of mitigation methods in on-policy deep RL. We demonstrate that plasticity loss is pervasive under domain shift in this regime, and that a number of methods developed to resolve it in other settings fail, sometimes even performing worse than applying no intervention at all. In contrast, we find that a class of ``regenerative'' methods are able to consistently mitigate plasticity loss in a variety of contexts, including in gridworld tasks and more challenging environments like Montezuma's Revenge and ProcGen.
Cite
Text
Juliani and Ash. "A Study of Plasticity Loss in On-Policy Deep Reinforcement Learning." Neural Information Processing Systems, 2024. doi:10.52202/079017-3616Markdown
[Juliani and Ash. "A Study of Plasticity Loss in On-Policy Deep Reinforcement Learning." Neural Information Processing Systems, 2024.](https://mlanthology.org/neurips/2024/juliani2024neurips-study/) doi:10.52202/079017-3616BibTeX
@inproceedings{juliani2024neurips-study,
title = {{A Study of Plasticity Loss in On-Policy Deep Reinforcement Learning}},
author = {Juliani, Arthur and Ash, Jordan T.},
booktitle = {Neural Information Processing Systems},
year = {2024},
doi = {10.52202/079017-3616},
url = {https://mlanthology.org/neurips/2024/juliani2024neurips-study/}
}