A Study of Plasticity Loss in On-Policy Deep Reinforcement Learning

Abstract

Continual learning with deep neural networks presents challenges distinct from both the fixed-dataset and convex continual learning regimes. One such challenge is plasticity loss, wherein a neural network trained in an online fashion displays a degraded ability to fit new tasks. This problem has been extensively studied in both supervised learning and off-policy reinforcement learning (RL), where a number of remedies have been proposed. Still, plasticity loss has received less attention in the on-policy deep RL setting. Here we perform an extensive set of experiments examining plasticity loss and a variety of mitigation methods in on-policy deep RL. We demonstrate that plasticity loss is pervasive under domain shift in this regime, and that a number of methods developed to resolve it in other settings fail, sometimes even performing worse than applying no intervention at all. In contrast, we find that a class of ``regenerative'' methods are able to consistently mitigate plasticity loss in a variety of contexts, including in gridworld tasks and more challenging environments like Montezuma's Revenge and ProcGen.

Cite

Text

Juliani and Ash. "A Study of Plasticity Loss in On-Policy Deep Reinforcement Learning." Neural Information Processing Systems, 2024. doi:10.52202/079017-3616

Markdown

[Juliani and Ash. "A Study of Plasticity Loss in On-Policy Deep Reinforcement Learning." Neural Information Processing Systems, 2024.](https://mlanthology.org/neurips/2024/juliani2024neurips-study/) doi:10.52202/079017-3616

BibTeX

@inproceedings{juliani2024neurips-study,
  title     = {{A Study of Plasticity Loss in On-Policy Deep Reinforcement Learning}},
  author    = {Juliani, Arthur and Ash, Jordan T.},
  booktitle = {Neural Information Processing Systems},
  year      = {2024},
  doi       = {10.52202/079017-3616},
  url       = {https://mlanthology.org/neurips/2024/juliani2024neurips-study/}
}