A Study of Plasticity Loss in On-Policy Deep Reinforcement Learning

Abstract

Continual learning with deep neural networks presents challenges distinct from both the fixed-dataset and convex continual learning regimes. One such challenge is plasticity loss, wherein a neural network trained in an online fashion displays a degraded ability to fit new tasks. This problem has been extensively studied in both supervised learning and off-policy reinforcement learning (RL), where a number of remedies have been proposed. Still, plasticity loss has received less attention in the on-policy deep RL setting. Here we perform an extensive set of experiments examining plasticity loss and a variety of mitigation methods in on-policy deep RL. We demonstrate that plasticity loss is pervasive under domain shift in this regime, and that a number of methods developed to resolve it in other settings fail, sometimes even performing worse than applying no intervention at all. In contrast, we find that a class of ``regenerative'' methods are able to consistently mitigate plasticity loss in a variety of contexts, including in gridworld tasks and more challenging environments like Montezuma's Revenge and ProcGen.

PDF NeurIPS OpenReview Semantic Scholar

Cite

Text

Juliani and Ash. "A Study of Plasticity Loss in On-Policy Deep Reinforcement Learning." Neural Information Processing Systems, 2024. doi:10.52202/079017-3616

Markdown

[Juliani and Ash. "A Study of Plasticity Loss in On-Policy Deep Reinforcement Learning." Neural Information Processing Systems, 2024.](https://mlanthology.org/neurips/2024/juliani2024neurips-study/) doi:10.52202/079017-3616

BibTeX

@inproceedings{juliani2024neurips-study,
  title     = {{A Study of Plasticity Loss in On-Policy Deep Reinforcement Learning}},
  author    = {Juliani, Arthur and Ash, Jordan T.},
  booktitle = {Neural Information Processing Systems},
  year      = {2024},
  doi       = {10.52202/079017-3616},
  url       = {https://mlanthology.org/neurips/2024/juliani2024neurips-study/}
}