Hypernetwork-PPO for Continual Reinforcement Learning

Abstract

Continually learning new capabilities in different environments, and being able to solve multiple complex tasks is of great importance for many robotics appli- cations. Modern reinforcement learning algorithms such as Proximal Policy Op- timization can successfully handle surprisingly difficult tasks, but are generally not suited for multi-task or continual learning. Hypernetworks are a promising approach for avoiding catastrophic forgetting, and have previously been used suc- cessfully for continual model-learning in model-based RL. We propose HN-PPO, a continual model-free RL method employing a hypernetwork to learn multiple policies in a continual manner using PPO. We demonstrate our method on Door- Gym, and show that it is suitable for solving tasks involving complex dynamics such as door opening, while effectively protecting against catastrophic forgetting

PDF NeurIPSW OpenReview Semantic Scholar

Cite

Text

Schöpf et al. "Hypernetwork-PPO for Continual Reinforcement Learning." NeurIPS 2022 Workshops: DeepRL, 2022.

Markdown

[Schöpf et al. "Hypernetwork-PPO for Continual Reinforcement Learning." NeurIPS 2022 Workshops: DeepRL, 2022.](https://mlanthology.org/neuripsw/2022/schopf2022neuripsw-hypernetworkppo/)

BibTeX

@inproceedings{schopf2022neuripsw-hypernetworkppo,
  title     = {{Hypernetwork-PPO for Continual Reinforcement Learning}},
  author    = {Schöpf, Philemon and Auddy, Sayantan and Hollenstein, Jakob and Rodriguez-sanchez, Antonio},
  booktitle = {NeurIPS 2022 Workshops: DeepRL},
  year      = {2022},
  url       = {https://mlanthology.org/neuripsw/2022/schopf2022neuripsw-hypernetworkppo/}
}