Hypernetwork-PPO for Continual Reinforcement Learning

Abstract

Continually learning new capabilities in different environments, and being able to solve multiple complex tasks is of great importance for many robotics appli- cations. Modern reinforcement learning algorithms such as Proximal Policy Op- timization can successfully handle surprisingly difficult tasks, but are generally not suited for multi-task or continual learning. Hypernetworks are a promising approach for avoiding catastrophic forgetting, and have previously been used suc- cessfully for continual model-learning in model-based RL. We propose HN-PPO, a continual model-free RL method employing a hypernetwork to learn multiple policies in a continual manner using PPO. We demonstrate our method on Door- Gym, and show that it is suitable for solving tasks involving complex dynamics such as door opening, while effectively protecting against catastrophic forgetting

Cite

Text

Schöpf et al. "Hypernetwork-PPO for Continual Reinforcement Learning." NeurIPS 2022 Workshops: DeepRL, 2022.

Markdown

[Schöpf et al. "Hypernetwork-PPO for Continual Reinforcement Learning." NeurIPS 2022 Workshops: DeepRL, 2022.](https://mlanthology.org/neuripsw/2022/schopf2022neuripsw-hypernetworkppo/)

BibTeX

@inproceedings{schopf2022neuripsw-hypernetworkppo,
  title     = {{Hypernetwork-PPO for Continual Reinforcement Learning}},
  author    = {Schöpf, Philemon and Auddy, Sayantan and Hollenstein, Jakob and Rodriguez-sanchez, Antonio},
  booktitle = {NeurIPS 2022 Workshops: DeepRL},
  year      = {2022},
  url       = {https://mlanthology.org/neuripsw/2022/schopf2022neuripsw-hypernetworkppo/}
}