Hypernetwork-PPO for Continual Reinforcement Learning
Abstract
Continually learning new capabilities in different environments, and being able to solve multiple complex tasks is of great importance for many robotics appli- cations. Modern reinforcement learning algorithms such as Proximal Policy Op- timization can successfully handle surprisingly difficult tasks, but are generally not suited for multi-task or continual learning. Hypernetworks are a promising approach for avoiding catastrophic forgetting, and have previously been used suc- cessfully for continual model-learning in model-based RL. We propose HN-PPO, a continual model-free RL method employing a hypernetwork to learn multiple policies in a continual manner using PPO. We demonstrate our method on Door- Gym, and show that it is suitable for solving tasks involving complex dynamics such as door opening, while effectively protecting against catastrophic forgetting
Cite
Text
Schöpf et al. "Hypernetwork-PPO for Continual Reinforcement Learning." NeurIPS 2022 Workshops: DeepRL, 2022.Markdown
[Schöpf et al. "Hypernetwork-PPO for Continual Reinforcement Learning." NeurIPS 2022 Workshops: DeepRL, 2022.](https://mlanthology.org/neuripsw/2022/schopf2022neuripsw-hypernetworkppo/)BibTeX
@inproceedings{schopf2022neuripsw-hypernetworkppo,
title = {{Hypernetwork-PPO for Continual Reinforcement Learning}},
author = {Schöpf, Philemon and Auddy, Sayantan and Hollenstein, Jakob and Rodriguez-sanchez, Antonio},
booktitle = {NeurIPS 2022 Workshops: DeepRL},
year = {2022},
url = {https://mlanthology.org/neuripsw/2022/schopf2022neuripsw-hypernetworkppo/}
}