Hypernetworks for Zero-Shot Transfer in Reinforcement Learning
Abstract
In this paper, hypernetworks are trained to generate behaviors across a range of unseen task conditions, via a novel TD-based training objective and data from a set of near-optimal RL solutions for training tasks. This work relates to meta RL, contextual RL, and transfer learning, with a particular focus on zero-shot performance at test time, enabled by knowledge of the task parameters (also known as context). Our technical approach is based upon viewing each RL algorithm as a mapping from the MDP specifics to the near-optimal value function and policy and seek to approximate it with a hypernetwork that can generate near-optimal value functions and policies, given the parameters of the MDP. We show that, under certain conditions, this mapping can be considered as a supervised learning problem. We empirically evaluate the effectiveness of our method for zero-shot transfer to new reward and transition dynamics on a series of continuous control tasks from DeepMind Control Suite. Our method demonstrates significant improvements over baselines from multitask and meta RL approaches.
Cite
Text
Rezaei-Shoshtari et al. "Hypernetworks for Zero-Shot Transfer in Reinforcement Learning." AAAI Conference on Artificial Intelligence, 2023. doi:10.1609/AAAI.V37I8.26146Markdown
[Rezaei-Shoshtari et al. "Hypernetworks for Zero-Shot Transfer in Reinforcement Learning." AAAI Conference on Artificial Intelligence, 2023.](https://mlanthology.org/aaai/2023/rezaeishoshtari2023aaai-hypernetworks/) doi:10.1609/AAAI.V37I8.26146BibTeX
@inproceedings{rezaeishoshtari2023aaai-hypernetworks,
title = {{Hypernetworks for Zero-Shot Transfer in Reinforcement Learning}},
author = {Rezaei-Shoshtari, Sahand and Morissette, Charlotte and Hogan, François Robert and Dudek, Gregory and Meger, David},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2023},
pages = {9579-9587},
doi = {10.1609/AAAI.V37I8.26146},
url = {https://mlanthology.org/aaai/2023/rezaeishoshtari2023aaai-hypernetworks/}
}