Zero-Shot Task Generalization with Multi-Task Deep Reinforcement Learning

Abstract

As a step towards developing zero-shot task generalization capabilities in reinforcement learning (RL), we introduce a new RL problem where the agent should learn to execute sequences of instructions after learning useful skills that solve subtasks. In this problem, we consider two types of generalizations: to previously unseen instructions and to longer sequences of instructions. For generalization over unseen instructions, we propose a new objective which encourages learning correspondences between similar subtasks by making analogies. For generalization over sequential instructions, we present a hierarchical architecture where a meta controller learns to use the acquired skills for executing the instructions. To deal with delayed reward, we propose a new neural architecture in the meta controller that learns when to update the subtask, which makes learning more efficient. Experimental results on a stochastic 3D domain show that the proposed ideas are crucial for generalization to longer instructions as well as unseen instructions.

Cite

Text

Oh et al. "Zero-Shot Task Generalization with Multi-Task Deep Reinforcement Learning." International Conference on Machine Learning, 2017.

Markdown

[Oh et al. "Zero-Shot Task Generalization with Multi-Task Deep Reinforcement Learning." International Conference on Machine Learning, 2017.](https://mlanthology.org/icml/2017/oh2017icml-zeroshot/)

BibTeX

@inproceedings{oh2017icml-zeroshot,
  title     = {{Zero-Shot Task Generalization with Multi-Task Deep Reinforcement Learning}},
  author    = {Oh, Junhyuk and Singh, Satinder and Lee, Honglak and Kohli, Pushmeet},
  booktitle = {International Conference on Machine Learning},
  year      = {2017},
  pages     = {2661-2670},
  volume    = {70},
  url       = {https://mlanthology.org/icml/2017/oh2017icml-zeroshot/}
}