Disagreement Options: Task Adaptation Through Temporally Extended Actions
Abstract
Embodied AI, learning through interaction with a physical environment, typically requires large amounts of interaction with the environment in order to learn how to solve new tasks. Training can be done in parallel, using simulated environments. However, once deployed in e.g., a real-world setting, it is not yet clear how an agent can quickly adapt its knowledge to solve new tasks. In this paper, we propose a novel Hierarchical Reinforcement Learning (HRL) method that allows an agent, when confronted with a novel task, to switch between exploiting prior knowledge through temporally extended actions, and environment exploration. We solve this trade-off by utilizing the disagreement between action distributions of selected previously acquired policies. Selection of relevant prior tasks is done by measuring the cosine similarity of their attached natural language goals in a pre-trained word-embedding. We analyze the resulting temporal abstractions, and we experimentally demonstrate the effectiveness of them in different environments. We show that our method is capable of solving new tasks using only a fraction of the environment interactions required when learning the task from scratch.
Cite
Text
Hutsebaut-Buysse et al. "Disagreement Options: Task Adaptation Through Temporally Extended Actions." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2021. doi:10.1007/978-3-030-86486-6_12Markdown
[Hutsebaut-Buysse et al. "Disagreement Options: Task Adaptation Through Temporally Extended Actions." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2021.](https://mlanthology.org/ecmlpkdd/2021/hutsebautbuysse2021ecmlpkdd-disagreement/) doi:10.1007/978-3-030-86486-6_12BibTeX
@inproceedings{hutsebautbuysse2021ecmlpkdd-disagreement,
title = {{Disagreement Options: Task Adaptation Through Temporally Extended Actions}},
author = {Hutsebaut-Buysse, Matthias and De Schepper, Tom and Mets, Kevin and Latré, Steven},
booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
year = {2021},
pages = {190-205},
doi = {10.1007/978-3-030-86486-6_12},
url = {https://mlanthology.org/ecmlpkdd/2021/hutsebautbuysse2021ecmlpkdd-disagreement/}
}