Landmark-Guided Subgoal Generation in Hierarchical Reinforcement Learning
Abstract
Goal-conditioned hierarchical reinforcement learning (HRL) has shown promising results for solving complex and long-horizon RL tasks. However, the action space of high-level policy in the goal-conditioned HRL is often large, so it results in poor exploration, leading to inefficiency in training. In this paper, we present HIerarchical reinforcement learning Guided by Landmarks (HIGL), a novel framework for training a high-level policy with a reduced action space guided by landmarks, i.e., promising states to explore. The key component of HIGL is twofold: (a) sampling landmarks that are informative for exploration and (b) encouraging the high level policy to generate a subgoal towards a selected landmark. For (a), we consider two criteria: coverage of the entire visited state space (i.e., dispersion of states) and novelty of states (i.e., prediction error of a state). For (b), we select a landmark as the very first landmark in the shortest path in a graph whose nodes are landmarks. Our experiments demonstrate that our framework outperforms prior-arts across a variety of control tasks, thanks to efficient exploration guided by landmarks.
Cite
Text
Kim et al. "Landmark-Guided Subgoal Generation in Hierarchical Reinforcement Learning." Neural Information Processing Systems, 2021.Markdown
[Kim et al. "Landmark-Guided Subgoal Generation in Hierarchical Reinforcement Learning." Neural Information Processing Systems, 2021.](https://mlanthology.org/neurips/2021/kim2021neurips-landmarkguided/)BibTeX
@inproceedings{kim2021neurips-landmarkguided,
title = {{Landmark-Guided Subgoal Generation in Hierarchical Reinforcement Learning}},
author = {Kim, Junsu and Seo, Younggyo and Shin, Jinwoo},
booktitle = {Neural Information Processing Systems},
year = {2021},
url = {https://mlanthology.org/neurips/2021/kim2021neurips-landmarkguided/}
}