State-Conditioned Adversarial Subgoal Generation

Abstract

Hierarchical reinforcement learning (HRL) proposes to solve difficult tasks by performing decision-making and control at successively higher levels of temporal abstraction. However, off-policy HRL often suffers from the problem of a non-stationary high-level policy since the low-level policy is constantly changing. In this paper, we propose a novel HRL approach for mitigating the non-stationarity by adversarially enforcing the high-level policy to generate subgoals compatible with the current instantiation of the low-level policy. In practice, the adversarial learning is implemented by training a simple state conditioned discriminator network concurrently with the high-level policy which determines the compatibility level of subgoals. Comparison to state-of-the-art algorithms shows that our approach improves both learning efficiency and performance in challenging continuous control tasks.

Cite

Text

Wang et al. "State-Conditioned Adversarial Subgoal Generation." AAAI Conference on Artificial Intelligence, 2023. doi:10.1609/AAAI.V37I8.26213

Markdown

[Wang et al. "State-Conditioned Adversarial Subgoal Generation." AAAI Conference on Artificial Intelligence, 2023.](https://mlanthology.org/aaai/2023/wang2023aaai-state/) doi:10.1609/AAAI.V37I8.26213

BibTeX

@inproceedings{wang2023aaai-state,
  title     = {{State-Conditioned Adversarial Subgoal Generation}},
  author    = {Wang, Vivienne Huiling and Pajarinen, Joni and Wang, Tinghuai and Kämäräinen, Joni-Kristian},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2023},
  pages     = {10184-10191},
  doi       = {10.1609/AAAI.V37I8.26213},
  url       = {https://mlanthology.org/aaai/2023/wang2023aaai-state/}
}