Learning Two-Step Hybrid Policy for Graph-Based Interpretable Reinforcement Learning

Abstract

We present a two-step hybrid reinforcement learning (RL) policy that is designed to generate interpretable and robust hierarchical policies on the RL problem with graph-based input. Unlike prior deep reinforcement learning policies parameterized by an end-to-end black-box graph neural network, our approach disentangles the decision-making process into two steps. The first step is a simplified classification problem that maps the graph input to an action group where all actions share a similar semantic meaning. The second step implements a sophisticated rule-miner that conducts explicit one-hop reasoning over the graph and identifies decisive edges in the graph input without the necessity of heavy domain knowledge. This two-step hybrid policy presents human-friendly interpretations and achieves better performance in terms of generalization and robustness. Extensive experimental studies on four levels of complex text-based games have demonstrated the superiority of the proposed method compared to the state-of-the-art.

Cite

Text

Mu et al. "Learning Two-Step Hybrid Policy for Graph-Based Interpretable Reinforcement Learning." Transactions on Machine Learning Research, 2022.

Markdown

[Mu et al. "Learning Two-Step Hybrid Policy for Graph-Based Interpretable Reinforcement Learning." Transactions on Machine Learning Research, 2022.](https://mlanthology.org/tmlr/2022/mu2022tmlr-learning/)

BibTeX

@article{mu2022tmlr-learning,
  title     = {{Learning Two-Step Hybrid Policy for Graph-Based Interpretable Reinforcement Learning}},
  author    = {Mu, Tongzhou and Lin, Kaixiang and Niu, Feiyang and Thattai, Govind},
  journal   = {Transactions on Machine Learning Research},
  year      = {2022},
  url       = {https://mlanthology.org/tmlr/2022/mu2022tmlr-learning/}
}