Efficient Dialog Policy Learning by Reasoning with Contextual Knowledge

Zhang, Haodi; Zeng, Zhichao; Lu, Keting; Wu, Kaishun; Zhang, Shiqi

doi:10.1609/AAAI.V36I10.21421

Efficient Dialog Policy Learning by Reasoning with Contextual Knowledge

Haodi Zhang, Zhichao Zeng, Keting Lu, Kaishun Wu, Shiqi Zhang

AAAI 2022 pp. 11667-11675

doi:10.1609/AAAI.V36I10.21421 /aaai/2022/zhang2022aaai-efficient/

Abstract

Goal-oriented dialog policy learning algorithms aim to learn a dialog policy for selecting language actions based on the current dialog state. Deep reinforcement learning methods have been used for dialog policy learning. This work is motivated by the observation that, although dialog is a domain with rich contextual knowledge, reinforcement learning methods are ill-equipped to incorporate such knowledge into the dialog policy learning process. In this paper, we develop a deep reinforcement learning framework for goal-oriented dialog policy learning that learns user preferences from user goal data, while leveraging commonsense knowledge from people. The developed framework has been evaluated using a realistic dialog simulation platform. Compared with baselines from the literature and the ablations of our approach, we see significant improvements in learning efficiency and the quality of the computed action policies.

PDF AAAI Semantic Scholar

Cite

Text

Zhang et al. "Efficient Dialog Policy Learning by Reasoning with Contextual Knowledge." AAAI Conference on Artificial Intelligence, 2022. doi:10.1609/AAAI.V36I10.21421

Markdown

[Zhang et al. "Efficient Dialog Policy Learning by Reasoning with Contextual Knowledge." AAAI Conference on Artificial Intelligence, 2022.](https://mlanthology.org/aaai/2022/zhang2022aaai-efficient/) doi:10.1609/AAAI.V36I10.21421

BibTeX

@inproceedings{zhang2022aaai-efficient,
  title     = {{Efficient Dialog Policy Learning by Reasoning with Contextual Knowledge}},
  author    = {Zhang, Haodi and Zeng, Zhichao and Lu, Keting and Wu, Kaishun and Zhang, Shiqi},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2022},
  pages     = {11667-11675},
  doi       = {10.1609/AAAI.V36I10.21421},
  url       = {https://mlanthology.org/aaai/2022/zhang2022aaai-efficient/}
}