Exploring Implicit Feedback for Open Domain Conversation Generation

Abstract

User feedback can be an effective indicator to the success of the human-robot conversation. However, to avoid to interrupt the online real-time conversation process, explicit feedback is usually gained at the end of a conversation. Alternatively, users' responses usually contain their implicit feedback, such as stance, sentiment, emotion, etc., towards the conversation content or the interlocutors. Therefore, exploring the implicit feedback is a natural way to optimize the conversation generation process. In this paper, we propose a novel reward function which explores the implicit feedback to optimize the future reward of a reinforcement learning based neural conversation model. A simulation strategy is applied to explore the state-action space in training and test. Experimental results show that the proposed approach outperforms the Seq2Seq model and the state-of-the-art reinforcement learning model for conversation generation on automatic and human evaluations on the OpenSubtitles and Twitter datasets.

Cite

Text

Zhang et al. "Exploring Implicit Feedback for Open Domain Conversation Generation." AAAI Conference on Artificial Intelligence, 2018. doi:10.1609/AAAI.V32I1.11253

Markdown

[Zhang et al. "Exploring Implicit Feedback for Open Domain Conversation Generation." AAAI Conference on Artificial Intelligence, 2018.](https://mlanthology.org/aaai/2018/zhang2018aaai-exploring/) doi:10.1609/AAAI.V32I1.11253

BibTeX

@inproceedings{zhang2018aaai-exploring,
  title     = {{Exploring Implicit Feedback for Open Domain Conversation Generation}},
  author    = {Zhang, Weinan and Li, Lingzhi and Cao, Dongyan and Liu, Ting},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2018},
  pages     = {547-554},
  doi       = {10.1609/AAAI.V32I1.11253},
  url       = {https://mlanthology.org/aaai/2018/zhang2018aaai-exploring/}
}