Optimizing Chatbot Fallback Intent Selections with Reinforcement Learning

Abstract

Large language models used in GPT-4 and Alexa are limited by their ability to assess the validity of their own answers i.e., to fall back on a clarification intent when needed. Reinforcement learning can be used specifically to address this fallback selection problem, by adapting to semantic pitfalls of a given language model in a given environment. This is demonstrated in a simplified environment where the chatbot learns when best to ask for clarifications. After training it identifies correct intents in $<$ 2 interactions on average in over 99% of dialogues. In multi-agent simulations where the user cooperates, the chatbot identifies correct intents in 1.3 interactions on average in 100% of dialogues.

Cite

Text

Curuksu. "Optimizing Chatbot Fallback Intent Selections with Reinforcement Learning." ICML 2023 Workshops: MFPL, 2023.

Markdown

[Curuksu. "Optimizing Chatbot Fallback Intent Selections with Reinforcement Learning." ICML 2023 Workshops: MFPL, 2023.](https://mlanthology.org/icmlw/2023/curuksu2023icmlw-optimizing/)

BibTeX

@inproceedings{curuksu2023icmlw-optimizing,
  title     = {{Optimizing Chatbot Fallback Intent Selections with Reinforcement Learning}},
  author    = {Curuksu, Jeremy},
  booktitle = {ICML 2023 Workshops: MFPL},
  year      = {2023},
  url       = {https://mlanthology.org/icmlw/2023/curuksu2023icmlw-optimizing/}
}