Optimizing Chatbot Fallback Intent Selections with Reinforcement Learning
Abstract
Large language models used in GPT-4 and Alexa are limited by their ability to assess the validity of their own answers i.e., to fall back on a clarification intent when needed. Reinforcement learning can be used specifically to address this fallback selection problem, by adapting to semantic pitfalls of a given language model in a given environment. This is demonstrated in a simplified environment where the chatbot learns when best to ask for clarifications. After training it identifies correct intents in $<$ 2 interactions on average in over 99% of dialogues. In multi-agent simulations where the user cooperates, the chatbot identifies correct intents in 1.3 interactions on average in 100% of dialogues.
Cite
Text
Curuksu. "Optimizing Chatbot Fallback Intent Selections with Reinforcement Learning." ICML 2023 Workshops: MFPL, 2023.Markdown
[Curuksu. "Optimizing Chatbot Fallback Intent Selections with Reinforcement Learning." ICML 2023 Workshops: MFPL, 2023.](https://mlanthology.org/icmlw/2023/curuksu2023icmlw-optimizing/)BibTeX
@inproceedings{curuksu2023icmlw-optimizing,
title = {{Optimizing Chatbot Fallback Intent Selections with Reinforcement Learning}},
author = {Curuksu, Jeremy},
booktitle = {ICML 2023 Workshops: MFPL},
year = {2023},
url = {https://mlanthology.org/icmlw/2023/curuksu2023icmlw-optimizing/}
}