Generative Visual Dialogue System via Weighted Likelihood Estimation
Abstract
The key challenge of generative Visual Dialogue (VD) systems is to respond to human queries with informative answers in natural and contiguous conversation flow. Traditional Maximum Likelihood Estimation-based methods only learn from positive responses but ignore the negative responses, and consequently tend to yield safe or generic responses. To address this issue, we propose a novel training scheme in conjunction with weighted likelihood estimation method. Furthermore, an adaptive multi-modal reasoning module is designed, to accommodate various dialogue scenarios automatically and select relevant information accordingly. The experimental results on the VisDial benchmark demonstrate the superiority of our proposed algorithm over other state-of-the-art approaches, with an improvement of 5.81% on recall@10.
Cite
Text
Zhang et al. "Generative Visual Dialogue System via Weighted Likelihood Estimation." International Joint Conference on Artificial Intelligence, 2019. doi:10.24963/IJCAI.2019/144Markdown
[Zhang et al. "Generative Visual Dialogue System via Weighted Likelihood Estimation." International Joint Conference on Artificial Intelligence, 2019.](https://mlanthology.org/ijcai/2019/zhang2019ijcai-generative/) doi:10.24963/IJCAI.2019/144BibTeX
@inproceedings{zhang2019ijcai-generative,
title = {{Generative Visual Dialogue System via Weighted Likelihood Estimation}},
author = {Zhang, Heming and Ghosh, Shalini and Heck, Larry P. and Walsh, Stephen and Zhang, Junting and Zhang, Jie and Kuo, C.-C. Jay},
booktitle = {International Joint Conference on Artificial Intelligence},
year = {2019},
pages = {1025-1031},
doi = {10.24963/IJCAI.2019/144},
url = {https://mlanthology.org/ijcai/2019/zhang2019ijcai-generative/}
}