A Question Type Driven Framework to Diversify Visual Question Generation

Abstract

Visual question generation aims at asking questions about an image automatically. Existing research works on this topic usually generate a single question for each given image without considering the issue of diversity. In this paper, we propose a question type driven framework to produce multiple questions for a given image with different focuses. In our framework, each question is constructed following the guidance of a sampled question type in a sequence-to-sequence fashion. To diversify the generated questions, a novel conditional variational auto-encoder is introduced to generate multiple questions with a specific question type. Moreover, we design a strategy to conduct the question type distribution learning for each image to select the final questions. Experimental results on three benchmark datasets show that our framework outperforms the state-of-the-art approaches in terms of both relevance and diversity.

Cite

Text

Fan et al. "A Question Type Driven Framework to Diversify Visual Question Generation." International Joint Conference on Artificial Intelligence, 2018. doi:10.24963/IJCAI.2018/563

Markdown

[Fan et al. "A Question Type Driven Framework to Diversify Visual Question Generation." International Joint Conference on Artificial Intelligence, 2018.](https://mlanthology.org/ijcai/2018/fan2018ijcai-question/) doi:10.24963/IJCAI.2018/563

BibTeX

@inproceedings{fan2018ijcai-question,
  title     = {{A Question Type Driven Framework to Diversify Visual Question Generation}},
  author    = {Fan, Zhihao and Wei, Zhongyu and Li, Piji and Lan, Yanyan and Huang, Xuanjing},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2018},
  pages     = {4048-4054},
  doi       = {10.24963/IJCAI.2018/563},
  url       = {https://mlanthology.org/ijcai/2018/fan2018ijcai-question/}
}