Dealing with Missing Modalities in the Visual Question Answer-Difference Prediction Task Through Knowledge Distillation

Abstract

In this work, we address the issues of the missing modalities that have arisen from the Visual Question Answer-Difference prediction task and find a novel method to solve the task at hand. We address the missing modality–the ground truth answers–that are not present at test time and use a privileged knowledge distillation scheme to deal with the issue of the missing modality. In order to efficiently do so, we first introduce a model, the "Big" Teacher, that takes the image/question/answer triplet as its input and out-performs the baseline, then use a combination of models to distill knowledge to a target network (student) that only takes the image/question pair as its inputs. We experiment our models on the VizWiz and VQA-V2 Answer Difference datasets and show through extensive experimentation and ablation the performance of our method and a diverse possibility for future research.

Cite

Text

Cho et al. "Dealing with Missing Modalities in the Visual Question Answer-Difference Prediction Task Through Knowledge Distillation." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2021. doi:10.1109/CVPRW53098.2021.00175

Markdown

[Cho et al. "Dealing with Missing Modalities in the Visual Question Answer-Difference Prediction Task Through Knowledge Distillation." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2021.](https://mlanthology.org/cvprw/2021/cho2021cvprw-dealing/) doi:10.1109/CVPRW53098.2021.00175

BibTeX

@inproceedings{cho2021cvprw-dealing,
  title     = {{Dealing with Missing Modalities in the Visual Question Answer-Difference Prediction Task Through Knowledge Distillation}},
  author    = {Cho, Jae-Won and Kim, Dong-Jin and Choi, Jinsoo and Jung, Yunjae and Kweon, In So},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
  year      = {2021},
  pages     = {1592-1601},
  doi       = {10.1109/CVPRW53098.2021.00175},
  url       = {https://mlanthology.org/cvprw/2021/cho2021cvprw-dealing/}
}