Counterfactual Vision and Language Learning

Abbasnejad, Ehsan; Teney, Damien; Parvaneh, Amin; Shi, Javen; van den Hengel, Anton

doi:10.1109/CVPR42600.2020.01006

Counterfactual Vision and Language Learning

Ehsan Abbasnejad, Damien Teney, Amin Parvaneh, Javen Shi, Anton van den Hengel

CVPR 2020

doi:10.1109/CVPR42600.2020.01006 /cvpr/2020/abbasnejad2020cvpr-counterfactual/

Abstract

The ongoing success of visual question answering methods has been somwehat surprising given that, at its most general, the problem requires understanding the entire variety of both visual and language stimuli. It is particularly remarkable that this success has been achieved on the basis of comparatively small datasets, given the scale of the problem. One explanation is that this has been accomplished partly by exploiting bias in the datasets rather than developing deeper multi-modal reasoning. This fundamentally limits the generalization of the method, and thus its practical applicability. We propose a method that addresses this problem by introducing counterfactuals in the training. In doing so we leverage structural causal models for counterfactual evaluation to formulate alternatives, for instance, questions that could be asked of the same image set. We show that simulating plausible alternative training data through this process results in better generalization.

PDF CVPR Semantic Scholar

Cite

Text

Abbasnejad et al. "Counterfactual Vision and Language Learning." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020. doi:10.1109/CVPR42600.2020.01006

Markdown

[Abbasnejad et al. "Counterfactual Vision and Language Learning." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020.](https://mlanthology.org/cvpr/2020/abbasnejad2020cvpr-counterfactual/) doi:10.1109/CVPR42600.2020.01006

BibTeX

@inproceedings{abbasnejad2020cvpr-counterfactual,
  title     = {{Counterfactual Vision and Language Learning}},
  author    = {Abbasnejad, Ehsan and Teney, Damien and Parvaneh, Amin and Shi, Javen and van den Hengel, Anton},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year      = {2020},
  doi       = {10.1109/CVPR42600.2020.01006},
  url       = {https://mlanthology.org/cvpr/2020/abbasnejad2020cvpr-counterfactual/}
}