Hadamard Product for Low-Rank Bilinear Pooling

Abstract

Bilinear models provide rich representations compared with linear models. They have been applied in various visual tasks, such as object recognition, segmentation, and visual question-answering, to get state-of-the-art performances taking advantage of the expanded representations. However, bilinear representations tend to be high-dimensional, limiting the applicability to computationally complex tasks. We propose low-rank bilinear pooling using Hadamard product for an efficient attention mechanism of multimodal learning. We show that our model outperforms compact bilinear pooling in visual question-answering tasks with the state-of-the-art results on the VQA dataset, having a better parsimonious property.

Cite

Text

Kim et al. "Hadamard Product for Low-Rank Bilinear Pooling." International Conference on Learning Representations, 2017.

Markdown

[Kim et al. "Hadamard Product for Low-Rank Bilinear Pooling." International Conference on Learning Representations, 2017.](https://mlanthology.org/iclr/2017/kim2017iclr-hadamard/)

BibTeX

@inproceedings{kim2017iclr-hadamard,
  title     = {{Hadamard Product for Low-Rank Bilinear Pooling}},
  author    = {Kim, Jin-Hwa and On, Kyoung Woon and Lim, Woosang and Kim, Jeonghee and Ha, Jung-Woo and Zhang, Byoung-Tak},
  booktitle = {International Conference on Learning Representations},
  year      = {2017},
  url       = {https://mlanthology.org/iclr/2017/kim2017iclr-hadamard/}
}