Hadamard Product for Low-Rank Bilinear Pooling
Abstract
Bilinear models provide rich representations compared with linear models. They have been applied in various visual tasks, such as object recognition, segmentation, and visual question-answering, to get state-of-the-art performances taking advantage of the expanded representations. However, bilinear representations tend to be high-dimensional, limiting the applicability to computationally complex tasks. We propose low-rank bilinear pooling using Hadamard product for an efficient attention mechanism of multimodal learning. We show that our model outperforms compact bilinear pooling in visual question-answering tasks with the state-of-the-art results on the VQA dataset, having a better parsimonious property.
Cite
Text
Kim et al. "Hadamard Product for Low-Rank Bilinear Pooling." International Conference on Learning Representations, 2017.Markdown
[Kim et al. "Hadamard Product for Low-Rank Bilinear Pooling." International Conference on Learning Representations, 2017.](https://mlanthology.org/iclr/2017/kim2017iclr-hadamard/)BibTeX
@inproceedings{kim2017iclr-hadamard,
title = {{Hadamard Product for Low-Rank Bilinear Pooling}},
author = {Kim, Jin-Hwa and On, Kyoung Woon and Lim, Woosang and Kim, Jeonghee and Ha, Jung-Woo and Zhang, Byoung-Tak},
booktitle = {International Conference on Learning Representations},
year = {2017},
url = {https://mlanthology.org/iclr/2017/kim2017iclr-hadamard/}
}