Discrete-Continuous Action Space Policy Gradient-Based Attention for Image-Text Matching

Abstract

Image-text matching is an important multi-modal task with massive applications. It tries to match the image and the text with similar semantic information. Existing approaches do not explicitly transform the different modalities into a common space. Meanwhile, the attention mechanism which is widely used in image-text matching models does not have supervision. We propose a novel attention scheme which projects the image and text embedding into a common space and optimises the attention weights directly towards the evaluation metrics. The proposed attention scheme can be considered as a kind of supervised attention and requiring no additional annotations. It is trained via a novel Discrete-continuous action space policy gradient algorithm, which is more effective in modelling complex action space than previous continuous action space policy gradient. We evaluate the proposed methods on two widely-used benchmark datasets: Flickr30k and MS-COCO, outperforming the previous approaches by a large margin.

Cite

Text

Yan et al. "Discrete-Continuous Action Space Policy Gradient-Based Attention for Image-Text Matching." Conference on Computer Vision and Pattern Recognition, 2021. doi:10.1109/CVPR46437.2021.00800

Markdown

[Yan et al. "Discrete-Continuous Action Space Policy Gradient-Based Attention for Image-Text Matching." Conference on Computer Vision and Pattern Recognition, 2021.](https://mlanthology.org/cvpr/2021/yan2021cvpr-discretecontinuous/) doi:10.1109/CVPR46437.2021.00800

BibTeX

@inproceedings{yan2021cvpr-discretecontinuous,
  title     = {{Discrete-Continuous Action Space Policy Gradient-Based Attention for Image-Text Matching}},
  author    = {Yan, Shiyang and Yu, Li and Xie, Yuan},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2021},
  pages     = {8096-8105},
  doi       = {10.1109/CVPR46437.2021.00800},
  url       = {https://mlanthology.org/cvpr/2021/yan2021cvpr-discretecontinuous/}
}