Bi-Directional Relationship Inferring Network for Referring Image Segmentation

Abstract

Most existing methods do not explicitly formulate the mutual guidance between vision and language. In this work, we propose a bi-directional relationship inferring network (BRINet) to model the dependencies of cross-modal information. In detail, the vision-guided linguistic attention is used to learn the adaptive linguistic context corresponding to each visual region. Combining with the language-guided visual attention, a bi-directional cross-modal attention module (BCAM) is built to learn the relationship between multi-modal features. Thus, the ultimate semantic context of the target object and referring expression can be represented accurately and consistently. Moreover, a gated bi-directional fusion module (GBFM) is designed to integrate the multi-level features where a gate function is used to guide the bi-directional flow of multi-level information. Extensive experiments on four benchmark datasets demonstrate that the proposed method outperforms other state-of-the-art methods under different evaluation metrics.

Cite

Text

Hu et al. "Bi-Directional Relationship Inferring Network for Referring Image Segmentation." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020. doi:10.1109/CVPR42600.2020.00448

Markdown

[Hu et al. "Bi-Directional Relationship Inferring Network for Referring Image Segmentation." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020.](https://mlanthology.org/cvpr/2020/hu2020cvpr-bidirectional/) doi:10.1109/CVPR42600.2020.00448

BibTeX

@inproceedings{hu2020cvpr-bidirectional,
  title     = {{Bi-Directional Relationship Inferring Network for Referring Image Segmentation}},
  author    = {Hu, Zhiwei and Feng, Guang and Sun, Jiayu and Zhang, Lihe and Lu, Huchuan},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year      = {2020},
  doi       = {10.1109/CVPR42600.2020.00448},
  url       = {https://mlanthology.org/cvpr/2020/hu2020cvpr-bidirectional/}
}