Mind Your Neighbours: Image Annotation with Metadata Neighbourhood Graph Co-Attention Networks

Abstract

As the visual reflections of our daily lives, images are frequently shared on the social network, which generates the abundant 'metadata' that records user interactions with images. Due to the diverse contents and complex styles, some images can be challenging to recognise when neglecting the context. Images with the similar metadata, such as 'relevant topics and textual descriptions', 'common friends of users' and 'nearby locations', form a neighbourhood for each image, which can be used to assist the annotation. In this paper, we propose a Metadata Neighbourhood Graph Co-Attention Network (MangoNet) to model the correlations between each target image and its neighbours. To accurately capture the visual clues from the neighbourhood, a co-attention mechanism is introduced to embed the target image and its neighbours as graph nodes, while the graph edges capture the node pair correlations. By reasoning on the neighbourhood graph, we obtain the graph representation to help annotate the target image. Experimental results on three benchmark datasets indicate that our proposed model achieves the best performance compared to the state-of-the-art methods.

Cite

Text

Zhang et al. "Mind Your Neighbours: Image Annotation with Metadata Neighbourhood Graph Co-Attention Networks." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019. doi:10.1109/CVPR.2019.00307

Markdown

[Zhang et al. "Mind Your Neighbours: Image Annotation with Metadata Neighbourhood Graph Co-Attention Networks." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019.](https://mlanthology.org/cvpr/2019/zhang2019cvpr-mind/) doi:10.1109/CVPR.2019.00307

BibTeX

@inproceedings{zhang2019cvpr-mind,
  title     = {{Mind Your Neighbours: Image Annotation with Metadata Neighbourhood Graph Co-Attention Networks}},
  author    = {Zhang, Junjie and Wu, Qi and Zhang, Jian and Shen, Chunhua and Lu, Jianfeng},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year      = {2019},
  doi       = {10.1109/CVPR.2019.00307},
  url       = {https://mlanthology.org/cvpr/2019/zhang2019cvpr-mind/}
}