Linking People Across Text and Images Based on Social Relation Reasoning

Abstract

As a sub-task of visual grounding, linking people across text and images aims to localize target people in images with corresponding sentences. Existing approaches tend to capture superficial features of people (e.g., dress and location) that suffer from the incompleteness information across text and images. We observe that humans are adept at exploring social relations to assist identifying people. Therefore, we propose a Social Relation Reasoning (SRR) model to address the aforementioned issues. Firstly, we design a Social Relation Extraction (SRE) module to extract social relations between people in the input sentence. Specially, the SRE module based on zero-shot learning is able to extract social relations even though they are not defined in the existing datasets. A Reasoning based Cross-modal Matching (RCM) module is further used to generate matching matrices by reasoning on the social relations and visual features. Experimental results show that the accuracy of our proposed SRR model outperforms the state-of-the-art models on the challenging datasets Who's Waldo and FL: MSRE, by more than 5\% and 7\%, respectively. Our source code is available at https://github.com/VILAN-Lab/SRR.

Cite

Text

Lei et al. "Linking People Across Text and Images Based on Social Relation Reasoning." AAAI Conference on Artificial Intelligence, 2023. doi:10.1609/AAAI.V37I1.25209

Markdown

[Lei et al. "Linking People Across Text and Images Based on Social Relation Reasoning." AAAI Conference on Artificial Intelligence, 2023.](https://mlanthology.org/aaai/2023/lei2023aaai-linking/) doi:10.1609/AAAI.V37I1.25209

BibTeX

@inproceedings{lei2023aaai-linking,
  title     = {{Linking People Across Text and Images Based on Social Relation Reasoning}},
  author    = {Lei, Yang and Zhao, Peizhi and Li, Pijian and Cai, Yi and Huang, Qingbao},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2023},
  pages     = {1260-1268},
  doi       = {10.1609/AAAI.V37I1.25209},
  url       = {https://mlanthology.org/aaai/2023/lei2023aaai-linking/}
}