Symmetric Network with Spatial Relationship Modeling for Natural Language-Based Vehicle Retrieval

Abstract

Natural language (NL) based vehicle retrieval aims to search specific vehicle given text description. Different from the image-based vehicle retrieval, NL-based vehicle retrieval requires considering not only vehicle appearance, but also surrounding environment and temporal relations. In this paper, we propose a Symmetric Network with Spatial Relationship Modeling (SSM) method for NL-based vehicle retrieval. Specifically, we design a symmetric network to learn the unified cross-modal representations between text descriptions and vehicle images, where vehicle appearance details and vehicle trajectory global information are pre-served. Besides, to make better use of location information, we propose a spatial relationship modeling methods to take surrounding environment and mutual relationship between vehicles into consideration. The qualitative and quantitative experiments verify the effectiveness of the proposed method. We achieve 43.92% MRR accuracy on the test set of the 6th AI City Challenge on natural language-based vehicle retrieval track, yielding the 4th place on the public leaderboard. The code will be available at https://github.com/hbchen121/AICITY2022_Track2_SSM.

Cite

Text

Zhao et al. "Symmetric Network with Spatial Relationship Modeling for Natural Language-Based Vehicle Retrieval." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2022. doi:10.1109/CVPRW56347.2022.00364

Markdown

[Zhao et al. "Symmetric Network with Spatial Relationship Modeling for Natural Language-Based Vehicle Retrieval." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2022.](https://mlanthology.org/cvprw/2022/zhao2022cvprw-symmetric/) doi:10.1109/CVPRW56347.2022.00364

BibTeX

@inproceedings{zhao2022cvprw-symmetric,
  title     = {{Symmetric Network with Spatial Relationship Modeling for Natural Language-Based Vehicle Retrieval}},
  author    = {Zhao, Chuyang and Chen, Haobo and Zhang, Wenyuan and Chen, Junru and Zhang, Sipeng and Li, Yadong and Li, Boxun},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
  year      = {2022},
  pages     = {3225-3232},
  doi       = {10.1109/CVPRW56347.2022.00364},
  url       = {https://mlanthology.org/cvprw/2022/zhao2022cvprw-symmetric/}
}