Cross Modal Retrieval with Querybank Normalisation

Abstract

Profiting from large-scale training datasets, advances in neural architecture design and efficient inference, joint embeddings have become the dominant approach for tackling cross-modal retrieval. In this work we first show that, despite their effectiveness, state-of-the-art joint embeddings suffer significantly from the longstanding "hubness problem" in which a small number of gallery embeddings form the nearest neighbours of many queries. Drawing inspiration from the NLP literature, we formulate a simple but effective framework called Querybank Normalisation (QB-Norm) that re-normalises query similarities to account for hubs in the embedding space. QB-Norm improves retrieval performance without requiring retraining. Differently from prior work, we show that QB-Norm works effectively without concurrent access to any test set queries. Within the QB-Norm framework, we also propose a novel similarity normalisation method, the Dynamic Inverted Softmax, that is significantly more robust than existing approaches. We showcase QB-Norm across a range of cross modal retrieval models and benchmarks where it consistently enhances strong baselines beyond the state of the art. Code is available at https://vladbogo.github.io/QB-Norm/.

Cite

Text

Bogolin et al. "Cross Modal Retrieval with Querybank Normalisation." Conference on Computer Vision and Pattern Recognition, 2022. doi:10.1109/CVPR52688.2022.00513

Markdown

[Bogolin et al. "Cross Modal Retrieval with Querybank Normalisation." Conference on Computer Vision and Pattern Recognition, 2022.](https://mlanthology.org/cvpr/2022/bogolin2022cvpr-cross/) doi:10.1109/CVPR52688.2022.00513

BibTeX

@inproceedings{bogolin2022cvpr-cross,
  title     = {{Cross Modal Retrieval with Querybank Normalisation}},
  author    = {Bogolin, Simion-Vlad and Croitoru, Ioana and Jin, Hailin and Liu, Yang and Albanie, Samuel},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2022},
  pages     = {5194-5205},
  doi       = {10.1109/CVPR52688.2022.00513},
  url       = {https://mlanthology.org/cvpr/2022/bogolin2022cvpr-cross/}
}