Hard-Negatives or Non-Negatives? a Hard-Negative Selection Strategy for Cross-Modal Retrieval Using the Improved Marginal Ranking Loss

Abstract

Cross-modal learning has gained a lot of interest recently, and many applications of it, such as image-text retrieval, cross-modal video search, or video captioning have been proposed. In this work, we deal with the cross-modal video retrieval problem. The state-of-the-art approaches are based on deep network architectures, and rely on mining hard-negative samples during training to optimize the selection of the network’s parameters. Starting from a state-of-the-art cross-modal architecture that uses the improved marginal ranking loss function, we propose a simple strategy for hard-negative mining to identify which training samples are hard-negatives and which, although presently treated as hard-negatives, are likely not negative samples at all and shouldn’t be treated as such. Additionally, to take full advantage of network models trained using different de-sign choices for hard-negative mining, we examine model combination strategies, and we design a hybrid one effectively combining large numbers of trained models.

Cite

Text

Galanopoulos and Mezaris. "Hard-Negatives or Non-Negatives? a Hard-Negative Selection Strategy for Cross-Modal Retrieval Using the Improved Marginal Ranking Loss." IEEE/CVF International Conference on Computer Vision Workshops, 2021. doi:10.1109/ICCVW54120.2021.00261

Markdown

[Galanopoulos and Mezaris. "Hard-Negatives or Non-Negatives? a Hard-Negative Selection Strategy for Cross-Modal Retrieval Using the Improved Marginal Ranking Loss." IEEE/CVF International Conference on Computer Vision Workshops, 2021.](https://mlanthology.org/iccvw/2021/galanopoulos2021iccvw-hardnegatives/) doi:10.1109/ICCVW54120.2021.00261

BibTeX

@inproceedings{galanopoulos2021iccvw-hardnegatives,
  title     = {{Hard-Negatives or Non-Negatives? a Hard-Negative Selection Strategy for Cross-Modal Retrieval Using the Improved Marginal Ranking Loss}},
  author    = {Galanopoulos, Damianos and Mezaris, Vasileios},
  booktitle = {IEEE/CVF International Conference on Computer Vision Workshops},
  year      = {2021},
  pages     = {2312-2316},
  doi       = {10.1109/ICCVW54120.2021.00261},
  url       = {https://mlanthology.org/iccvw/2021/galanopoulos2021iccvw-hardnegatives/}
}