Contrastive Attention Maps for Self-Supervised Co-Localization

Abstract

The goal of unsupervised co-localization is to locate the object in a scene under the assumptions that 1) the dataset consists of only one superclass, e.g., birds, and 2) there are no human-annotated labels in the dataset. The most recent method achieves impressive co-localization performance by employing self-supervised representation learning approaches such as predicting rotation. In this paper, we introduce a new contrastive objective directly on the attention maps to enhance co-localization performance. Our contrastive loss function exploits rich information of location, which induces the model to activate the extent of the object effectively. In addition, we propose a pixel-wise attention pooling that selectively aggregates the feature map regarding their magnitudes across channels. Our methods are simple and shown effective by extensive qualitative and quantitative evaluation, achieving state-of-the-art co-localization performances by large margins on four datasets: CUB-200-2011, Stanford Cars, FGVC-Aircraft, and Stanford Dogs. Our code will be publicly available online for the research community.

Cite

Text

Ki et al. "Contrastive Attention Maps for Self-Supervised Co-Localization." International Conference on Computer Vision, 2021. doi:10.1109/ICCV48922.2021.00280

Markdown

[Ki et al. "Contrastive Attention Maps for Self-Supervised Co-Localization." International Conference on Computer Vision, 2021.](https://mlanthology.org/iccv/2021/ki2021iccv-contrastive/) doi:10.1109/ICCV48922.2021.00280

BibTeX

@inproceedings{ki2021iccv-contrastive,
  title     = {{Contrastive Attention Maps for Self-Supervised Co-Localization}},
  author    = {Ki, Minsong and Uh, Youngjung and Choe, Junsuk and Byun, Hyeran},
  booktitle = {International Conference on Computer Vision},
  year      = {2021},
  pages     = {2803-2812},
  doi       = {10.1109/ICCV48922.2021.00280},
  url       = {https://mlanthology.org/iccv/2021/ki2021iccv-contrastive/}
}