Semantic Information in Contrastive Learning

Abstract

This work investigates the functionality of Semantic information in Contrastive Learning (SemCL). An advanced pretext task is designed: a contrast is performed between each object and its environment, taken from a scene. This allows the SemCL pretrained model to extract objects from their environment in an image, significantly improving the spatial understanding of the pretrained models. Downstream tasks of semantic/instance segmentation, object detection and depth estimation are implemented on PASCAl VOC, Cityscapes, COCO, KITTI, etc. SemCL pretrained models substantially outperform ImageNet pretrained counterparts and are competitive with well-known works on downstream tasks. The results suggest that a dedicated pretext task leveraging semantic information can be powerful in benchmarks related to spatial understanding. The code is available at https://github.com/sjiang95/semcl.

Cite

Text

Quan et al. "Semantic Information in Contrastive Learning." International Conference on Computer Vision, 2023. doi:10.1109/ICCV51070.2023.00523

Markdown

[Quan et al. "Semantic Information in Contrastive Learning." International Conference on Computer Vision, 2023.](https://mlanthology.org/iccv/2023/quan2023iccv-semantic/) doi:10.1109/ICCV51070.2023.00523

BibTeX

@inproceedings{quan2023iccv-semantic,
  title     = {{Semantic Information in Contrastive Learning}},
  author    = {Quan, Shengjiang and Hirano, Masahiro and Yamakawa, Yuji},
  booktitle = {International Conference on Computer Vision},
  year      = {2023},
  pages     = {5686-5696},
  doi       = {10.1109/ICCV51070.2023.00523},
  url       = {https://mlanthology.org/iccv/2023/quan2023iccv-semantic/}
}