Unifying Visual Contrastive Learning for Object Recognition from a Graph Perspective

Abstract

Recent contrastive based unsupervised object recognition methods leverage a Siamese architecture, which has two branches composed of a backbone, a projector layer, and an optional predictor layer in each branch. To learn the parameters of the backbone, existing methods have a similar projector layer design, while the major difference among them lies in the predictor layer. In this paper, we propose to \underline{Uni}fy existing unsupervised \underline{V}isual \underline{C}ontrastive \underline{L}earning methods by using a GCN layer as the predictor layer (UniVCL), which deserves two merits to unsupervised learning in object recognition. First, by treating different designs of predictors in the existing methods as its special cases, our fair and comprehensive experiments reveal the critical importance of neighborhood aggregation in the GCN predictor. Second, by viewing the predictor from the graph perspective, we can bridge the vision self-supervised learning with the graph representation learning area, which facilitates us to introduce the augmentations from the graph representation learning to unsupervised object recognition and further improves the unsupervised object recognition accuracy. Extensive experiments on linear evaluation and the semi-supervised learning tasks demonstrate the effectiveness of UniVCL and the introduced graph augmentations. Code will be released upon acceptance.

Cite

Text

Tang et al. "Unifying Visual Contrastive Learning for Object Recognition from a Graph Perspective." Proceedings of the European Conference on Computer Vision (ECCV), 2022. doi:10.1007/978-3-031-19809-0_37

Markdown

[Tang et al. "Unifying Visual Contrastive Learning for Object Recognition from a Graph Perspective." Proceedings of the European Conference on Computer Vision (ECCV), 2022.](https://mlanthology.org/eccv/2022/tang2022eccv-unifying/) doi:10.1007/978-3-031-19809-0_37

BibTeX

@inproceedings{tang2022eccv-unifying,
  title     = {{Unifying Visual Contrastive Learning for Object Recognition from a Graph Perspective}},
  author    = {Tang, Shixiang and Zhu, Feng and Bai, Lei and Zhao, Rui and Wang, Chenyu and Ouyang, Wanli},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year      = {2022},
  doi       = {10.1007/978-3-031-19809-0_37},
  url       = {https://mlanthology.org/eccv/2022/tang2022eccv-unifying/}
}