Distilling Visual Priors from Self-Supervised Learning

Abstract

Convolutional Neural Networks (CNNs) are prone to overfit small training datasets. We present a novel two-phase pipeline that leverages self-supervised learning and knowledge distillation to improve the generalization ability of CNN models for image classification under the data-deficient setting. The first phase is to learn a teacher model which possesses rich and generalizable visual representations via self-supervised learning, and the second phase is to distill the representations into a student model in a self-distillation manner, and meanwhile fine-tune the student model for the image classification task. We also propose a novel margin loss for the self-supervised contrastive learning proxy task to better learn the representation under the data-deficient scenario. Together with other tricks, we achieve competitive performance in the VIPriors image classification challenge.

Cite

Text

Zhao and Wen. "Distilling Visual Priors from Self-Supervised Learning." European Conference on Computer Vision Workshops, 2020. doi:10.1007/978-3-030-66096-3_29

Markdown

[Zhao and Wen. "Distilling Visual Priors from Self-Supervised Learning." European Conference on Computer Vision Workshops, 2020.](https://mlanthology.org/eccvw/2020/zhao2020eccvw-distilling/) doi:10.1007/978-3-030-66096-3_29

BibTeX

@inproceedings{zhao2020eccvw-distilling,
  title     = {{Distilling Visual Priors from Self-Supervised Learning}},
  author    = {Zhao, Bingchen and Wen, Xin},
  booktitle = {European Conference on Computer Vision Workshops},
  year      = {2020},
  pages     = {422-429},
  doi       = {10.1007/978-3-030-66096-3_29},
  url       = {https://mlanthology.org/eccvw/2020/zhao2020eccvw-distilling/}
}