MixSKD: Self-Knowledge Distillation from Mixup for Image Recognition
Abstract
Unlike the conventional Knowledge Distillation (KD), Self-KD allows a network to learn knowledge from itself without any guidance from extra networks. This paper proposes to perform Self-KD from image Mixture (MixSKD), which integrates these two techniques into a unified framework. MixSKD mutually distills feature maps and probability distributions between the random pair of original images and their mixup images in a meaningful way. Therefore, it guides the network to learn cross-image knowledge by modelling supervisory signals from mixup images. Moreover, we construct a self-teacher network by aggregating multi-stage feature maps for providing soft labels to supervise the backbone classifier, further improving the efficacy of self-boosting. Experiments on image classification and transfer learning to object detection and semantic segmentation demonstrate that MixSKD outperforms other state-of-the-art Self-KD and data augmentation methods. The code is available at https://github.com/winycg/Self-KD-Lib.
Cite
Text
Yang et al. "MixSKD: Self-Knowledge Distillation from Mixup for Image Recognition." Proceedings of the European Conference on Computer Vision (ECCV), 2022. doi:10.1007/978-3-031-20053-3_31Markdown
[Yang et al. "MixSKD: Self-Knowledge Distillation from Mixup for Image Recognition." Proceedings of the European Conference on Computer Vision (ECCV), 2022.](https://mlanthology.org/eccv/2022/yang2022eccv-mixskd/) doi:10.1007/978-3-031-20053-3_31BibTeX
@inproceedings{yang2022eccv-mixskd,
title = {{MixSKD: Self-Knowledge Distillation from Mixup for Image Recognition}},
author = {Yang, Chuanguang and An, Zhulin and Zhou, Helong and Cai, Linhang and Zhi, Xiang and Wu, Jiwen and Xu, Yongjun and Zhang, Qian},
booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
year = {2022},
doi = {10.1007/978-3-031-20053-3_31},
url = {https://mlanthology.org/eccv/2022/yang2022eccv-mixskd/}
}