Correlation Congruence for Knowledge Distillation
Abstract
Most teacher-student frameworks based on knowledge distillation (KD) depend on a strong congruent constraint on instance level. However, they usually ignore the correlation between multiple instances, which is also valuable for knowledge transfer. In this work, we propose a new framework named correlation congruence for knowledge distillation (CCKD), which transfers not only the instance-level information but also the correlation between instances. Furthermore, a generalized kernel method based on Taylor series expansion is proposed to better capture the correlation between instances. Empirical experiments and ablation studies on image classification tasks (including CIFAR-100, ImageNet-1K) and metric learning tasks (including ReID and Face Recognition) show that the proposed CCKD substantially outperforms the original KD and other SOTA KD-based methods. The CCKD can be easily deployed in the majority of the teacher-student framework such as KD and hint-based learning methods.
Cite
Text
Peng et al. "Correlation Congruence for Knowledge Distillation." Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019. doi:10.1109/ICCV.2019.00511Markdown
[Peng et al. "Correlation Congruence for Knowledge Distillation." Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019.](https://mlanthology.org/iccv/2019/peng2019iccv-correlation/) doi:10.1109/ICCV.2019.00511BibTeX
@inproceedings{peng2019iccv-correlation,
title = {{Correlation Congruence for Knowledge Distillation}},
author = {Peng, Baoyun and Jin, Xiao and Liu, Jiaheng and Li, Dongsheng and Wu, Yichao and Liu, Yu and Zhou, Shunfeng and Zhang, Zhaoning},
booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision},
year = {2019},
doi = {10.1109/ICCV.2019.00511},
url = {https://mlanthology.org/iccv/2019/peng2019iccv-correlation/}
}