DearKD: Data-Efficient Early Knowledge Distillation for Vision Transformers

Abstract

Transformers have been successfully applied to computer vision due to its powerful modelling capacity with self-attention. However, the good performance of transformers heavily depends on enormous training images. Thus, a data-efficient transformer solution is urgently needed. In this work, we propose an early knowledge distillation framework, which is termed as DearKD, to improvethe data-efficiency required by transformers. Our DearKD is a two-stage framework that first distills the inductive biases from the early intermediate layers of a CNN and then gives the transformer full play by training without distillation. Further, our DearKD can also be applied to the extreme data-free case where no real images are available, where we propose a boundary-preserving intra-divergence loss based on DeepInversion to further close the performance gap against the full-data counterpart. Extensive experiments on ImageNet, partial ImageNet, data-free setting and other downstream tasks prove the superiority of DearKD over its baselines and state-of-the-art methods.

Cite

Text

Chen et al. "DearKD: Data-Efficient Early Knowledge Distillation for Vision Transformers." Conference on Computer Vision and Pattern Recognition, 2022. doi:10.1109/CVPR52688.2022.01174

Markdown

[Chen et al. "DearKD: Data-Efficient Early Knowledge Distillation for Vision Transformers." Conference on Computer Vision and Pattern Recognition, 2022.](https://mlanthology.org/cvpr/2022/chen2022cvpr-dearkd/) doi:10.1109/CVPR52688.2022.01174

BibTeX

@inproceedings{chen2022cvpr-dearkd,
  title     = {{DearKD: Data-Efficient Early Knowledge Distillation for Vision Transformers}},
  author    = {Chen, Xianing and Cao, Qiong and Zhong, Yujie and Zhang, Jing and Gao, Shenghua and Tao, Dacheng},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2022},
  pages     = {12052-12062},
  doi       = {10.1109/CVPR52688.2022.01174},
  url       = {https://mlanthology.org/cvpr/2022/chen2022cvpr-dearkd/}
}