Adaptive Dual Guidance Knowledge Distillation

Abstract

Knowledge distillation (KD) aims to improve the performance of lightweight student networks under the guidance of pre-trained teachers. However, the large capacity gap between teachers and students limits the distillation gains. Previous methods addressing this problem have two weaknesses. First, most of them decrease the performance of pre-trained teachers, hindering students from achieving comparable performance. Second, these methods fail to dynamically adjust the transferred knowledge to be compatible with the representation ability of students, which is less effective in bridging the capacity gap. In this paper, we propose Adaptive Dual Guidance Knowledge Distillation (ADG-KD), which retains the guidance of the pre-trained teacher and uses the teacher's bidirectional optimization route guiding the student to alleviate the capacity gap problem. Specifically, ADG-KD introduces an initialized teacher, which has an identical structure to the pre-trained teacher and is optimized through the bidirectional supervision from both the pre-trained teacher and student. In this way, we construct the teacher's bidirectional optimization route to provide the students with an easy-to-hard and compatible knowledge sequence. ADG-KD trains the students under the proposed dual guidance approaches and automatically determines their importance weights, making the transferred knowledge better compatible with the representation ability of students. Extensive experiments on CIFAR-100, ImageNet, and MS-COCO demonstrate the effectiveness of our method.

Cite

Text

Li et al. "Adaptive Dual Guidance Knowledge Distillation." AAAI Conference on Artificial Intelligence, 2025. doi:10.1609/AAAI.V39I17.34031

Markdown

[Li et al. "Adaptive Dual Guidance Knowledge Distillation." AAAI Conference on Artificial Intelligence, 2025.](https://mlanthology.org/aaai/2025/li2025aaai-adaptive/) doi:10.1609/AAAI.V39I17.34031

BibTeX

@inproceedings{li2025aaai-adaptive,
  title     = {{Adaptive Dual Guidance Knowledge Distillation}},
  author    = {Li, Tong and Liu, Long and Liu, Kang and Wang, Xin and Zhou, Bo and Yang, Hongguang and Lu, Kai},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {18457-18465},
  doi       = {10.1609/AAAI.V39I17.34031},
  url       = {https://mlanthology.org/aaai/2025/li2025aaai-adaptive/}
}