Performance-Aware Mutual Knowledge Distillation for Improving Neural Architecture Search

CVPR 2022 pp. 11922-11932

doi:10.1109/CVPR52688.2022.01162 /cvpr/2022/xie2022cvpr-performanceaware/

Abstract

Knowledge distillation has shown great effectiveness for improving neural architecture search (NAS). Mutual knowledge distillation (MKD), where a group of models mutually generate knowledge to train each other, has achieved promising results in many applications. In existing MKD methods, mutual knowledge distillation is performed between models without scrutiny: a worse-performing model is allowed to generate knowledge to train a better-performing model, which may lead to collective failures. To address this problem, we propose a performance-aware MKD (PAMKD) approach for NAS, where knowledge generated by model A is allowed to train model B only if the performance of A is better than B. We propose a three-level optimization framework to formulate PAMKD, where three learning stages are performed end-to-end: 1) each model trains an initial model independently; 2) the initial models are evaluated on a validation set and better-performing models generate knowledge to train worse-performing models; 3) architectures are updated by minimizing a validation loss. Experimental results on a variety of datasets demonstrate that our method is effective.

PDF CVPR Semantic Scholar

Cite

Text

Xie and Du. "Performance-Aware Mutual Knowledge Distillation for Improving Neural Architecture Search." Conference on Computer Vision and Pattern Recognition, 2022. doi:10.1109/CVPR52688.2022.01162

Markdown

[Xie and Du. "Performance-Aware Mutual Knowledge Distillation for Improving Neural Architecture Search." Conference on Computer Vision and Pattern Recognition, 2022.](https://mlanthology.org/cvpr/2022/xie2022cvpr-performanceaware/) doi:10.1109/CVPR52688.2022.01162

BibTeX

@inproceedings{xie2022cvpr-performanceaware,
  title     = {{Performance-Aware Mutual Knowledge Distillation for Improving Neural Architecture Search}},
  author    = {Xie, Pengtao and Du, Xuefeng},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2022},
  pages     = {11922-11932},
  doi       = {10.1109/CVPR52688.2022.01162},
  url       = {https://mlanthology.org/cvpr/2022/xie2022cvpr-performanceaware/}
}