Task-Oriented Feature Distillation
Abstract
Feature distillation, a primary method in knowledge distillation, always leads to significant accuracy improvements. Most existing methods distill features in the teacher network through a manually designed transformation. In this paper, we propose a novel distillation method named task-oriented feature distillation (TOFD) where the transformation is convolutional layers that are trained in a data-driven manner by task loss. As a result, the task-oriented information in the features can be captured and distilled to students. Moreover, an orthogonal loss is applied to the feature resizing layer in TOFD to improve the performance of knowledge distillation. Experiments show that TOFD outperforms other distillation methods by a large margin on both image classification and 3D classification tasks. Codes have been released in Github.
Cite
Text
Zhang et al. "Task-Oriented Feature Distillation." Neural Information Processing Systems, 2020.Markdown
[Zhang et al. "Task-Oriented Feature Distillation." Neural Information Processing Systems, 2020.](https://mlanthology.org/neurips/2020/zhang2020neurips-taskoriented/)BibTeX
@inproceedings{zhang2020neurips-taskoriented,
title = {{Task-Oriented Feature Distillation}},
author = {Zhang, Linfeng and Shi, Yukang and Shi, Zuoqiang and Ma, Kaisheng and Bao, Chenglong},
booktitle = {Neural Information Processing Systems},
year = {2020},
url = {https://mlanthology.org/neurips/2020/zhang2020neurips-taskoriented/}
}