Knowledge Distillation for Multi-Task Learning

Abstract

Multi-task learning (MTL) is to learn one single model that performs multiple tasks for achieving good performance on all tasks and lower cost on computation. Learning such a model requires to jointly optimize losses of a set of tasks with different difficulty levels, magnitudes, and characteristics (e.g. cross-entropy, Euclidean loss), leading to the imbalance problem in multi-task learning. To address the imbalance problem, we propose a knowledge distillation based method in this work. We first learn a task-specific model for each task. We then learn the multi-task model for minimizing task-specific loss and for producing the same feature with task-specific models. As the task-specific network encodes different features, we introduce small task-specific adaptors to project multi-task features to the task-specific features. In this way, the adaptors align the task-specific feature and the multi-task feature, which enables a balanced parameter sharing across tasks. Extensive experimental results demonstrate that our method can optimize a multi-task learning model in a more balanced way and achieve better overall performance.

Cite

Text

Li and Bilen. "Knowledge Distillation for Multi-Task Learning." European Conference on Computer Vision Workshops, 2020. doi:10.1007/978-3-030-65414-6_13

Markdown

[Li and Bilen. "Knowledge Distillation for Multi-Task Learning." European Conference on Computer Vision Workshops, 2020.](https://mlanthology.org/eccvw/2020/li2020eccvw-knowledge/) doi:10.1007/978-3-030-65414-6_13

BibTeX

@inproceedings{li2020eccvw-knowledge,
  title     = {{Knowledge Distillation for Multi-Task Learning}},
  author    = {Li, Wei-Hong and Bilen, Hakan},
  booktitle = {European Conference on Computer Vision Workshops},
  year      = {2020},
  pages     = {163-176},
  doi       = {10.1007/978-3-030-65414-6_13},
  url       = {https://mlanthology.org/eccvw/2020/li2020eccvw-knowledge/}
}