Knowledge Distillation for Multi-Task Learning
Abstract
Multi-task learning (MTL) is to learn one single model that performs multiple tasks for achieving good performance on all tasks and lower cost on computation. Learning such a model requires to jointly optimize losses of a set of tasks with different difficulty levels, magnitudes, and characteristics (e.g. cross-entropy, Euclidean loss), leading to the imbalance problem in multi-task learning. To address the imbalance problem, we propose a knowledge distillation based method in this work. We first learn a task-specific model for each task. We then learn the multi-task model for minimizing task-specific loss and for producing the same feature with task-specific models. As the task-specific network encodes different features, we introduce small task-specific adaptors to project multi-task features to the task-specific features. In this way, the adaptors align the task-specific feature and the multi-task feature, which enables a balanced parameter sharing across tasks. Extensive experimental results demonstrate that our method can optimize a multi-task learning model in a more balanced way and achieve better overall performance.
Cite
Text
Li and Bilen. "Knowledge Distillation for Multi-Task Learning." European Conference on Computer Vision Workshops, 2020. doi:10.1007/978-3-030-65414-6_13Markdown
[Li and Bilen. "Knowledge Distillation for Multi-Task Learning." European Conference on Computer Vision Workshops, 2020.](https://mlanthology.org/eccvw/2020/li2020eccvw-knowledge/) doi:10.1007/978-3-030-65414-6_13BibTeX
@inproceedings{li2020eccvw-knowledge,
title = {{Knowledge Distillation for Multi-Task Learning}},
author = {Li, Wei-Hong and Bilen, Hakan},
booktitle = {European Conference on Computer Vision Workshops},
year = {2020},
pages = {163-176},
doi = {10.1007/978-3-030-65414-6_13},
url = {https://mlanthology.org/eccvw/2020/li2020eccvw-knowledge/}
}