Learning to Branch for Multi-Task Learning

Abstract

Training multiple tasks jointly in one deep network yields reduced latency during inference and better performance over the single-task counterpart by sharing certain layers of a network. However, over-sharing a network could erroneously enforce over-generalization, causing negative knowledge transfer across tasks. Prior works rely on human intuition or pre-computed task relatedness scores for ad hoc branching structures. They provide sub-optimal end results and often require huge efforts for the trial-and-error process. In this work, we present an automated multi-task learning algorithm that learns where to share or branch within a network, designing an effective network topology that is directly optimized for multiple objectives across tasks. Specifically, we propose a novel tree-structured design space that casts a tree branching operation as a gumbel-softmax sampling procedure. This enables differentiable network splitting that is end-to-end trainable. We validate the proposed method on controlled synthetic data, CelebA, and Taskonomy.

Cite

Text

Guo et al. "Learning to Branch for Multi-Task Learning." International Conference on Machine Learning, 2020.

Markdown

[Guo et al. "Learning to Branch for Multi-Task Learning." International Conference on Machine Learning, 2020.](https://mlanthology.org/icml/2020/guo2020icml-learning/)

BibTeX

@inproceedings{guo2020icml-learning,
  title     = {{Learning to Branch for Multi-Task Learning}},
  author    = {Guo, Pengsheng and Lee, Chen-Yu and Ulbricht, Daniel},
  booktitle = {International Conference on Machine Learning},
  year      = {2020},
  pages     = {3854-3863},
  volume    = {119},
  url       = {https://mlanthology.org/icml/2020/guo2020icml-learning/}
}