In Defense of the Unitary Scalarization for Deep Multi-Task Learning

Abstract

Recent multi-task learning research argues against unitary scalarization, where training simply minimizes the sum of the task losses. Several ad-hoc multi-task optimization algorithms have instead been proposed, inspired by various hypotheses about what makes multi-task settings difficult. The majority of these optimizers require per-task gradients, and introduce significant memory, runtime, and implementation overhead. We show that unitary scalarization, coupled with standard regularization and stabilization techniques from single-task learning, matches or improves upon the performance of complex multi-task optimizers in popular supervised and reinforcement learning settings. We then present an analysis suggesting that many specialized multi-task optimizers can be partly interpreted as forms of regularization, potentially explaining our surprising results. We believe our results call for a critical reevaluation of recent research in the area.

Cite

Text

Kurin et al. "In Defense of the Unitary Scalarization for Deep Multi-Task Learning." Neural Information Processing Systems, 2022.

Markdown

[Kurin et al. "In Defense of the Unitary Scalarization for Deep Multi-Task Learning." Neural Information Processing Systems, 2022.](https://mlanthology.org/neurips/2022/kurin2022neurips-defense/)

BibTeX

@inproceedings{kurin2022neurips-defense,
  title     = {{In Defense of the Unitary Scalarization for Deep Multi-Task Learning}},
  author    = {Kurin, Vitaly and De Palma, Alessandro and Kostrikov, Ilya and Whiteson, Shimon and Mudigonda, Pawan K},
  booktitle = {Neural Information Processing Systems},
  year      = {2022},
  url       = {https://mlanthology.org/neurips/2022/kurin2022neurips-defense/}
}