Feature Partitioning for Efficient Multi-Task Architectures
Abstract
Multi-task learning promises to use less data, parameters, and time than training separate single-task models. But realizing these benefits in practice is challenging. In particular, it is difficult to define a suitable architecture that has enough capacity to support many tasks while not requiring excessive compute for each individual task. There are difficult trade-offs when deciding how to allocate parameters and layers across a large set of tasks. To address this, we propose a method for automatically searching over multi-task architectures that accounts for resource constraints. We define a parameterization of feature sharing strategies for effective coverage and sampling of architectures. We also present a method for quick evaluation of such architectures with feature distillation. Together these contributions allow us to quickly optimize for parameter-efficient multi-task models. We benchmark on Visual Decathlon, demonstrating that we can automatically search for and identify architectures that effectively make trade-offs between task resource requirements while maintaining a high level of final performance.
Cite
Text
Newell et al. "Feature Partitioning for Efficient Multi-Task Architectures." International Conference on Learning Representations, 2020.Markdown
[Newell et al. "Feature Partitioning for Efficient Multi-Task Architectures." International Conference on Learning Representations, 2020.](https://mlanthology.org/iclr/2020/newell2020iclr-feature/)BibTeX
@inproceedings{newell2020iclr-feature,
title = {{Feature Partitioning for Efficient Multi-Task Architectures}},
author = {Newell, Alejandro and Jiang, Lu and Wang, Chong and Li, Li-Jia and Deng, Jia},
booktitle = {International Conference on Learning Representations},
year = {2020},
url = {https://mlanthology.org/iclr/2020/newell2020iclr-feature/}
}