Generative Modeling for Multi-Task Visual Learning

Abstract

Generative modeling has recently shown great promise in computer vision, but it has mostly focused on synthesizing visually realistic images. In this paper, motivated by multi-task learning of shareable feature representations, we consider a novel problem of learning a shared generative model that is useful across various visual perception tasks. Correspondingly, we propose a general multi-task oriented generative modeling (MGM) framework, by coupling a discriminative multi-task network with a generative network. While it is challenging to synthesize both RGB images and pixel-level annotations in multi-task scenarios, our framework enables us to use synthesized images paired with only weak annotations (i.e., image-level scene labels) to facilitate multiple visual tasks. Experimental evaluation on challenging multi-task benchmarks, including NYUv2 and Taskonomy, demonstrates that our MGM framework improves the performance of all the tasks by large margins, consistently outperforming state-of-the-art multi-task approaches in different sample-size regimes.

Cite

Text

Bao et al. "Generative Modeling for Multi-Task Visual Learning." International Conference on Machine Learning, 2022.

Markdown

[Bao et al. "Generative Modeling for Multi-Task Visual Learning." International Conference on Machine Learning, 2022.](https://mlanthology.org/icml/2022/bao2022icml-generative/)

BibTeX

@inproceedings{bao2022icml-generative,
  title     = {{Generative Modeling for Multi-Task Visual Learning}},
  author    = {Bao, Zhipeng and Hebert, Martial and Wang, Yu-Xiong},
  booktitle = {International Conference on Machine Learning},
  year      = {2022},
  pages     = {1537-1554},
  volume    = {162},
  url       = {https://mlanthology.org/icml/2022/bao2022icml-generative/}
}