Geometric Unsupervised Domain Adaptation for Semantic Segmentation
Abstract
Simulators can efficiently generate large amounts of labeled synthetic data with perfect supervision for hard-to-label tasks like semantic segmentation. However, they introduce a domain gap that severely hurts real-world performance. We propose to use self-supervised monocular depth estimation as a proxy task to bridge this gap and improve sim-to-real unsupervised domain adaptation (UDA). Our Geometric Unsupervised Domain Adaptation method (GUDA) learns a domain-invariant representation via a multi-task objective combining synthetic semantic supervision with real-world geometric constraints on videos. GUDA establishes a new state of the art in UDA for semantic segmentation on three benchmarks, outperforming methods that use domain adversarial learning, self-training, or other self-supervised proxy tasks. Furthermore, we show that our method scales well with the quality and quantity of synthetic data while also improving depth prediction.
Cite
Text
Guizilini et al. "Geometric Unsupervised Domain Adaptation for Semantic Segmentation." International Conference on Computer Vision, 2021. doi:10.1109/ICCV48922.2021.00842Markdown
[Guizilini et al. "Geometric Unsupervised Domain Adaptation for Semantic Segmentation." International Conference on Computer Vision, 2021.](https://mlanthology.org/iccv/2021/guizilini2021iccv-geometric/) doi:10.1109/ICCV48922.2021.00842BibTeX
@inproceedings{guizilini2021iccv-geometric,
title = {{Geometric Unsupervised Domain Adaptation for Semantic Segmentation}},
author = {Guizilini, Vitor and Li, Jie and Ambruș, Rareș and Gaidon, Adrien},
booktitle = {International Conference on Computer Vision},
year = {2021},
pages = {8537-8547},
doi = {10.1109/ICCV48922.2021.00842},
url = {https://mlanthology.org/iccv/2021/guizilini2021iccv-geometric/}
}