OVeNet: Offset Vector Network for Semantic Segmentation
Abstract
Semantic segmentation is a fundamental task in visual scene understanding. We focus on the supervised setting, where ground-truth semantic annotations are available. Based on knowledge about the high regularity of real-world scenes, we propose a method for improving class predictions by learning to selectively exploit information from neighboring pixels. In particular, our method is based on the prior that for each pixel, there is a seed pixel in its close neighborhood sharing the same prediction with the former. Motivated by this prior, we design a novel two-head network, named Offset Vector Network (OVeNet), which generates both standard semantic predictions and a dense 2D offset vector field indicating the offset from each pixel to the respective seed pixel, which is used to compute an alternative, seed-based semantic prediction. The two predictions are adaptively fused at each pixel using a learnt dense confidence map for the predicted offset vector field. We supervise offset vectors indirectly via optimizing the seed-based prediction and via a novel loss on the confidence map. Compared to the baseline state-of-the-art architectures HRNet and HRNet+OCR on which OVeNet is built, the latter achieves significant performance gains on three prominent benchmarks for semantic segmentation, namely Cityscapes, ACDC and ADE20K. Code is available at https://github.com/stamatisalex/OVeNet.
Cite
Text
Alexandropoulos et al. "OVeNet: Offset Vector Network for Semantic Segmentation." Winter Conference on Applications of Computer Vision, 2024.Markdown
[Alexandropoulos et al. "OVeNet: Offset Vector Network for Semantic Segmentation." Winter Conference on Applications of Computer Vision, 2024.](https://mlanthology.org/wacv/2024/alexandropoulos2024wacv-ovenet/)BibTeX
@inproceedings{alexandropoulos2024wacv-ovenet,
title = {{OVeNet: Offset Vector Network for Semantic Segmentation}},
author = {Alexandropoulos, Stamatis and Sakaridis, Christos and Maragos, Petros},
booktitle = {Winter Conference on Applications of Computer Vision},
year = {2024},
pages = {7407-7418},
url = {https://mlanthology.org/wacv/2024/alexandropoulos2024wacv-ovenet/}
}