SPGNet: Semantic Prediction Guidance for Scene Parsing

Cheng, Bowen; Chen, Liang-Chieh; Wei, Yunchao; Zhu, Yukun; Huang, Zilong; Xiong, Jinjun; Huang, Thomas S.; Hwu, Wen-Mei; Shi, Honghui

doi:10.1109/ICCV.2019.00532

SPGNet: Semantic Prediction Guidance for Scene Parsing

Bowen Cheng, Liang-Chieh Chen, Yunchao Wei, Yukun Zhu, Zilong Huang, Jinjun Xiong, Thomas S. Huang, Wen-Mei Hwu, Honghui Shi

ICCV 2019

doi:10.1109/ICCV.2019.00532 /iccv/2019/cheng2019iccv-spgnet/

Abstract

Multi-scale context module and single-stage encoder-decoder structure are commonly employed for semantic segmentation. The multi-scale context module refers to the operations to aggregate feature responses from a large spatial extent, while the single-stage encoder-decoder structure encodes the high-level semantic information in the encoder path and recovers the boundary information in the decoder path. In contrast, multi-stage encoder-decoder networks have been widely used in human pose estimation and show superior performance than their single-stage counterpart. However, few efforts have been attempted to bring this effective design to semantic segmentation. In this work, we propose a Semantic Prediction Guidance (SPG) module which learns to re-weight the local features through the guidance from pixel-wise semantic prediction. We find that by carefully re-weighting features across stages, a two-stage encoder-decoder network coupled with our proposed SPG module can significantly outperform its one-stage counterpart with similar parameters and computations. Finally, we report experimental results on the semantic segmentation benchmark Cityscapes, in which our SPGNet attains 81.1% on the test set using only 'fine' annotations.

PDF ICCV Semantic Scholar

Cite

Text

Cheng et al. "SPGNet: Semantic Prediction Guidance for Scene Parsing." Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019. doi:10.1109/ICCV.2019.00532

Markdown

[Cheng et al. "SPGNet: Semantic Prediction Guidance for Scene Parsing." Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019.](https://mlanthology.org/iccv/2019/cheng2019iccv-spgnet/) doi:10.1109/ICCV.2019.00532

BibTeX

@inproceedings{cheng2019iccv-spgnet,
  title     = {{SPGNet: Semantic Prediction Guidance for Scene Parsing}},
  author    = {Cheng, Bowen and Chen, Liang-Chieh and Wei, Yunchao and Zhu, Yukun and Huang, Zilong and Xiong, Jinjun and Huang, Thomas S. and Hwu, Wen-Mei and Shi, Honghui},
  booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision},
  year      = {2019},
  doi       = {10.1109/ICCV.2019.00532},
  url       = {https://mlanthology.org/iccv/2019/cheng2019iccv-spgnet/}
}