Pixel-Wise Attentional Gating for Scene Parsing

Abstract

To achieve dynamic inference in pixel labeling tasks, we propose Pixel-wise Attentional Gating (PAG), which learns to selectively process a subset of spatial locations at each layer of a deep convolutional network. PAG is a generic, architecture-independent, problem-agnostic mechanism that can be readily "plugged in" to an existing model with fine-tuning. We utilize PAG in two ways: 1) learning spatially varying pooling fields that improve model performance without the extra computation cost associated with multi-scale pooling, and 2) learning a dynamic computation policy for each pixel to decrease total computation (FLOPs) while maintaining accuracy. We extensively evaluate PAG on a variety of per-pixel labeling tasks, including semantic segmentation, boundary detection, monocular depth and surface normal estimation. We demonstrate that PAG allows competitive or state-of-the-art performance on these tasks. Our experiments show that PAG learns dynamic spatial allocation of computation over the input image which provides better performance trade-offs compared to related approaches (e.g., truncating deep models or dynamically skipping whole layers). Generally, we observe PAG can reduce computation by 10% without noticeable loss in accuracy and performance degrades gracefully when imposing stronger computational constraints.

Cite

Text

Kong and Fowlkes. "Pixel-Wise Attentional Gating for Scene Parsing." IEEE/CVF Winter Conference on Applications of Computer Vision, 2019. doi:10.1109/WACV.2019.00114

Markdown

[Kong and Fowlkes. "Pixel-Wise Attentional Gating for Scene Parsing." IEEE/CVF Winter Conference on Applications of Computer Vision, 2019.](https://mlanthology.org/wacv/2019/kong2019wacv-pixel/) doi:10.1109/WACV.2019.00114

BibTeX

@inproceedings{kong2019wacv-pixel,
  title     = {{Pixel-Wise Attentional Gating for Scene Parsing}},
  author    = {Kong, Shu and Fowlkes, Charless C.},
  booktitle = {IEEE/CVF Winter Conference on Applications of Computer Vision},
  year      = {2019},
  pages     = {1024-1033},
  doi       = {10.1109/WACV.2019.00114},
  url       = {https://mlanthology.org/wacv/2019/kong2019wacv-pixel/}
}