Self-Attentive Pooling for Efficient Deep Learning

Chen, Fang; Datta, Gourav; Kundu, Souvik; Beerel, Peter A.

Self-Attentive Pooling for Efficient Deep Learning

Fang Chen, Gourav Datta, Souvik Kundu, Peter A. Beerel

WACV 2023 pp. 3974-3983

/wacv/2023/chen2023wacv-selfattentive/

Abstract

Efficient custom pooling techniques that can aggressively trim the dimensions of a feature map for resource-constrained computer vision applications have recently gained significant traction. However, prior pooling works extract only the local context of the activation maps, limiting their effectiveness. In contrast, we propose a novel non-local self-attentive pooling method that can be used as a drop-in replacement to the standard pooling layers, such as max/average pooling or strided convolution. The proposed self-attention module uses patch embedding, multi-head self-attention, and spatial-channel restoration, followed by sigmoid activation and exponential soft-max. This self-attention mechanism efficiently aggregates dependencies between non-local activation patches during down-sampling. Extensive experiments on standard object classification and detection tasks with various convolutional neural network (CNN) architectures demonstrate the superiority of our proposed mechanism over the state-of-the-art (SOTA) pooling techniques. In particular, we surpass the test accuracy of existing pooling techniques on different variants of MobileNet-V2 on ImageNet by an average of 1.2%. With the aggressive down-sampling of the activation maps in the initial layers (providing up to 22x reduction in memory consumption), our approach achieves 1.43% higher test accuracy compared to SOTA techniques with iso-memory footprints. This enables the deployment of our models in memory-constrained devices, such as micro-controllers without losing significant accuracy, because the initial activation maps consume a significant amount of on-chip memory for high-resolution images required for complex vision tasks. Our pooling method also leverages channel pruning to further reduce memory footprints. Codes are available at https://github.com/C-Fun/Non-Local-Pooling.

PDF WACV Semantic Scholar

Cite

Text

Chen et al. "Self-Attentive Pooling for Efficient Deep Learning." Winter Conference on Applications of Computer Vision, 2023.

Markdown

[Chen et al. "Self-Attentive Pooling for Efficient Deep Learning." Winter Conference on Applications of Computer Vision, 2023.](https://mlanthology.org/wacv/2023/chen2023wacv-selfattentive/)

BibTeX

@inproceedings{chen2023wacv-selfattentive,
  title     = {{Self-Attentive Pooling for Efficient Deep Learning}},
  author    = {Chen, Fang and Datta, Gourav and Kundu, Souvik and Beerel, Peter A.},
  booktitle = {Winter Conference on Applications of Computer Vision},
  year      = {2023},
  pages     = {3974-3983},
  url       = {https://mlanthology.org/wacv/2023/chen2023wacv-selfattentive/}
}