Automatic Attention Pruning: Improving and Automating Model Pruning Using Attentions

Abstract

Pruning is a promising approach to compress deep learning models in order to deploy them on resource-constrained edge devices. However, many existing pruning solutions are based on unstructured pruning, which yields models that cannot efficiently run on commodity hardware; and they often require users to manually explore and tune the pruning process, which is time-consuming and often leads to sub-optimal results. To address these limitations, this paper presents Automatic Attention Pruning (AAP), an adaptive, attention-based, structured pruning approach to automatically generate small, accurate, and hardware-efficient models that meet user objectives. First, it proposes iterative structured pruning using activation-based attention maps to effectively identify and prune unimportant filters. Then, it proposes adaptive pruning policies for automatically meeting the pruning objectives of accuracy-critical, memory-constrained, and latency-sensitive tasks. A comprehensive evaluation shows that AAP substantially outperforms the state-of-the-art structured pruning works for a variety of model architectures. Our code is at: https://github.com/kaiqi123/Automatic-Attention-Pruning.git.

Cite

Text

Zhao et al. "Automatic Attention Pruning: Improving and Automating Model Pruning Using Attentions." Artificial Intelligence and Statistics, 2023.

Markdown

[Zhao et al. "Automatic Attention Pruning: Improving and Automating Model Pruning Using Attentions." Artificial Intelligence and Statistics, 2023.](https://mlanthology.org/aistats/2023/zhao2023aistats-automatic/)

BibTeX

@inproceedings{zhao2023aistats-automatic,
  title     = {{Automatic Attention Pruning: Improving and Automating Model Pruning Using Attentions}},
  author    = {Zhao, Kaiqi and Jain, Animesh and Zhao, Ming},
  booktitle = {Artificial Intelligence and Statistics},
  year      = {2023},
  pages     = {10470-10486},
  volume    = {206},
  url       = {https://mlanthology.org/aistats/2023/zhao2023aistats-automatic/}
}