S$^3$: Sign-Sparse-Shift Reparametrization for Effective Training of Low-Bit Shift Networks

Abstract

Shift neural networks reduce computation complexity by removing expensive multiplication operations and quantizing continuous weights into low-bit discrete values, which are fast and energy-efficient compared to conventional neural networks. However, existing shift networks are sensitive to the weight initialization and yield a degraded performance caused by vanishing gradient and weight sign freezing problem. To address these issues, we propose S$^3$ re-parameterization, a novel technique for training low-bit shift networks. Our method decomposes a discrete parameter in a sign-sparse-shift 3-fold manner. This way, it efficiently learns a low-bit network with weight dynamics similar to full-precision networks and insensitive to weight initialization. Our proposed training method pushes the boundaries of shift neural networks and shows 3-bit shift networks compete with their full-precision counterparts in terms of top-1 accuracy on ImageNet.

Cite

Text

Li et al. "S$^3$: Sign-Sparse-Shift Reparametrization for Effective Training of Low-Bit Shift Networks." Neural Information Processing Systems, 2021.

Markdown

[Li et al. "S$^3$: Sign-Sparse-Shift Reparametrization for Effective Training of Low-Bit Shift Networks." Neural Information Processing Systems, 2021.](https://mlanthology.org/neurips/2021/li2021neurips-signsparseshift/)

BibTeX

@inproceedings{li2021neurips-signsparseshift,
  title     = {{S$^3$: Sign-Sparse-Shift Reparametrization for Effective Training of Low-Bit Shift Networks}},
  author    = {Li, Xinlin and Liu, Bang and Yu, Yaoliang and Liu, Wulong and Xu, Chunjing and Nia, Vahid Partovi},
  booktitle = {Neural Information Processing Systems},
  year      = {2021},
  url       = {https://mlanthology.org/neurips/2021/li2021neurips-signsparseshift/}
}