Activate or Not: Learning Customized Activation

Abstract

We present a simple, effective, and general activation function we term ACON which learns to activate the neurons or not. Interestingly, we find Swish, the recent popular NAS-searched activation, can be interpreted as a smooth approximation to ReLU. Intuitively, in the same way, we approximate the more general Maxout family to our novel ACON family, which remarkably improves the performance and makes Swish a special case of ACON. Next, we present meta-ACON, which explicitly learns to optimize the parameter switching between non-linear (activate) and linear (inactivate) and provides a new design space. By simply changing the activation function, we show its effectiveness on both small models and highly optimized large models (e.g. it improves the ImageNet top-1 accuracy rate by 6.7% and 1.8% on MobileNet-0.25 and ResNet-152, respectively). Moreover, our novel ACON can be naturally transferred to object detection and semantic segmentation, showing that ACON is an effective alternative in a variety of tasks. Code is available at https://github.com/nmaac/acon.

Cite

Text

Ma et al. "Activate or Not: Learning Customized Activation." Conference on Computer Vision and Pattern Recognition, 2021. doi:10.1109/CVPR46437.2021.00794

Markdown

[Ma et al. "Activate or Not: Learning Customized Activation." Conference on Computer Vision and Pattern Recognition, 2021.](https://mlanthology.org/cvpr/2021/ma2021cvpr-activate/) doi:10.1109/CVPR46437.2021.00794

BibTeX

@inproceedings{ma2021cvpr-activate,
  title     = {{Activate or Not: Learning Customized Activation}},
  author    = {Ma, Ningning and Zhang, Xiangyu and Liu, Ming and Sun, Jian},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2021},
  pages     = {8032-8042},
  doi       = {10.1109/CVPR46437.2021.00794},
  url       = {https://mlanthology.org/cvpr/2021/ma2021cvpr-activate/}
}