Deep Network Approximation: Beyond ReLU to Diverse Activation Functions

Abstract

This paper explores the expressive power of deep neural networks for a diverse range of activation functions. An activation function set $\mathscr{A}$ is defined to encompass the majority of commonly used activation functions, such as $\mathtt{ReLU}$, $\mathtt{LeakyReLU}$, $\mathtt{ReLU}^2$, $\mathtt{ELU}$, $\mathtt{CELU}$, $\mathtt{SELU}$, $\mathtt{Softplus}$, $\mathtt{GELU}$, $\mathtt{SiLU}$, $\mathtt{Swish}$, $\mathtt{Mish}$, $\mathtt{Sigmoid}$, $\mathtt{Tanh}$, $\mathtt{Arctan}$, $\mathtt{Softsign}$, $\mathtt{dSiLU}$, and $\mathtt{SRS}$. We demonstrate that for any activation function $\varrho\in \mathscr{A}$, a $\mathtt{ReLU}$ network of width $N$ and depth $L$ can be approximated to arbitrary precision by a $\varrho$-activated network of width $3N$ and depth $2L$ on any bounded set. This finding enables the extension of most approximation results achieved with $\mathtt{ReLU}$ networks to a wide variety of other activation functions, albeit with slightly increased constants. Significantly, we establish that the (width,$\,$depth) scaling factors can be further reduced from $(3,2)$ to $(1,1)$ if $\varrho$ falls within a specific subset of $\mathscr{A}$. This subset includes activation functions such as $\mathtt{ELU}$, $\mathtt{CELU}$, $\mathtt{SELU}$, $\mathtt{Softplus}$, $\mathtt{GELU}$, $\mathtt{SiLU}$, $\mathtt{Swish}$, and $\mathtt{Mish}$.

Cite

Text

Zhang et al. "Deep Network Approximation: Beyond ReLU to Diverse Activation Functions." Journal of Machine Learning Research, 2024.

Markdown

[Zhang et al. "Deep Network Approximation: Beyond ReLU to Diverse Activation Functions." Journal of Machine Learning Research, 2024.](https://mlanthology.org/jmlr/2024/zhang2024jmlr-deep/)

BibTeX

@article{zhang2024jmlr-deep,
  title     = {{Deep Network Approximation: Beyond ReLU to Diverse Activation Functions}},
  author    = {Zhang, Shijun and Lu, Jianfeng and Zhao, Hongkai},
  journal   = {Journal of Machine Learning Research},
  year      = {2024},
  pages     = {1-39},
  volume    = {25},
  url       = {https://mlanthology.org/jmlr/2024/zhang2024jmlr-deep/}
}