Why Deep Neural Networks for Function Approximation?

ICLR 2017

/iclr/2017/liang2017iclr-deep/

Abstract

Recently there has been much interest in understanding why deep neural networks are preferred to shallow networks. We show that, for a large class of piecewise smooth functions, the number of neurons needed by a shallow network to approximate a function is exponentially larger than the corresponding number of neurons needed by a deep network for a given degree of function approximation. First, we consider univariate functions on a bounded interval and require a neural network to achieve an approximation error of $\varepsilon$ uniformly over the interval. We show that shallow networks (i.e., networks whose depth does not depend on $\varepsilon$) require $\Omega(\text{poly}(1/\varepsilon))$ neurons while deep networks (i.e., networks whose depth grows with $1/\varepsilon$) require $\mathcal{O}(\text{polylog}(1/\varepsilon))$ neurons. We then extend these results to certain classes of important multivariate functions. Our results are derived for neural networks which use a combination of rectifier linear units (ReLUs) and binary step units, two of the most popular type of activation functions. Our analysis builds on a simple observation: the multiplication of two bits can be represented by a ReLU.

ICLR Semantic Scholar

Cite

Text

Liang and Srikant. "Why Deep Neural Networks for Function Approximation?." International Conference on Learning Representations, 2017.

Markdown

[Liang and Srikant. "Why Deep Neural Networks for Function Approximation?." International Conference on Learning Representations, 2017.](https://mlanthology.org/iclr/2017/liang2017iclr-deep/)

BibTeX

@inproceedings{liang2017iclr-deep,
  title     = {{Why Deep Neural Networks for Function Approximation?}},
  author    = {Liang, Shiyu and Srikant, R.},
  booktitle = {International Conference on Learning Representations},
  year      = {2017},
  url       = {https://mlanthology.org/iclr/2017/liang2017iclr-deep/}
}