Does Preprocessing Help Training Over-Parameterized Neural Networks?

NeurIPS 2021

/neurips/2021/song2021neurips-preprocessing/

Abstract

Deep neural networks have achieved impressive performance in many areas. Designing a fast and provable method for training neural networks is a fundamental question in machine learning. The classical training method requires paying $\Omega(mnd)$ cost for both forward computation and backward computation, where $m$ is the width of the neural network, and we are given $n$ training points in $d$-dimensional space. In this paper, we propose two novel preprocessing ideas to bypass this $\Omega(mnd)$ barrier:* First, by preprocessing the initial weights of the neural networks, we can train the neural network in $\widetilde{O}(m^{1-\Theta(1/d)} n d)$ cost per iteration.* Second, by preprocessing the input data points, we can train neural network in $\widetilde{O} (m^{4/5} nd )$ cost per iteration.From the technical perspective, our result is a sophisticated combination of tools in different fields, greedy-type convergence analysis in optimization, sparsity observation in practical work, high-dimensional geometric search in data structure, concentration and anti-concentration in probability. Our results also provide theoretical insights for a large number of previously established fast training methods.In addition, our classical algorithm can be generalized to the Quantum computation model. Interestingly, we can get a similar sublinear cost per iteration but avoid preprocessing initial weights or input data points.

PDF NeurIPS OpenReview Semantic Scholar

Cite

Text

Song et al. "Does Preprocessing Help Training Over-Parameterized Neural Networks?." Neural Information Processing Systems, 2021.

Markdown

[Song et al. "Does Preprocessing Help Training Over-Parameterized Neural Networks?." Neural Information Processing Systems, 2021.](https://mlanthology.org/neurips/2021/song2021neurips-preprocessing/)

BibTeX

@inproceedings{song2021neurips-preprocessing,
  title     = {{Does Preprocessing Help Training Over-Parameterized Neural Networks?}},
  author    = {Song, Zhao and Yang, Shuo and Zhang, Ruizhe},
  booktitle = {Neural Information Processing Systems},
  year      = {2021},
  url       = {https://mlanthology.org/neurips/2021/song2021neurips-preprocessing/}
}