Mean-Field Analysis for Learning Subspace-Sparse Polynomials with Gaussian Input
Abstract
In this work, we study the mean-field flow for learning subspace-sparse polynomials using stochastic gradient descent and two-layer neural networks, where the input distribution is standard Gaussian and the output only depends on the projection of the input onto a low-dimensional subspace. We establish a necessary condition for SGD-learnability, involving both the characteristics of the target function and the expressiveness of the activation function. In addition, we prove that the condition is almost sufficient, in the sense that a condition slightly stronger than the necessary condition can guarantee the exponential decay of the loss functional to zero.
Cite
Text
Chen and Ge. "Mean-Field Analysis for Learning Subspace-Sparse Polynomials with Gaussian Input." Neural Information Processing Systems, 2024. doi:10.52202/079017-1369Markdown
[Chen and Ge. "Mean-Field Analysis for Learning Subspace-Sparse Polynomials with Gaussian Input." Neural Information Processing Systems, 2024.](https://mlanthology.org/neurips/2024/chen2024neurips-meanfield/) doi:10.52202/079017-1369BibTeX
@inproceedings{chen2024neurips-meanfield,
title = {{Mean-Field Analysis for Learning Subspace-Sparse Polynomials with Gaussian Input}},
author = {Chen, Ziang and Ge, Rong},
booktitle = {Neural Information Processing Systems},
year = {2024},
doi = {10.52202/079017-1369},
url = {https://mlanthology.org/neurips/2024/chen2024neurips-meanfield/}
}