Learning a Single Neuron with Gradient Methods
Abstract
We consider the fundamental problem of learning a single neuron $\mathbf{x}\mapsto \sigma(\mathbf{w}^\top\mathbf{x})$ in a realizable setting, using standard gradient methods with random initialization, and under general families of input distributions and activations. On the one hand, we show that some assumptions on both the distribution and the activation function are necessary. On the other hand, we prove positive guarantees under mild assumptions, which go significantly beyond those studied in the literature so far. We also point out and study the challenges in further strengthening and generalizing our results.
Cite
Text
Yehudai and Ohad. "Learning a Single Neuron with Gradient Methods." Conference on Learning Theory, 2020.Markdown
[Yehudai and Ohad. "Learning a Single Neuron with Gradient Methods." Conference on Learning Theory, 2020.](https://mlanthology.org/colt/2020/yehudai2020colt-learning/)BibTeX
@inproceedings{yehudai2020colt-learning,
title = {{Learning a Single Neuron with Gradient Methods}},
author = {Yehudai, Gilad and Ohad, Shamir},
booktitle = {Conference on Learning Theory},
year = {2020},
pages = {3756-3786},
volume = {125},
url = {https://mlanthology.org/colt/2020/yehudai2020colt-learning/}
}