Proximal Mean Field Learning in Shallow Neural Networks
Abstract
We propose a custom learning algorithm for shallow over-parameterized neural networks, i.e., networks with single hidden layer having infinite width. The infinite width of the hidden layer serves as an abstraction for the over-parameterization. Building on the recent mean field interpretations of learning dynamics in shallow neural networks, we realize mean field learning as a computational algorithm, rather than as an analytical tool. Specifically, we design a Sinkhorn regularized proximal algorithm to approximate the distributional flow for the learning dynamics over weighted point clouds. In this setting, a contractive fixed point recursion computes the time-varying weights, numerically realizing the interacting Wasserstein gradient flow of the parameter distribution supported over the neuronal ensemble. An appealing aspect of the proposed algorithm is that the measure-valued recursions allow meshless computation. We demonstrate the proposed computational framework of interacting weighted particle evolution on binary and multi-class classification. Our algorithm performs gradient descent of the free energy associated with the risk functional.
Cite
Text
Teter et al. "Proximal Mean Field Learning in Shallow Neural Networks." Transactions on Machine Learning Research, 2024.Markdown
[Teter et al. "Proximal Mean Field Learning in Shallow Neural Networks." Transactions on Machine Learning Research, 2024.](https://mlanthology.org/tmlr/2024/teter2024tmlr-proximal/)BibTeX
@article{teter2024tmlr-proximal,
title = {{Proximal Mean Field Learning in Shallow Neural Networks}},
author = {Teter, Alexis and Nodozi, Iman and Halder, Abhishek},
journal = {Transactions on Machine Learning Research},
year = {2024},
url = {https://mlanthology.org/tmlr/2024/teter2024tmlr-proximal/}
}