Learning Multi-Index Models with Neural Networks via Mean-Field Langevin Dynamics
Abstract
We study the problem of learning multi-index models in high-dimensions using a two-layer neural network trained with the mean-field Langevin algorithm. Under mild distributional assumptions on the data, we characterize the effective dimension $d_{\mathrm{eff}}$ that controls both sample and computational complexity by utilizing the adaptivity of neural networks to latent low-dimensional structures. When the data exhibit such a structure, $d_{\mathrm{eff}}$ can be significantly smaller than the ambient dimension. We prove that the sample complexity grows almost linearly with $d_{\mathrm{eff}}$, bypassing the limitations of the information exponent or the leap complexity that appeared in recent analyses of gradient-based feature learning. On the other hand, the computational complexity may inevitably grow exponentially with $d_{\mathrm{eff}}$ in the worst-case scenario.
Cite
Text
Mousavi-Hosseini et al. "Learning Multi-Index Models with Neural Networks via Mean-Field Langevin Dynamics." ICML 2024 Workshops: HiLD, 2024.Markdown
[Mousavi-Hosseini et al. "Learning Multi-Index Models with Neural Networks via Mean-Field Langevin Dynamics." ICML 2024 Workshops: HiLD, 2024.](https://mlanthology.org/icmlw/2024/mousavihosseini2024icmlw-learning/)BibTeX
@inproceedings{mousavihosseini2024icmlw-learning,
title = {{Learning Multi-Index Models with Neural Networks via Mean-Field Langevin Dynamics}},
author = {Mousavi-Hosseini, Alireza and Wu, Denny and Erdogdu, Murat A},
booktitle = {ICML 2024 Workshops: HiLD},
year = {2024},
url = {https://mlanthology.org/icmlw/2024/mousavihosseini2024icmlw-learning/}
}