Learning Multi-Index Models with Neural Networks via Mean-Field Langevin Dynamics

Abstract

We study the problem of learning multi-index models in high-dimensions using a two-layer neural network trained with the mean-field Langevin algorithm. Under mild distributional assumptions on the data, we characterize the effective dimension $d_{\mathrm{eff}}$ that controls both sample and computational complexity by utilizing the adaptivity of neural networks to latent low-dimensional structures. When the data exhibit such a structure, $d_{\mathrm{eff}}$ can be significantly smaller than the ambient dimension. We prove that the sample complexity grows almost linearly with $d_{\mathrm{eff}}$, bypassing the limitations of the information exponent or the leap complexity that appeared in recent analyses of gradient-based feature learning. On the other hand, the computational complexity may inevitably grow exponentially with $d_{\mathrm{eff}}$ in the worst-case scenario.

Cite

Text

Mousavi-Hosseini et al. "Learning Multi-Index Models with Neural Networks via Mean-Field Langevin Dynamics." ICML 2024 Workshops: HiLD, 2024.

Markdown

[Mousavi-Hosseini et al. "Learning Multi-Index Models with Neural Networks via Mean-Field Langevin Dynamics." ICML 2024 Workshops: HiLD, 2024.](https://mlanthology.org/icmlw/2024/mousavihosseini2024icmlw-learning/)

BibTeX

@inproceedings{mousavihosseini2024icmlw-learning,
  title     = {{Learning Multi-Index Models with Neural Networks via Mean-Field Langevin Dynamics}},
  author    = {Mousavi-Hosseini, Alireza and Wu, Denny and Erdogdu, Murat A},
  booktitle = {ICML 2024 Workshops: HiLD},
  year      = {2024},
  url       = {https://mlanthology.org/icmlw/2024/mousavihosseini2024icmlw-learning/}
}