Uniform-in-Time Propagation of Chaos for the Mean-Field Gradient Langevin Dynamics

Abstract

The mean-field Langevin dynamics is characterized by a stochastic differential equation that arises from (noisy) gradient descent on an infinite-width two-layer neural network, which can be viewed as an interacting particle system. In this work, we establish a quantitative weak propagation of chaos result for the system, with a finite-particle discretization error of $\mathcal{O}(1/N)$ \textit{uniformly over time}, where $N$ is the width of the neural network. This allows us to directly transfer the optimization guarantee for infinite-width networks to practical finite-width models without excessive overparameterization. On the technical side, our analysis differs from most existing studies on similar mean field dynamics in that we do not require the interaction between particles to be sufficiently weak to obtain a uniform propagation of chaos, because such assumptions may not be satisfied in neural network optimization. Instead, we make use of a logarithmic Sobolev-type condition which can be verified in appropriate regularized risk minimization settings.

Cite

Text

Suzuki et al. "Uniform-in-Time Propagation of Chaos for the Mean-Field Gradient Langevin Dynamics." International Conference on Learning Representations, 2023.

Markdown

[Suzuki et al. "Uniform-in-Time Propagation of Chaos for the Mean-Field Gradient Langevin Dynamics." International Conference on Learning Representations, 2023.](https://mlanthology.org/iclr/2023/suzuki2023iclr-uniformintime/)

BibTeX

@inproceedings{suzuki2023iclr-uniformintime,
  title     = {{Uniform-in-Time Propagation of Chaos for the Mean-Field Gradient Langevin Dynamics}},
  author    = {Suzuki, Taiji and Nitanda, Atsushi and Wu, Denny},
  booktitle = {International Conference on Learning Representations},
  year      = {2023},
  url       = {https://mlanthology.org/iclr/2023/suzuki2023iclr-uniformintime/}
}