Mirror Descent with Relative Smoothness in Measure Spaces, with Application to Sinkhorn and EM
Abstract
Many problems in machine learning can be formulated as optimizing a convex functional over a vector space of measures. This paper studies the convergence of the mirror descent algorithm in this infinite-dimensional setting. Defining Bregman divergences through directional derivatives, we derive the convergence of the scheme for relatively smooth and convex pairs of functionals. Such assumptions allow to handle non-smooth functionals such as the Kullback--Leibler (KL) divergence. Applying our result to joint distributions and KL, we show that Sinkhorn's primal iterations for entropic optimal transport in the continuous setting correspond to a mirror descent, and we obtain a new proof of its (sub)linear convergence. We also show that Expectation Maximization (EM) can always formally be written as a mirror descent. When optimizing only on the latent distribution while fixing the mixtures parameters -- which corresponds to the Richardson--Lucy deconvolution scheme in signal processing -- we derive sublinear rates of convergence.
Cite
Text
Aubin-Frankowski et al. "Mirror Descent with Relative Smoothness in Measure Spaces, with Application to Sinkhorn and EM." Neural Information Processing Systems, 2022.Markdown
[Aubin-Frankowski et al. "Mirror Descent with Relative Smoothness in Measure Spaces, with Application to Sinkhorn and EM." Neural Information Processing Systems, 2022.](https://mlanthology.org/neurips/2022/aubinfrankowski2022neurips-mirror/)BibTeX
@inproceedings{aubinfrankowski2022neurips-mirror,
title = {{Mirror Descent with Relative Smoothness in Measure Spaces, with Application to Sinkhorn and EM}},
author = {Aubin-Frankowski, Pierre-Cyril and Korba, Anna and Léger, Flavien},
booktitle = {Neural Information Processing Systems},
year = {2022},
url = {https://mlanthology.org/neurips/2022/aubinfrankowski2022neurips-mirror/}
}