Averaging Stochastic Gradient Descent on Riemannian Manifolds

Abstract

We consider the minimization of a function defined on a Riemannian manifold $\mathcal{M}$ accessible only through unbiased estimates of its gradients. We develop a geometric framework to transform a sequence of slowly converging iterates generated from stochastic gradient descent (SGD) on $\mathcal{M}$ to an averaged iterate sequence with a robust and fast $O(1/n)$ convergence rate. We then present an application of our framework to geodesically-strongly-convex (and possibly Euclidean non-convex) problems. Finally, we demonstrate how these ideas apply to the case of streaming $k$-PCA, where we show how to accelerate the slow rate of the randomized power method (without requiring knowledge of the eigengap) into a robust algorithm achieving the optimal rate of convergence.

Cite

Text

Tripuraneni et al. "Averaging Stochastic Gradient Descent on Riemannian Manifolds." Annual Conference on Computational Learning Theory, 2018.

Markdown

[Tripuraneni et al. "Averaging Stochastic Gradient Descent on Riemannian Manifolds." Annual Conference on Computational Learning Theory, 2018.](https://mlanthology.org/colt/2018/tripuraneni2018colt-averaging/)

BibTeX

@inproceedings{tripuraneni2018colt-averaging,
  title     = {{Averaging Stochastic Gradient Descent on Riemannian Manifolds}},
  author    = {Tripuraneni, Nilesh and Flammarion, Nicolas and Bach, Francis R. and Jordan, Michael I.},
  booktitle = {Annual Conference on Computational Learning Theory},
  year      = {2018},
  pages     = {650-687},
  url       = {https://mlanthology.org/colt/2018/tripuraneni2018colt-averaging/}
}