Optimizing Optimizers: Regret-Optimal Gradient Descent Algorithms

Abstract

This paper treats the task of designing optimization algorithms as an optimal control problem. Using regret as a metric for an algorithm’s performance, we study the existence, uniqueness and consistency of regret-optimal algorithms. By providing first-order optimality conditions for the control problem, we show that regret-optimal algorithms must satisfy a specific structure in their dynamics which we show is equivalent to performing \emph{dual-preconditioned gradient descent} on the value function generated by its regret. Using these optimal dynamics, we provide bounds on their rates of convergence to solutions of convex optimization problems. Though closed-form optimal dynamics cannot be obtained in general, we present fast numerical methods for approximating them, generating optimization algorithms which directly optimize their long-term regret. These are benchmarked against commonly used optimization algorithms to demonstrate their effectiveness.

Cite

Text

Casgrain and Kratsios. "Optimizing Optimizers: Regret-Optimal Gradient Descent Algorithms." Conference on Learning Theory, 2021.

Markdown

[Casgrain and Kratsios. "Optimizing Optimizers: Regret-Optimal Gradient Descent Algorithms." Conference on Learning Theory, 2021.](https://mlanthology.org/colt/2021/casgrain2021colt-optimizing/)

BibTeX

@inproceedings{casgrain2021colt-optimizing,
  title     = {{Optimizing Optimizers: Regret-Optimal Gradient Descent Algorithms}},
  author    = {Casgrain, Philippe and Kratsios, Anastasis},
  booktitle = {Conference on Learning Theory},
  year      = {2021},
  pages     = {883-926},
  volume    = {134},
  url       = {https://mlanthology.org/colt/2021/casgrain2021colt-optimizing/}
}