Optimizing Optimizers: Regret-Optimal Gradient Descent Algorithms
Abstract
This paper treats the task of designing optimization algorithms as an optimal control problem. Using regret as a metric for an algorithm’s performance, we study the existence, uniqueness and consistency of regret-optimal algorithms. By providing first-order optimality conditions for the control problem, we show that regret-optimal algorithms must satisfy a specific structure in their dynamics which we show is equivalent to performing \emph{dual-preconditioned gradient descent} on the value function generated by its regret. Using these optimal dynamics, we provide bounds on their rates of convergence to solutions of convex optimization problems. Though closed-form optimal dynamics cannot be obtained in general, we present fast numerical methods for approximating them, generating optimization algorithms which directly optimize their long-term regret. These are benchmarked against commonly used optimization algorithms to demonstrate their effectiveness.
Cite
Text
Casgrain and Kratsios. "Optimizing Optimizers: Regret-Optimal Gradient Descent Algorithms." Conference on Learning Theory, 2021.Markdown
[Casgrain and Kratsios. "Optimizing Optimizers: Regret-Optimal Gradient Descent Algorithms." Conference on Learning Theory, 2021.](https://mlanthology.org/colt/2021/casgrain2021colt-optimizing/)BibTeX
@inproceedings{casgrain2021colt-optimizing,
title = {{Optimizing Optimizers: Regret-Optimal Gradient Descent Algorithms}},
author = {Casgrain, Philippe and Kratsios, Anastasis},
booktitle = {Conference on Learning Theory},
year = {2021},
pages = {883-926},
volume = {134},
url = {https://mlanthology.org/colt/2021/casgrain2021colt-optimizing/}
}