Adaptive Regret for Control of Time-Varying Dynamics

Abstract

We consider the problem of online control of systems with time-varying linear dynamics. To state meaningful guarantees over changing environments, we introduce the metric of {\it adaptive regret} to the field of control. This metric, originally studied in online learning, measures performance in terms of regret against the best policy in hindsight on {\it any interval in time}, and thus captures the adaptation of the controller to changing dynamics. Our main contribution is a novel efficient meta-algorithm: it converts a controller with sublinear regret bounds into one with sublinear {\it adaptive regret} bounds in the setting of time-varying linear dynamical systems. The underlying technical innovation is the first adaptive regret bound for the more general framework of online convex optimization with memory. Furthermore, we give a lower bound showing that our attained adaptive regret bound is nearly tight for this general framework.

Cite

Text

Gradu et al. "Adaptive Regret for Control of Time-Varying Dynamics." Proceedings of The 5th Annual Learning for Dynamics and Control Conference, 2023.

Markdown

[Gradu et al. "Adaptive Regret for Control of Time-Varying Dynamics." Proceedings of The 5th Annual Learning for Dynamics and Control Conference, 2023.](https://mlanthology.org/l4dc/2023/gradu2023l4dc-adaptive/)

BibTeX

@inproceedings{gradu2023l4dc-adaptive,
  title     = {{Adaptive Regret for Control of Time-Varying Dynamics}},
  author    = {Gradu, Paula and Hazan, Elad and Minasyan, Edgar},
  booktitle = {Proceedings of The 5th Annual Learning for Dynamics and Control Conference},
  year      = {2023},
  pages     = {560-572},
  volume    = {211},
  url       = {https://mlanthology.org/l4dc/2023/gradu2023l4dc-adaptive/}
}