Adaptive Regret for Control of Time-Varying Dynamics
Abstract
We consider the problem of online control of systems with time-varying linear dynamics. To state meaningful guarantees over changing environments, we introduce the metric of {\it adaptive regret} to the field of control. This metric, originally studied in online learning, measures performance in terms of regret against the best policy in hindsight on {\it any interval in time}, and thus captures the adaptation of the controller to changing dynamics. Our main contribution is a novel efficient meta-algorithm: it converts a controller with sublinear regret bounds into one with sublinear {\it adaptive regret} bounds in the setting of time-varying linear dynamical systems. The underlying technical innovation is the first adaptive regret bound for the more general framework of online convex optimization with memory. Furthermore, we give a lower bound showing that our attained adaptive regret bound is nearly tight for this general framework.
Cite
Text
Gradu et al. "Adaptive Regret for Control of Time-Varying Dynamics." Proceedings of The 5th Annual Learning for Dynamics and Control Conference, 2023.Markdown
[Gradu et al. "Adaptive Regret for Control of Time-Varying Dynamics." Proceedings of The 5th Annual Learning for Dynamics and Control Conference, 2023.](https://mlanthology.org/l4dc/2023/gradu2023l4dc-adaptive/)BibTeX
@inproceedings{gradu2023l4dc-adaptive,
title = {{Adaptive Regret for Control of Time-Varying Dynamics}},
author = {Gradu, Paula and Hazan, Elad and Minasyan, Edgar},
booktitle = {Proceedings of The 5th Annual Learning for Dynamics and Control Conference},
year = {2023},
pages = {560-572},
volume = {211},
url = {https://mlanthology.org/l4dc/2023/gradu2023l4dc-adaptive/}
}