Online Linear Quadratic Control
Abstract
We study the problem of controlling linear time-invariant systems with known noisy dynamics and adversarially chosen quadratic losses. We present the first efficient online learning algorithms in this setting that guarantee $O(\sqrt{T})$ regret under mild assumptions, where $T$ is the time horizon. Our algorithms rely on a novel SDP relaxation for the steady-state distribution of the system. Crucially, and in contrast to previously proposed relaxations, the feasible solutions of our SDP all correspond to “strongly stable” policies that mix exponentially fast to a steady state.
Cite
Text
Cohen et al. "Online Linear Quadratic Control." International Conference on Machine Learning, 2018.Markdown
[Cohen et al. "Online Linear Quadratic Control." International Conference on Machine Learning, 2018.](https://mlanthology.org/icml/2018/cohen2018icml-online/)BibTeX
@inproceedings{cohen2018icml-online,
title = {{Online Linear Quadratic Control}},
author = {Cohen, Alon and Hasidim, Avinatan and Koren, Tomer and Lazic, Nevena and Mansour, Yishay and Talwar, Kunal},
booktitle = {International Conference on Machine Learning},
year = {2018},
pages = {1029-1038},
volume = {80},
url = {https://mlanthology.org/icml/2018/cohen2018icml-online/}
}