Online Linear Quadratic Control

Abstract

We study the problem of controlling linear time-invariant systems with known noisy dynamics and adversarially chosen quadratic losses. We present the first efficient online learning algorithms in this setting that guarantee $O(\sqrt{T})$ regret under mild assumptions, where $T$ is the time horizon. Our algorithms rely on a novel SDP relaxation for the steady-state distribution of the system. Crucially, and in contrast to previously proposed relaxations, the feasible solutions of our SDP all correspond to “strongly stable” policies that mix exponentially fast to a steady state.

Cite

Text

Cohen et al. "Online Linear Quadratic Control." International Conference on Machine Learning, 2018.

Markdown

[Cohen et al. "Online Linear Quadratic Control." International Conference on Machine Learning, 2018.](https://mlanthology.org/icml/2018/cohen2018icml-online/)

BibTeX

@inproceedings{cohen2018icml-online,
  title     = {{Online Linear Quadratic Control}},
  author    = {Cohen, Alon and Hasidim, Avinatan and Koren, Tomer and Lazic, Nevena and Mansour, Yishay and Talwar, Kunal},
  booktitle = {International Conference on Machine Learning},
  year      = {2018},
  pages     = {1029-1038},
  volume    = {80},
  url       = {https://mlanthology.org/icml/2018/cohen2018icml-online/}
}